Javascript In Your Ruby: Mongoid Map Reduce
Oct 10, 2011 · 3 minute readWe’re pretty fond of Mongodb at work and I’ve been getting an opportunity to kick some of the more interesting tyres recently. I thought I’d document something I found myself doing here, half hoping it might be useful for anyone else with a similar problem and also to see if anyone else has a much neater approach. The examples are obviously pretty trivial, but hopefully you get the idea.
So, we’re making using of the rather nice Mongoid Ruby library for defining our models as Ruby classes. Here’s a couple of very simple classes. Anyone familiar with DataMapper or Django’s ORM should be right at home here.
class Publication
include Mongoid::Document
field :name, :type => String
field :section, :type => String
field :body, :type => String
field :is_published, :type => Boolean
end
class LongerPublication < Publication
field :extra_body, :type => String
end
So we now have a good few publications and longer publications in our system. And folks have been creating sections with wild amandon. What I’d like to do now is do some reporting, specifically I want to know the numbers of Publications by type and publication status. And lets allow a breakdown by section while we’re at it.
One approach to this is using Mongo’s built in map-reduce capability. Mongoid exposes this pretty cleanly in my view, by allowing you to write the required javascript functions (a mapper and a reducer) inline in the Ruby code. This might feel evil, but seems the best of the available options. I can see for much larger functions that splitting this out into separate javascript files for ease of testing might be nice, but were you can just test the input/output of the whole job this works for me.
KLASS = "this._type"
SECTION = "this.section"
def self.count_by(type)
map = <<EOF
function() {
function truthy(value) {
return (value == true) ? 1 : 0;
}
emit(#{type}, {type: #{type}, count: 1, published: truthy(this.is_published)})
}
EOF
reduce = <<EOF
function(key, values) {
var count = 0; published = 0;
values.forEach(function(doc) {
count += parseInt(doc.count);
published += parseInt(doc.published);
type = doc.type
);
return {type: type, count: count, published: published}
}
EOF
collection.mapreduce(map, reduce).find()
end
In our case that will return something like the following, or rather more specifically it will return a Mongo::Cursor that allows you to get at the following data.
[{"_id"=>"Publication", "value"=>{"type"=>"Publication", "count"=>42.0, "published"=>29.0}},
{"_id"=>"LongerPublication", "value"=>{"type"=>"LongerPublication", "count"=>12.0, "published"=>10.0}}]
I’ve been pretty impressed with both Mongo and Mongoid here. I like the feel of mapreduce jobs for this sort of reporting task. In particular it’s suprising how writing two languages mixed together like this doesn’t really affect the readability of the code in my view. Given that with a relational database you’d probably be writing SQL anyway maybe that’s not that suprising - the syntactic differences between Javascript and Ruby are much smaller than pretty much anything and SQL. Lots of folks have written about the increase of polyglot programming, but I wonder if we’ll see an increase in the embedding of one language in another?