Discrete Passions: Chipping away at manipulating and querying mongoDB sub-documents (with Groovy/Java)

I'm building a little library of commonly needed MongoDB Groovy scripts using the MongoDB Java driver. Sub-documents are my favorite aspect of mongodb scheming. They're how I like to illustrate a core difference between document DBs and the "joining tables" world of SQL. So I figured I'd tackle storing and fetching sub-documents.

There are more programmer/java-friendly strategies such as Morphia. My preference is to start with a reasonably low level API to understand some foundation perspective before jumping up a level or two of very helpful abstraction. It's a control thing, I'm sure... ;}

It's easy to use native MongoDB javascript to create some test data, so I started with that. Below I'm using a collection called "diary". This document identifies the activities and activity dates performed by John.

mongo

> use blog

> db.diary.insert ({name: 'john'}, {'activities':[]});  // setup activities array for subsequent content updates

> db.diary.update({name:'john'},{"$push" : {"activities" : { "date" : "20130812", "name" :"Go to school"}}});

> db.diary.update({name:'john'},{"$push" : {"activities" : { "date" : "20130817", "name" :"Bird watching on Rio Grande"}}});

> db.diary.findOne({}, {_id:0}); // select all docs; don't display the _id 

{"activities" : [

       {

               "date" : "20130812",

               "name" : "Go to school"

       },

       {

               "date" : "20130817",

               "name" : "Bird watching on Rio Grande"

       }

       ],

       "name" : "john"

}

I now have a key called "activities" that collects sub-documents in array form. Each item in the array is a sub-document representing a date-stamped activity, such as "Go to school", which I do every day in one form or other.

In the code below loops through any top-level documents, looking for the presence of the "activities" key. Since we're talking MongoDB, there's no requirement that such a key exists in every document. The responsibility of whatever you decide that policy should be is implemented in your code!

Here's some tested code the performs the query I've been looking for.

// main...

    myApp.dumpActivities(fetchActivities('john'))

// ...end of main

def fetchActivities(name) {

  def activities = [:]

  def q = new BasicDBObject().append("name",name)

  def cursor = diary.find(q)

  while (cursor.hasNext()) {

    def activityDocs = (BasicDBList) cursor.next().get("activities")

    # DANGER -- This loop assumes one event per day...

    for (BasicDBObject activity: activityDocs) {

      activities.put(activity.date, activity.name)

    }

  }

  return activities

}

def dumpActivities(activities) {

  activities.each { dateStamp, activity ->

    println "Activity: "+ activity + " on " + dateStamp

  }

}

So we're looping through each document looking for a match on the key "activities". When found, we use the key casting of the value associated with the activities key to create a BasicDBList object. Basically, we're treating the activity key's sub-document as the array that it is.

The innermost loop processes each sub-document as the hash (represented by the BasicDBObject object) that it is, collecting each key-value pair ("date","activity") as encountered.

So! Now I have the basis of my standard query pattern. Next post should be about inserting new activity sub-documents with Groovy.

Discrete Passions

Saturday, August 17, 2013

Chipping away at manipulating and querying mongoDB sub-documents (with Groovy/Java)

No comments:

Post a Comment