Datas manipulation in MongoDB: rename field, change type, add sub-document

mars 15th, 2013

The other day I’ve done a big data import in MongoDB and I had to change some of my fields types, to rename fields, to transform String fields into sub-document array and other operations. All these operations took me time to search solution on the Web and I’ve wanted to share these solutions I’ve found. I’ve used MongoDB for over a year now in several different projects, and it schemaless structure is very powerfull.

All operation in this post will be made with the mongo shell. There’s a .pretty() command function to prettify shell output.

db.collection.find().pretty()

which is similar to :

db.collection.find().forEach(printjson);

If you want to enable pretty print globally by default, just add the following line to your file in $HOME/.mongorc.js.

DBQuery.prototype._prettyShell = true

You may also want output colorized + other enhancements, so check out Mongo-hacker on GitHub

Structure example

Let’s start with our example now, here is the « members » Mongodb collections :


>db.members.find()
{
  "_id" : ObjectId("51435013044641f8f5a1252e"),
  "fullname"  : "John Doe",
  "birthdate" : "1974-11-26 00:00:00",
  "children" : "0",
  "hair" : "blond",
  "photos" : "john-doe.jpg"
},
{
  "_id" : ObjectId("51435013044641f8f5a1252f"),
  "fullname"  : "Dave Brown",
  "birthdate" : "1966-07-01 00:00:00",
  "children" : "1",
  "hair" : "brown",
  "photos" : "dave-brown.jpg"
},
{
  "_id" : ObjectId("51435013044641f8f5a12530"),
  "fullname"  : "Joe Black",
  "birthdate" : "1920-04-15 00:00:00",
  "children" : "2",
  "hair" : "black",
  "photos" : "joe-black.jpg"
}

Let’s say we want to make the following changes to it :

  • Change the « birthdate » field’s type to ISODate() instead of type String.
  • Change the « children » field’s type to Number instead of type String.
  • Change the field member’s « hair » name to « hair_color ».
  • Change the member’s « Joe Black » hair color value.
  • Turn the « photos » structure’s field into an array of sub-document and change the photo value.

Change the « birthdate » field’s type to ISODate() instead of String.

When you start using the find() method, it returns a cursor to the result. Then you can iterate through cursor with the methods hasNext() and next() like so :


var cursor = db.members.find();
while (cursor.hasNext()) {
  var doc = cursor.next();
  db.members.update({_id : doc._id}, {$set : {birthdate : new ISODate(doc.birthdate) }});
}

All « birthdate »’s fields were changed to ISODate() and here’s the result with after a new find() :


> db.members.find()
{
  "_id" : ObjectId("51435013044641f8f5a1252e"),
  "birthdate" : ISODate("1974-11-26T00:00:00Z"),
  "children" : "0",
  "fullname" : "John Doe",
  "hair" : "blond",
  "photos" : "john-doe.jpg"
}
{
  "_id" : ObjectId("51435013044641f8f5a1252f"),
  "birthdate" : ISODate("1966-07-01T00:00:00Z"),
  "children" : "1",
  "fullname" : "Dave Brown",
  "hair" : "brown",
  "photos" : "dave-brown.jpg"
}
{
  "_id" : ObjectId("51435013044641f8f5a12530"),
  "birthdate" : ISODate("1920-04-15T00:00:00Z"),
  "children" : "2",
  "fullname" : "Joe Black",
  "hair" : "black",
  "photos" : "joe-black.jpg"
}

Change the « children » field’s type to Number instead of String.

The mongodb shell is based on JavaScript, so you can use methods internally on the server-side. To parse the string we’ll use the function parseInt() to change the value and save the « members » after all.


db.members.find().forEach( function (x) {
  x.children = parseInt(x.children);
  db.members.save(x);
});

And verify if everything works as expected :


> db.members.find()
{
  "_id" : ObjectId("51435013044641f8f5a1252e"),
  "birthdate" : ISODate("1974-11-26T00:00:00Z"),
  "children" : 0,
  "fullname" : "John Doe",
  "hair" : "blond",
  "photos" : "john-doe.jpg"
}
{
  "_id" : ObjectId("51435013044641f8f5a1252f"),
  "birthdate" : ISODate("1966-07-01T00:00:00Z"),
  "children" : 1,
  "fullname" : "Dave Brown",
  "hair" : "brown",
  "photos" : "dave-brown.jpg"
}
{
  "_id" : ObjectId("51435013044641f8f5a12530"),
  "birthdate" : ISODate("1920-04-15T00:00:00Z"),
  "children" : 2,
  "fullname" : "Joe Black",
  "hair" : "black",
  "photos" : "joe-black.jpg"
}

Change the field member’s « hair » name to « hair_color ».

To accomplish this, we’ll use the $rename operator as follow :

db.members.update({}, {$rename: {"hair": "hair_color"}}, false, true);

Change the member’s « Joe Black » hair color value.

This is a basic query, can be executed as follows :

db.members.update ({_id: ObjectId("51435013044641f8f5a1252e")}, {$set: {"hair":"white"}}, false, true);

Change the « photos » structure’s field into array of sub-document and the photo value.

For this operation we would like to change the « photos » field to an array of sub-document and add a sub-field name « uri » which refer to a new photo url.

After the operation, the structure will be as follows :


{
  "_id" : ObjectId("51435013044641f8f5a1252e"),
  ...
  "photos" : [
    "uri": "/images/john-doe.jpg"
  ]
}

This can be done with the following command :

db.members.find().snapshot().forEach( function (x) { x.photos = [{"uri": "/images/" + x.photos}]; db.members.save(x); });

We’re done for today, but if you have any enhancements or new query, just let me know !

Hope this will help !