Special Features

JSON

While building web application, you might want to create a REST API with JSON support. Then, you may need to convert all your Documents into a JSON format in order to pass it via the REST API. Unfortunately (or fortunately), MongoDB supports field format which is not supported by JSON. This is the case for datetime but also for all your CustomTypes you may have built and your embedded objects.

Document supports the JSON import/export.

to_json()

is a simple method which exports your document into a JSON format:

>>> # Python 3
>>> class MyDoc(Document):
...         structure = {
...             "bla":{
...                 "foo":str,
...                 "bar":int,
...             },
...             "spam":[],
...         }
>>> con.register([MyDoc])
>>> mydoc = tutorial.MyDoc()
>>> mydoc['_id'] = 'mydoc'
>>> mydoc["bla"]["foo"] = "bar"
>>> mydoc["bla"]["bar"] = 42
>>> mydoc['spam'] = range(10)
>>> mydoc.save()
>>> json = mydoc.to_json()
>>> json
'{"_id": "mydoc", "bla": {"foo": "bar", "bar": 42}, "spam": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}'

>>> # Python 2
>>> class MyDoc(Document):
...         structure = {
...             "bla":{
...                 "foo":unicode,
...                 "bar":int,
...             },
...             "spam":[],
...         }
>>> con.register([MyDoc])
>>> mydoc = tutorial.MyDoc()
>>> mydoc['_id'] = u'mydoc'
>>> mydoc["bla"]["foo"] = u"bar"
>>> mydoc["bla"]["bar"] = 42
>>> mydoc['spam'] = range(10)
>>> mydoc.save()
>>> json = mydoc.to_json()
>>> json
u'{"_id": "mydoc", "bla": {"foo": "bar", "bar": 42}, "spam": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}'

from_json()

To load a JSON string into a Document, use the from_json class method:

>>> # Python 3
>>> class MyDoc(Document):
...     structure = {
...         "bla":{
...             "foo":str,
...             "bar":int,
...         },
...         "spam":[],
...     }
>>> json = '{"_id": "mydoc", "bla": {"foo": "bar", "bar": 42}, "spam": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}'
>>> mydoc = tutorial.MyDoc.from_json(json)
>>> mydoc
{'_id': 'mydoc', 'bla': {'foo': 'bar', 'bar': 42}, 'spam': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}

>>> # Python 2
>>> class MyDoc(Document):
...     structure = {
...         "bla":{
...             "foo":unicode,
...             "bar":int,
...         },
...         "spam":[],
...     }
>>> json = '{"_id": "mydoc", "bla": {"foo": "bar", "bar": 42}, "spam": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}'
>>> mydoc = tutorial.MyDoc.from_json(json)
>>> mydoc
{'_id': 'mydoc', 'bla': {'foo': 'bar', 'bar': 42}, 'spam': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}

Note that from_json will take care of all your embedded Documents if you used the to_json() method to generate the JSON. Indeed, some extra value has to be set : the database and the collection where the embedded document lives. This is added by the to_json() method:

>>> # Python 3
>>> class EmbedDoc(Document):
...     db_name = "test"
...     collection_name = "mongokit"
...     structure = {
...         "foo":str
...     }
>>> class MyDoc(Document):
...    db_name = "test"
...    collection_name = "mongokit"
...    structure = {
...        "doc":{
...            "embed":EmbedDoc,
...        },
...    }
...    use_autorefs = True
>>> con.register([EmbedDoc, MyDoc])

>>> # Python 2
>>> class EmbedDoc(Document):
...     db_name = "test"
...     collection_name = "mongokit"
...     structure = {
...         "foo":unicode
...     }
>>> class MyDoc(Document):
...    db_name = "test"
...    collection_name = "mongokit"
...    structure = {
...        "doc":{
...            "embed":EmbedDoc,
...        },
...    }
...    use_autorefs = True
>>> con.register([EmbedDoc, MyDoc])

Let’s create an embedded doc:

>>> # Python 3
>>> embed = tutorial.EmbedDoc()
>>> embed['_id'] = "embed"
>>> embed['foo'] = "bar"
>>> embed.save()

>>> # Python 2
>>> embed = tutorial.EmbedDoc()
>>> embed['_id'] = u"embed"
>>> embed['foo'] = u"bar"
>>> embed.save()

and embed this doc to another doc:

>>> # Python 3
>>> mydoc = tutorial.MyDoc()
>>> mydoc['_id'] = u'mydoc'
>>> mydoc['doc']['embed'] = embed
>>> mydoc.save()

>>> # Python 2
>>> mydoc = tutorial.MyDoc()
>>> mydoc['_id'] = 'mydoc'
>>> mydoc['doc']['embed'] = embed
>>> mydoc.save()

Now let’s see how the generated json looks like:

>>> json = mydoc.to_json()
>>> json
u'{"doc": {"embed": {"_collection": "tutorial", "_database": "test", "_id": "embed", "foo": "bar"}}, "_id": "mydoc"}'

As you can see, two new fields have been added : _collection and _database which represent respectively the collection and the database where the embedded doc has been saved. That information is needed to generate the embedded document. These are removed when calling the from_json() method:

>>> # Python 3
>>> mydoc = tutorial.MyDoc.from_json(json)
>>> mydoc
{u'doc': {u'embed': {u'_id': u'embed', u'foo': u'bar'}}, u'_id': u'mydoc'}

>>> # Python 2
>>> mydoc = tutorial.MyDoc.from_json(json)
>>> mydoc
{'doc': {'embed': {'_id': 'embed', 'foo': 'bar'}}, '_id': 'mydoc'}

An the embedded document is an instance of EmbedDoc:

>>> isinstance(mydoc['doc']['embed'], EmbedDoc)
True

ObjectId support

from_json() can detect if the _id is an ObjectId instance or a simple string. When you serialize an object with ObjectId instance to json, the generated json object looks like this:

'{"_id": {"$oid": "..."}, ...}'

The “$oid” field is added to tell from_json() that ‘_id’ is an ObjectId instance. The same happens with embedded docs:

>>> mydoc = tutorial.MyDoc()
>>> mydoc['doc']['embed'] = embed
>>> mydoc.save()
>>> mydoc.to_json()
{'doc': {'embed': {u'_id': ObjectId('4b5ec45090bce737cb000002'), u'foo': u'bar'}}, '_id': ObjectId('4b5ec45090bce737cb000003')}

Migration

Let’s say we have created a blog post which look like this:

# Python 3
>>> from mongokit import *
>>> con = Connection()
... class BlogPost(Document):
...     structure = {
...         "blog_post":{
...             "title": str,
...             "created_at": datetime,
...             "body": str,
...         }
...     }
...     default_values = {'blog_post.created_at':datetime.utcnow()}

>>> # Python 2
>>> from mongokit import *
>>> con = Connection()
... class BlogPost(Document):
...     structure = {
...         "blog_post":{
...             "title": unicode,
...             "created_at": datetime,
...             "body": unicode,
...         }
...     }
...     default_values = {'blog_post.created_at':datetime.utcnow()}

Let’s create some blog posts:

>>> for i in range(10):
...     con.test.tutorial.BlogPost({'title':u'hello %s' % i, 'body': u'I the post number %s' % i}).save()

Now, development goes on and we add a ‘tags’ field to our BlogPost:

# Python 3
class BlogPost(Document):
    structure = {
        "blog_post":{
            "title": str,
            "created_at": datetime,
            "body": str,
            "tags":  [str],
        }
    }
    default_values = {'blog_post.created_at':datetime.utcnow()}

# Python 2
class BlogPost(Document):
    structure = {
        "blog_post":{
            "title": unicode,
            "created_at": datetime,
            "body": unicode,
            "tags":  [unicode],
        }
    }
    default_values = {'blog_post.created_at':datetime.utcnow()}

We’re gonna be in trouble when we’ll try to save the fetched document because the structures don’t match:

>>> blog_post = con.test.tutorial.BlogPost.find_one()
>>> blog_post['blog_post']['title'] = u'Hello World'
>>> blog_post.save()
Traceback (most recent call last):
    ...
StructureError: missed fields : ['tags']

If we want to fix this issue, we have to add the ‘tags’ field manually to all BlogPost in the database:

>>> con.test.tutorial.update({'blog_post':{'$exists':True}, 'blog_post.tags':{'$exists':False}},
...    {'$set':{'blog_post.tags':[]}}, multi=True)

and now we can save our blog_post:

>>> blog_post.reload()
>>> blog_post['blog_post']['title'] = u'Hello World'
>>> blog_post.save()

Lazy migration

Important

You cannot use this feature if use_schemaless is set to True

Mongokit provides a convenient way to set migration rules and apply them lazily. We will explain how to do that using the previous example.

Let’s create a BlogPostMigration which inherits from DocumentMigration:

class BlogPostMigration(DocumentMigration):
    def migration01__add_tags_field(self):
        self.target = {'blog_post':{'$exists':True}, 'blog_post.tags':{'$exists':False}}
        self.update = {'$set':{'blog_post.tags':[]}}

How does it work? All migration rules are simple methods on the BlogPostMigration. They must begin with migration and be numbered (so they can be applied in certain order). The rest of the name should describe the rule. Here, we create our first rule (migration01) which adds the ‘tags’ field to our BlogPost.

Then you must set two attributes : self.target and self.update. There’s both mongodb regular query.

self.target will tell mongokit which document will match this rule. Migration will be applied to every document matching this query.

self.update is a mongodb update query with modifiers. This will describe what updates should be applied to the matching document.

Now that our BlogPostMigration is created, we have to tell Mongokit to what document these migration rules should be applied. To do that, we have to set the migration_handler in BlogPost:

# Python 3
class BlogPost(Document):
    structure = {
        "blog_post":{
            "title": unicode,
            "created_at": datetime,
            "body": unicode,
            "tags": [unicode],
        }
    }
    default_values = {'blog_post.created_at':datetime.utcnow}
    migration_handler = BlogPostMigration

# Python 2
class BlogPost(Document):
    structure = {
        "blog_post":{
            "title": unicode,
            "created_at": datetime,
            "body": unicode,
            "tags": [unicode],
        }
    }
    default_values = {'blog_post.created_at':datetime.utcnow}
    migration_handler = BlogPostMigration

Each time an error is raised while validating a document, migration rules are applied to the object and the document is reloaded.

Caution

If migration_handler is set then skip_validation is deactivated. Validation must be on to allow lazy migration.

Bulk migration

Lazy migration is useful if you have many documents to migrate, because update will lock the database. But sometimes you might want to make a migration on few documents and you don’t want slow down your application with validation. You should then use bulk migration.

Bulk migration works like lazy migration but DocumentMigration method must start with allmigration. Because lazy migration adds document _id to self.target, with bulk migration you should provide more information on self.target. Here’s an example of bulk migration, where we finally wan’t to remove the tags field from BlogPost:

# Python 3
class BlogPost(Document):
    structure = {
        "blog_post":{
            "title": unicode,
            "creation_date": datetime,
            "body": unicode,
        }
    }
    default_values = {'blog_post.created_at':datetime.utcnow}

# Python 2
class BlogPost(Document):
    structure = {
        "blog_post":{
            "title": unicode,
            "creation_date": datetime,
            "body": unicode,
        }
    }
    default_values = {'blog_post.created_at':datetime.utcnow}

Note that we don’t need to add the migration_handler, it is required only for lazy migration.

Let’s edit the BlogPostMigration:

class BlogPostMigration(DocumentMigration):
    def allmigration01_remove_tags(self):
        self.target = {'blog_post.tags':{'$exists':True}}
        self.update = {'$unset':{'blog_post.tags':[]}}

To apply the migration, instantiate the BlogPostMigration and call the migrate_all method:

>>> migration = BlogPostMigration(BlogPost)
>>> migration.migrate_all(collection=con.test.tutorial)

Note

Because migration_* methods are not called with migrate_all(), you can mix migration_* and allmigration_* methods.

Migration status

Once all your documents have been migrated, some migration rules could become deprecated. To know which rules are deprecated, use the get_deprecated method:

>>>> migration = BlogPostMigration(BlogPost)
>>> migration.get_deprecated(collection=con.test.tutorial)
{'deprecated':['allmigration01__remove_tags'], 'active':['migration02__rename_created_at']}

Here we can remove the rule allmigration01__remove_tags.

Advanced migration

Lazy migration

Sometimes we might want to build more advanced migration. For instance, say you want to copy a field value into another field, you can have access to the current doc value via self.doc. In the following example, we want to add an update_date field and copy the creation_date value into it:

class BlogPostMigration(DocumentMigration):
    def migration01__add_update_field_and_fill_it(self):
        self.target = {'blog_post.update_date':{'$exists':False}, 'blog_post':{'$exists':True}}
        self.update = {'$set':{'blog_post.update_date': self.doc['blog_post']['creation_date']}}

Advanced and bulk migration

If you want to do the same thing with bulk migration, things are a little different:

class BlogPostMigration(DocumentMigration):
    def allmigration01__add_update_field_and_fill_it(self):
        self.target = {'blog_post.update_date':{'$exists':False}, 'blog_post':{'$exists':True}}
        if not self.status:
            for doc in self.collection.find(self.target):
                self.update = {'$set':{'blog_post.update_date': doc['blog_post']['creation_date']}}
                self.collection.update(self.target, self.update, multi=True, safe=True)

In this example, the method allmigration01__add_update_field_and_fill_it will directly modify the database and will be called by get_deprecated(). But calling get_deprecated() should not arm the database so, we need to specify what portion of the code must be ignored when calling get_deprecated(). This explains the second line.

Paginator

Implementing pagination in the project is made easy with mongokit.paginator.Paginator. Paginator actually converts query-result-cursor into Paginator object and provides useful properties on it.

Using Paginator is consists of following two logical steps:

  1. Importing paginator.Paginator module.
  2. Applying it on your query-result-cursor.

Lets apply this steps with following detailed example.

A detailed Example:

Consider following as a sample model class:

>>> # Python 3
>>> from mongokit import Document, Connection
>>> connection = Connection()
...
... @connection.register
... class Wiki(Document):
...
...    __collection__ = 'wiki'
...    __database__ = 'db_test_pagination'
...
...    structure = {
...        "name": str,  # name of wiki
...        "description": str,  # content of wiki
...        "created_by": str,  # username of user
...    }
...
...    required_fields = ['name', 'created_by']

>>> # Python 2
>>> from mongokit import Document, Connection
>>> connection = Connection()
...
... @connection.register
... class Wiki(Document):
...
...    __collection__ = 'wiki'
...    __database__ = 'db_test_pagination'
...
...    structure = {
...        "name": unicode,  # name of wiki
...        "description": unicode,  # content of wiki
...        "created_by": basestring,  # username of user
...    }
...
...    required_fields = ['name', 'created_by']

Now let’s consider that you have created 55 instances of class Wiki. And while querying you are getting all the instances in a query-result-cursor or resultant cursor.

>>> wiki_collection = connection['db_test_pagination']
>>> total_wikis = wiki_collection.Wiki.find()
>>> total_wikis.count()
... 55

Now let’s paginate the resultant cursor: total_wikis

As stated previously, we will first import the Paginator and then apply pagination on the resultant cursor.

>>> from mongokit.paginator import Paginator
>>> page_no = 2  # page number
>>> no_of_objects_pp = 10  # total no of objects or items per page

Keyword arguments required for Paginator class are as follows:

  1. cursor – Cursor of a returned query (total_wikis in our example)

  2. page – The page number requested (page_no in our example)

  3. limit – The number of items per page (no_of_objects_pp in our example)

    >>> paged_wiki = Paginator(total_wikis, page_no, no_of_objects_pp)
    

We had applied the pagination on total_wikis cursor and stored it’s result in paged_wiki, which is a Paginated object.

Note:The cursor (total_wikis) which we passed as an argument for Paginator, also gets limit to no_of_objects_pp (10 in our case). And looping it would loop for no_of_objects_pp (10) times.

Paginated object properties:

Let’s move ahead and try properties on paged_wiki. There are total of 11 properties provided by mongokit, for the Paginated object. The properties that we can apply on paged_wiki are as follows:

property-1: itemsReturns the paginated Cursor object.

Above code will loop for 10 times to print the name of objects.

Property-2: is_paginatedBoolean value determining if the cursor has multiple pages

>>> paged_wiki.is_paginated
... True

Property-3: start_indexint index of the first item on the requested page

>>> paged_wiki.start_index
... 11

As the no. of items per page is 10, we got the result of second page’s, starting item’s index as 11.

Property-4: end_indexint index of the last item on the requested page

>>> paged_wiki.end_index
... 20

As the no. of items per page is 10, we got the result of second page’s, ending item’s index as 20.

Property-5: current_pageint page number of the requested page

>>> paged_wiki.current_page
... 2

Property-6: previous_pageint page number of previous page with respect to current requested page

>>> paged_wiki.previous_page
... 1

Property-7: next_pageint page number of next page with respect to current requested page

>>> paged_wiki.next_page
... 3

Property-8: has_nextTrue or False if the Cursor has a next page

>>> paged_wiki.has_next
... True

Property-9: has_previousTrue or False if the Cursor has a previous page

>>> paged_wiki.has_previous
... True

Property-10: page_rangelist of the all page numbers (ascending order) in a list format

>>> paged_wiki.page_range
... [1, 2, 3, 4, 5, 6]

Property-11: num_pagesint of the total number of pages

>>> paged_wiki.num_pages
... 6

Property-12: countint total number of items on the cursor

>>> paged_wiki.count
... 55

It’s same as that of total_wikis.count()