Saturday 21 February 2015

mongoDB : dropDups : Drop duplicates from the collections


In previous post we have seen, how to create unique indexes. While creating unique index, you can use dropDups option, which removes all the duplicates on unique index. If dropDups option is set to true, then MongoDB indexes only the first occurrence of a key and removes all documents from the collection that contain subsequent occurrences of that key.

Let’s say I had 3 documents in employee collection like below.
> db.employee.find()
{ "_id" : 1, "mailId" : "abcabc@abc.com" }
{ "_id" : 2, "mailId" : "defdef@def.com" }
{ "_id" : 3, "mailId" : "abcabc@abc.com" }


If you try to create unique index on field “mailId”, you will get an error. It is because, documents 1 and 3 has same mailId’s.
> db.employee.ensureIndex({"mailId":1}, {"unique":true})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "ok" : 0,
        "errmsg" : "E11000 duplicate key error index: test.employee.$mailId_1  dup key: { : \"abcabc@abc.com\" }",
        "code" : 11000
}
>


You can create unique index, by telling mongoDB, remove all duplicate elements.
> db.employee.ensureIndex({"mailId":1}, {"unique":true, "dropDups":true})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}
>
> db.employee.find()
{ "_id" : 1, "mailId" : "abcabc@abc.com" }
{ "_id" : 2, "mailId" : "defdef@def.com" }
>


You can observe document 3 is removed from employee collection.

Prevoius                                                 Next                                                 Home

No comments:

Post a Comment