How to print count of duplicate documents in a Mongo collection? (Pymongo) [duplicate]












0















This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)










share|improve this question















marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19
















0















This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)










share|improve this question















marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19














0












0








0








This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)










share|improve this question
















This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)





This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers








python mongodb






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 17 '18 at 12:15

























asked Nov 17 '18 at 11:52









Joshua

4317




4317




marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19


















  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19
















It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
– Anthony Winzlet
Nov 17 '18 at 11:57




It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
– Anthony Winzlet
Nov 17 '18 at 11:57












Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
– Joshua
Nov 17 '18 at 12:04




Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
– Joshua
Nov 17 '18 at 12:04












I have added more documents.
– Joshua
Nov 17 '18 at 12:16




I have added more documents.
– Joshua
Nov 17 '18 at 12:16




1




1




Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
– Anthony Winzlet
Nov 17 '18 at 12:18




Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
– Anthony Winzlet
Nov 17 '18 at 12:18












Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
– Joshua
Nov 17 '18 at 12:19




Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
– Joshua
Nov 17 '18 at 12:19












1 Answer
1






active

oldest

votes


















0














Try this:



db.collection.aggregate([
{
$group : {
"_id" : {"ID" : "$ID", "Category" : "$Category"},
"Count" : {$sum : 1}
}
},
{
$match : {
"Count" : {$gt : 1}
}
},
{
$project : {
"_id" : 0,
"ID" : "$_id.ID",
"Category" : "$_id.Category",
"Count" : "$Count"
}
}
]);


Hope this helps!






share|improve this answer




























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Try this:



    db.collection.aggregate([
    {
    $group : {
    "_id" : {"ID" : "$ID", "Category" : "$Category"},
    "Count" : {$sum : 1}
    }
    },
    {
    $match : {
    "Count" : {$gt : 1}
    }
    },
    {
    $project : {
    "_id" : 0,
    "ID" : "$_id.ID",
    "Category" : "$_id.Category",
    "Count" : "$Count"
    }
    }
    ]);


    Hope this helps!






    share|improve this answer


























      0














      Try this:



      db.collection.aggregate([
      {
      $group : {
      "_id" : {"ID" : "$ID", "Category" : "$Category"},
      "Count" : {$sum : 1}
      }
      },
      {
      $match : {
      "Count" : {$gt : 1}
      }
      },
      {
      $project : {
      "_id" : 0,
      "ID" : "$_id.ID",
      "Category" : "$_id.Category",
      "Count" : "$Count"
      }
      }
      ]);


      Hope this helps!






      share|improve this answer
























        0












        0








        0






        Try this:



        db.collection.aggregate([
        {
        $group : {
        "_id" : {"ID" : "$ID", "Category" : "$Category"},
        "Count" : {$sum : 1}
        }
        },
        {
        $match : {
        "Count" : {$gt : 1}
        }
        },
        {
        $project : {
        "_id" : 0,
        "ID" : "$_id.ID",
        "Category" : "$_id.Category",
        "Count" : "$Count"
        }
        }
        ]);


        Hope this helps!






        share|improve this answer












        Try this:



        db.collection.aggregate([
        {
        $group : {
        "_id" : {"ID" : "$ID", "Category" : "$Category"},
        "Count" : {$sum : 1}
        }
        },
        {
        $match : {
        "Count" : {$gt : 1}
        }
        },
        {
        $project : {
        "_id" : 0,
        "ID" : "$_id.ID",
        "Category" : "$_id.Category",
        "Count" : "$Count"
        }
        }
        ]);


        Hope this helps!







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 17 '18 at 14:00









        Arsen Davtyan

        82661626




        82661626















            Popular posts from this blog

            How to change which sound is reproduced for terminal bell?

            Can I use Tabulator js library in my java Spring + Thymeleaf project?

            Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents