How to print count of duplicate documents in a Mongo collection? (Pymongo) [duplicate]












0















This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)










share|improve this question















marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19
















0















This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)










share|improve this question















marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19














0












0








0








This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)










share|improve this question
















This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers




Each document in the collection looks like this. In this case, A and C are fine but B has a duplicate.



{
"_id": {
"$oid": "5bef93fc1c4b3236e79f9c25" # all these are unique
},
"Created_at": "Sat Nov 17 04:07:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727480320" # duplicates identified by this ID
},
"Category": "A" #this is the category
}

{
"_id": {
"$oid": "5bef93531c4b3236e79f9c11"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94e81c4b3236e79f9c3b"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644018276360192"
},
"Category": "B"
}

{
"_id": {
"$oid": "5bef94591c4b3236e79f9cee"
},
"Created_at": "Sat Nov 17 05:17:12 +0000 2018",
"ID": {
"$numberLong": "1063644700727481111"
},
"Category": "C"
}


Duplicates are defined by their ID. I want to count the number of duplicates and print their category like this.



Category A : 5 (5 duplicates tagged Category A)



Category B : 6



Category C : 15



This is what I have tried but it doesn't print anything. I have already seeded my Mongo database with duplicates.



cursor = db.collection.aggregate([
{
"$group": {
"_id": {"ID": "$ID"},
"uniqueIds": { "$addToSet": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$match": { "count": { "$gt": 1 } } }
])

for document in cursor:
print(document)


Your help is appreciated :)





This question already has an answer here:




  • MongoDB Duplicate Documents even after adding unique key

    2 answers








python mongodb






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 17 '18 at 12:15

























asked Nov 17 '18 at 11:52









Joshua

4317




4317




marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by Neil Lunn mongodb
Users with the  mongodb badge can single-handedly close mongodb questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 19 '18 at 6:01


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19


















  • It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
    – Anthony Winzlet
    Nov 17 '18 at 11:57










  • Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
    – Joshua
    Nov 17 '18 at 12:04










  • I have added more documents.
    – Joshua
    Nov 17 '18 at 12:16






  • 1




    Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
    – Anthony Winzlet
    Nov 17 '18 at 12:18










  • Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
    – Joshua
    Nov 17 '18 at 12:19
















It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
– Anthony Winzlet
Nov 17 '18 at 11:57




It should work. May be your count would not be greater($gt) than 1? Try this db.collection.aggregate([ { "$group": { "_id": "$ID", "uniqueIds": { "$addToSet": "$Category" }, "count": { "$sum": 1 } }} ])
– Anthony Winzlet
Nov 17 '18 at 11:57












Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
– Joshua
Nov 17 '18 at 12:04




Thanks for your help. I've tried your code but it still doesn't work. No errors either. It just prints nothing.
– Joshua
Nov 17 '18 at 12:04












I have added more documents.
– Joshua
Nov 17 '18 at 12:16




I have added more documents.
– Joshua
Nov 17 '18 at 12:16




1




1




Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
– Anthony Winzlet
Nov 17 '18 at 12:18




Take a look mongoplayground.net/p/WtwN32is1G9. Is it ok?
– Anthony Winzlet
Nov 17 '18 at 12:18












Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
– Joshua
Nov 17 '18 at 12:19




Yes that looks good but I still can't print the output. I need to print the db.collection.aggregate
– Joshua
Nov 17 '18 at 12:19












1 Answer
1






active

oldest

votes


















0














Try this:



db.collection.aggregate([
{
$group : {
"_id" : {"ID" : "$ID", "Category" : "$Category"},
"Count" : {$sum : 1}
}
},
{
$match : {
"Count" : {$gt : 1}
}
},
{
$project : {
"_id" : 0,
"ID" : "$_id.ID",
"Category" : "$_id.Category",
"Count" : "$Count"
}
}
]);


Hope this helps!






share|improve this answer




























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Try this:



    db.collection.aggregate([
    {
    $group : {
    "_id" : {"ID" : "$ID", "Category" : "$Category"},
    "Count" : {$sum : 1}
    }
    },
    {
    $match : {
    "Count" : {$gt : 1}
    }
    },
    {
    $project : {
    "_id" : 0,
    "ID" : "$_id.ID",
    "Category" : "$_id.Category",
    "Count" : "$Count"
    }
    }
    ]);


    Hope this helps!






    share|improve this answer


























      0














      Try this:



      db.collection.aggregate([
      {
      $group : {
      "_id" : {"ID" : "$ID", "Category" : "$Category"},
      "Count" : {$sum : 1}
      }
      },
      {
      $match : {
      "Count" : {$gt : 1}
      }
      },
      {
      $project : {
      "_id" : 0,
      "ID" : "$_id.ID",
      "Category" : "$_id.Category",
      "Count" : "$Count"
      }
      }
      ]);


      Hope this helps!






      share|improve this answer
























        0












        0








        0






        Try this:



        db.collection.aggregate([
        {
        $group : {
        "_id" : {"ID" : "$ID", "Category" : "$Category"},
        "Count" : {$sum : 1}
        }
        },
        {
        $match : {
        "Count" : {$gt : 1}
        }
        },
        {
        $project : {
        "_id" : 0,
        "ID" : "$_id.ID",
        "Category" : "$_id.Category",
        "Count" : "$Count"
        }
        }
        ]);


        Hope this helps!






        share|improve this answer












        Try this:



        db.collection.aggregate([
        {
        $group : {
        "_id" : {"ID" : "$ID", "Category" : "$Category"},
        "Count" : {$sum : 1}
        }
        },
        {
        $match : {
        "Count" : {$gt : 1}
        }
        },
        {
        $project : {
        "_id" : 0,
        "ID" : "$_id.ID",
        "Category" : "$_id.Category",
        "Count" : "$Count"
        }
        }
        ]);


        Hope this helps!







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 17 '18 at 14:00









        Arsen Davtyan

        82661626




        82661626















            Popular posts from this blog

            Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

            ComboBox Display Member on multiple fields

            Is it possible to collect Nectar points via Trainline?