Best practices for Feed restructuring in a social app database












1















I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag and a school (all pointers, (hashtag is optional, school is not)).



Users see posts based on school or hashtag. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag and a school.



Right now when a user asks for posts, I'm just making a query to the Post collection, comparing only the school or hashtag field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:




find all documents in Post, where school equals <some_value>




And if they are on the Hashtag screen, we have something like:




find all documents in Post, where hashtag equals <some_value>




Since we are going to have the new Poll collection, I'm wondering what implementation would be best... Here's what I have so far:



1st implementation



We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.



2nd (and last) implementation



We create a new collection, Feed, which would combine both collections and it would look like this:



Feed

post: <Pointer to Post>
poll: <Pointer to Poll>
creator: <Pointer to User> // we need this for block checking or if you open your profile
school: <Pointer to School>
hashtag: <Pointer to Hashtag>


Every time a post or poll will be created, their triggers will create a Feed document, with all the necessary values.



I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.



Any ideas on which one is better? If you have any other implementation to suggest, please do so!



NOTE: All collections are indexed.










share|improve this question















migrated from stackoverflow.com Nov 24 '18 at 16:09


This question came from our site for professional and enthusiast programmers.























    1















    I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag and a school (all pointers, (hashtag is optional, school is not)).



    Users see posts based on school or hashtag. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag and a school.



    Right now when a user asks for posts, I'm just making a query to the Post collection, comparing only the school or hashtag field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:




    find all documents in Post, where school equals <some_value>




    And if they are on the Hashtag screen, we have something like:




    find all documents in Post, where hashtag equals <some_value>




    Since we are going to have the new Poll collection, I'm wondering what implementation would be best... Here's what I have so far:



    1st implementation



    We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.



    2nd (and last) implementation



    We create a new collection, Feed, which would combine both collections and it would look like this:



    Feed

    post: <Pointer to Post>
    poll: <Pointer to Poll>
    creator: <Pointer to User> // we need this for block checking or if you open your profile
    school: <Pointer to School>
    hashtag: <Pointer to Hashtag>


    Every time a post or poll will be created, their triggers will create a Feed document, with all the necessary values.



    I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.



    Any ideas on which one is better? If you have any other implementation to suggest, please do so!



    NOTE: All collections are indexed.










    share|improve this question















    migrated from stackoverflow.com Nov 24 '18 at 16:09


    This question came from our site for professional and enthusiast programmers.





















      1












      1








      1


      0






      I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag and a school (all pointers, (hashtag is optional, school is not)).



      Users see posts based on school or hashtag. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag and a school.



      Right now when a user asks for posts, I'm just making a query to the Post collection, comparing only the school or hashtag field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:




      find all documents in Post, where school equals <some_value>




      And if they are on the Hashtag screen, we have something like:




      find all documents in Post, where hashtag equals <some_value>




      Since we are going to have the new Poll collection, I'm wondering what implementation would be best... Here's what I have so far:



      1st implementation



      We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.



      2nd (and last) implementation



      We create a new collection, Feed, which would combine both collections and it would look like this:



      Feed

      post: <Pointer to Post>
      poll: <Pointer to Poll>
      creator: <Pointer to User> // we need this for block checking or if you open your profile
      school: <Pointer to School>
      hashtag: <Pointer to Hashtag>


      Every time a post or poll will be created, their triggers will create a Feed document, with all the necessary values.



      I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.



      Any ideas on which one is better? If you have any other implementation to suggest, please do so!



      NOTE: All collections are indexed.










      share|improve this question
















      I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag and a school (all pointers, (hashtag is optional, school is not)).



      Users see posts based on school or hashtag. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag and a school.



      Right now when a user asks for posts, I'm just making a query to the Post collection, comparing only the school or hashtag field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:




      find all documents in Post, where school equals <some_value>




      And if they are on the Hashtag screen, we have something like:




      find all documents in Post, where hashtag equals <some_value>




      Since we are going to have the new Poll collection, I'm wondering what implementation would be best... Here's what I have so far:



      1st implementation



      We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.



      2nd (and last) implementation



      We create a new collection, Feed, which would combine both collections and it would look like this:



      Feed

      post: <Pointer to Post>
      poll: <Pointer to Poll>
      creator: <Pointer to User> // we need this for block checking or if you open your profile
      school: <Pointer to School>
      hashtag: <Pointer to Hashtag>


      Every time a post or poll will be created, their triggers will create a Feed document, with all the necessary values.



      I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.



      Any ideas on which one is better? If you have any other implementation to suggest, please do so!



      NOTE: All collections are indexed.







      mongodb database-design






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 24 '18 at 16:17









      Paul White

      50.7k14277447




      50.7k14277447










      asked Nov 19 '18 at 22:18









      Sotiris KanirasSotiris Kaniras

      1234




      1234




      migrated from stackoverflow.com Nov 24 '18 at 16:09


      This question came from our site for professional and enthusiast programmers.









      migrated from stackoverflow.com Nov 24 '18 at 16:09


      This question came from our site for professional and enthusiast programmers.
























          1 Answer
          1






          active

          oldest

          votes


















          3














          Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.



          Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.



          The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)



          And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.



          In a comment, you asked:




          What if I still created a new collection Feed, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside the Post collection?




          There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.



          You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.



          There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.






          share|improve this answer

























            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "182"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223329%2fbest-practices-for-feed-restructuring-in-a-social-app-database%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.



            Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.



            The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)



            And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.



            In a comment, you asked:




            What if I still created a new collection Feed, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside the Post collection?




            There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.



            You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.



            There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.






            share|improve this answer






























              3














              Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.



              Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.



              The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)



              And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.



              In a comment, you asked:




              What if I still created a new collection Feed, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside the Post collection?




              There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.



              You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.



              There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.






              share|improve this answer




























                3












                3








                3







                Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.



                Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.



                The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)



                And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.



                In a comment, you asked:




                What if I still created a new collection Feed, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside the Post collection?




                There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.



                You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.



                There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.






                share|improve this answer















                Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.



                Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.



                The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)



                And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.



                In a comment, you asked:




                What if I still created a new collection Feed, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside the Post collection?




                There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.



                You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.



                There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Nov 24 '18 at 16:15









                Paul White

                50.7k14277447




                50.7k14277447










                answered Nov 20 '18 at 6:24







                mtj





































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Database Administrators Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223329%2fbest-practices-for-feed-restructuring-in-a-social-app-database%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to change which sound is reproduced for terminal bell?

                    Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

                    Can I use Tabulator js library in my java Spring + Thymeleaf project?