Best practices for Feed restructuring in a social app database
I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag
and a school
(all pointers, (hashtag
is optional, school
is not)).
Users see posts based on school
or hashtag
. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag
and a school
.
Right now when a user asks for posts, I'm just making a query to the Post
collection, comparing only the school
or hashtag
field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:
find all documents in
Post
, whereschool
equals<some_value>
And if they are on the Hashtag screen, we have something like:
find all documents in
Post
, wherehashtag
equals<some_value>
Since we are going to have the new Poll
collection, I'm wondering what implementation would be best... Here's what I have so far:
1st implementation
We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.
2nd (and last) implementation
We create a new collection, Feed
, which would combine both collections and it would look like this:
Feed
post: <Pointer to Post>
poll: <Pointer to Poll>
creator: <Pointer to User> // we need this for block checking or if you open your profile
school: <Pointer to School>
hashtag: <Pointer to Hashtag>
Every time a post or poll will be created, their triggers will create a Feed
document, with all the necessary values.
I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.
Any ideas on which one is better? If you have any other implementation to suggest, please do so!
NOTE: All collections are indexed.
mongodb database-design
migrated from stackoverflow.com Nov 24 '18 at 16:09
This question came from our site for professional and enthusiast programmers.
add a comment |
I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag
and a school
(all pointers, (hashtag
is optional, school
is not)).
Users see posts based on school
or hashtag
. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag
and a school
.
Right now when a user asks for posts, I'm just making a query to the Post
collection, comparing only the school
or hashtag
field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:
find all documents in
Post
, whereschool
equals<some_value>
And if they are on the Hashtag screen, we have something like:
find all documents in
Post
, wherehashtag
equals<some_value>
Since we are going to have the new Poll
collection, I'm wondering what implementation would be best... Here's what I have so far:
1st implementation
We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.
2nd (and last) implementation
We create a new collection, Feed
, which would combine both collections and it would look like this:
Feed
post: <Pointer to Post>
poll: <Pointer to Poll>
creator: <Pointer to User> // we need this for block checking or if you open your profile
school: <Pointer to School>
hashtag: <Pointer to Hashtag>
Every time a post or poll will be created, their triggers will create a Feed
document, with all the necessary values.
I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.
Any ideas on which one is better? If you have any other implementation to suggest, please do so!
NOTE: All collections are indexed.
mongodb database-design
migrated from stackoverflow.com Nov 24 '18 at 16:09
This question came from our site for professional and enthusiast programmers.
add a comment |
I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag
and a school
(all pointers, (hashtag
is optional, school
is not)).
Users see posts based on school
or hashtag
. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag
and a school
.
Right now when a user asks for posts, I'm just making a query to the Post
collection, comparing only the school
or hashtag
field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:
find all documents in
Post
, whereschool
equals<some_value>
And if they are on the Hashtag screen, we have something like:
find all documents in
Post
, wherehashtag
equals<some_value>
Since we are going to have the new Poll
collection, I'm wondering what implementation would be best... Here's what I have so far:
1st implementation
We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.
2nd (and last) implementation
We create a new collection, Feed
, which would combine both collections and it would look like this:
Feed
post: <Pointer to Post>
poll: <Pointer to Poll>
creator: <Pointer to User> // we need this for block checking or if you open your profile
school: <Pointer to School>
hashtag: <Pointer to Hashtag>
Every time a post or poll will be created, their triggers will create a Feed
document, with all the necessary values.
I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.
Any ideas on which one is better? If you have any other implementation to suggest, please do so!
NOTE: All collections are indexed.
mongodb database-design
I have a social app, where users can post to the school they chose on registration. Each post can have a hashtag
and a school
(all pointers, (hashtag
is optional, school
is not)).
Users see posts based on school
or hashtag
. Now we want to introduce a new feature, Polls. Polls are another type of post... Users will be seeing them in their feed among other posts! Besides other properties, they too have a hashtag
and a school
.
Right now when a user asks for posts, I'm just making a query to the Post
collection, comparing only the school
or hashtag
field (depending on the screen they're at). So if they're on the Home screen, I'm going to have something like:
find all documents in
Post
, whereschool
equals<some_value>
And if they are on the Hashtag screen, we have something like:
find all documents in
Post
, wherehashtag
equals<some_value>
Since we are going to have the new Poll
collection, I'm wondering what implementation would be best... Here's what I have so far:
1st implementation
We keep the posts query as is and we add another one, pretty much the same, for the polls and we combine the results. Have in mind here, that our app fetches data with pagination, so we bring the posts 20-by-20. I'm saying this because in case we have a page and the responses of the 2 queries bring 20 documents each, we'll have to compare dates to know what the response should be... eg 16 posts and 4 polls.
2nd (and last) implementation
We create a new collection, Feed
, which would combine both collections and it would look like this:
Feed
post: <Pointer to Post>
poll: <Pointer to Poll>
creator: <Pointer to User> // we need this for block checking or if you open your profile
school: <Pointer to School>
hashtag: <Pointer to Hashtag>
Every time a post or poll will be created, their triggers will create a Feed
document, with all the necessary values.
I believe the second solution is the best, as it is much cleaner than the first. I also think they're pretty much the same complexity-wise.
Any ideas on which one is better? If you have any other implementation to suggest, please do so!
NOTE: All collections are indexed.
mongodb database-design
mongodb database-design
edited Nov 24 '18 at 16:17
Paul White♦
50.7k14277447
50.7k14277447
asked Nov 19 '18 at 22:18
Sotiris KanirasSotiris Kaniras
1234
1234
migrated from stackoverflow.com Nov 24 '18 at 16:09
This question came from our site for professional and enthusiast programmers.
migrated from stackoverflow.com Nov 24 '18 at 16:09
This question came from our site for professional and enthusiast programmers.
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.
Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.
The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)
And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.
In a comment, you asked:
What if I still created a new collection
Feed
, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside thePost
collection?
There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.
You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.
There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223329%2fbest-practices-for-feed-restructuring-in-a-social-app-database%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.
Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.
The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)
And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.
In a comment, you asked:
What if I still created a new collection
Feed
, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside thePost
collection?
There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.
You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.
There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.
add a comment |
Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.
Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.
The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)
And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.
In a comment, you asked:
What if I still created a new collection
Feed
, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside thePost
collection?
There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.
You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.
There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.
add a comment |
Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.
Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.
The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)
And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.
In a comment, you asked:
What if I still created a new collection
Feed
, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside thePost
collection?
There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.
You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.
There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.
Real mongo-design centers around the queries and only the queries. From the database perspective, there is no necessity to separate the document types. From the query perspective, I get the impression that you basically need a single collection, which contains all documents. If you need to distinguish between posts and polls, do this via a field in the document.
Also get rid of this "pointer to user" stuff. Put a copy of the necessary user fields into the respective document, and use this to display whatever you need.
The ultimate goal is, to display your complete page with a single query, without using the aggregation framework (which is slow and only a mediocre workaround for bad database design.)
And get these relational concepts out of your head. Use redundancy, optimize for the reading query. If a user changes, well yes, you'll have to update a thousand documents - but this is not the normal case, so you optimize the read and do more work on the write.
In a comment, you asked:
What if I still created a new collection
Feed
, but instead of pointers, I'll use copies of documents? (without every field, only selected ones) Could that somehow slow things down or not be as fast/good as just integrating the polls inside thePost
collection?
There's nothing wrong with duplicating the necessary data for a given query in a separate collection like your "Feed" collection. But "having too many fields" is once more a relational thought.
You don't have any fields in a collection. You have documents, and these documents have a number of fields individually.
There is no negative effect when you have 100 document types which sum up to 10000 different fields. There may be a technical drawback if you start to create many indexes, but apart from that you may well dump everything in a single collection.
edited Nov 24 '18 at 16:15
Paul White♦
50.7k14277447
50.7k14277447
answered Nov 20 '18 at 6:24
mtj
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f223329%2fbest-practices-for-feed-restructuring-in-a-social-app-database%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown