Deriving marginal likelihood formula











up vote
1
down vote

favorite












The formula for marginal likelihood is the following:



$ p(D | m) = int P(D | theta)p(theta | m)dtheta $



But if I try to simplify the right-hand-side, how would I prove this equality



$ = int frac{p(D, theta)}{p(theta)}frac{p(theta, m)}{p(m)}dtheta $



... and so on? I can't seem to simplify it. I can't just "remove" $ theta $ here right like you would do if there was only one expression? As in this isn't the same as:



$ P(D)frac{P(m)}{P(m)} = P(D)? $










share|cite|improve this question






















  • You want to be performing the marginalisation over $theta$ so you want to rearrange your integrand so that $theta$ is the argument of the density function and doesn't appear as a variable being conditioned on.
    – Nadiels
    Nov 15 at 11:32










  • @Nadiels Can you show what you mean? I don't see how I could expand it more than I've done in step 2
    – Ferus
    Nov 15 at 12:01















up vote
1
down vote

favorite












The formula for marginal likelihood is the following:



$ p(D | m) = int P(D | theta)p(theta | m)dtheta $



But if I try to simplify the right-hand-side, how would I prove this equality



$ = int frac{p(D, theta)}{p(theta)}frac{p(theta, m)}{p(m)}dtheta $



... and so on? I can't seem to simplify it. I can't just "remove" $ theta $ here right like you would do if there was only one expression? As in this isn't the same as:



$ P(D)frac{P(m)}{P(m)} = P(D)? $










share|cite|improve this question






















  • You want to be performing the marginalisation over $theta$ so you want to rearrange your integrand so that $theta$ is the argument of the density function and doesn't appear as a variable being conditioned on.
    – Nadiels
    Nov 15 at 11:32










  • @Nadiels Can you show what you mean? I don't see how I could expand it more than I've done in step 2
    – Ferus
    Nov 15 at 12:01













up vote
1
down vote

favorite









up vote
1
down vote

favorite











The formula for marginal likelihood is the following:



$ p(D | m) = int P(D | theta)p(theta | m)dtheta $



But if I try to simplify the right-hand-side, how would I prove this equality



$ = int frac{p(D, theta)}{p(theta)}frac{p(theta, m)}{p(m)}dtheta $



... and so on? I can't seem to simplify it. I can't just "remove" $ theta $ here right like you would do if there was only one expression? As in this isn't the same as:



$ P(D)frac{P(m)}{P(m)} = P(D)? $










share|cite|improve this question













The formula for marginal likelihood is the following:



$ p(D | m) = int P(D | theta)p(theta | m)dtheta $



But if I try to simplify the right-hand-side, how would I prove this equality



$ = int frac{p(D, theta)}{p(theta)}frac{p(theta, m)}{p(m)}dtheta $



... and so on? I can't seem to simplify it. I can't just "remove" $ theta $ here right like you would do if there was only one expression? As in this isn't the same as:



$ P(D)frac{P(m)}{P(m)} = P(D)? $







probability probability-theory probability-distributions conditional-probability marginal-probability






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Nov 15 at 11:02









Ferus

83




83












  • You want to be performing the marginalisation over $theta$ so you want to rearrange your integrand so that $theta$ is the argument of the density function and doesn't appear as a variable being conditioned on.
    – Nadiels
    Nov 15 at 11:32










  • @Nadiels Can you show what you mean? I don't see how I could expand it more than I've done in step 2
    – Ferus
    Nov 15 at 12:01


















  • You want to be performing the marginalisation over $theta$ so you want to rearrange your integrand so that $theta$ is the argument of the density function and doesn't appear as a variable being conditioned on.
    – Nadiels
    Nov 15 at 11:32










  • @Nadiels Can you show what you mean? I don't see how I could expand it more than I've done in step 2
    – Ferus
    Nov 15 at 12:01
















You want to be performing the marginalisation over $theta$ so you want to rearrange your integrand so that $theta$ is the argument of the density function and doesn't appear as a variable being conditioned on.
– Nadiels
Nov 15 at 11:32




You want to be performing the marginalisation over $theta$ so you want to rearrange your integrand so that $theta$ is the argument of the density function and doesn't appear as a variable being conditioned on.
– Nadiels
Nov 15 at 11:32












@Nadiels Can you show what you mean? I don't see how I could expand it more than I've done in step 2
– Ferus
Nov 15 at 12:01




@Nadiels Can you show what you mean? I don't see how I could expand it more than I've done in step 2
– Ferus
Nov 15 at 12:01










1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










So the marginal likelihood is more of a definition than a result, what you do always have from basic probability theory is the marginalisation
$$
p(D|m)=int p(D, theta | m)dtheta, tag{1}
$$

so there is an assumption that $p(D | theta, m) = p(D | theta)$ - this is a hierarchical modelling set up.



Since we know $(1)$ my comment is just that as quickly as possible you want to go
$$
p(D|theta)p(theta|m) = p(D, theta |m),
$$

so instead of an expansion like you have considered you are actually wanting to condense everything to a joint density.






share|cite|improve this answer





















  • Okay, but how did you get $ p(D|theta)p(theta|m) = P(D, theta | m)? $
    – Ferus
    Nov 15 at 13:32










  • 'Cause $p(D, theta | m ) = p(D | theta, m)p(theta | m)$ but $p(D |theta, m) = p(D | theta)$ by assumption.
    – Nadiels
    Nov 15 at 13:52










  • Alright, then I get it. It was the assumption that confused me. Thanks!
    – Ferus
    Nov 15 at 13:55










  • No worries - I think that assumption is actually responsible for a lot of confusion when it comes to the marginal likelihood
    – Nadiels
    Nov 15 at 13:59











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2999538%2fderiving-marginal-likelihood-formula%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote



accepted










So the marginal likelihood is more of a definition than a result, what you do always have from basic probability theory is the marginalisation
$$
p(D|m)=int p(D, theta | m)dtheta, tag{1}
$$

so there is an assumption that $p(D | theta, m) = p(D | theta)$ - this is a hierarchical modelling set up.



Since we know $(1)$ my comment is just that as quickly as possible you want to go
$$
p(D|theta)p(theta|m) = p(D, theta |m),
$$

so instead of an expansion like you have considered you are actually wanting to condense everything to a joint density.






share|cite|improve this answer





















  • Okay, but how did you get $ p(D|theta)p(theta|m) = P(D, theta | m)? $
    – Ferus
    Nov 15 at 13:32










  • 'Cause $p(D, theta | m ) = p(D | theta, m)p(theta | m)$ but $p(D |theta, m) = p(D | theta)$ by assumption.
    – Nadiels
    Nov 15 at 13:52










  • Alright, then I get it. It was the assumption that confused me. Thanks!
    – Ferus
    Nov 15 at 13:55










  • No worries - I think that assumption is actually responsible for a lot of confusion when it comes to the marginal likelihood
    – Nadiels
    Nov 15 at 13:59















up vote
1
down vote



accepted










So the marginal likelihood is more of a definition than a result, what you do always have from basic probability theory is the marginalisation
$$
p(D|m)=int p(D, theta | m)dtheta, tag{1}
$$

so there is an assumption that $p(D | theta, m) = p(D | theta)$ - this is a hierarchical modelling set up.



Since we know $(1)$ my comment is just that as quickly as possible you want to go
$$
p(D|theta)p(theta|m) = p(D, theta |m),
$$

so instead of an expansion like you have considered you are actually wanting to condense everything to a joint density.






share|cite|improve this answer





















  • Okay, but how did you get $ p(D|theta)p(theta|m) = P(D, theta | m)? $
    – Ferus
    Nov 15 at 13:32










  • 'Cause $p(D, theta | m ) = p(D | theta, m)p(theta | m)$ but $p(D |theta, m) = p(D | theta)$ by assumption.
    – Nadiels
    Nov 15 at 13:52










  • Alright, then I get it. It was the assumption that confused me. Thanks!
    – Ferus
    Nov 15 at 13:55










  • No worries - I think that assumption is actually responsible for a lot of confusion when it comes to the marginal likelihood
    – Nadiels
    Nov 15 at 13:59













up vote
1
down vote



accepted







up vote
1
down vote



accepted






So the marginal likelihood is more of a definition than a result, what you do always have from basic probability theory is the marginalisation
$$
p(D|m)=int p(D, theta | m)dtheta, tag{1}
$$

so there is an assumption that $p(D | theta, m) = p(D | theta)$ - this is a hierarchical modelling set up.



Since we know $(1)$ my comment is just that as quickly as possible you want to go
$$
p(D|theta)p(theta|m) = p(D, theta |m),
$$

so instead of an expansion like you have considered you are actually wanting to condense everything to a joint density.






share|cite|improve this answer












So the marginal likelihood is more of a definition than a result, what you do always have from basic probability theory is the marginalisation
$$
p(D|m)=int p(D, theta | m)dtheta, tag{1}
$$

so there is an assumption that $p(D | theta, m) = p(D | theta)$ - this is a hierarchical modelling set up.



Since we know $(1)$ my comment is just that as quickly as possible you want to go
$$
p(D|theta)p(theta|m) = p(D, theta |m),
$$

so instead of an expansion like you have considered you are actually wanting to condense everything to a joint density.







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered Nov 15 at 12:43









Nadiels

2,350313




2,350313












  • Okay, but how did you get $ p(D|theta)p(theta|m) = P(D, theta | m)? $
    – Ferus
    Nov 15 at 13:32










  • 'Cause $p(D, theta | m ) = p(D | theta, m)p(theta | m)$ but $p(D |theta, m) = p(D | theta)$ by assumption.
    – Nadiels
    Nov 15 at 13:52










  • Alright, then I get it. It was the assumption that confused me. Thanks!
    – Ferus
    Nov 15 at 13:55










  • No worries - I think that assumption is actually responsible for a lot of confusion when it comes to the marginal likelihood
    – Nadiels
    Nov 15 at 13:59


















  • Okay, but how did you get $ p(D|theta)p(theta|m) = P(D, theta | m)? $
    – Ferus
    Nov 15 at 13:32










  • 'Cause $p(D, theta | m ) = p(D | theta, m)p(theta | m)$ but $p(D |theta, m) = p(D | theta)$ by assumption.
    – Nadiels
    Nov 15 at 13:52










  • Alright, then I get it. It was the assumption that confused me. Thanks!
    – Ferus
    Nov 15 at 13:55










  • No worries - I think that assumption is actually responsible for a lot of confusion when it comes to the marginal likelihood
    – Nadiels
    Nov 15 at 13:59
















Okay, but how did you get $ p(D|theta)p(theta|m) = P(D, theta | m)? $
– Ferus
Nov 15 at 13:32




Okay, but how did you get $ p(D|theta)p(theta|m) = P(D, theta | m)? $
– Ferus
Nov 15 at 13:32












'Cause $p(D, theta | m ) = p(D | theta, m)p(theta | m)$ but $p(D |theta, m) = p(D | theta)$ by assumption.
– Nadiels
Nov 15 at 13:52




'Cause $p(D, theta | m ) = p(D | theta, m)p(theta | m)$ but $p(D |theta, m) = p(D | theta)$ by assumption.
– Nadiels
Nov 15 at 13:52












Alright, then I get it. It was the assumption that confused me. Thanks!
– Ferus
Nov 15 at 13:55




Alright, then I get it. It was the assumption that confused me. Thanks!
– Ferus
Nov 15 at 13:55












No worries - I think that assumption is actually responsible for a lot of confusion when it comes to the marginal likelihood
– Nadiels
Nov 15 at 13:59




No worries - I think that assumption is actually responsible for a lot of confusion when it comes to the marginal likelihood
– Nadiels
Nov 15 at 13:59


















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2999538%2fderiving-marginal-likelihood-formula%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

ComboBox Display Member on multiple fields

Is it possible to collect Nectar points via Trainline?