The formula of ScoreDoc.score. in Lucene





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







1















I want to create a research engine using Lucene. From Lucene documentation, I noticed that ScoreDoc.score gives the similarity score between the document and query.



I want to know how the similarity score is calculated?



Please help me..










share|improve this question































    1















    I want to create a research engine using Lucene. From Lucene documentation, I noticed that ScoreDoc.score gives the similarity score between the document and query.



    I want to know how the similarity score is calculated?



    Please help me..










    share|improve this question



























      1












      1








      1


      1






      I want to create a research engine using Lucene. From Lucene documentation, I noticed that ScoreDoc.score gives the similarity score between the document and query.



      I want to know how the similarity score is calculated?



      Please help me..










      share|improve this question
















      I want to create a research engine using Lucene. From Lucene documentation, I noticed that ScoreDoc.score gives the similarity score between the document and query.



      I want to know how the similarity score is calculated?



      Please help me..







      java apache search solr lucene






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 22 '18 at 20:37







      Noran

















      asked Nov 22 '18 at 16:45









      NoranNoran

      239110




      239110
























          1 Answer
          1






          active

          oldest

          votes


















          1














          Similarly score is calculated based on the similarly model being used in the field on which user is doing the query. There are two I am aware of tf-idf and another is BM25.



          Both of those uses the documents characterstics like doc length, word frequency, idf etc. So you could go through this link if it helps






          share|improve this answer
























          • That link doesn't really explain much about how BM25 works - a much better explanation can be found at BM25 - The Next Generation of Lucene Relevation. BM25 is the default similarity in Solr these days.

            – MatsLindh
            Nov 23 '18 at 19:30













          • @MatsLindhThe page not found

            – Noran
            Nov 23 '18 at 20:30













          • @AmanTandon I would like to normalize the scores in Lucene, Do you know how to do this?

            – Noran
            Nov 23 '18 at 20:36











          • @Noran Please refer the github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.0/… You can divide the score given by scoring algo (BM25) by the max score returned by the getMaxScore to normalize the score, however could you explain why you want to normalize those values?

            – Aman Tandon
            Nov 24 '18 at 7:37













          • @Noran Here is the corrected link which Mats provided and gives the good explanation of BM25. opensourceconnections.com/blog/2015/10/16/…

            – Aman Tandon
            Nov 24 '18 at 7:39














          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53435288%2fthe-formula-of-scoredoc-score-in-lucene%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          Similarly score is calculated based on the similarly model being used in the field on which user is doing the query. There are two I am aware of tf-idf and another is BM25.



          Both of those uses the documents characterstics like doc length, word frequency, idf etc. So you could go through this link if it helps






          share|improve this answer
























          • That link doesn't really explain much about how BM25 works - a much better explanation can be found at BM25 - The Next Generation of Lucene Relevation. BM25 is the default similarity in Solr these days.

            – MatsLindh
            Nov 23 '18 at 19:30













          • @MatsLindhThe page not found

            – Noran
            Nov 23 '18 at 20:30













          • @AmanTandon I would like to normalize the scores in Lucene, Do you know how to do this?

            – Noran
            Nov 23 '18 at 20:36











          • @Noran Please refer the github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.0/… You can divide the score given by scoring algo (BM25) by the max score returned by the getMaxScore to normalize the score, however could you explain why you want to normalize those values?

            – Aman Tandon
            Nov 24 '18 at 7:37













          • @Noran Here is the corrected link which Mats provided and gives the good explanation of BM25. opensourceconnections.com/blog/2015/10/16/…

            – Aman Tandon
            Nov 24 '18 at 7:39


















          1














          Similarly score is calculated based on the similarly model being used in the field on which user is doing the query. There are two I am aware of tf-idf and another is BM25.



          Both of those uses the documents characterstics like doc length, word frequency, idf etc. So you could go through this link if it helps






          share|improve this answer
























          • That link doesn't really explain much about how BM25 works - a much better explanation can be found at BM25 - The Next Generation of Lucene Relevation. BM25 is the default similarity in Solr these days.

            – MatsLindh
            Nov 23 '18 at 19:30













          • @MatsLindhThe page not found

            – Noran
            Nov 23 '18 at 20:30













          • @AmanTandon I would like to normalize the scores in Lucene, Do you know how to do this?

            – Noran
            Nov 23 '18 at 20:36











          • @Noran Please refer the github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.0/… You can divide the score given by scoring algo (BM25) by the max score returned by the getMaxScore to normalize the score, however could you explain why you want to normalize those values?

            – Aman Tandon
            Nov 24 '18 at 7:37













          • @Noran Here is the corrected link which Mats provided and gives the good explanation of BM25. opensourceconnections.com/blog/2015/10/16/…

            – Aman Tandon
            Nov 24 '18 at 7:39
















          1












          1








          1







          Similarly score is calculated based on the similarly model being used in the field on which user is doing the query. There are two I am aware of tf-idf and another is BM25.



          Both of those uses the documents characterstics like doc length, word frequency, idf etc. So you could go through this link if it helps






          share|improve this answer













          Similarly score is calculated based on the similarly model being used in the field on which user is doing the query. There are two I am aware of tf-idf and another is BM25.



          Both of those uses the documents characterstics like doc length, word frequency, idf etc. So you could go through this link if it helps







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 23 '18 at 17:22









          Aman TandonAman Tandon

          620322




          620322













          • That link doesn't really explain much about how BM25 works - a much better explanation can be found at BM25 - The Next Generation of Lucene Relevation. BM25 is the default similarity in Solr these days.

            – MatsLindh
            Nov 23 '18 at 19:30













          • @MatsLindhThe page not found

            – Noran
            Nov 23 '18 at 20:30













          • @AmanTandon I would like to normalize the scores in Lucene, Do you know how to do this?

            – Noran
            Nov 23 '18 at 20:36











          • @Noran Please refer the github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.0/… You can divide the score given by scoring algo (BM25) by the max score returned by the getMaxScore to normalize the score, however could you explain why you want to normalize those values?

            – Aman Tandon
            Nov 24 '18 at 7:37













          • @Noran Here is the corrected link which Mats provided and gives the good explanation of BM25. opensourceconnections.com/blog/2015/10/16/…

            – Aman Tandon
            Nov 24 '18 at 7:39





















          • That link doesn't really explain much about how BM25 works - a much better explanation can be found at BM25 - The Next Generation of Lucene Relevation. BM25 is the default similarity in Solr these days.

            – MatsLindh
            Nov 23 '18 at 19:30













          • @MatsLindhThe page not found

            – Noran
            Nov 23 '18 at 20:30













          • @AmanTandon I would like to normalize the scores in Lucene, Do you know how to do this?

            – Noran
            Nov 23 '18 at 20:36











          • @Noran Please refer the github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.0/… You can divide the score given by scoring algo (BM25) by the max score returned by the getMaxScore to normalize the score, however could you explain why you want to normalize those values?

            – Aman Tandon
            Nov 24 '18 at 7:37













          • @Noran Here is the corrected link which Mats provided and gives the good explanation of BM25. opensourceconnections.com/blog/2015/10/16/…

            – Aman Tandon
            Nov 24 '18 at 7:39



















          That link doesn't really explain much about how BM25 works - a much better explanation can be found at BM25 - The Next Generation of Lucene Relevation. BM25 is the default similarity in Solr these days.

          – MatsLindh
          Nov 23 '18 at 19:30







          That link doesn't really explain much about how BM25 works - a much better explanation can be found at BM25 - The Next Generation of Lucene Relevation. BM25 is the default similarity in Solr these days.

          – MatsLindh
          Nov 23 '18 at 19:30















          @MatsLindhThe page not found

          – Noran
          Nov 23 '18 at 20:30







          @MatsLindhThe page not found

          – Noran
          Nov 23 '18 at 20:30















          @AmanTandon I would like to normalize the scores in Lucene, Do you know how to do this?

          – Noran
          Nov 23 '18 at 20:36





          @AmanTandon I would like to normalize the scores in Lucene, Do you know how to do this?

          – Noran
          Nov 23 '18 at 20:36













          @Noran Please refer the github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.0/… You can divide the score given by scoring algo (BM25) by the max score returned by the getMaxScore to normalize the score, however could you explain why you want to normalize those values?

          – Aman Tandon
          Nov 24 '18 at 7:37







          @Noran Please refer the github.com/apache/lucene-solr/blob/releases/lucene-solr/6.4.0/… You can divide the score given by scoring algo (BM25) by the max score returned by the getMaxScore to normalize the score, however could you explain why you want to normalize those values?

          – Aman Tandon
          Nov 24 '18 at 7:37















          @Noran Here is the corrected link which Mats provided and gives the good explanation of BM25. opensourceconnections.com/blog/2015/10/16/…

          – Aman Tandon
          Nov 24 '18 at 7:39







          @Noran Here is the corrected link which Mats provided and gives the good explanation of BM25. opensourceconnections.com/blog/2015/10/16/…

          – Aman Tandon
          Nov 24 '18 at 7:39






















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53435288%2fthe-formula-of-scoredoc-score-in-lucene%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to change which sound is reproduced for terminal bell?

          Can I use Tabulator js library in my java Spring + Thymeleaf project?

          Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents