Rank Transformation of an Array











up vote
3
down vote

favorite












Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.










share|improve this question
























  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40















up vote
3
down vote

favorite












Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.










share|improve this question
























  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40













up vote
3
down vote

favorite









up vote
3
down vote

favorite











Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.










share|improve this question















Is there a built in function which rank transforms an array of data? By rank transformation I mean



data = {2.4,5,1,6,7,10,2}
Rank[data]={3,4,1,5,6,7,2}


where each value in data is assigned a rank from minimum to maximum where the lowest value in data is assigned the value of 1, the next highest value is assigned the value of 2, ect.
Ordering does not accomplish this as we obtain



Ordering[data]
{3,7,1,2,4,5,6}


Edit 1: As Carl pointed out, I need to express what I want to happen in the case of a tied ranking. Ultimately, I want to use this rank transformation in the context of the definition of Spearman's Rho function where



Covariance[Transpose[{Rank[X],Rank[Y]}]/(
StandardDeviation[Rank[X]]*StandardDeviation[Rank[Y]])


should equal



SpearmanRho[Transpose[{X,Y}]][[1,2]]


where X and Y are equally lengthed arrays of data.







functions data






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 26 at 19:43

























asked Nov 26 at 18:37









tquarton

25717




25717












  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40


















  • What do you want to return when there are ties?
    – Carl Woll
    Nov 26 at 19:20










  • Ah, great question. Give me a moment to respond in this comment with an edit.
    – tquarton
    Nov 26 at 19:30










  • I've actually edited the question to address your point Carl.
    – tquarton
    Nov 26 at 19:38










  • closely related / possible duplicate: How to get the ranked order
    – kglr
    Nov 26 at 22:40
















What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20




What do you want to return when there are ties?
– Carl Woll
Nov 26 at 19:20












Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30




Ah, great question. Give me a moment to respond in this comment with an edit.
– tquarton
Nov 26 at 19:30












I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38




I've actually edited the question to address your point Carl.
– tquarton
Nov 26 at 19:38












closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40




closely related / possible duplicate: How to get the ranked order
– kglr
Nov 26 at 22:40










3 Answers
3






active

oldest

votes

















up vote
1
down vote



accepted










Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



{3, 4, 1, 5, 6, 7, 2}




This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



If input data has ties:



Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



{1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



SeedRandom[1]
data = RandomReal[{-1, 1}, 1000000];
a = Ranking[data]; // RepeatedTiming // First



0.18




b = Ordering[Ordering[data]]; // RepeatedTiming // First



0.307




c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



0.226




a == b == c



True




A slightly faster alternative (still slower than Ranking):



ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
d = ranks @ data; // RepeatedTiming // First



0.203




a == b == c == d



True







share|improve this answer






























    up vote
    5
    down vote













    What about this?



    Ordering[Ordering[data]]



    {3, 4, 1, 5, 6, 7, 2}




    Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



    Ranking[data_] := Module[{a},
    a = Range[Length[data]];
    a[[Ordering[data]]] = a;
    a
    ]


    Comparison:



    data = RandomReal[{-1, 1}, 1000000];
    a = Ranking[data]; // RepeatedTiming // First
    b = Ordering[Ordering[data]]; // RepeatedTiming // First
    a == b



    0.13



    0.234



    True







    share|improve this answer























    • Brilliant! This does it. Thanks very much.
      – tquarton
      Nov 26 at 19:13










    • You're welcome.
      – Henrik Schumacher
      Nov 26 at 19:14










    • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
      – tquarton
      Nov 26 at 19:37


















    up vote
    1
    down vote













    I'll answer my own question with a constructed function which does the job:



    Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


    Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






    share|improve this answer























      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "387"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f186727%2frank-transformation-of-an-array%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      1
      down vote



      accepted










      Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



      {3, 4, 1, 5, 6, 7, 2}




      This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



      If input data has ties:



      Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



      {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




      It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



      SeedRandom[1]
      data = RandomReal[{-1, 1}, 1000000];
      a = Ranking[data]; // RepeatedTiming // First



      0.18




      b = Ordering[Ordering[data]]; // RepeatedTiming // First



      0.307




      c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



      0.226




      a == b == c



      True




      A slightly faster alternative (still slower than Ranking):



      ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
      d = ranks @ data; // RepeatedTiming // First



      0.203




      a == b == c == d



      True







      share|improve this answer



























        up vote
        1
        down vote



        accepted










        Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



        {3, 4, 1, 5, 6, 7, 2}




        This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



        If input data has ties:



        Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



        {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




        It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



        SeedRandom[1]
        data = RandomReal[{-1, 1}, 1000000];
        a = Ranking[data]; // RepeatedTiming // First



        0.18




        b = Ordering[Ordering[data]]; // RepeatedTiming // First



        0.307




        c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



        0.226




        a == b == c



        True




        A slightly faster alternative (still slower than Ranking):



        ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
        d = ranks @ data; // RepeatedTiming // First



        0.203




        a == b == c == d



        True







        share|improve this answer

























          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



          {3, 4, 1, 5, 6, 7, 2}




          This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



          If input data has ties:



          Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



          {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




          It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



          SeedRandom[1]
          data = RandomReal[{-1, 1}, 1000000];
          a = Ranking[data]; // RepeatedTiming // First



          0.18




          b = Ordering[Ordering[data]]; // RepeatedTiming // First



          0.307




          c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



          0.226




          a == b == c



          True




          A slightly faster alternative (still slower than Ranking):



          ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
          d = ranks @ data; // RepeatedTiming // First



          0.203




          a == b == c == d



          True







          share|improve this answer














          Statistics`Library`GetDataRankings[{2.4, 5, 1, 6, 7, 10, 2}]



          {3, 4, 1, 5, 6, 7, 2}




          This gives the same result as Ordering@Ordering@#& if there are no ties in the input data.



          If input data has ties:



          Statistics`Library`GetDataRankings[{1, 2, 2, 2, 2, 3, 3, 3, 4, 5}]



          {1, 7/2, 7/2, 7/2, 7/2, 7, 7, 7, 9, 10}




          It is faster than Ordering@Ordering@#& but slower than Henrik Schumacher's Ranking:



          SeedRandom[1]
          data = RandomReal[{-1, 1}, 1000000];
          a = Ranking[data]; // RepeatedTiming // First



          0.18




          b = Ordering[Ordering[data]]; // RepeatedTiming // First



          0.307




          c = Statistics`Library`GetDataRankings[data]; // RepeatedTiming // First



          0.226




          a == b == c



          True




          A slightly faster alternative (still slower than Ranking):



          ranks = Module[{r = Range@Length@#, o = Ordering@#}, Permute[r, o]] &;
          d = ranks @ data; // RepeatedTiming // First



          0.203




          a == b == c == d



          True








          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 27 at 0:31

























          answered Nov 27 at 0:24









          kglr

          175k9197402




          175k9197402






















              up vote
              5
              down vote













              What about this?



              Ordering[Ordering[data]]



              {3, 4, 1, 5, 6, 7, 2}




              Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



              Ranking[data_] := Module[{a},
              a = Range[Length[data]];
              a[[Ordering[data]]] = a;
              a
              ]


              Comparison:



              data = RandomReal[{-1, 1}, 1000000];
              a = Ranking[data]; // RepeatedTiming // First
              b = Ordering[Ordering[data]]; // RepeatedTiming // First
              a == b



              0.13



              0.234



              True







              share|improve this answer























              • Brilliant! This does it. Thanks very much.
                – tquarton
                Nov 26 at 19:13










              • You're welcome.
                – Henrik Schumacher
                Nov 26 at 19:14










              • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
                – tquarton
                Nov 26 at 19:37















              up vote
              5
              down vote













              What about this?



              Ordering[Ordering[data]]



              {3, 4, 1, 5, 6, 7, 2}




              Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



              Ranking[data_] := Module[{a},
              a = Range[Length[data]];
              a[[Ordering[data]]] = a;
              a
              ]


              Comparison:



              data = RandomReal[{-1, 1}, 1000000];
              a = Ranking[data]; // RepeatedTiming // First
              b = Ordering[Ordering[data]]; // RepeatedTiming // First
              a == b



              0.13



              0.234



              True







              share|improve this answer























              • Brilliant! This does it. Thanks very much.
                – tquarton
                Nov 26 at 19:13










              • You're welcome.
                – Henrik Schumacher
                Nov 26 at 19:14










              • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
                – tquarton
                Nov 26 at 19:37













              up vote
              5
              down vote










              up vote
              5
              down vote









              What about this?



              Ordering[Ordering[data]]



              {3, 4, 1, 5, 6, 7, 2}




              Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



              Ranking[data_] := Module[{a},
              a = Range[Length[data]];
              a[[Ordering[data]]] = a;
              a
              ]


              Comparison:



              data = RandomReal[{-1, 1}, 1000000];
              a = Ranking[data]; // RepeatedTiming // First
              b = Ordering[Ordering[data]]; // RepeatedTiming // First
              a == b



              0.13



              0.234



              True







              share|improve this answer














              What about this?



              Ordering[Ordering[data]]



              {3, 4, 1, 5, 6, 7, 2}




              Since Ordering is the bottleneck, here a variant that needs only one call to Ordering:



              Ranking[data_] := Module[{a},
              a = Range[Length[data]];
              a[[Ordering[data]]] = a;
              a
              ]


              Comparison:



              data = RandomReal[{-1, 1}, 1000000];
              a = Ranking[data]; // RepeatedTiming // First
              b = Ordering[Ordering[data]]; // RepeatedTiming // First
              a == b



              0.13



              0.234



              True








              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Nov 26 at 19:18

























              answered Nov 26 at 19:09









              Henrik Schumacher

              47k466134




              47k466134












              • Brilliant! This does it. Thanks very much.
                – tquarton
                Nov 26 at 19:13










              • You're welcome.
                – Henrik Schumacher
                Nov 26 at 19:14










              • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
                – tquarton
                Nov 26 at 19:37


















              • Brilliant! This does it. Thanks very much.
                – tquarton
                Nov 26 at 19:13










              • You're welcome.
                – Henrik Schumacher
                Nov 26 at 19:14










              • Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
                – tquarton
                Nov 26 at 19:37
















              Brilliant! This does it. Thanks very much.
              – tquarton
              Nov 26 at 19:13




              Brilliant! This does it. Thanks very much.
              – tquarton
              Nov 26 at 19:13












              You're welcome.
              – Henrik Schumacher
              Nov 26 at 19:14




              You're welcome.
              – Henrik Schumacher
              Nov 26 at 19:14












              Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
              – tquarton
              Nov 26 at 19:37




              Carl Woll brought up a great point with regards to tied rankings. I thought I'd ping you in this comment in case you wanted to address it. For my purposes, I don't believe there are ties in my dataset of interest, so your solution still holds.
              – tquarton
              Nov 26 at 19:37










              up vote
              1
              down vote













              I'll answer my own question with a constructed function which does the job:



              Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


              Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






              share|improve this answer



























                up vote
                1
                down vote













                I'll answer my own question with a constructed function which does the job:



                Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


                Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






                share|improve this answer

























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  I'll answer my own question with a constructed function which does the job:



                  Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


                  Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.






                  share|improve this answer














                  I'll answer my own question with a constructed function which does the job:



                  Rank[x_]:=Flatten[Table[Position[Sort[x], x[[i]]], {i, 1, Length[x]}]]


                  Ordering gives the sort of inverse of the above function where you get the position of the unsorted data with respect to the sorted data. Here, the Rank function gets the position of the sorted data with respect to the unsorted data.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 26 at 19:07

























                  answered Nov 26 at 18:58









                  tquarton

                  25717




                  25717






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Mathematica Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f186727%2frank-transformation-of-an-array%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to change which sound is reproduced for terminal bell?

                      Can I use Tabulator js library in my java Spring + Thymeleaf project?

                      Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents