How to construct a list of lengths efficiently

up vote
17
down vote

favorite

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

This is terribly inefficient, but for some reason I cannot come up with a better a approach. What's a better way to accomplish this? This seems to be a fairly standard exercise, so there should be plenty of duplicates, but I can find none.
I am probably missing a key word...

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

1

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
Nov 13 at 13:08

add a comment |

up vote
17
down vote

favorite

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

1

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
Nov 13 at 13:08

add a comment |

up vote
17
down vote

favorite

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

Say I have a sorted list of integers

RandomInteger[{1, 100000}, 10000] // Sort // Short

I want to construct another list whose $m$-th element is the number of elements in the original list that are less than or equal to $m$:

Table[Length@Select[%, LessEqualThan[m]], {m, 10000}]

list-manipulation performance-tuning filtering

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

edited Nov 13 at 12:30

Alexey Popkov

38k4104260

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

asked Nov 13 at 1:00

AccidentalFourierTransform

4,9681941

1

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
Nov 13 at 13:08

add a comment |

1

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
Nov 13 at 13:08

What do you want to do with the table? What's your original problem? It sounds like you want to build some kind of CDF table manually. Maybe an EmpiricalDistribution or BinCounts already can accomplish what you want?
– Thies Heidecke
Nov 13 at 13:08

add a comment |

4 Answers
4

active

oldest

votes

up vote
20
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited Nov 13 at 4:11

answered Nov 13 at 1:11

Carl Woll

65.2k285171

Ah, great answer, as usual.
– AccidentalFourierTransform
Nov 13 at 1:19

Can you please check my answer because my laptop is very slow
– J42161217
Nov 13 at 3:05

add a comment |

up vote
13
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:31

kglr

171k8194399

Beat me to it - BinCounts is the way... +1
– ciao
Nov 13 at 6:49

1

Hey @ciao, you are back?!!
– kglr
Nov 13 at 6:53

Sorry @Henrik; thanks for the edit.
– kglr
Nov 13 at 10:23

1

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
Nov 13 at 15:14

@kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
– ciao
Nov 13 at 22:42

add a comment |

up vote
12
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited Nov 13 at 12:57

answered Nov 13 at 3:04

J42161217

2,993219

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
Nov 13 at 3:55

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
Nov 13 at 4:01

add a comment |

up vote
5
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

Let us just imagine for the moment that the target list r is supposed to have length 100000 (we can truncate it afterwards). Then for each entry i in the list s, the list r needs to have a jump of height 1 at position i. The jumps are the "derivative" of r (in a discrete sense) and the antiderivative can be obtained with Accumulate. In order to get the list of jumps, we need additive matrix assembly.

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

edited Nov 13 at 23:11

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "387"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f185874%2fhow-to-construct-a-list-of-lengths-efficiently%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
20
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited Nov 13 at 4:11

answered Nov 13 at 1:11

Carl Woll

65.2k285171

Ah, great answer, as usual.
– AccidentalFourierTransform
Nov 13 at 1:19

Can you please check my answer because my laptop is very slow
– J42161217
Nov 13 at 3:05

add a comment |

up vote
20
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited Nov 13 at 4:11

answered Nov 13 at 1:11

Carl Woll

65.2k285171

Ah, great answer, as usual.
– AccidentalFourierTransform
Nov 13 at 1:19

Can you please check my answer because my laptop is very slow
– J42161217
Nov 13 at 3:05

add a comment |

up vote
20
down vote

accepted

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited Nov 13 at 4:11

answered Nov 13 at 1:11

Carl Woll

65.2k285171

You can use the usual UnitStep + Total tricks:

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming



r2 = Table[Length@Select[s,LessEqualThan[m]],{m,10000}];//AbsoluteTiming



r1 === r2

{0.435358, Null}

{41.4357, Null}

True

Update

As @J42161217 points out, you can take advantage of the fact that the data is sorted to speed things up. He used Differences. Here is a version that uses Nearest instead:

mincounts[s_] := With[

    {

    unique = DeleteDuplicates@Nearest[s->"Element",s][[All,-1]],

    counts = Prepend[0] @ DeleteDuplicates@Nearest[s->"Index",s][[All,-1]]

    },

    With[{near = Nearest[unique->"Index", Range @ Length @ s][[All,1]]},

        counts[[1+near-UnitStep[unique[[near]]-Range@Length@s-1]]]

    ]

]

Comparison:

SeedRandom[1];

s=RandomInteger[{1,100000},10000]//Sort;



(* my first answer *)

r1 = Table[Total[UnitStep[m-s]], {m,10000}]; //AbsoluteTiming

(* J42161217's answer *)

r2 = Flatten[

    Join[

        {Table[0, s[[1]] - 1]}, 

        Table[Table[i, Differences[s][[i]]], {i, Length[Select[s, # <= 10000 &]]}]

    ]

][[;;10000]]; // AbsoluteTiming

(* using Nearest *)

r3 = mincounts[s]; //AbsoluteTiming



r1 === r2 === r3

{0.432897, Null}

{0.122198, Null}

{0.025923, Null}

True

edited Nov 13 at 4:11

answered Nov 13 at 1:11

Carl Woll

65.2k285171

edited Nov 13 at 4:11

answered Nov 13 at 1:11

Carl Woll

65.2k285171

answered Nov 13 at 1:11

Carl Woll

65.2k285171

answered Nov 13 at 1:11

Carl Woll

65.2k285171

Ah, great answer, as usual.
– AccidentalFourierTransform
Nov 13 at 1:19

Can you please check my answer because my laptop is very slow
– J42161217
Nov 13 at 3:05

add a comment |

Ah, great answer, as usual.
– AccidentalFourierTransform
Nov 13 at 1:19

Can you please check my answer because my laptop is very slow
– J42161217
Nov 13 at 3:05

Ah, great answer, as usual.
– AccidentalFourierTransform
Nov 13 at 1:19

Can you please check my answer because my laptop is very slow
– J42161217
Nov 13 at 3:05

add a comment |

up vote
13
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:31

kglr

171k8194399

Beat me to it - BinCounts is the way... +1
– ciao
Nov 13 at 6:49

1

Hey @ciao, you are back?!!
– kglr
Nov 13 at 6:53

Sorry @Henrik; thanks for the edit.
– kglr
Nov 13 at 10:23

1

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
Nov 13 at 15:14

@kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
– ciao
Nov 13 at 22:42

add a comment |

up vote
13
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:31

kglr

171k8194399

Beat me to it - BinCounts is the way... +1
– ciao
Nov 13 at 6:49

1

Hey @ciao, you are back?!!
– kglr
Nov 13 at 6:53

Sorry @Henrik; thanks for the edit.
– kglr
Nov 13 at 10:23

1

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
Nov 13 at 15:14

@kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
– ciao
Nov 13 at 22:42

add a comment |

up vote
13
down vote

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:31

kglr

171k8194399

BinCounts and Accumulate combination is faster than all the methods posted so far:

r4 = Accumulate @ BinCounts[s, {1, 1 + 10000, 1}]; // RepeatedTiming // First

0.00069

versus Henrik Schumacher's mySparseArray, Carl Woll's mincounts and J42161217's Differences-based method:

r5 = Accumulate[mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[

      1 ;; Length[s]]]

      ]; // RepeatedTiming // First

0.00081

r3 = mincounts[s]; // RepeatedTiming // First

0.016

r2 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

             Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

0.149

r2 == r3 == r4 == r5

True

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:31

kglr

171k8194399

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

edited Nov 13 at 10:10

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:31

kglr

171k8194399

answered Nov 13 at 6:31

kglr

171k8194399

answered Nov 13 at 6:31

kglr

171k8194399

Beat me to it - BinCounts is the way... +1
– ciao
Nov 13 at 6:49

1

Hey @ciao, you are back?!!
– kglr
Nov 13 at 6:53

Sorry @Henrik; thanks for the edit.
– kglr
Nov 13 at 10:23

1

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
Nov 13 at 15:14

@kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
– ciao
Nov 13 at 22:42

add a comment |

Beat me to it - BinCounts is the way... +1
– ciao
Nov 13 at 6:49

1

Hey @ciao, you are back?!!
– kglr
Nov 13 at 6:53

Sorry @Henrik; thanks for the edit.
– kglr
Nov 13 at 10:23

1

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
Nov 13 at 15:14

@kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
– ciao
Nov 13 at 22:42

Beat me to it - BinCounts is the way... +1
– ciao
Nov 13 at 6:49

Hey @ciao, you are back?!!
– kglr
Nov 13 at 6:53

Sorry @Henrik; thanks for the edit.
– kglr
Nov 13 at 10:23

Short and fast. No need sorting.. Excellent!!... +1
– Okkes Dulgerci
Nov 13 at 15:14

@kglr - Been here all along, just busy herding a couple of startups, but I read several times a week, and usually learn something new each time.
– ciao
Nov 13 at 22:42

add a comment |

up vote
12
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited Nov 13 at 12:57

answered Nov 13 at 3:04

J42161217

2,993219

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
Nov 13 at 3:55

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
Nov 13 at 4:01

add a comment |

up vote
12
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited Nov 13 at 12:57

answered Nov 13 at 3:04

J42161217

2,993219

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
Nov 13 at 3:55

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
Nov 13 at 4:01

add a comment |

up vote
12
down vote

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited Nov 13 at 12:57

answered Nov 13 at 3:04

J42161217

2,993219

I think this is at least x3 faster than Mr. Carl Woll's answer

Can anybody compare my timing?

r3 = Flatten[Join[{Table[0, s[[1]] - 1]}, 

Table[Table[i, Differences[s][[i]]], {i, 

    Length[Select[s, # <= 10000 &]]}]]][[;;10000]]; // AbsoluteTiming

{0.157123, Null}

Using MapThread the same code is way faster

r6 = Flatten[

 Join[{Table[0, s[[1]] - 1]}, 

  MapThread[

   Table, {Range[t = Length[Select[s, # <= 10000 &]]], 

    Differences[s][[1 ;; t]]}]]][[;; 10000]]; // AbsoluteTiming   



r6===r3

{0.008387, Null}

True

edited Nov 13 at 12:57

answered Nov 13 at 3:04

J42161217

2,993219

edited Nov 13 at 12:57

answered Nov 13 at 3:04

J42161217

2,993219

answered Nov 13 at 3:04

J42161217

2,993219

answered Nov 13 at 3:04

J42161217

2,993219

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
Nov 13 at 3:55

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
Nov 13 at 4:01

add a comment |

1

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
Nov 13 at 3:55

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
Nov 13 at 4:01

These are the timing on my laptop r1={0.444354, Null},r2={39.456, Null},r3={0.157123, Null} True
– Okkes Dulgerci
Nov 13 at 3:55

Hey, thanks for checking. I will use your timing. Thanks for the confirmation
– J42161217
Nov 13 at 4:01

add a comment |

up vote
5
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

edited Nov 13 at 23:11

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

add a comment |

up vote
5
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

edited Nov 13 at 23:11

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

add a comment |

up vote
5
down vote

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

edited Nov 13 at 23:11

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

s = Sort[RandomInteger[{1, 100000}, 10000]];

This can be done with this function:

mySparseArray[rules_, dims_, f_: Total, background_: 0.] := 

 If[(Head[rules] === Rule) && (rules[[1]] === {}),

  rules[[2]],

  With[{spopt = SystemOptions["SparseArrayOptions"]},

   Internal`WithLocalSettings[

    SetSystemOptions[

     "SparseArrayOptions" -> {"TreatRepeatedEntries" -> f}],

    SparseArray[rules, dims, background],

    SetSystemOptions[spopt]]

   ]

  ]

So, in total, r can be obtained as follows:

r4 = Accumulate[

 mySparseArray[Partition[s, 1] -> 1, {s[[-1]]}, Total, 0][[1 ;; Length[s]]]

 ]; // RepeatedTiming // First

0.00055

For comparison:

r3 = Flatten[

      Join[{Table[0, s[[1]] - 1]}, 

       Table[Table[i, Differences[s][[i]]], {i, 

         Length[Select[s, # <= 10000 &]]}]]][[;; 10000]]; // 

  RepeatedTiming // First

r3 == r4

0.116

True

edited Nov 13 at 23:11

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

edited Nov 13 at 23:11

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

answered Nov 13 at 6:32

Henrik Schumacher

45.1k365131

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky