python numpy vectorize an array of object instances











up vote
0
down vote

favorite












I'd like to encapsulate my calc function and all its parameters inside an object, but vectorize the execution for millions of objects much like how numpy would do it. Any suggestions?



the calculation is still basic arithmetic which numpy should be able to vectorize.



Example code:



import numpy as np
myarray = np.random.rand(3, 10000000)

############################# This works fine: FAST ###################################

def calc(a,b,c):
return (a+b/c)**b/a


res1 = calc(*myarray) #0.7 seconds

############################# What I'd like to do (unsuccessfully): SLOW ###################################

class MyClass():
__slots__ = ['a','b','c']

def __init__(self, a,b,c):
self.a, self.b, self.c = a,b,c

def calc(self):
return (self.a + self.b / self.c) ** self.b / self.a

def classCalc(myClass:MyClass):
return myClass.calc()

vectorizedClassCalc = np.vectorize(classCalc)
myobjects = np.array([MyClass(*args) for args in myarray.transpose()])


res2 = vectorizedClassCalc(myobjects) #8 seconds no different from a list comprehension
res3 = [obj.calc() for obj in myobjects] #7.5 seconds


perhaps pandas has additional features?










share|improve this question






















  • This creates a lot of overhead because now Python has to deal with millions of different instances. You really should redesign you solution as I don't think, this will get much faster
    – user8408080
    Nov 13 at 14:29






  • 1




    Your array will consist of pointers to the objects elsewhere in memory, much like a list. Iteration over such an array is like a list comprehension,and no where as fast as the compiled numpy code for numeric dtypes. Search for np.frompyfunc for past discussions on this topic.
    – hpaulj
    Nov 13 at 14:46






  • 1




    hpaulj is pretty spot on as to why you're seeing a slow down: good access patterns over data and contiguous memory are a very large part of why numpy is so fast. Another thing to consider is why you want to use OOP. The function that you have looks perfectly reasonable. A lot of scientific code (especially numpy/scipy heavy code) simply doesn't use OOP, or uses it sparingly when it's convenient or the right tool for the job. OOP is a tool: nothing more. But as an aside, consider just leaving your calc function as is and just pass it in your array: it looks fine as-is.
    – Matt Messersmith
    Nov 13 at 15:13












  • Are you not making a single object that contains the big arrays because you're not sure how many instances you have?
    – anishtain4
    Nov 13 at 15:20










  • An example from a couple of weeks ago: stackoverflow.com/q/53034280/901925
    – hpaulj
    Nov 13 at 15:32















up vote
0
down vote

favorite












I'd like to encapsulate my calc function and all its parameters inside an object, but vectorize the execution for millions of objects much like how numpy would do it. Any suggestions?



the calculation is still basic arithmetic which numpy should be able to vectorize.



Example code:



import numpy as np
myarray = np.random.rand(3, 10000000)

############################# This works fine: FAST ###################################

def calc(a,b,c):
return (a+b/c)**b/a


res1 = calc(*myarray) #0.7 seconds

############################# What I'd like to do (unsuccessfully): SLOW ###################################

class MyClass():
__slots__ = ['a','b','c']

def __init__(self, a,b,c):
self.a, self.b, self.c = a,b,c

def calc(self):
return (self.a + self.b / self.c) ** self.b / self.a

def classCalc(myClass:MyClass):
return myClass.calc()

vectorizedClassCalc = np.vectorize(classCalc)
myobjects = np.array([MyClass(*args) for args in myarray.transpose()])


res2 = vectorizedClassCalc(myobjects) #8 seconds no different from a list comprehension
res3 = [obj.calc() for obj in myobjects] #7.5 seconds


perhaps pandas has additional features?










share|improve this question






















  • This creates a lot of overhead because now Python has to deal with millions of different instances. You really should redesign you solution as I don't think, this will get much faster
    – user8408080
    Nov 13 at 14:29






  • 1




    Your array will consist of pointers to the objects elsewhere in memory, much like a list. Iteration over such an array is like a list comprehension,and no where as fast as the compiled numpy code for numeric dtypes. Search for np.frompyfunc for past discussions on this topic.
    – hpaulj
    Nov 13 at 14:46






  • 1




    hpaulj is pretty spot on as to why you're seeing a slow down: good access patterns over data and contiguous memory are a very large part of why numpy is so fast. Another thing to consider is why you want to use OOP. The function that you have looks perfectly reasonable. A lot of scientific code (especially numpy/scipy heavy code) simply doesn't use OOP, or uses it sparingly when it's convenient or the right tool for the job. OOP is a tool: nothing more. But as an aside, consider just leaving your calc function as is and just pass it in your array: it looks fine as-is.
    – Matt Messersmith
    Nov 13 at 15:13












  • Are you not making a single object that contains the big arrays because you're not sure how many instances you have?
    – anishtain4
    Nov 13 at 15:20










  • An example from a couple of weeks ago: stackoverflow.com/q/53034280/901925
    – hpaulj
    Nov 13 at 15:32













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I'd like to encapsulate my calc function and all its parameters inside an object, but vectorize the execution for millions of objects much like how numpy would do it. Any suggestions?



the calculation is still basic arithmetic which numpy should be able to vectorize.



Example code:



import numpy as np
myarray = np.random.rand(3, 10000000)

############################# This works fine: FAST ###################################

def calc(a,b,c):
return (a+b/c)**b/a


res1 = calc(*myarray) #0.7 seconds

############################# What I'd like to do (unsuccessfully): SLOW ###################################

class MyClass():
__slots__ = ['a','b','c']

def __init__(self, a,b,c):
self.a, self.b, self.c = a,b,c

def calc(self):
return (self.a + self.b / self.c) ** self.b / self.a

def classCalc(myClass:MyClass):
return myClass.calc()

vectorizedClassCalc = np.vectorize(classCalc)
myobjects = np.array([MyClass(*args) for args in myarray.transpose()])


res2 = vectorizedClassCalc(myobjects) #8 seconds no different from a list comprehension
res3 = [obj.calc() for obj in myobjects] #7.5 seconds


perhaps pandas has additional features?










share|improve this question













I'd like to encapsulate my calc function and all its parameters inside an object, but vectorize the execution for millions of objects much like how numpy would do it. Any suggestions?



the calculation is still basic arithmetic which numpy should be able to vectorize.



Example code:



import numpy as np
myarray = np.random.rand(3, 10000000)

############################# This works fine: FAST ###################################

def calc(a,b,c):
return (a+b/c)**b/a


res1 = calc(*myarray) #0.7 seconds

############################# What I'd like to do (unsuccessfully): SLOW ###################################

class MyClass():
__slots__ = ['a','b','c']

def __init__(self, a,b,c):
self.a, self.b, self.c = a,b,c

def calc(self):
return (self.a + self.b / self.c) ** self.b / self.a

def classCalc(myClass:MyClass):
return myClass.calc()

vectorizedClassCalc = np.vectorize(classCalc)
myobjects = np.array([MyClass(*args) for args in myarray.transpose()])


res2 = vectorizedClassCalc(myobjects) #8 seconds no different from a list comprehension
res3 = [obj.calc() for obj in myobjects] #7.5 seconds


perhaps pandas has additional features?







pandas class numpy object vectorization






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 13 at 13:41









user1441053

133126




133126












  • This creates a lot of overhead because now Python has to deal with millions of different instances. You really should redesign you solution as I don't think, this will get much faster
    – user8408080
    Nov 13 at 14:29






  • 1




    Your array will consist of pointers to the objects elsewhere in memory, much like a list. Iteration over such an array is like a list comprehension,and no where as fast as the compiled numpy code for numeric dtypes. Search for np.frompyfunc for past discussions on this topic.
    – hpaulj
    Nov 13 at 14:46






  • 1




    hpaulj is pretty spot on as to why you're seeing a slow down: good access patterns over data and contiguous memory are a very large part of why numpy is so fast. Another thing to consider is why you want to use OOP. The function that you have looks perfectly reasonable. A lot of scientific code (especially numpy/scipy heavy code) simply doesn't use OOP, or uses it sparingly when it's convenient or the right tool for the job. OOP is a tool: nothing more. But as an aside, consider just leaving your calc function as is and just pass it in your array: it looks fine as-is.
    – Matt Messersmith
    Nov 13 at 15:13












  • Are you not making a single object that contains the big arrays because you're not sure how many instances you have?
    – anishtain4
    Nov 13 at 15:20










  • An example from a couple of weeks ago: stackoverflow.com/q/53034280/901925
    – hpaulj
    Nov 13 at 15:32


















  • This creates a lot of overhead because now Python has to deal with millions of different instances. You really should redesign you solution as I don't think, this will get much faster
    – user8408080
    Nov 13 at 14:29






  • 1




    Your array will consist of pointers to the objects elsewhere in memory, much like a list. Iteration over such an array is like a list comprehension,and no where as fast as the compiled numpy code for numeric dtypes. Search for np.frompyfunc for past discussions on this topic.
    – hpaulj
    Nov 13 at 14:46






  • 1




    hpaulj is pretty spot on as to why you're seeing a slow down: good access patterns over data and contiguous memory are a very large part of why numpy is so fast. Another thing to consider is why you want to use OOP. The function that you have looks perfectly reasonable. A lot of scientific code (especially numpy/scipy heavy code) simply doesn't use OOP, or uses it sparingly when it's convenient or the right tool for the job. OOP is a tool: nothing more. But as an aside, consider just leaving your calc function as is and just pass it in your array: it looks fine as-is.
    – Matt Messersmith
    Nov 13 at 15:13












  • Are you not making a single object that contains the big arrays because you're not sure how many instances you have?
    – anishtain4
    Nov 13 at 15:20










  • An example from a couple of weeks ago: stackoverflow.com/q/53034280/901925
    – hpaulj
    Nov 13 at 15:32
















This creates a lot of overhead because now Python has to deal with millions of different instances. You really should redesign you solution as I don't think, this will get much faster
– user8408080
Nov 13 at 14:29




This creates a lot of overhead because now Python has to deal with millions of different instances. You really should redesign you solution as I don't think, this will get much faster
– user8408080
Nov 13 at 14:29




1




1




Your array will consist of pointers to the objects elsewhere in memory, much like a list. Iteration over such an array is like a list comprehension,and no where as fast as the compiled numpy code for numeric dtypes. Search for np.frompyfunc for past discussions on this topic.
– hpaulj
Nov 13 at 14:46




Your array will consist of pointers to the objects elsewhere in memory, much like a list. Iteration over such an array is like a list comprehension,and no where as fast as the compiled numpy code for numeric dtypes. Search for np.frompyfunc for past discussions on this topic.
– hpaulj
Nov 13 at 14:46




1




1




hpaulj is pretty spot on as to why you're seeing a slow down: good access patterns over data and contiguous memory are a very large part of why numpy is so fast. Another thing to consider is why you want to use OOP. The function that you have looks perfectly reasonable. A lot of scientific code (especially numpy/scipy heavy code) simply doesn't use OOP, or uses it sparingly when it's convenient or the right tool for the job. OOP is a tool: nothing more. But as an aside, consider just leaving your calc function as is and just pass it in your array: it looks fine as-is.
– Matt Messersmith
Nov 13 at 15:13






hpaulj is pretty spot on as to why you're seeing a slow down: good access patterns over data and contiguous memory are a very large part of why numpy is so fast. Another thing to consider is why you want to use OOP. The function that you have looks perfectly reasonable. A lot of scientific code (especially numpy/scipy heavy code) simply doesn't use OOP, or uses it sparingly when it's convenient or the right tool for the job. OOP is a tool: nothing more. But as an aside, consider just leaving your calc function as is and just pass it in your array: it looks fine as-is.
– Matt Messersmith
Nov 13 at 15:13














Are you not making a single object that contains the big arrays because you're not sure how many instances you have?
– anishtain4
Nov 13 at 15:20




Are you not making a single object that contains the big arrays because you're not sure how many instances you have?
– anishtain4
Nov 13 at 15:20












An example from a couple of weeks ago: stackoverflow.com/q/53034280/901925
– hpaulj
Nov 13 at 15:32




An example from a couple of weeks ago: stackoverflow.com/q/53034280/901925
– hpaulj
Nov 13 at 15:32

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282346%2fpython-numpy-vectorize-an-array-of-object-instances%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282346%2fpython-numpy-vectorize-an-array-of-object-instances%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to change which sound is reproduced for terminal bell?

Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

Can I use Tabulator js library in my java Spring + Thymeleaf project?