Quadratic n term equation using multiindex

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I have two DFs which I would like to use to calculate the following:

w(ti,ti)*a(ti)^2 + w(tj,tj)*b(sj,tj)^2 + 2*w(si,tj)*a(ti)*b(tj)

The above uses two terms (a,b).
w is the weight df where i and j are index and column spaces pertaining to the Tn index of a and b.

Set Up - Edit dynamic W

import pandas as pd

import numpy as np



I = ['i'+ str(i) for i in range(4)]

Q = ['q' + str(i) for i in range(5)]

T = ['t' + str(i) for i in range(3)]

n = 100



df1 = pd.DataFrame({'I': [I[np.random.randint(len(I))] for i in range(n)],

                    'Q': [Q[np.random.randint(len(Q))] for i in range(n)],

                    'Tn': [T[np.random.randint(len(T))] for i in range(n)],

                    'V': np.random.rand(n)}).groupby(['I','Q','Tn']).sum()



df1.head(5)

I  Q  Tn  V        

i0 q0 t0  1.626799

      t2  1.725374

   q1 t0  2.155340

      t1  0.479741

      t2  1.039178



w = np.random.randn(len(T),len(T))

w = (w*w.T)/2

np.fill_diagonal(w,1)

W = pd.DataFrame(w, columns = T, index = T)



W

          t0        t1        t2

t0  1.000000  0.029174 -0.045754

t1  0.029174  1.000000  0.233330

t2 -0.045754  0.233330  1.000000

Effectively I would like to use the index Tn in df1 to use the above equation for every I and Q.

The end result for df1.loc['i0','q0'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t2) * V(t0) * V(t2) 

=     

  1.0 * 1.626799**2 

+ 1.0 * 1.725374**2 

+ (-0.045754) * 1.626799 * 1.725374

The end result for df1.loc['i0','q1'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t1,t1) * V(t1)^2

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t1) * V(t0) * V(t1)

+ 2 * W(t0,t2) * V(t0) * V(t2)

+ 2 * W(t2,t1) * V(t1) * V(t2)

=     

  1.0 * 2.155340**2 

+ 1.0 * 0.479741**2

+ 1.0 * 1.039178**2

+ 0.029174 * 2.155340 * 0.479741 * 1

+ (-0.045754) * 2.155340 * 1.039178 * 1

+ 0.233330 * 0.479741 * 1.039178 * 1

This pattern will repeat depending on the number of tn terms in each Q hence it should be robust enough to handle as many Tn terms as needed (in the example I use 3, but it could be as much as 100 or more).

Each result should then be saved in a new DF with Index = [I, Q]
The solution should also not be slower than excel when n increases in value.

Thanks in advance

edited Nov 25 '18 at 8:36

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

2

Your equation implies the value 'w' is the same for all three terms but they are not. Maybe you should rename them and describe how they relate to or are derived from the df1 indices . Make it easier for your readers.

– wwii
Nov 22 '18 at 21:52

2

df1.loc['i0','q0' has three Tn's. How does it work?

– wwii
Nov 22 '18 at 22:03

1

Is W not supposed to be symmetric? if not, how I know which factor to used between W.loc['t3','t4'] and W.loc['t4','t3'] for the example you give, because you use the first one but why?

– Ben.T
Nov 22 '18 at 22:09

1

I have changed the question to correspond with the comments

– RealRageDontQuit
Nov 22 '18 at 22:42

add a comment |

I have two DFs which I would like to use to calculate the following:

w(ti,ti)*a(ti)^2 + w(tj,tj)*b(sj,tj)^2 + 2*w(si,tj)*a(ti)*b(tj)

The above uses two terms (a,b).
w is the weight df where i and j are index and column spaces pertaining to the Tn index of a and b.

Set Up - Edit dynamic W

import pandas as pd

import numpy as np



I = ['i'+ str(i) for i in range(4)]

Q = ['q' + str(i) for i in range(5)]

T = ['t' + str(i) for i in range(3)]

n = 100



df1 = pd.DataFrame({'I': [I[np.random.randint(len(I))] for i in range(n)],

                    'Q': [Q[np.random.randint(len(Q))] for i in range(n)],

                    'Tn': [T[np.random.randint(len(T))] for i in range(n)],

                    'V': np.random.rand(n)}).groupby(['I','Q','Tn']).sum()



df1.head(5)

I  Q  Tn  V        

i0 q0 t0  1.626799

      t2  1.725374

   q1 t0  2.155340

      t1  0.479741

      t2  1.039178



w = np.random.randn(len(T),len(T))

w = (w*w.T)/2

np.fill_diagonal(w,1)

W = pd.DataFrame(w, columns = T, index = T)



W

          t0        t1        t2

t0  1.000000  0.029174 -0.045754

t1  0.029174  1.000000  0.233330

t2 -0.045754  0.233330  1.000000

Effectively I would like to use the index Tn in df1 to use the above equation for every I and Q.

The end result for df1.loc['i0','q0'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t2) * V(t0) * V(t2) 

=     

  1.0 * 1.626799**2 

+ 1.0 * 1.725374**2 

+ (-0.045754) * 1.626799 * 1.725374

The end result for df1.loc['i0','q1'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t1,t1) * V(t1)^2

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t1) * V(t0) * V(t1)

+ 2 * W(t0,t2) * V(t0) * V(t2)

+ 2 * W(t2,t1) * V(t1) * V(t2)

=     

  1.0 * 2.155340**2 

+ 1.0 * 0.479741**2

+ 1.0 * 1.039178**2

+ 0.029174 * 2.155340 * 0.479741 * 1

+ (-0.045754) * 2.155340 * 1.039178 * 1

+ 0.233330 * 0.479741 * 1.039178 * 1

Each result should then be saved in a new DF with Index = [I, Q]
The solution should also not be slower than excel when n increases in value.

Thanks in advance

edited Nov 25 '18 at 8:36

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

2

Your equation implies the value 'w' is the same for all three terms but they are not. Maybe you should rename them and describe how they relate to or are derived from the df1 indices . Make it easier for your readers.

– wwii
Nov 22 '18 at 21:52

2

df1.loc['i0','q0' has three Tn's. How does it work?

– wwii
Nov 22 '18 at 22:03

1

Is W not supposed to be symmetric? if not, how I know which factor to used between W.loc['t3','t4'] and W.loc['t4','t3'] for the example you give, because you use the first one but why?

– Ben.T
Nov 22 '18 at 22:09

1

I have changed the question to correspond with the comments

– RealRageDontQuit
Nov 22 '18 at 22:42

add a comment |

I have two DFs which I would like to use to calculate the following:

w(ti,ti)*a(ti)^2 + w(tj,tj)*b(sj,tj)^2 + 2*w(si,tj)*a(ti)*b(tj)

The above uses two terms (a,b).
w is the weight df where i and j are index and column spaces pertaining to the Tn index of a and b.

Set Up - Edit dynamic W

import pandas as pd

import numpy as np



I = ['i'+ str(i) for i in range(4)]

Q = ['q' + str(i) for i in range(5)]

T = ['t' + str(i) for i in range(3)]

n = 100



df1 = pd.DataFrame({'I': [I[np.random.randint(len(I))] for i in range(n)],

                    'Q': [Q[np.random.randint(len(Q))] for i in range(n)],

                    'Tn': [T[np.random.randint(len(T))] for i in range(n)],

                    'V': np.random.rand(n)}).groupby(['I','Q','Tn']).sum()



df1.head(5)

I  Q  Tn  V        

i0 q0 t0  1.626799

      t2  1.725374

   q1 t0  2.155340

      t1  0.479741

      t2  1.039178



w = np.random.randn(len(T),len(T))

w = (w*w.T)/2

np.fill_diagonal(w,1)

W = pd.DataFrame(w, columns = T, index = T)



W

          t0        t1        t2

t0  1.000000  0.029174 -0.045754

t1  0.029174  1.000000  0.233330

t2 -0.045754  0.233330  1.000000

Effectively I would like to use the index Tn in df1 to use the above equation for every I and Q.

The end result for df1.loc['i0','q0'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t2) * V(t0) * V(t2) 

=     

  1.0 * 1.626799**2 

+ 1.0 * 1.725374**2 

+ (-0.045754) * 1.626799 * 1.725374

The end result for df1.loc['i0','q1'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t1,t1) * V(t1)^2

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t1) * V(t0) * V(t1)

+ 2 * W(t0,t2) * V(t0) * V(t2)

+ 2 * W(t2,t1) * V(t1) * V(t2)

=     

  1.0 * 2.155340**2 

+ 1.0 * 0.479741**2

+ 1.0 * 1.039178**2

+ 0.029174 * 2.155340 * 0.479741 * 1

+ (-0.045754) * 2.155340 * 1.039178 * 1

+ 0.233330 * 0.479741 * 1.039178 * 1

Each result should then be saved in a new DF with Index = [I, Q]
The solution should also not be slower than excel when n increases in value.

Thanks in advance

edited Nov 25 '18 at 8:36

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

I have two DFs which I would like to use to calculate the following:

w(ti,ti)*a(ti)^2 + w(tj,tj)*b(sj,tj)^2 + 2*w(si,tj)*a(ti)*b(tj)

The above uses two terms (a,b).
w is the weight df where i and j are index and column spaces pertaining to the Tn index of a and b.

Set Up - Edit dynamic W

import pandas as pd

import numpy as np



I = ['i'+ str(i) for i in range(4)]

Q = ['q' + str(i) for i in range(5)]

T = ['t' + str(i) for i in range(3)]

n = 100



df1 = pd.DataFrame({'I': [I[np.random.randint(len(I))] for i in range(n)],

                    'Q': [Q[np.random.randint(len(Q))] for i in range(n)],

                    'Tn': [T[np.random.randint(len(T))] for i in range(n)],

                    'V': np.random.rand(n)}).groupby(['I','Q','Tn']).sum()



df1.head(5)

I  Q  Tn  V        

i0 q0 t0  1.626799

      t2  1.725374

   q1 t0  2.155340

      t1  0.479741

      t2  1.039178



w = np.random.randn(len(T),len(T))

w = (w*w.T)/2

np.fill_diagonal(w,1)

W = pd.DataFrame(w, columns = T, index = T)



W

          t0        t1        t2

t0  1.000000  0.029174 -0.045754

t1  0.029174  1.000000  0.233330

t2 -0.045754  0.233330  1.000000

Effectively I would like to use the index Tn in df1 to use the above equation for every I and Q.

The end result for df1.loc['i0','q0'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t2) * V(t0) * V(t2) 

=     

  1.0 * 1.626799**2 

+ 1.0 * 1.725374**2 

+ (-0.045754) * 1.626799 * 1.725374

The end result for df1.loc['i0','q1'] in the example above should be:

  W(t0,t0) * V(t0)^2 

+ W(t1,t1) * V(t1)^2

+ W(t2,t2) * V(t2)^2

+ 2 * W(t0,t1) * V(t0) * V(t1)

+ 2 * W(t0,t2) * V(t0) * V(t2)

+ 2 * W(t2,t1) * V(t1) * V(t2)

=     

  1.0 * 2.155340**2 

+ 1.0 * 0.479741**2

+ 1.0 * 1.039178**2

+ 0.029174 * 2.155340 * 0.479741 * 1

+ (-0.045754) * 2.155340 * 1.039178 * 1

+ 0.233330 * 0.479741 * 1.039178 * 1

Each result should then be saved in a new DF with Index = [I, Q]
The solution should also not be slower than excel when n increases in value.

Thanks in advance

python numpy dataframe multi-index quadratic

edited Nov 25 '18 at 8:36

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

edited Nov 25 '18 at 8:36

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

edited Nov 25 '18 at 8:36

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

asked Nov 22 '18 at 21:33

RealRageDontQuit

508

2

Your equation implies the value 'w' is the same for all three terms but they are not. Maybe you should rename them and describe how they relate to or are derived from the df1 indices . Make it easier for your readers.

– wwii
Nov 22 '18 at 21:52

2

df1.loc['i0','q0' has three Tn's. How does it work?

– wwii
Nov 22 '18 at 22:03

1

Is W not supposed to be symmetric? if not, how I know which factor to used between W.loc['t3','t4'] and W.loc['t4','t3'] for the example you give, because you use the first one but why?

– Ben.T
Nov 22 '18 at 22:09

1

I have changed the question to correspond with the comments

– RealRageDontQuit
Nov 22 '18 at 22:42

add a comment |

2

Your equation implies the value 'w' is the same for all three terms but they are not. Maybe you should rename them and describe how they relate to or are derived from the df1 indices . Make it easier for your readers.

– wwii
Nov 22 '18 at 21:52

2

df1.loc['i0','q0' has three Tn's. How does it work?

– wwii
Nov 22 '18 at 22:03

1

Is W not supposed to be symmetric? if not, how I know which factor to used between W.loc['t3','t4'] and W.loc['t4','t3'] for the example you give, because you use the first one but why?

– Ben.T
Nov 22 '18 at 22:09

1

I have changed the question to correspond with the comments

– RealRageDontQuit
Nov 22 '18 at 22:42

Your equation implies the value 'w' is the same for all three terms but they are not. Maybe you should rename them and describe how they relate to or are derived from the df1 indices . Make it easier for your readers.

– wwii
Nov 22 '18 at 21:52

df1.loc['i0','q0' has three Tn's. How does it work?

– wwii
Nov 22 '18 at 22:03

Is W not supposed to be symmetric? if not, how I know which factor to used between W.loc['t3','t4'] and W.loc['t4','t3'] for the example you give, because you use the first one but why?

– Ben.T
Nov 22 '18 at 22:09

I have changed the question to correspond with the comments

– RealRageDontQuit
Nov 22 '18 at 22:42

add a comment |

1 Answer
1

active

oldest

votes

One way could be first reindex your dataframe df1 with all the possible combinations of the lists I, Q and Tn with pd.MultiIndex.from_product, filling the missing value in the column 'V' with 0. The column has then len(I)*len(Q)*len(T) elements. Then you can reshape the values to get each row related to one combination on I and Q such as:

ar = (df1.reindex(pd.MultiIndex.from_product([I,Q,T], names=['I','Q','Tn']),fill_value=0)

         .values.reshape(-1,len(T)))

To see the relation between my input df1 and ar, here are some related rows

print (df1.head(6))

                 V

I  Q  Tn          

i0 q0 t1  1.123666

   q1 t0  0.538610

      t1  2.943206

   q2 t0  0.570990

      t1  0.617524

      t2  1.413926

print (ar[:3])

[[0.         1.1236656  0.        ]

 [0.53861027 2.94320574 0.        ]

 [0.57099049 0.61752408 1.4139263 ]]

Now, to perform the multiplication with the element of W, one way is to create the outer product of ar with itself but row-wise to get, for each row a len(T)*len(T) matrix. For example, for the second row:

[0.53861027 2.94320574 0.        ]

becomes

[[0.29010102, 1.58524083, 0.        ], #0.29010102 = 0.53861027**2, 1.58524083 = 0.53861027*2.94320574 ...

 [1.58524083, 8.66246003, 0.        ],

 [0.        , 0.        , 0.        ]]

Several methods are possible such as ar[:,:,None]*ar[:,None,:] or np.einsum with the right subscript: np.einsum('ij,ik->ijk',ar,ar). Both give same result.

The next step can be done with a tensordot and specify the right axes. So with ar and W as an input, you do:

print (np.tensordot(np.einsum('ij,ik->ijk',ar,ar),W.values,axes=([1,2],[0,1])))

array([ 1.26262437, 15.29352438, 15.94605435, ...

To check for the second value here, 1*0.29010102 + 1*8.66246003 + 2.*2*1.58524083 == 15.29352438 (where 1 is W(t0,t0) and W(t1,t1), 2 is W(t0,t1))

Finally, to create the dataframe as expected, use again pd.MultiIndex.from_product:

new_df = pd.DataFrame({'col1': np.tensordot(np.einsum('ij,ik->ijk',ar,ar),

                                            W.values,axes=([1,2],[0,1]))},

                      index=pd.MultiIndex.from_product([I,Q], names=['I','Q']))



print (new_df.head(3))

            col1

I  Q            

i0 q0   1.262624

   q1  15.293524

   q2  15.946054

...

Note: if you are SURE that each element of T is at least once in the last level of df1, the ar can be obtain using unstack such as ar=df1.unstack(fill_value=0).values. But I would suggest to use the reindex method above to prevent any error

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

This seems to work. However, I found an edge case in my problem which would make this answer not correct. Otherwise you have taught me something new! Thank you

– RealRageDontQuit
Nov 24 '18 at 11:16

@RealRageDontQuit what you call edge case is actually a different problem. You change the dataframe structure by adding an index level, change the formula by multipying with another matrix s and do a sum over this new index (at least of what I understood). I think my answer can pretty easily be adapted to this problem, but if you want a general method it will be more complicated.

– Ben.T
Nov 24 '18 at 12:14

1

you are correct to say that this question solves the initial problem. I have changed the question to reflect the initial problem, tick your answer as accepted (thanks) and will also create a new question with this edge case.

– RealRageDontQuit
Nov 25 '18 at 8:37

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53438193%2fquadratic-n-term-equation-using-multiindex%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

ar = (df1.reindex(pd.MultiIndex.from_product([I,Q,T], names=['I','Q','Tn']),fill_value=0)

         .values.reshape(-1,len(T)))

To see the relation between my input df1 and ar, here are some related rows

print (df1.head(6))

                 V

I  Q  Tn          

i0 q0 t1  1.123666

   q1 t0  0.538610

      t1  2.943206

   q2 t0  0.570990

      t1  0.617524

      t2  1.413926

print (ar[:3])

[[0.         1.1236656  0.        ]

 [0.53861027 2.94320574 0.        ]

 [0.57099049 0.61752408 1.4139263 ]]

[0.53861027 2.94320574 0.        ]

becomes

[[0.29010102, 1.58524083, 0.        ], #0.29010102 = 0.53861027**2, 1.58524083 = 0.53861027*2.94320574 ...

 [1.58524083, 8.66246003, 0.        ],

 [0.        , 0.        , 0.        ]]

Several methods are possible such as ar[:,:,None]*ar[:,None,:] or np.einsum with the right subscript: np.einsum('ij,ik->ijk',ar,ar). Both give same result.

The next step can be done with a tensordot and specify the right axes. So with ar and W as an input, you do:

print (np.tensordot(np.einsum('ij,ik->ijk',ar,ar),W.values,axes=([1,2],[0,1])))

array([ 1.26262437, 15.29352438, 15.94605435, ...

To check for the second value here, 1*0.29010102 + 1*8.66246003 + 2.*2*1.58524083 == 15.29352438 (where 1 is W(t0,t0) and W(t1,t1), 2 is W(t0,t1))

Finally, to create the dataframe as expected, use again pd.MultiIndex.from_product:

new_df = pd.DataFrame({'col1': np.tensordot(np.einsum('ij,ik->ijk',ar,ar),

                                            W.values,axes=([1,2],[0,1]))},

                      index=pd.MultiIndex.from_product([I,Q], names=['I','Q']))



print (new_df.head(3))

            col1

I  Q            

i0 q0   1.262624

   q1  15.293524

   q2  15.946054

...

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

This seems to work. However, I found an edge case in my problem which would make this answer not correct. Otherwise you have taught me something new! Thank you

– RealRageDontQuit
Nov 24 '18 at 11:16

@RealRageDontQuit what you call edge case is actually a different problem. You change the dataframe structure by adding an index level, change the formula by multipying with another matrix s and do a sum over this new index (at least of what I understood). I think my answer can pretty easily be adapted to this problem, but if you want a general method it will be more complicated.

– Ben.T
Nov 24 '18 at 12:14

1

you are correct to say that this question solves the initial problem. I have changed the question to reflect the initial problem, tick your answer as accepted (thanks) and will also create a new question with this edge case.

– RealRageDontQuit
Nov 25 '18 at 8:37

add a comment |

ar = (df1.reindex(pd.MultiIndex.from_product([I,Q,T], names=['I','Q','Tn']),fill_value=0)

         .values.reshape(-1,len(T)))

To see the relation between my input df1 and ar, here are some related rows

print (df1.head(6))

                 V

I  Q  Tn          

i0 q0 t1  1.123666

   q1 t0  0.538610

      t1  2.943206

   q2 t0  0.570990

      t1  0.617524

      t2  1.413926

print (ar[:3])

[[0.         1.1236656  0.        ]

 [0.53861027 2.94320574 0.        ]

 [0.57099049 0.61752408 1.4139263 ]]

[0.53861027 2.94320574 0.        ]

becomes

[[0.29010102, 1.58524083, 0.        ], #0.29010102 = 0.53861027**2, 1.58524083 = 0.53861027*2.94320574 ...

 [1.58524083, 8.66246003, 0.        ],

 [0.        , 0.        , 0.        ]]

Several methods are possible such as ar[:,:,None]*ar[:,None,:] or np.einsum with the right subscript: np.einsum('ij,ik->ijk',ar,ar). Both give same result.

The next step can be done with a tensordot and specify the right axes. So with ar and W as an input, you do:

print (np.tensordot(np.einsum('ij,ik->ijk',ar,ar),W.values,axes=([1,2],[0,1])))

array([ 1.26262437, 15.29352438, 15.94605435, ...

To check for the second value here, 1*0.29010102 + 1*8.66246003 + 2.*2*1.58524083 == 15.29352438 (where 1 is W(t0,t0) and W(t1,t1), 2 is W(t0,t1))

Finally, to create the dataframe as expected, use again pd.MultiIndex.from_product:

new_df = pd.DataFrame({'col1': np.tensordot(np.einsum('ij,ik->ijk',ar,ar),

                                            W.values,axes=([1,2],[0,1]))},

                      index=pd.MultiIndex.from_product([I,Q], names=['I','Q']))



print (new_df.head(3))

            col1

I  Q            

i0 q0   1.262624

   q1  15.293524

   q2  15.946054

...

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

This seems to work. However, I found an edge case in my problem which would make this answer not correct. Otherwise you have taught me something new! Thank you

– RealRageDontQuit
Nov 24 '18 at 11:16

@RealRageDontQuit what you call edge case is actually a different problem. You change the dataframe structure by adding an index level, change the formula by multipying with another matrix s and do a sum over this new index (at least of what I understood). I think my answer can pretty easily be adapted to this problem, but if you want a general method it will be more complicated.

– Ben.T
Nov 24 '18 at 12:14

1

you are correct to say that this question solves the initial problem. I have changed the question to reflect the initial problem, tick your answer as accepted (thanks) and will also create a new question with this edge case.

– RealRageDontQuit
Nov 25 '18 at 8:37

add a comment |

ar = (df1.reindex(pd.MultiIndex.from_product([I,Q,T], names=['I','Q','Tn']),fill_value=0)

         .values.reshape(-1,len(T)))

To see the relation between my input df1 and ar, here are some related rows

print (df1.head(6))

                 V

I  Q  Tn          

i0 q0 t1  1.123666

   q1 t0  0.538610

      t1  2.943206

   q2 t0  0.570990

      t1  0.617524

      t2  1.413926

print (ar[:3])

[[0.         1.1236656  0.        ]

 [0.53861027 2.94320574 0.        ]

 [0.57099049 0.61752408 1.4139263 ]]

[0.53861027 2.94320574 0.        ]

becomes

[[0.29010102, 1.58524083, 0.        ], #0.29010102 = 0.53861027**2, 1.58524083 = 0.53861027*2.94320574 ...

 [1.58524083, 8.66246003, 0.        ],

 [0.        , 0.        , 0.        ]]

Several methods are possible such as ar[:,:,None]*ar[:,None,:] or np.einsum with the right subscript: np.einsum('ij,ik->ijk',ar,ar). Both give same result.

The next step can be done with a tensordot and specify the right axes. So with ar and W as an input, you do:

print (np.tensordot(np.einsum('ij,ik->ijk',ar,ar),W.values,axes=([1,2],[0,1])))

array([ 1.26262437, 15.29352438, 15.94605435, ...

To check for the second value here, 1*0.29010102 + 1*8.66246003 + 2.*2*1.58524083 == 15.29352438 (where 1 is W(t0,t0) and W(t1,t1), 2 is W(t0,t1))

Finally, to create the dataframe as expected, use again pd.MultiIndex.from_product:

new_df = pd.DataFrame({'col1': np.tensordot(np.einsum('ij,ik->ijk',ar,ar),

                                            W.values,axes=([1,2],[0,1]))},

                      index=pd.MultiIndex.from_product([I,Q], names=['I','Q']))



print (new_df.head(3))

            col1

I  Q            

i0 q0   1.262624

   q1  15.293524

   q2  15.946054

...

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

ar = (df1.reindex(pd.MultiIndex.from_product([I,Q,T], names=['I','Q','Tn']),fill_value=0)

         .values.reshape(-1,len(T)))

To see the relation between my input df1 and ar, here are some related rows

print (df1.head(6))

                 V

I  Q  Tn          

i0 q0 t1  1.123666

   q1 t0  0.538610

      t1  2.943206

   q2 t0  0.570990

      t1  0.617524

      t2  1.413926

print (ar[:3])

[[0.         1.1236656  0.        ]

 [0.53861027 2.94320574 0.        ]

 [0.57099049 0.61752408 1.4139263 ]]

[0.53861027 2.94320574 0.        ]

becomes

[[0.29010102, 1.58524083, 0.        ], #0.29010102 = 0.53861027**2, 1.58524083 = 0.53861027*2.94320574 ...

 [1.58524083, 8.66246003, 0.        ],

 [0.        , 0.        , 0.        ]]

Several methods are possible such as ar[:,:,None]*ar[:,None,:] or np.einsum with the right subscript: np.einsum('ij,ik->ijk',ar,ar). Both give same result.

The next step can be done with a tensordot and specify the right axes. So with ar and W as an input, you do:

print (np.tensordot(np.einsum('ij,ik->ijk',ar,ar),W.values,axes=([1,2],[0,1])))

array([ 1.26262437, 15.29352438, 15.94605435, ...

To check for the second value here, 1*0.29010102 + 1*8.66246003 + 2.*2*1.58524083 == 15.29352438 (where 1 is W(t0,t0) and W(t1,t1), 2 is W(t0,t1))

Finally, to create the dataframe as expected, use again pd.MultiIndex.from_product:

new_df = pd.DataFrame({'col1': np.tensordot(np.einsum('ij,ik->ijk',ar,ar),

                                            W.values,axes=([1,2],[0,1]))},

                      index=pd.MultiIndex.from_product([I,Q], names=['I','Q']))



print (new_df.head(3))

            col1

I  Q            

i0 q0   1.262624

   q1  15.293524

   q2  15.946054

...

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

answered Nov 23 '18 at 19:40

Ben.T

6,6903928

This seems to work. However, I found an edge case in my problem which would make this answer not correct. Otherwise you have taught me something new! Thank you

– RealRageDontQuit
Nov 24 '18 at 11:16

@RealRageDontQuit what you call edge case is actually a different problem. You change the dataframe structure by adding an index level, change the formula by multipying with another matrix s and do a sum over this new index (at least of what I understood). I think my answer can pretty easily be adapted to this problem, but if you want a general method it will be more complicated.

– Ben.T
Nov 24 '18 at 12:14

1

you are correct to say that this question solves the initial problem. I have changed the question to reflect the initial problem, tick your answer as accepted (thanks) and will also create a new question with this edge case.

– RealRageDontQuit
Nov 25 '18 at 8:37

add a comment |

This seems to work. However, I found an edge case in my problem which would make this answer not correct. Otherwise you have taught me something new! Thank you

– RealRageDontQuit
Nov 24 '18 at 11:16

@RealRageDontQuit what you call edge case is actually a different problem. You change the dataframe structure by adding an index level, change the formula by multipying with another matrix s and do a sum over this new index (at least of what I understood). I think my answer can pretty easily be adapted to this problem, but if you want a general method it will be more complicated.

– Ben.T
Nov 24 '18 at 12:14

1

you are correct to say that this question solves the initial problem. I have changed the question to reflect the initial problem, tick your answer as accepted (thanks) and will also create a new question with this edge case.

– RealRageDontQuit
Nov 25 '18 at 8:37

This seems to work. However, I found an edge case in my problem which would make this answer not correct. Otherwise you have taught me something new! Thank you

– RealRageDontQuit
Nov 24 '18 at 11:16

@RealRageDontQuit what you call edge case is actually a different problem. You change the dataframe structure by adding an index level, change the formula by multipying with another matrix s and do a sum over this new index (at least of what I understood). I think my answer can pretty easily be adapted to this problem, but if you want a general method it will be more complicated.

– Ben.T
Nov 24 '18 at 12:14

you are correct to say that this question solves the initial problem. I have changed the question to reflect the initial problem, tick your answer as accepted (thanks) and will also create a new question with this edge case.

– RealRageDontQuit
Nov 25 '18 at 8:37

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky