splitting a dataframe into chunks and naming each new chunk into a dataframe
up vote
0
down vote
favorite
is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?
for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.
I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.
python loops dataframe split chunks
add a comment |
up vote
0
down vote
favorite
is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?
for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.
I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.
python loops dataframe split chunks
If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?
for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.
I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.
python loops dataframe split chunks
is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?
for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.
I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.
python loops dataframe split chunks
python loops dataframe split chunks
edited Nov 15 at 7:05
Torxed
13.1k105486
13.1k105486
asked Nov 15 at 7:03
pynewbee
124112
124112
If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08
add a comment |
If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08
If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08
If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
Use numpy
for splitting:
See example below:
In [2095]: df
Out[2095]:
0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN
In [2096]: np.split(df, 2)
Out[2096]:
[ 0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
0 1 2 3 4 5 6 7 8 9 10
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]
df
gets split into 2 dataframes having 2
rows each.
You can do np.split(df, 500)
add a comment |
up vote
0
down vote
I find these ideas helpful:
solution via list:
https://stackoverflow.com/a/49563326/10396469
solution using numpy.split:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html
just use df = df.values
first to convert from dataframe to numpy.array.
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
Use numpy
for splitting:
See example below:
In [2095]: df
Out[2095]:
0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN
In [2096]: np.split(df, 2)
Out[2096]:
[ 0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
0 1 2 3 4 5 6 7 8 9 10
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]
df
gets split into 2 dataframes having 2
rows each.
You can do np.split(df, 500)
add a comment |
up vote
1
down vote
Use numpy
for splitting:
See example below:
In [2095]: df
Out[2095]:
0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN
In [2096]: np.split(df, 2)
Out[2096]:
[ 0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
0 1 2 3 4 5 6 7 8 9 10
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]
df
gets split into 2 dataframes having 2
rows each.
You can do np.split(df, 500)
add a comment |
up vote
1
down vote
up vote
1
down vote
Use numpy
for splitting:
See example below:
In [2095]: df
Out[2095]:
0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN
In [2096]: np.split(df, 2)
Out[2096]:
[ 0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
0 1 2 3 4 5 6 7 8 9 10
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]
df
gets split into 2 dataframes having 2
rows each.
You can do np.split(df, 500)
Use numpy
for splitting:
See example below:
In [2095]: df
Out[2095]:
0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN
In [2096]: np.split(df, 2)
Out[2096]:
[ 0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
0 1 2 3 4 5 6 7 8 9 10
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]
df
gets split into 2 dataframes having 2
rows each.
You can do np.split(df, 500)
answered Nov 15 at 7:15
Mayank Porwal
3,8601621
3,8601621
add a comment |
add a comment |
up vote
0
down vote
I find these ideas helpful:
solution via list:
https://stackoverflow.com/a/49563326/10396469
solution using numpy.split:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html
just use df = df.values
first to convert from dataframe to numpy.array.
add a comment |
up vote
0
down vote
I find these ideas helpful:
solution via list:
https://stackoverflow.com/a/49563326/10396469
solution using numpy.split:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html
just use df = df.values
first to convert from dataframe to numpy.array.
add a comment |
up vote
0
down vote
up vote
0
down vote
I find these ideas helpful:
solution via list:
https://stackoverflow.com/a/49563326/10396469
solution using numpy.split:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html
just use df = df.values
first to convert from dataframe to numpy.array.
I find these ideas helpful:
solution via list:
https://stackoverflow.com/a/49563326/10396469
solution using numpy.split:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html
just use df = df.values
first to convert from dataframe to numpy.array.
answered Nov 15 at 7:12
Ruslan S.
63
63
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53314056%2fsplitting-a-dataframe-into-chunks-and-naming-each-new-chunk-into-a-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08