splitting a dataframe into chunks and naming each new chunk into a dataframe











up vote
0
down vote

favorite












is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.










share|improve this question
























  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
    – RoyM
    Nov 15 at 7:08















up vote
0
down vote

favorite












is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.










share|improve this question
























  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
    – RoyM
    Nov 15 at 7:08













up vote
0
down vote

favorite









up vote
0
down vote

favorite











is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.










share|improve this question















is there a good code to split dataframes into chunks and automatically name each chunk into its own dataframe?



for example, dfmaster has 1000 records. split by 200 and create df1, df2,….df5
any guidance would be much appreciated.



I've looked on other boards and there is no guidance for a function that can automatically create new dataframes.







python loops dataframe split chunks






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 at 7:05









Torxed

13.1k105486




13.1k105486










asked Nov 15 at 7:03









pynewbee

124112




124112












  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
    – RoyM
    Nov 15 at 7:08


















  • If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
    – RoyM
    Nov 15 at 7:08
















If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08




If you're reading your data with pd.read_csv or anything similar, you can use the chunksize-parameter: pandas.pydata.org/pandas-docs/version/0.23/generated/…. You'll make a simple for chunk in pd.read_csv(chunksize=200), and so on.
– RoyM
Nov 15 at 7:08












2 Answers
2






active

oldest

votes

















up vote
1
down vote













Use numpy for splitting:



See example below:



In [2095]: df
Out[2095]:
0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

In [2096]: np.split(df, 2)
Out[2096]:
[ 0 1 2 3 4 5 6 7 8 9 10
0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
0 1 2 3 4 5 6 7 8 9 10
2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



df gets split into 2 dataframes having 2 rows each.



You can do np.split(df, 500)






share|improve this answer




























    up vote
    0
    down vote













    I find these ideas helpful:



    solution via list:
    https://stackoverflow.com/a/49563326/10396469



    solution using numpy.split:
    https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



    just use df = df.values first to convert from dataframe to numpy.array.






    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53314056%2fsplitting-a-dataframe-into-chunks-and-naming-each-new-chunk-into-a-dataframe%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      1
      down vote













      Use numpy for splitting:



      See example below:



      In [2095]: df
      Out[2095]:
      0 1 2 3 4 5 6 7 8 9 10
      0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
      1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
      2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
      3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

      In [2096]: np.split(df, 2)
      Out[2096]:
      [ 0 1 2 3 4 5 6 7 8 9 10
      0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
      1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
      0 1 2 3 4 5 6 7 8 9 10
      2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
      3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



      df gets split into 2 dataframes having 2 rows each.



      You can do np.split(df, 500)






      share|improve this answer

























        up vote
        1
        down vote













        Use numpy for splitting:



        See example below:



        In [2095]: df
        Out[2095]:
        0 1 2 3 4 5 6 7 8 9 10
        0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
        1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
        2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
        3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

        In [2096]: np.split(df, 2)
        Out[2096]:
        [ 0 1 2 3 4 5 6 7 8 9 10
        0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
        1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
        0 1 2 3 4 5 6 7 8 9 10
        2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
        3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



        df gets split into 2 dataframes having 2 rows each.



        You can do np.split(df, 500)






        share|improve this answer























          up vote
          1
          down vote










          up vote
          1
          down vote









          Use numpy for splitting:



          See example below:



          In [2095]: df
          Out[2095]:
          0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

          In [2096]: np.split(df, 2)
          Out[2096]:
          [ 0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
          0 1 2 3 4 5 6 7 8 9 10
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



          df gets split into 2 dataframes having 2 rows each.



          You can do np.split(df, 500)






          share|improve this answer












          Use numpy for splitting:



          See example below:



          In [2095]: df
          Out[2095]:
          0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.00 0.0 0.00 0.0 0.94 0.00 0.00 0.63 0.00
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.00 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN

          In [2096]: np.split(df, 2)
          Out[2096]:
          [ 0 1 2 3 4 5 6 7 8 9 10
          0 0.25 0.00 0.0 0.0 0.0 0.0 0.94 0.0 0.0 0.63 0.0
          1 0.51 0.51 NaN NaN NaN NaN NaN NaN NaN NaN NaN,
          0 1 2 3 4 5 6 7 8 9 10
          2 0.54 0.54 0.00 0.0 0.63 0.0 0.51 0.54 0.51 1.0 0.51
          3 0.81 0.05 0.13 0.7 0.02 NaN NaN NaN NaN NaN NaN]



          df gets split into 2 dataframes having 2 rows each.



          You can do np.split(df, 500)







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 15 at 7:15









          Mayank Porwal

          3,8601621




          3,8601621
























              up vote
              0
              down vote













              I find these ideas helpful:



              solution via list:
              https://stackoverflow.com/a/49563326/10396469



              solution using numpy.split:
              https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



              just use df = df.values first to convert from dataframe to numpy.array.






              share|improve this answer

























                up vote
                0
                down vote













                I find these ideas helpful:



                solution via list:
                https://stackoverflow.com/a/49563326/10396469



                solution using numpy.split:
                https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



                just use df = df.values first to convert from dataframe to numpy.array.






                share|improve this answer























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  I find these ideas helpful:



                  solution via list:
                  https://stackoverflow.com/a/49563326/10396469



                  solution using numpy.split:
                  https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



                  just use df = df.values first to convert from dataframe to numpy.array.






                  share|improve this answer












                  I find these ideas helpful:



                  solution via list:
                  https://stackoverflow.com/a/49563326/10396469



                  solution using numpy.split:
                  https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.split.html



                  just use df = df.values first to convert from dataframe to numpy.array.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 15 at 7:12









                  Ruslan S.

                  63




                  63






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53314056%2fsplitting-a-dataframe-into-chunks-and-naming-each-new-chunk-into-a-dataframe%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to change which sound is reproduced for terminal bell?

                      Can I use Tabulator js library in my java Spring + Thymeleaf project?

                      Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents