pandas.read_csv leads to shifted column labels when dropping lines below header












0















I am trying to read a .csv file with pandas, with a header looking like this:



System Information_1
System Information_2
System Information_3
System Information_4

"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"


I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')



My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.



In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:



System Information_1;;;;;
System Information_2;;;;;
etc.


Does anyone know where that error comes from and how to solve it?










share|improve this question

























  • please format your question's description properly

    – RomanPerekhrest
    Nov 20 '18 at 10:23













  • what do you mean by "properly"? sorry, I'm new here

    – Judith
    Nov 20 '18 at 10:24











  • --> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

    – RomanPerekhrest
    Nov 20 '18 at 10:26













  • @Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

    – pygo
    Nov 20 '18 at 10:31











  • @RomanPerekhrest You can edit the formatting yourself (I've done it now).

    – user31415629
    Nov 20 '18 at 10:41
















0















I am trying to read a .csv file with pandas, with a header looking like this:



System Information_1
System Information_2
System Information_3
System Information_4

"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"


I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')



My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.



In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:



System Information_1;;;;;
System Information_2;;;;;
etc.


Does anyone know where that error comes from and how to solve it?










share|improve this question

























  • please format your question's description properly

    – RomanPerekhrest
    Nov 20 '18 at 10:23













  • what do you mean by "properly"? sorry, I'm new here

    – Judith
    Nov 20 '18 at 10:24











  • --> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

    – RomanPerekhrest
    Nov 20 '18 at 10:26













  • @Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

    – pygo
    Nov 20 '18 at 10:31











  • @RomanPerekhrest You can edit the formatting yourself (I've done it now).

    – user31415629
    Nov 20 '18 at 10:41














0












0








0








I am trying to read a .csv file with pandas, with a header looking like this:



System Information_1
System Information_2
System Information_3
System Information_4

"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"


I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')



My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.



In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:



System Information_1;;;;;
System Information_2;;;;;
etc.


Does anyone know where that error comes from and how to solve it?










share|improve this question
















I am trying to read a .csv file with pandas, with a header looking like this:



System Information_1
System Information_2
System Information_3
System Information_4

"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"


I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')



My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.



In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:



System Information_1;;;;;
System Information_2;;;;;
etc.


Does anyone know where that error comes from and how to solve it?







python pandas csv






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 '18 at 11:55









user31415629

456214




456214










asked Nov 20 '18 at 10:21









JudithJudith

12




12













  • please format your question's description properly

    – RomanPerekhrest
    Nov 20 '18 at 10:23













  • what do you mean by "properly"? sorry, I'm new here

    – Judith
    Nov 20 '18 at 10:24











  • --> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

    – RomanPerekhrest
    Nov 20 '18 at 10:26













  • @Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

    – pygo
    Nov 20 '18 at 10:31











  • @RomanPerekhrest You can edit the formatting yourself (I've done it now).

    – user31415629
    Nov 20 '18 at 10:41



















  • please format your question's description properly

    – RomanPerekhrest
    Nov 20 '18 at 10:23













  • what do you mean by "properly"? sorry, I'm new here

    – Judith
    Nov 20 '18 at 10:24











  • --> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

    – RomanPerekhrest
    Nov 20 '18 at 10:26













  • @Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

    – pygo
    Nov 20 '18 at 10:31











  • @RomanPerekhrest You can edit the formatting yourself (I've done it now).

    – user31415629
    Nov 20 '18 at 10:41

















please format your question's description properly

– RomanPerekhrest
Nov 20 '18 at 10:23







please format your question's description properly

– RomanPerekhrest
Nov 20 '18 at 10:23















what do you mean by "properly"? sorry, I'm new here

– Judith
Nov 20 '18 at 10:24





what do you mean by "properly"? sorry, I'm new here

– Judith
Nov 20 '18 at 10:24













--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

– RomanPerekhrest
Nov 20 '18 at 10:26







--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

– RomanPerekhrest
Nov 20 '18 at 10:26















@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

– pygo
Nov 20 '18 at 10:31





@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

– pygo
Nov 20 '18 at 10:31













@RomanPerekhrest You can edit the formatting yourself (I've done it now).

– user31415629
Nov 20 '18 at 10:41





@RomanPerekhrest You can edit the formatting yourself (I've done it now).

– user31415629
Nov 20 '18 at 10:41












3 Answers
3






active

oldest

votes


















0














You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :



df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')





share|improve this answer
























  • Thanks a lot, this works perfectly fine :)

    – Judith
    Nov 20 '18 at 11:07



















0














You could use a list as your header argument:



import pandas as pd
from io import StringIO

data = """System Information_1
System Information_2
System Information_3
System Information_4

"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
1;2;3;4;5;6
10;20;30;40;50;60
"""

df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')


gives:



enter image description here






share|improve this answer
























  • The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

    – Owen
    Nov 20 '18 at 10:59



















0














The "header" parameter starts counting after the "skiprows" parameter.



If you want to use the label as header:



df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')


Otherwhise, if you want to use the alternative label as header:



df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')


I made it so you can use the label while keeping the "units" as data for the labels.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390861%2fpandas-read-csv-leads-to-shifted-column-labels-when-dropping-lines-below-header%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :



    df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')





    share|improve this answer
























    • Thanks a lot, this works perfectly fine :)

      – Judith
      Nov 20 '18 at 11:07
















    0














    You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :



    df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')





    share|improve this answer
























    • Thanks a lot, this works perfectly fine :)

      – Judith
      Nov 20 '18 at 11:07














    0












    0








    0







    You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :



    df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')





    share|improve this answer













    You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :



    df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 20 '18 at 10:53









    SpghttCdSpghttCd

    4,6422313




    4,6422313













    • Thanks a lot, this works perfectly fine :)

      – Judith
      Nov 20 '18 at 11:07



















    • Thanks a lot, this works perfectly fine :)

      – Judith
      Nov 20 '18 at 11:07

















    Thanks a lot, this works perfectly fine :)

    – Judith
    Nov 20 '18 at 11:07





    Thanks a lot, this works perfectly fine :)

    – Judith
    Nov 20 '18 at 11:07













    0














    You could use a list as your header argument:



    import pandas as pd
    from io import StringIO

    data = """System Information_1
    System Information_2
    System Information_3
    System Information_4

    "Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
    "alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
    "unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
    1;2;3;4;5;6
    10;20;30;40;50;60
    """

    df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')


    gives:



    enter image description here






    share|improve this answer
























    • The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

      – Owen
      Nov 20 '18 at 10:59
















    0














    You could use a list as your header argument:



    import pandas as pd
    from io import StringIO

    data = """System Information_1
    System Information_2
    System Information_3
    System Information_4

    "Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
    "alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
    "unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
    1;2;3;4;5;6
    10;20;30;40;50;60
    """

    df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')


    gives:



    enter image description here






    share|improve this answer
























    • The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

      – Owen
      Nov 20 '18 at 10:59














    0












    0








    0







    You could use a list as your header argument:



    import pandas as pd
    from io import StringIO

    data = """System Information_1
    System Information_2
    System Information_3
    System Information_4

    "Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
    "alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
    "unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
    1;2;3;4;5;6
    10;20;30;40;50;60
    """

    df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')


    gives:



    enter image description here






    share|improve this answer













    You could use a list as your header argument:



    import pandas as pd
    from io import StringIO

    data = """System Information_1
    System Information_2
    System Information_3
    System Information_4

    "Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
    "alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
    "unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
    1;2;3;4;5;6
    10;20;30;40;50;60
    """

    df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')


    gives:



    enter image description here







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 20 '18 at 10:58









    OwenOwen

    3,2541915




    3,2541915













    • The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

      – Owen
      Nov 20 '18 at 10:59



















    • The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

      – Owen
      Nov 20 '18 at 10:59

















    The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

    – Owen
    Nov 20 '18 at 10:59





    The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

    – Owen
    Nov 20 '18 at 10:59











    0














    The "header" parameter starts counting after the "skiprows" parameter.



    If you want to use the label as header:



    df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')


    Otherwhise, if you want to use the alternative label as header:



    df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')


    I made it so you can use the label while keeping the "units" as data for the labels.






    share|improve this answer




























      0














      The "header" parameter starts counting after the "skiprows" parameter.



      If you want to use the label as header:



      df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')


      Otherwhise, if you want to use the alternative label as header:



      df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')


      I made it so you can use the label while keeping the "units" as data for the labels.






      share|improve this answer


























        0












        0








        0







        The "header" parameter starts counting after the "skiprows" parameter.



        If you want to use the label as header:



        df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')


        Otherwhise, if you want to use the alternative label as header:



        df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')


        I made it so you can use the label while keeping the "units" as data for the labels.






        share|improve this answer













        The "header" parameter starts counting after the "skiprows" parameter.



        If you want to use the label as header:



        df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')


        Otherwhise, if you want to use the alternative label as header:



        df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')


        I made it so you can use the label while keeping the "units" as data for the labels.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 20 '18 at 11:14









        Francisco del Valle BasFrancisco del Valle Bas

        444




        444






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390861%2fpandas-read-csv-leads-to-shifted-column-labels-when-dropping-lines-below-header%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to send String Array data to Server using php in android

            Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

            Is anime1.com a legal site for watching anime?