pandas.read_csv leads to shifted column labels when dropping lines below header

I am trying to read a .csv file with pandas, with a header looking like this:

System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"

I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')

My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.

In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:

System Information_1;;;;;

System Information_2;;;;; 

etc.

Does anyone know where that error comes from and how to solve it?

edited Nov 20 '18 at 11:55

user31415629

456214

asked Nov 20 '18 at 10:21

Judith

please format your question's description properly

– RomanPerekhrest
Nov 20 '18 at 10:23

what do you mean by "properly"? sorry, I'm new here

– Judith
Nov 20 '18 at 10:24

--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

– RomanPerekhrest
Nov 20 '18 at 10:26

@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

– pygo
Nov 20 '18 at 10:31

@RomanPerekhrest You can edit the formatting yourself (I've done it now).

– user31415629
Nov 20 '18 at 10:41

add a comment |

I am trying to read a .csv file with pandas, with a header looking like this:

System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"

I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')

My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.

In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:

System Information_1;;;;;

System Information_2;;;;; 

etc.

Does anyone know where that error comes from and how to solve it?

edited Nov 20 '18 at 11:55

user31415629

456214

asked Nov 20 '18 at 10:21

Judith

please format your question's description properly

– RomanPerekhrest
Nov 20 '18 at 10:23

what do you mean by "properly"? sorry, I'm new here

– Judith
Nov 20 '18 at 10:24

--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

– RomanPerekhrest
Nov 20 '18 at 10:26

@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

– pygo
Nov 20 '18 at 10:31

@RomanPerekhrest You can edit the formatting yourself (I've done it now).

– user31415629
Nov 20 '18 at 10:41

add a comment |

I am trying to read a .csv file with pandas, with a header looking like this:

System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"

I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')

My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.

In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:

System Information_1;;;;;

System Information_2;;;;; 

etc.

Does anyone know where that error comes from and how to solve it?

edited Nov 20 '18 at 11:55

user31415629

456214

asked Nov 20 '18 at 10:21

Judith

I am trying to read a .csv file with pandas, with a header looking like this:

System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"

I'm using the following code to read it:
df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')

My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.

In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:

System Information_1;;;;;

System Information_2;;;;; 

etc.

Does anyone know where that error comes from and how to solve it?

python pandas csv

edited Nov 20 '18 at 11:55

user31415629

456214

asked Nov 20 '18 at 10:21

Judith

edited Nov 20 '18 at 11:55

user31415629

456214

asked Nov 20 '18 at 10:21

Judith

edited Nov 20 '18 at 11:55

user31415629

456214

edited Nov 20 '18 at 11:55

user31415629

456214

edited Nov 20 '18 at 11:55

user31415629

456214

asked Nov 20 '18 at 10:21

Judith

asked Nov 20 '18 at 10:21

Judith

asked Nov 20 '18 at 10:21

Judith

please format your question's description properly

– RomanPerekhrest
Nov 20 '18 at 10:23

what do you mean by "properly"? sorry, I'm new here

– Judith
Nov 20 '18 at 10:24

--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

– RomanPerekhrest
Nov 20 '18 at 10:26

@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

– pygo
Nov 20 '18 at 10:31

@RomanPerekhrest You can edit the formatting yourself (I've done it now).

– user31415629
Nov 20 '18 at 10:41

add a comment |

please format your question's description properly

– RomanPerekhrest
Nov 20 '18 at 10:23

what do you mean by "properly"? sorry, I'm new here

– Judith
Nov 20 '18 at 10:24

--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

– RomanPerekhrest
Nov 20 '18 at 10:26

@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

– pygo
Nov 20 '18 at 10:31

@RomanPerekhrest You can edit the formatting yourself (I've done it now).

– user31415629
Nov 20 '18 at 10:41

please format your question's description properly

– RomanPerekhrest
Nov 20 '18 at 10:23

what do you mean by "properly"? sorry, I'm new here

– Judith
Nov 20 '18 at 10:24

--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)

– RomanPerekhrest
Nov 20 '18 at 10:26

@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.

– pygo
Nov 20 '18 at 10:31

@RomanPerekhrest You can edit the formatting yourself (I've done it now).

– user31415629
Nov 20 '18 at 10:41

add a comment |

3 Answers
3

active

oldest

votes

You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :

df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

Thanks a lot, this works perfectly fine :)

– Judith
Nov 20 '18 at 11:07

add a comment |

You could use a list as your header argument:

import pandas as pd

from io import StringIO



data = """System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6" 

1;2;3;4;5;6

10;20;30;40;50;60

"""



df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')

gives:

enter image description here

answered Nov 20 '18 at 10:58

Owen

3,2541915

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

– Owen
Nov 20 '18 at 10:59

add a comment |

The "header" parameter starts counting after the "skiprows" parameter.

If you want to use the label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')

Otherwhise, if you want to use the alternative label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')

I made it so you can use the label while keeping the "units" as data for the labels.

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390861%2fpandas-read-csv-leads-to-shifted-column-labels-when-dropping-lines-below-header%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :

df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

Thanks a lot, this works perfectly fine :)

– Judith
Nov 20 '18 at 11:07

add a comment |

You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :

df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

Thanks a lot, this works perfectly fine :)

– Judith
Nov 20 '18 at 11:07

add a comment |

You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :

df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :

df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

answered Nov 20 '18 at 10:53

SpghttCd

4,6422313

Thanks a lot, this works perfectly fine :)

– Judith
Nov 20 '18 at 11:07

add a comment |

Thanks a lot, this works perfectly fine :)

– Judith
Nov 20 '18 at 11:07

Thanks a lot, this works perfectly fine :)

– Judith
Nov 20 '18 at 11:07

add a comment |

You could use a list as your header argument:

import pandas as pd

from io import StringIO



data = """System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6" 

1;2;3;4;5;6

10;20;30;40;50;60

"""



df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')

gives:

enter image description here

answered Nov 20 '18 at 10:58

Owen

3,2541915

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

– Owen
Nov 20 '18 at 10:59

add a comment |

You could use a list as your header argument:

import pandas as pd

from io import StringIO



data = """System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6" 

1;2;3;4;5;6

10;20;30;40;50;60

"""



df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')

gives:

enter image description here

answered Nov 20 '18 at 10:58

Owen

3,2541915

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

– Owen
Nov 20 '18 at 10:59

add a comment |

You could use a list as your header argument:

import pandas as pd

from io import StringIO



data = """System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6" 

1;2;3;4;5;6

10;20;30;40;50;60

"""



df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')

gives:

enter image description here

answered Nov 20 '18 at 10:58

Owen

3,2541915

You could use a list as your header argument:

import pandas as pd

from io import StringIO



data = """System Information_1

System Information_2

System Information_3

System Information_4



"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"

"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"

"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6" 

1;2;3;4;5;6

10;20;30;40;50;60

"""



df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')

gives:

enter image description here

answered Nov 20 '18 at 10:58

Owen

3,2541915

answered Nov 20 '18 at 10:58

Owen

3,2541915

answered Nov 20 '18 at 10:58

Owen

3,2541915

answered Nov 20 '18 at 10:58

Owen

3,2541915

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

– Owen
Nov 20 '18 at 10:59

add a comment |

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

– Owen
Nov 20 '18 at 10:59

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!

– Owen
Nov 20 '18 at 10:59

add a comment |

The "header" parameter starts counting after the "skiprows" parameter.

If you want to use the label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')

Otherwhise, if you want to use the alternative label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')

I made it so you can use the label while keeping the "units" as data for the labels.

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

add a comment |

The "header" parameter starts counting after the "skiprows" parameter.

If you want to use the label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')

Otherwhise, if you want to use the alternative label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')

I made it so you can use the label while keeping the "units" as data for the labels.

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

add a comment |

The "header" parameter starts counting after the "skiprows" parameter.

If you want to use the label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')

Otherwhise, if you want to use the alternative label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')

I made it so you can use the label while keeping the "units" as data for the labels.

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

The "header" parameter starts counting after the "skiprows" parameter.

If you want to use the label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')

Otherwhise, if you want to use the alternative label as header:

df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')

I made it so you can use the label while keeping the "units" as data for the labels.

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

answered Nov 20 '18 at 11:14

Francisco del Valle Bas

444

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky