pandas.read_csv leads to shifted column labels when dropping lines below header
I am trying to read a .csv file with pandas, with a header looking like this:
System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
I'm using the following code to read it:df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')
My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.
In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:
System Information_1;;;;;
System Information_2;;;;;
etc.
Does anyone know where that error comes from and how to solve it?
python pandas csv
add a comment |
I am trying to read a .csv file with pandas, with a header looking like this:
System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
I'm using the following code to read it:df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')
My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.
In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:
System Information_1;;;;;
System Information_2;;;;;
etc.
Does anyone know where that error comes from and how to solve it?
python pandas csv
please format your question's description properly
– RomanPerekhrest
Nov 20 '18 at 10:23
what do you mean by "properly"? sorry, I'm new here
– Judith
Nov 20 '18 at 10:24
--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)
– RomanPerekhrest
Nov 20 '18 at 10:26
@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.
– pygo
Nov 20 '18 at 10:31
@RomanPerekhrest You can edit the formatting yourself (I've done it now).
– user31415629
Nov 20 '18 at 10:41
add a comment |
I am trying to read a .csv file with pandas, with a header looking like this:
System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
I'm using the following code to read it:df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')
My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.
In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:
System Information_1;;;;;
System Information_2;;;;;
etc.
Does anyone know where that error comes from and how to solve it?
python pandas csv
I am trying to read a .csv file with pandas, with a header looking like this:
System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3"; "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
I'm using the following code to read it:df = pd.read_csv('data.csv', sep=';', header=5, skiprows=[6,7], encoding='latin1')
My dataframe does however end up having "unit1", "unit2", "unit3", "unit4", "unit5", "unit6" instead of "Label1", "Label2", "Label3", "Label4", "Label5", "Label6" as column labels.
In an older version of my csv-file, however, the import code works properly. The difference I could spot between the files was that the older file has a full set of separators in the first 4 rows:
System Information_1;;;;;
System Information_2;;;;;
etc.
Does anyone know where that error comes from and how to solve it?
python pandas csv
python pandas csv
edited Nov 20 '18 at 11:55
user31415629
456214
456214
asked Nov 20 '18 at 10:21
JudithJudith
12
12
please format your question's description properly
– RomanPerekhrest
Nov 20 '18 at 10:23
what do you mean by "properly"? sorry, I'm new here
– Judith
Nov 20 '18 at 10:24
--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)
– RomanPerekhrest
Nov 20 '18 at 10:26
@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.
– pygo
Nov 20 '18 at 10:31
@RomanPerekhrest You can edit the formatting yourself (I've done it now).
– user31415629
Nov 20 '18 at 10:41
add a comment |
please format your question's description properly
– RomanPerekhrest
Nov 20 '18 at 10:23
what do you mean by "properly"? sorry, I'm new here
– Judith
Nov 20 '18 at 10:24
--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)
– RomanPerekhrest
Nov 20 '18 at 10:26
@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.
– pygo
Nov 20 '18 at 10:31
@RomanPerekhrest You can edit the formatting yourself (I've done it now).
– user31415629
Nov 20 '18 at 10:41
please format your question's description properly
– RomanPerekhrest
Nov 20 '18 at 10:23
please format your question's description properly
– RomanPerekhrest
Nov 20 '18 at 10:23
what do you mean by "properly"? sorry, I'm new here
– Judith
Nov 20 '18 at 10:24
what do you mean by "properly"? sorry, I'm new here
– Judith
Nov 20 '18 at 10:24
--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)
– RomanPerekhrest
Nov 20 '18 at 10:26
--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)
– RomanPerekhrest
Nov 20 '18 at 10:26
@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.
– pygo
Nov 20 '18 at 10:31
@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.
– pygo
Nov 20 '18 at 10:31
@RomanPerekhrest You can edit the formatting yourself (I've done it now).
– user31415629
Nov 20 '18 at 10:41
@RomanPerekhrest You can edit the formatting yourself (I've done it now).
– user31415629
Nov 20 '18 at 10:41
add a comment |
3 Answers
3
active
oldest
votes
You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :
df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')
Thanks a lot, this works perfectly fine :)
– Judith
Nov 20 '18 at 11:07
add a comment |
You could use a list as your header argument:
import pandas as pd
from io import StringIO
data = """System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
1;2;3;4;5;6
10;20;30;40;50;60
"""
df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')
gives:

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!
– Owen
Nov 20 '18 at 10:59
add a comment |
The "header" parameter starts counting after the "skiprows" parameter.
If you want to use the label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')
Otherwhise, if you want to use the alternative label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')
I made it so you can use the label while keeping the "units" as data for the labels.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390861%2fpandas-read-csv-leads-to-shifted-column-labels-when-dropping-lines-below-header%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :
df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')
Thanks a lot, this works perfectly fine :)
– Judith
Nov 20 '18 at 11:07
add a comment |
You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :
df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')
Thanks a lot, this works perfectly fine :)
– Judith
Nov 20 '18 at 11:07
add a comment |
You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :
df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')
You could skip the first rows, too, but then also don't set header to 5, because it's 0 then, so you can leave it to be detected automatically :
df = pd.read_csv('data.csv', sep=';', skiprows=[0,1,2,3,4,6,7], encoding='latin1')
answered Nov 20 '18 at 10:53
SpghttCdSpghttCd
4,6422313
4,6422313
Thanks a lot, this works perfectly fine :)
– Judith
Nov 20 '18 at 11:07
add a comment |
Thanks a lot, this works perfectly fine :)
– Judith
Nov 20 '18 at 11:07
Thanks a lot, this works perfectly fine :)
– Judith
Nov 20 '18 at 11:07
Thanks a lot, this works perfectly fine :)
– Judith
Nov 20 '18 at 11:07
add a comment |
You could use a list as your header argument:
import pandas as pd
from io import StringIO
data = """System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
1;2;3;4;5;6
10;20;30;40;50;60
"""
df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')
gives:

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!
– Owen
Nov 20 '18 at 10:59
add a comment |
You could use a list as your header argument:
import pandas as pd
from io import StringIO
data = """System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
1;2;3;4;5;6
10;20;30;40;50;60
"""
df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')
gives:

The answer from @SpghttCd works perfectly well too. They got here quicker than I did!
– Owen
Nov 20 '18 at 10:59
add a comment |
You could use a list as your header argument:
import pandas as pd
from io import StringIO
data = """System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
1;2;3;4;5;6
10;20;30;40;50;60
"""
df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')
gives:

You could use a list as your header argument:
import pandas as pd
from io import StringIO
data = """System Information_1
System Information_2
System Information_3
System Information_4
"Label1"; "Label2"; "Label3"; "Label4"; "Label5"; "Label6"
"alternative Label1"; "alternative Label2"; "alternative Label3" "alternative Label4"; "alternative Label5"; "alternative Label6"
"unit1"; "unit2"; "unit3"; "unit4"; "unit5"; "unit6"
1;2;3;4;5;6
10;20;30;40;50;60
"""
df = pd.read_csv(StringIO(data), sep=';', header=[4], skiprows=[6, 7], encoding='latin1')
gives:

answered Nov 20 '18 at 10:58
OwenOwen
3,2541915
3,2541915
The answer from @SpghttCd works perfectly well too. They got here quicker than I did!
– Owen
Nov 20 '18 at 10:59
add a comment |
The answer from @SpghttCd works perfectly well too. They got here quicker than I did!
– Owen
Nov 20 '18 at 10:59
The answer from @SpghttCd works perfectly well too. They got here quicker than I did!
– Owen
Nov 20 '18 at 10:59
The answer from @SpghttCd works perfectly well too. They got here quicker than I did!
– Owen
Nov 20 '18 at 10:59
add a comment |
The "header" parameter starts counting after the "skiprows" parameter.
If you want to use the label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')
Otherwhise, if you want to use the alternative label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')
I made it so you can use the label while keeping the "units" as data for the labels.
add a comment |
The "header" parameter starts counting after the "skiprows" parameter.
If you want to use the label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')
Otherwhise, if you want to use the alternative label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')
I made it so you can use the label while keeping the "units" as data for the labels.
add a comment |
The "header" parameter starts counting after the "skiprows" parameter.
If you want to use the label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')
Otherwhise, if you want to use the alternative label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')
I made it so you can use the label while keeping the "units" as data for the labels.
The "header" parameter starts counting after the "skiprows" parameter.
If you want to use the label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=[0,1,2,3,4,6], encoding='latin1')
Otherwhise, if you want to use the alternative label as header:
df = pd.read_csv('pruebasof.csv', sep=';', skiprows=6, encoding='latin1')
I made it so you can use the label while keeping the "units" as data for the labels.
answered Nov 20 '18 at 11:14
Francisco del Valle BasFrancisco del Valle Bas
444
444
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53390861%2fpandas-read-csv-leads-to-shifted-column-labels-when-dropping-lines-below-header%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
please format your question's description properly
– RomanPerekhrest
Nov 20 '18 at 10:23
what do you mean by "properly"? sorry, I'm new here
– Judith
Nov 20 '18 at 10:24
--> run "edit" mode --> use "Markdown editing panel" --> surround your code blocks with a proper edit items (stackoverflow.com/editing-help)
– RomanPerekhrest
Nov 20 '18 at 10:26
@Judith, Welcome to the SO , However , it worths to add the details How your raw data looks like , if you have a Dataframe , you can post few Lines of that and the desired output you want from the data. Will you be able to post the CSV file somewhere over the Internet.
– pygo
Nov 20 '18 at 10:31
@RomanPerekhrest You can edit the formatting yourself (I've done it now).
– user31415629
Nov 20 '18 at 10:41