How can I extract data from text file?
Text file as follows:
- Delimiter: Space
- Table: 3,496,080(row) x 6 (column)
- A column: Year
- B column: Day of the year
- C column: Hour
- D column: one of 30, 32.5, 35, 37.5, 40 and 45 values
- Values of E column begin with 25 and end with 45 and consecutively increase by 5 after five rows.
- E column: one of 25,30,35,40 and 45 values
- Values of D column begin with 30 and end with 45 and consecutively increase by 2.5 after 499,440 rows following.
F colum: Value
- A,B,and C column start over after 499,440 rows.
1st row: 1998 152 1 30 25 12.5
499,441st row: 1998 152 1 32.5 25 11.6
1998 152 1 30 25 12.5
1998 152 1 30 30 12
1998 152 1 30 35 11.8
1998 152 1 30 40 11.9
1998 152 1 30 45 12
1998 152 3 30 25 10.9
1998 152 3 30 30 10.7
1998 152 3 30 35 10.6
1998 152 3 30 40 10.5
1998 152 3 30 45 10.4
1998 152 5 30 25 9.6
1998 152 5 30 30 9.5
1998 152 5 30 35 9.2
1998 152 5 30 40 9
1998 152 5 30 45 8.7
1998 152 7 30 25 8.4
1998 152 7 30 30 8.5
1998 152 7 30 35 8.9
1998 152 7 30 40 9.6
1998 152 7 30 45 10.7
1998 152 9 30 25 13.2
1998 152 9 30 30 14.3
1998 152 9 30 35 15.2
1998 152 9 30 40 15.9
1998 152 9 30 45 16.2
1998 152 11 30 25 16.2
1998 152 11 30 30 16.5
1998 152 11 30 35 16.8
1998 152 11 30 40 17.2
1998 152 11 30 45 17.9
1998 152 13 30 25 18
1998 152 13 30 30 18.6
1998 152 13 30 35 19.3
1998 152 13 30 40 20.1
1998 152 13 30 45 21.2
1998 152 15 30 25 20.4
1998 152 15 30 30 21.4
1998 152 15 30 35 22.5
1998 152 15 30 40 23.7
1998 152 15 30 45 25
1998 152 17 30 25 21.8
1998 152 17 30 30 23.2
1998 152 17 30 35 24.7
1998 152 17 30 40 26
1998 152 17 30 45 26.9
1998 152 19 30 25 22.4
1998 152 19 30 30 23.4
1998 152 19 30 35 24.3
1998 152 19 30 40 25
1998 152 19 30 45 25.6
1998 152 21 30 25 25.1
1998 152 21 30 30 25
1998 152 21 30 35 24.3
1998 152 21 30 40 23.3
1998 152 21 30 45 22
1998 152 23 30 25 20.9
1998 152 23 30 30 19
1998 152 23 30 35 17.2
1998 152 23 30 40 15.7
1998 152 23 30 45 14.5
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
fid=fopen('table.txt','r');
formats='%f';
RawData=fscanf(fid,formats);
fclose(fid);
L=length(RawData);
fileID=fopen('test.txt','w');
What I tried
I tried with Matlab, with the code below, but it is very slow:
for i=1:L/6
data(i,:)=RawData((i-1)*6+1:(i-1)*6+6)';
if data(i,4)==30
if data(i,5)==25
if data(i,2)>=152 && data(i,2)<=241
fprintf(fileID,'%d %d %d %d %d %3.1f n',data(i,:));
end
end
end
end
command-line text
|
show 2 more comments
Text file as follows:
- Delimiter: Space
- Table: 3,496,080(row) x 6 (column)
- A column: Year
- B column: Day of the year
- C column: Hour
- D column: one of 30, 32.5, 35, 37.5, 40 and 45 values
- Values of E column begin with 25 and end with 45 and consecutively increase by 5 after five rows.
- E column: one of 25,30,35,40 and 45 values
- Values of D column begin with 30 and end with 45 and consecutively increase by 2.5 after 499,440 rows following.
F colum: Value
- A,B,and C column start over after 499,440 rows.
1st row: 1998 152 1 30 25 12.5
499,441st row: 1998 152 1 32.5 25 11.6
1998 152 1 30 25 12.5
1998 152 1 30 30 12
1998 152 1 30 35 11.8
1998 152 1 30 40 11.9
1998 152 1 30 45 12
1998 152 3 30 25 10.9
1998 152 3 30 30 10.7
1998 152 3 30 35 10.6
1998 152 3 30 40 10.5
1998 152 3 30 45 10.4
1998 152 5 30 25 9.6
1998 152 5 30 30 9.5
1998 152 5 30 35 9.2
1998 152 5 30 40 9
1998 152 5 30 45 8.7
1998 152 7 30 25 8.4
1998 152 7 30 30 8.5
1998 152 7 30 35 8.9
1998 152 7 30 40 9.6
1998 152 7 30 45 10.7
1998 152 9 30 25 13.2
1998 152 9 30 30 14.3
1998 152 9 30 35 15.2
1998 152 9 30 40 15.9
1998 152 9 30 45 16.2
1998 152 11 30 25 16.2
1998 152 11 30 30 16.5
1998 152 11 30 35 16.8
1998 152 11 30 40 17.2
1998 152 11 30 45 17.9
1998 152 13 30 25 18
1998 152 13 30 30 18.6
1998 152 13 30 35 19.3
1998 152 13 30 40 20.1
1998 152 13 30 45 21.2
1998 152 15 30 25 20.4
1998 152 15 30 30 21.4
1998 152 15 30 35 22.5
1998 152 15 30 40 23.7
1998 152 15 30 45 25
1998 152 17 30 25 21.8
1998 152 17 30 30 23.2
1998 152 17 30 35 24.7
1998 152 17 30 40 26
1998 152 17 30 45 26.9
1998 152 19 30 25 22.4
1998 152 19 30 30 23.4
1998 152 19 30 35 24.3
1998 152 19 30 40 25
1998 152 19 30 45 25.6
1998 152 21 30 25 25.1
1998 152 21 30 30 25
1998 152 21 30 35 24.3
1998 152 21 30 40 23.3
1998 152 21 30 45 22
1998 152 23 30 25 20.9
1998 152 23 30 30 19
1998 152 23 30 35 17.2
1998 152 23 30 40 15.7
1998 152 23 30 45 14.5
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
fid=fopen('table.txt','r');
formats='%f';
RawData=fscanf(fid,formats);
fclose(fid);
L=length(RawData);
fileID=fopen('test.txt','w');
What I tried
I tried with Matlab, with the code below, but it is very slow:
for i=1:L/6
data(i,:)=RawData((i-1)*6+1:(i-1)*6+6)';
if data(i,4)==30
if data(i,5)==25
if data(i,2)>=152 && data(i,2)<=241
fprintf(fileID,'%d %d %d %d %d %3.1f n',data(i,:));
end
end
end
end
command-line text
2
This doesn't feel like a assignment at all. Was there any effort from your side?
– Jacob Vlijm
Mar 19 at 11:11
In fact I made effort to extract data via MATLAB but it takes a long time to get data. I can share my code for MATLAB.
– Y. Suat
Mar 19 at 11:59
Usually a good idea to add that information. It prevents the impression of saying "here's my problem, write something fro me".
– Jacob Vlijm
Mar 19 at 12:07
I get it. I didn't share it because I wrote code from different operating system.
– Y. Suat
Mar 19 at 12:20
Oh, I believe you :), but better add something to the question, mentioning what you did with what result. (the vote wasn't mine).
– Jacob Vlijm
Mar 19 at 12:28
|
show 2 more comments
Text file as follows:
- Delimiter: Space
- Table: 3,496,080(row) x 6 (column)
- A column: Year
- B column: Day of the year
- C column: Hour
- D column: one of 30, 32.5, 35, 37.5, 40 and 45 values
- Values of E column begin with 25 and end with 45 and consecutively increase by 5 after five rows.
- E column: one of 25,30,35,40 and 45 values
- Values of D column begin with 30 and end with 45 and consecutively increase by 2.5 after 499,440 rows following.
F colum: Value
- A,B,and C column start over after 499,440 rows.
1st row: 1998 152 1 30 25 12.5
499,441st row: 1998 152 1 32.5 25 11.6
1998 152 1 30 25 12.5
1998 152 1 30 30 12
1998 152 1 30 35 11.8
1998 152 1 30 40 11.9
1998 152 1 30 45 12
1998 152 3 30 25 10.9
1998 152 3 30 30 10.7
1998 152 3 30 35 10.6
1998 152 3 30 40 10.5
1998 152 3 30 45 10.4
1998 152 5 30 25 9.6
1998 152 5 30 30 9.5
1998 152 5 30 35 9.2
1998 152 5 30 40 9
1998 152 5 30 45 8.7
1998 152 7 30 25 8.4
1998 152 7 30 30 8.5
1998 152 7 30 35 8.9
1998 152 7 30 40 9.6
1998 152 7 30 45 10.7
1998 152 9 30 25 13.2
1998 152 9 30 30 14.3
1998 152 9 30 35 15.2
1998 152 9 30 40 15.9
1998 152 9 30 45 16.2
1998 152 11 30 25 16.2
1998 152 11 30 30 16.5
1998 152 11 30 35 16.8
1998 152 11 30 40 17.2
1998 152 11 30 45 17.9
1998 152 13 30 25 18
1998 152 13 30 30 18.6
1998 152 13 30 35 19.3
1998 152 13 30 40 20.1
1998 152 13 30 45 21.2
1998 152 15 30 25 20.4
1998 152 15 30 30 21.4
1998 152 15 30 35 22.5
1998 152 15 30 40 23.7
1998 152 15 30 45 25
1998 152 17 30 25 21.8
1998 152 17 30 30 23.2
1998 152 17 30 35 24.7
1998 152 17 30 40 26
1998 152 17 30 45 26.9
1998 152 19 30 25 22.4
1998 152 19 30 30 23.4
1998 152 19 30 35 24.3
1998 152 19 30 40 25
1998 152 19 30 45 25.6
1998 152 21 30 25 25.1
1998 152 21 30 30 25
1998 152 21 30 35 24.3
1998 152 21 30 40 23.3
1998 152 21 30 45 22
1998 152 23 30 25 20.9
1998 152 23 30 30 19
1998 152 23 30 35 17.2
1998 152 23 30 40 15.7
1998 152 23 30 45 14.5
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
fid=fopen('table.txt','r');
formats='%f';
RawData=fscanf(fid,formats);
fclose(fid);
L=length(RawData);
fileID=fopen('test.txt','w');
What I tried
I tried with Matlab, with the code below, but it is very slow:
for i=1:L/6
data(i,:)=RawData((i-1)*6+1:(i-1)*6+6)';
if data(i,4)==30
if data(i,5)==25
if data(i,2)>=152 && data(i,2)<=241
fprintf(fileID,'%d %d %d %d %d %3.1f n',data(i,:));
end
end
end
end
command-line text
Text file as follows:
- Delimiter: Space
- Table: 3,496,080(row) x 6 (column)
- A column: Year
- B column: Day of the year
- C column: Hour
- D column: one of 30, 32.5, 35, 37.5, 40 and 45 values
- Values of E column begin with 25 and end with 45 and consecutively increase by 5 after five rows.
- E column: one of 25,30,35,40 and 45 values
- Values of D column begin with 30 and end with 45 and consecutively increase by 2.5 after 499,440 rows following.
F colum: Value
- A,B,and C column start over after 499,440 rows.
1st row: 1998 152 1 30 25 12.5
499,441st row: 1998 152 1 32.5 25 11.6
1998 152 1 30 25 12.5
1998 152 1 30 30 12
1998 152 1 30 35 11.8
1998 152 1 30 40 11.9
1998 152 1 30 45 12
1998 152 3 30 25 10.9
1998 152 3 30 30 10.7
1998 152 3 30 35 10.6
1998 152 3 30 40 10.5
1998 152 3 30 45 10.4
1998 152 5 30 25 9.6
1998 152 5 30 30 9.5
1998 152 5 30 35 9.2
1998 152 5 30 40 9
1998 152 5 30 45 8.7
1998 152 7 30 25 8.4
1998 152 7 30 30 8.5
1998 152 7 30 35 8.9
1998 152 7 30 40 9.6
1998 152 7 30 45 10.7
1998 152 9 30 25 13.2
1998 152 9 30 30 14.3
1998 152 9 30 35 15.2
1998 152 9 30 40 15.9
1998 152 9 30 45 16.2
1998 152 11 30 25 16.2
1998 152 11 30 30 16.5
1998 152 11 30 35 16.8
1998 152 11 30 40 17.2
1998 152 11 30 45 17.9
1998 152 13 30 25 18
1998 152 13 30 30 18.6
1998 152 13 30 35 19.3
1998 152 13 30 40 20.1
1998 152 13 30 45 21.2
1998 152 15 30 25 20.4
1998 152 15 30 30 21.4
1998 152 15 30 35 22.5
1998 152 15 30 40 23.7
1998 152 15 30 45 25
1998 152 17 30 25 21.8
1998 152 17 30 30 23.2
1998 152 17 30 35 24.7
1998 152 17 30 40 26
1998 152 17 30 45 26.9
1998 152 19 30 25 22.4
1998 152 19 30 30 23.4
1998 152 19 30 35 24.3
1998 152 19 30 40 25
1998 152 19 30 45 25.6
1998 152 21 30 25 25.1
1998 152 21 30 30 25
1998 152 21 30 35 24.3
1998 152 21 30 40 23.3
1998 152 21 30 45 22
1998 152 23 30 25 20.9
1998 152 23 30 30 19
1998 152 23 30 35 17.2
1998 152 23 30 40 15.7
1998 152 23 30 45 14.5
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
fid=fopen('table.txt','r');
formats='%f';
RawData=fscanf(fid,formats);
fclose(fid);
L=length(RawData);
fileID=fopen('test.txt','w');
What I tried
I tried with Matlab, with the code below, but it is very slow:
for i=1:L/6
data(i,:)=RawData((i-1)*6+1:(i-1)*6+6)';
if data(i,4)==30
if data(i,5)==25
if data(i,2)>=152 && data(i,2)<=241
fprintf(fileID,'%d %d %d %d %d %3.1f n',data(i,:));
end
end
end
end
command-line text
command-line text
edited Mar 19 at 12:31
Jacob Vlijm
65.6k9130226
65.6k9130226
asked Mar 19 at 9:45
Y. SuatY. Suat
283
283
2
This doesn't feel like a assignment at all. Was there any effort from your side?
– Jacob Vlijm
Mar 19 at 11:11
In fact I made effort to extract data via MATLAB but it takes a long time to get data. I can share my code for MATLAB.
– Y. Suat
Mar 19 at 11:59
Usually a good idea to add that information. It prevents the impression of saying "here's my problem, write something fro me".
– Jacob Vlijm
Mar 19 at 12:07
I get it. I didn't share it because I wrote code from different operating system.
– Y. Suat
Mar 19 at 12:20
Oh, I believe you :), but better add something to the question, mentioning what you did with what result. (the vote wasn't mine).
– Jacob Vlijm
Mar 19 at 12:28
|
show 2 more comments
2
This doesn't feel like a assignment at all. Was there any effort from your side?
– Jacob Vlijm
Mar 19 at 11:11
In fact I made effort to extract data via MATLAB but it takes a long time to get data. I can share my code for MATLAB.
– Y. Suat
Mar 19 at 11:59
Usually a good idea to add that information. It prevents the impression of saying "here's my problem, write something fro me".
– Jacob Vlijm
Mar 19 at 12:07
I get it. I didn't share it because I wrote code from different operating system.
– Y. Suat
Mar 19 at 12:20
Oh, I believe you :), but better add something to the question, mentioning what you did with what result. (the vote wasn't mine).
– Jacob Vlijm
Mar 19 at 12:28
2
2
This doesn't feel like a assignment at all. Was there any effort from your side?
– Jacob Vlijm
Mar 19 at 11:11
This doesn't feel like a assignment at all. Was there any effort from your side?
– Jacob Vlijm
Mar 19 at 11:11
In fact I made effort to extract data via MATLAB but it takes a long time to get data. I can share my code for MATLAB.
– Y. Suat
Mar 19 at 11:59
In fact I made effort to extract data via MATLAB but it takes a long time to get data. I can share my code for MATLAB.
– Y. Suat
Mar 19 at 11:59
Usually a good idea to add that information. It prevents the impression of saying "here's my problem, write something fro me".
– Jacob Vlijm
Mar 19 at 12:07
Usually a good idea to add that information. It prevents the impression of saying "here's my problem, write something fro me".
– Jacob Vlijm
Mar 19 at 12:07
I get it. I didn't share it because I wrote code from different operating system.
– Y. Suat
Mar 19 at 12:20
I get it. I didn't share it because I wrote code from different operating system.
– Y. Suat
Mar 19 at 12:20
Oh, I believe you :), but better add something to the question, mentioning what you did with what result. (the vote wasn't mine).
– Jacob Vlijm
Mar 19 at 12:28
Oh, I believe you :), but better add something to the question, mentioning what you did with what result. (the vote wasn't mine).
– Jacob Vlijm
Mar 19 at 12:28
|
show 2 more comments
3 Answers
3
active
oldest
votes
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
This should be straightforward in Awk
awk '$4==30 && $5==25 && $2>151 && $2<242' file > newfile
The default input and output field separators are whitespace.
I should use '>' instead of '>>' ?
– Y. Suat
Mar 19 at 12:05
@Y.Suat it depends whether you want to append the results to an existing file (>> newfile
) or overwrite its contents (> newfile
)
– steeldriver
Mar 19 at 12:13
What code do I integrate into your code for condition? I think it is necessary to do loop but I am beginner at Linux. Let me search code please. For example, get rows between 152 and 243 instead of between 151 and 242 from 2nd column if the first column value 1998.
– Y. Suat
Mar 19 at 12:52
4
@Y.Suat, if you can't figure out that you'll need to change "151" to "152", you can't be even attempting to understand what the awk code actually does. The learning process isn't about blindly cutting and pasting canned formulas.
– Ray Butterworth
Mar 19 at 13:14
add a comment |
Comment: if you are writing nested "if" statements, you are certainly doing it wrong.
So, even in MATLAB, which will always be slower than system calls, once you have loaded this data into a large array, do something like
my_output = data(data(:,2)>=152 & data(:,2)<=241 &data(:,4)==30 & data(:,5)==25,:)
and make that into a table()
and write that to your output.
add a comment |
You can use the TextQL library to write SQL queries in order to extract data from a text file.
You can install it using the following command (I believe it's only available as of 18.04 or else you would need to install in another way, docker, or from source):
sudo apt install textql
In your case the command would be:
textql -sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
-dlm='0x20'
-output-dlm='0x20'
<file-name>
Explanation:
-sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
The normal SQL query, the
from
is omitted as it is not needed in this case.
Since your file doesn't have column headers, the default names of columns arec0
for the first column,c1
for the second column,c2
for the third column, etc.
-dlm='0x20'
This parameter is to tell the command that the delimiter is a space instead of the default comma
,
. And 2016 is the hex code for the space character.
output-dlm='0x20'
This parameter is to tell the command to use the space character as a delimiter in the output instead of the default comma
,
.
<file-name>
This must be changed to the use the path of the actual filename.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1126883%2fhow-can-i-extract-data-from-text-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
This should be straightforward in Awk
awk '$4==30 && $5==25 && $2>151 && $2<242' file > newfile
The default input and output field separators are whitespace.
I should use '>' instead of '>>' ?
– Y. Suat
Mar 19 at 12:05
@Y.Suat it depends whether you want to append the results to an existing file (>> newfile
) or overwrite its contents (> newfile
)
– steeldriver
Mar 19 at 12:13
What code do I integrate into your code for condition? I think it is necessary to do loop but I am beginner at Linux. Let me search code please. For example, get rows between 152 and 243 instead of between 151 and 242 from 2nd column if the first column value 1998.
– Y. Suat
Mar 19 at 12:52
4
@Y.Suat, if you can't figure out that you'll need to change "151" to "152", you can't be even attempting to understand what the awk code actually does. The learning process isn't about blindly cutting and pasting canned formulas.
– Ray Butterworth
Mar 19 at 13:14
add a comment |
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
This should be straightforward in Awk
awk '$4==30 && $5==25 && $2>151 && $2<242' file > newfile
The default input and output field separators are whitespace.
I should use '>' instead of '>>' ?
– Y. Suat
Mar 19 at 12:05
@Y.Suat it depends whether you want to append the results to an existing file (>> newfile
) or overwrite its contents (> newfile
)
– steeldriver
Mar 19 at 12:13
What code do I integrate into your code for condition? I think it is necessary to do loop but I am beginner at Linux. Let me search code please. For example, get rows between 152 and 243 instead of between 151 and 242 from 2nd column if the first column value 1998.
– Y. Suat
Mar 19 at 12:52
4
@Y.Suat, if you can't figure out that you'll need to change "151" to "152", you can't be even attempting to understand what the awk code actually does. The learning process isn't about blindly cutting and pasting canned formulas.
– Ray Butterworth
Mar 19 at 13:14
add a comment |
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
This should be straightforward in Awk
awk '$4==30 && $5==25 && $2>151 && $2<242' file > newfile
The default input and output field separators are whitespace.
I'd like to extract all rows and then write data to text file, which is D= 30 and E=25 and B>=152 and B<=241.
This should be straightforward in Awk
awk '$4==30 && $5==25 && $2>151 && $2<242' file > newfile
The default input and output field separators are whitespace.
answered Mar 19 at 11:05
steeldriversteeldriver
70.5k11114187
70.5k11114187
I should use '>' instead of '>>' ?
– Y. Suat
Mar 19 at 12:05
@Y.Suat it depends whether you want to append the results to an existing file (>> newfile
) or overwrite its contents (> newfile
)
– steeldriver
Mar 19 at 12:13
What code do I integrate into your code for condition? I think it is necessary to do loop but I am beginner at Linux. Let me search code please. For example, get rows between 152 and 243 instead of between 151 and 242 from 2nd column if the first column value 1998.
– Y. Suat
Mar 19 at 12:52
4
@Y.Suat, if you can't figure out that you'll need to change "151" to "152", you can't be even attempting to understand what the awk code actually does. The learning process isn't about blindly cutting and pasting canned formulas.
– Ray Butterworth
Mar 19 at 13:14
add a comment |
I should use '>' instead of '>>' ?
– Y. Suat
Mar 19 at 12:05
@Y.Suat it depends whether you want to append the results to an existing file (>> newfile
) or overwrite its contents (> newfile
)
– steeldriver
Mar 19 at 12:13
What code do I integrate into your code for condition? I think it is necessary to do loop but I am beginner at Linux. Let me search code please. For example, get rows between 152 and 243 instead of between 151 and 242 from 2nd column if the first column value 1998.
– Y. Suat
Mar 19 at 12:52
4
@Y.Suat, if you can't figure out that you'll need to change "151" to "152", you can't be even attempting to understand what the awk code actually does. The learning process isn't about blindly cutting and pasting canned formulas.
– Ray Butterworth
Mar 19 at 13:14
I should use '>' instead of '>>' ?
– Y. Suat
Mar 19 at 12:05
I should use '>' instead of '>>' ?
– Y. Suat
Mar 19 at 12:05
@Y.Suat it depends whether you want to append the results to an existing file (
>> newfile
) or overwrite its contents (> newfile
)– steeldriver
Mar 19 at 12:13
@Y.Suat it depends whether you want to append the results to an existing file (
>> newfile
) or overwrite its contents (> newfile
)– steeldriver
Mar 19 at 12:13
What code do I integrate into your code for condition? I think it is necessary to do loop but I am beginner at Linux. Let me search code please. For example, get rows between 152 and 243 instead of between 151 and 242 from 2nd column if the first column value 1998.
– Y. Suat
Mar 19 at 12:52
What code do I integrate into your code for condition? I think it is necessary to do loop but I am beginner at Linux. Let me search code please. For example, get rows between 152 and 243 instead of between 151 and 242 from 2nd column if the first column value 1998.
– Y. Suat
Mar 19 at 12:52
4
4
@Y.Suat, if you can't figure out that you'll need to change "151" to "152", you can't be even attempting to understand what the awk code actually does. The learning process isn't about blindly cutting and pasting canned formulas.
– Ray Butterworth
Mar 19 at 13:14
@Y.Suat, if you can't figure out that you'll need to change "151" to "152", you can't be even attempting to understand what the awk code actually does. The learning process isn't about blindly cutting and pasting canned formulas.
– Ray Butterworth
Mar 19 at 13:14
add a comment |
Comment: if you are writing nested "if" statements, you are certainly doing it wrong.
So, even in MATLAB, which will always be slower than system calls, once you have loaded this data into a large array, do something like
my_output = data(data(:,2)>=152 & data(:,2)<=241 &data(:,4)==30 & data(:,5)==25,:)
and make that into a table()
and write that to your output.
add a comment |
Comment: if you are writing nested "if" statements, you are certainly doing it wrong.
So, even in MATLAB, which will always be slower than system calls, once you have loaded this data into a large array, do something like
my_output = data(data(:,2)>=152 & data(:,2)<=241 &data(:,4)==30 & data(:,5)==25,:)
and make that into a table()
and write that to your output.
add a comment |
Comment: if you are writing nested "if" statements, you are certainly doing it wrong.
So, even in MATLAB, which will always be slower than system calls, once you have loaded this data into a large array, do something like
my_output = data(data(:,2)>=152 & data(:,2)<=241 &data(:,4)==30 & data(:,5)==25,:)
and make that into a table()
and write that to your output.
Comment: if you are writing nested "if" statements, you are certainly doing it wrong.
So, even in MATLAB, which will always be slower than system calls, once you have loaded this data into a large array, do something like
my_output = data(data(:,2)>=152 & data(:,2)<=241 &data(:,4)==30 & data(:,5)==25,:)
and make that into a table()
and write that to your output.
answered Mar 19 at 15:09
Carl WitthoftCarl Witthoft
1449
1449
add a comment |
add a comment |
You can use the TextQL library to write SQL queries in order to extract data from a text file.
You can install it using the following command (I believe it's only available as of 18.04 or else you would need to install in another way, docker, or from source):
sudo apt install textql
In your case the command would be:
textql -sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
-dlm='0x20'
-output-dlm='0x20'
<file-name>
Explanation:
-sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
The normal SQL query, the
from
is omitted as it is not needed in this case.
Since your file doesn't have column headers, the default names of columns arec0
for the first column,c1
for the second column,c2
for the third column, etc.
-dlm='0x20'
This parameter is to tell the command that the delimiter is a space instead of the default comma
,
. And 2016 is the hex code for the space character.
output-dlm='0x20'
This parameter is to tell the command to use the space character as a delimiter in the output instead of the default comma
,
.
<file-name>
This must be changed to the use the path of the actual filename.
add a comment |
You can use the TextQL library to write SQL queries in order to extract data from a text file.
You can install it using the following command (I believe it's only available as of 18.04 or else you would need to install in another way, docker, or from source):
sudo apt install textql
In your case the command would be:
textql -sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
-dlm='0x20'
-output-dlm='0x20'
<file-name>
Explanation:
-sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
The normal SQL query, the
from
is omitted as it is not needed in this case.
Since your file doesn't have column headers, the default names of columns arec0
for the first column,c1
for the second column,c2
for the third column, etc.
-dlm='0x20'
This parameter is to tell the command that the delimiter is a space instead of the default comma
,
. And 2016 is the hex code for the space character.
output-dlm='0x20'
This parameter is to tell the command to use the space character as a delimiter in the output instead of the default comma
,
.
<file-name>
This must be changed to the use the path of the actual filename.
add a comment |
You can use the TextQL library to write SQL queries in order to extract data from a text file.
You can install it using the following command (I believe it's only available as of 18.04 or else you would need to install in another way, docker, or from source):
sudo apt install textql
In your case the command would be:
textql -sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
-dlm='0x20'
-output-dlm='0x20'
<file-name>
Explanation:
-sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
The normal SQL query, the
from
is omitted as it is not needed in this case.
Since your file doesn't have column headers, the default names of columns arec0
for the first column,c1
for the second column,c2
for the third column, etc.
-dlm='0x20'
This parameter is to tell the command that the delimiter is a space instead of the default comma
,
. And 2016 is the hex code for the space character.
output-dlm='0x20'
This parameter is to tell the command to use the space character as a delimiter in the output instead of the default comma
,
.
<file-name>
This must be changed to the use the path of the actual filename.
You can use the TextQL library to write SQL queries in order to extract data from a text file.
You can install it using the following command (I believe it's only available as of 18.04 or else you would need to install in another way, docker, or from source):
sudo apt install textql
In your case the command would be:
textql -sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
-dlm='0x20'
-output-dlm='0x20'
<file-name>
Explanation:
-sql "select * where c3=30 and c4=25 and c1>=152 and c1<=241"
The normal SQL query, the
from
is omitted as it is not needed in this case.
Since your file doesn't have column headers, the default names of columns arec0
for the first column,c1
for the second column,c2
for the third column, etc.
-dlm='0x20'
This parameter is to tell the command that the delimiter is a space instead of the default comma
,
. And 2016 is the hex code for the space character.
output-dlm='0x20'
This parameter is to tell the command to use the space character as a delimiter in the output instead of the default comma
,
.
<file-name>
This must be changed to the use the path of the actual filename.
edited Mar 19 at 16:01
answered Mar 19 at 15:53
DanDan
7,21034573
7,21034573
add a comment |
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1126883%2fhow-can-i-extract-data-from-text-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
This doesn't feel like a assignment at all. Was there any effort from your side?
– Jacob Vlijm
Mar 19 at 11:11
In fact I made effort to extract data via MATLAB but it takes a long time to get data. I can share my code for MATLAB.
– Y. Suat
Mar 19 at 11:59
Usually a good idea to add that information. It prevents the impression of saying "here's my problem, write something fro me".
– Jacob Vlijm
Mar 19 at 12:07
I get it. I didn't share it because I wrote code from different operating system.
– Y. Suat
Mar 19 at 12:20
Oh, I believe you :), but better add something to the question, mentioning what you did with what result. (the vote wasn't mine).
– Jacob Vlijm
Mar 19 at 12:28