Add lines to files to make them equal length

up vote
4
down vote

favorite

I have a bunch of .csv files with N columns and different number of rows (lines). I would like to add as many empty lines ;...; (N semicolons) to make them the same length. I can get the length of the longest file manually but it would also be good to get this done automatically.

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

asked Dec 4 at 9:49

myradio

2459

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
Dec 4 at 10:45

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
Dec 4 at 11:12

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
Dec 4 at 11:50

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
Dec 4 at 11:51

add a comment |

up vote
4
down vote

favorite

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

asked Dec 4 at 9:49

myradio

2459

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
Dec 4 at 10:45

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
Dec 4 at 11:12

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
Dec 4 at 11:50

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
Dec 4 at 11:51

add a comment |

up vote
4
down vote

favorite

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

asked Dec 4 at 9:49

myradio

2459

For example:

I have,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

I need,

file1.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

171; pep; 73; 22:26:10; 3; 72

;;;;;

;;;;;

;;;;;

file2.csv

128; pep; 93; 22:22:10; 3; 11

127; qep; 93; 12:52:10; 3; 15

121; fng; 96; 09:42:10; 3; 52

141; gep; 53; 21:22:10; 3; 62

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

file3.csv

121; fng; 96; 09:42:10; 3; 52

171; pep; 73; 22:26:10; 3; 72

221; ahp; 93; 23:52:10; 3; 892

141; gep; 53; 21:22:10; 3; 62

;;;;;

;;;;;

shell-script text-processing awk files csv

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

asked Dec 4 at 9:49

myradio

2459

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

asked Dec 4 at 9:49

myradio

2459

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

edited Dec 4 at 10:35

Jeff Schaller

37.5k1052121

asked Dec 4 at 9:49

myradio

2459

asked Dec 4 at 9:49

myradio

2459

asked Dec 4 at 9:49

myradio

2459

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
Dec 4 at 10:45

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
Dec 4 at 11:12

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
Dec 4 at 11:50

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
Dec 4 at 11:51

add a comment |

1

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
Dec 4 at 10:45

1

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
Dec 4 at 11:12

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
Dec 4 at 11:50

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
Dec 4 at 11:51

A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
Dec 4 at 10:45

Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
Dec 4 at 11:12

@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
Dec 4 at 11:50

@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
Dec 4 at 11:51

add a comment |

3 Answers
3

active

oldest

votes

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited Dec 5 at 9:08

answered Dec 4 at 11:54

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
Dec 4 at 12:28

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
Dec 4 at 12:30

2

@Sparhawk: I think you meant wc -l < $name
– Thor
Dec 4 at 12:37

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
Dec 4 at 20:54

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
Dec 5 at 8:07

|
show 1 more comment

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited Dec 5 at 9:51

answered Dec 4 at 13:53

RoVo

2,444215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
Dec 5 at 9:22

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
Dec 5 at 9:52

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
Dec 5 at 9:54

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered Dec 4 at 15:42

JigglyNaga

3,593829

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f485857%2fadd-lines-to-files-to-make-them-equal-length%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited Dec 5 at 9:08

answered Dec 4 at 11:54

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
Dec 4 at 12:28

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
Dec 4 at 12:30

2

@Sparhawk: I think you meant wc -l < $name
– Thor
Dec 4 at 12:37

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
Dec 4 at 20:54

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
Dec 5 at 8:07

|
show 1 more comment

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited Dec 5 at 9:08

answered Dec 4 at 11:54

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
Dec 4 at 12:28

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
Dec 4 at 12:30

2

@Sparhawk: I think you meant wc -l < $name
– Thor
Dec 4 at 12:37

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
Dec 4 at 20:54

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
Dec 5 at 8:07

|
show 1 more comment

up vote
3
down vote

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited Dec 5 at 9:08

answered Dec 4 at 11:54

myradio

2459

Thanks @Sparhawk for the suggestions in the comments, I update based on those,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in files*pattern.txt;do

    lineNumber=$(wc -l < $name)

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,

#!/bin/bash



emptyLine=;;;;;;;

rr=($(wc -l files*pattern.txt |  awk '{print $1}' | sed '$ d'))

max=$(echo "${rr[*]}" | sort -nr | head -n1)

for name in $(ls files*pattern.txt);do

    lineNumber=$(cat $name | wc -l )

    let missing=max-lineNumber

    for((i=0;i<$missing;i++));do

        echo $emptyLine >> $name

    done

done

I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt

edited Dec 5 at 9:08

answered Dec 4 at 11:54

myradio

2459

edited Dec 5 at 9:08

answered Dec 4 at 11:54

myradio

2459

answered Dec 4 at 11:54

myradio

2459

answered Dec 4 at 11:54

myradio

2459

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
Dec 4 at 12:28

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
Dec 4 at 12:30

2

@Sparhawk: I think you meant wc -l < $name
– Thor
Dec 4 at 12:37

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
Dec 4 at 20:54

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
Dec 5 at 8:07

|
show 1 more comment

1

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
Dec 4 at 12:28

1

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
Dec 4 at 12:30

2

@Sparhawk: I think you meant wc -l < $name
– Thor
Dec 4 at 12:37

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
Dec 4 at 20:54

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
Dec 5 at 8:07

Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
– Sparhawk
Dec 4 at 12:28

And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
– Sparhawk
Dec 4 at 12:30

@Sparhawk: I think you meant wc -l < $name
– Thor
Dec 4 at 12:37

@Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
– Sparhawk
Dec 4 at 20:54

@Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
– Thor
Dec 5 at 8:07

|
show 1 more comment

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited Dec 5 at 9:51

answered Dec 4 at 13:53

RoVo

2,444215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
Dec 5 at 9:22

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
Dec 5 at 9:52

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
Dec 5 at 9:54

add a comment |

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited Dec 5 at 9:51

answered Dec 4 at 13:53

RoVo

2,444215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
Dec 5 at 9:22

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
Dec 5 at 9:52

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
Dec 5 at 9:54

add a comment |

up vote
2
down vote

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited Dec 5 at 9:51

answered Dec 4 at 13:53

RoVo

2,444215

An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.

max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )

for f in file*.csv; do

    awk -F';' -v max=$max 

      'END{

         s=sprintf("%*s",FS,"");

         gsub(/ /,"-",s);

         for(i=NR;i<max;i++)

           print s;

       }' "$f" >> "$f"

done

With -F you set the correct field separator of your files (here -F';').

The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.

edited Dec 5 at 9:51

answered Dec 4 at 13:53

RoVo

2,444215

edited Dec 5 at 9:51

answered Dec 4 at 13:53

RoVo

2,444215

answered Dec 4 at 13:53

RoVo

2,444215

answered Dec 4 at 13:53

RoVo

2,444215

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
Dec 5 at 9:22

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
Dec 5 at 9:52

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
Dec 5 at 9:54

add a comment |

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
Dec 5 at 9:22

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
Dec 5 at 9:52

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
Dec 5 at 9:54

I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
– myradio
Dec 5 at 9:22

yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
– RoVo
Dec 5 at 9:52

About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
– RoVo
Dec 5 at 9:54

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered Dec 4 at 15:42

JigglyNaga

3,593829

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered Dec 4 at 15:42

JigglyNaga

3,593829

add a comment |

up vote
1
down vote

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered Dec 4 at 15:42

JigglyNaga

3,593829

In order to count the lines in each file only once:

wc -l *csv |sort -nr| sed 1d | {

    read max file

    pad=$(sed q "$file"|tr -cd ";")  # extract separators from first record

    while read lines file ; do

        while [ $((lines+=1)) -le $max ] ; do

                echo "$pad" >> "$file"

        done

    done

}

Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.

answered Dec 4 at 15:42

JigglyNaga

3,593829

answered Dec 4 at 15:42

JigglyNaga

3,593829

answered Dec 4 at 15:42

JigglyNaga

3,593829

answered Dec 4 at 15:42

JigglyNaga

3,593829

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky