Add lines to files to make them equal length











up vote
4
down vote

favorite
1












I have a bunch of .csv files with N columns and different number of rows (lines). I would like to add as many empty lines ;...; (N semicolons) to make them the same length. I can get the length of the longest file manually but it would also be good to get this done automatically.



For example:



I have,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62


I need,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72
;;;;;
;;;;;
;;;;;


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62
;;;;;
;;;;;









share|improve this question




















  • 1




    A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
    – Bear'sBeard
    Dec 4 at 10:45






  • 1




    Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
    – sudodus
    Dec 4 at 11:12










  • @Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
    – myradio
    Dec 4 at 11:50










  • @sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
    – myradio
    Dec 4 at 11:51















up vote
4
down vote

favorite
1












I have a bunch of .csv files with N columns and different number of rows (lines). I would like to add as many empty lines ;...; (N semicolons) to make them the same length. I can get the length of the longest file manually but it would also be good to get this done automatically.



For example:



I have,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62


I need,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72
;;;;;
;;;;;
;;;;;


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62
;;;;;
;;;;;









share|improve this question




















  • 1




    A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
    – Bear'sBeard
    Dec 4 at 10:45






  • 1




    Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
    – sudodus
    Dec 4 at 11:12










  • @Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
    – myradio
    Dec 4 at 11:50










  • @sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
    – myradio
    Dec 4 at 11:51













up vote
4
down vote

favorite
1









up vote
4
down vote

favorite
1






1





I have a bunch of .csv files with N columns and different number of rows (lines). I would like to add as many empty lines ;...; (N semicolons) to make them the same length. I can get the length of the longest file manually but it would also be good to get this done automatically.



For example:



I have,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62


I need,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72
;;;;;
;;;;;
;;;;;


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62
;;;;;
;;;;;









share|improve this question















I have a bunch of .csv files with N columns and different number of rows (lines). I would like to add as many empty lines ;...; (N semicolons) to make them the same length. I can get the length of the longest file manually but it would also be good to get this done automatically.



For example:



I have,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62


I need,



file1.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
171; pep; 73; 22:26:10; 3; 72
;;;;;
;;;;;
;;;;;


file2.csv



128; pep; 93; 22:22:10; 3; 11
127; qep; 93; 12:52:10; 3; 15
121; fng; 96; 09:42:10; 3; 52
141; gep; 53; 21:22:10; 3; 62
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892


file3.csv



121; fng; 96; 09:42:10; 3; 52
171; pep; 73; 22:26:10; 3; 72
221; ahp; 93; 23:52:10; 3; 892
141; gep; 53; 21:22:10; 3; 62
;;;;;
;;;;;






shell-script text-processing awk files csv






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 4 at 10:35









Jeff Schaller

37.5k1052121




37.5k1052121










asked Dec 4 at 9:49









myradio

2459




2459








  • 1




    A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
    – Bear'sBeard
    Dec 4 at 10:45






  • 1




    Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
    – sudodus
    Dec 4 at 11:12










  • @Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
    – myradio
    Dec 4 at 11:50










  • @sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
    – myradio
    Dec 4 at 11:51














  • 1




    A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
    – Bear'sBeard
    Dec 4 at 10:45






  • 1




    Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
    – sudodus
    Dec 4 at 11:12










  • @Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
    – myradio
    Dec 4 at 11:50










  • @sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
    – myradio
    Dec 4 at 11:51








1




1




A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
Dec 4 at 10:45




A simple (but probably not optimal) way to do it would be to use wc to count the line count of each file to find the max. You can then echo ";;;;" >> file in each file until the line count reach the max.
– Bear'sBeard
Dec 4 at 10:45




1




1




Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
Dec 4 at 11:12




Why do you want the files to have the same number of lines? Maybe there is a good method, where you can use the files as they are (with their different number of lines).
– sudodus
Dec 4 at 11:12












@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
Dec 4 at 11:50




@Bear'sBeard Yep, something like that did it, I was looking for a more compact way.
– myradio
Dec 4 at 11:50












@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
Dec 4 at 11:51




@sudodus Well, there're people before and after me in the pipeline, things must match certain formats...
– myradio
Dec 4 at 11:51










3 Answers
3






active

oldest

votes

















up vote
3
down vote













Thanks @Sparhawk for the suggestions in the comments, I update based on those,



#!/bin/bash

emptyLine=;;;;;;;
rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
max=$(echo "${rr[*]}" | sort -nr | head -n1)
for name in files*pattern.txt;do
lineNumber=$(wc -l < $name)
let missing=max-lineNumber
for((i=0;i<$missing;i++));do
echo $emptyLine >> $name
done
done


Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,



#!/bin/bash

emptyLine=;;;;;;;
rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
max=$(echo "${rr[*]}" | sort -nr | head -n1)
for name in $(ls files*pattern.txt);do
lineNumber=$(cat $name | wc -l )
let missing=max-lineNumber
for((i=0;i<$missing;i++));do
echo $emptyLine >> $name
done
done


I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt






share|improve this answer



















  • 1




    Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
    – Sparhawk
    Dec 4 at 12:28






  • 1




    And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
    – Sparhawk
    Dec 4 at 12:30






  • 2




    @Sparhawk: I think you meant wc -l < $name
    – Thor
    Dec 4 at 12:37










  • @Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
    – Sparhawk
    Dec 4 at 20:54










  • @Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
    – Thor
    Dec 5 at 8:07


















up vote
2
down vote













An improvement of @myradio's answer.

The part inside the loop written in awk which should be much faster.



max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )
for f in file*.csv; do
awk -F';' -v max=$max
'END{
s=sprintf("%*s",FS,"");
gsub(/ /,"-",s);
for(i=NR;i<max;i++)
print s;
}' "$f" >> "$f"
done


With -F you set the correct field separator of your files (here -F';').



The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

You could simply replace that with print ";;;;;" or other static content if you like.






share|improve this answer























  • I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
    – myradio
    Dec 5 at 9:22










  • yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
    – RoVo
    Dec 5 at 9:52












  • About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
    – RoVo
    Dec 5 at 9:54




















up vote
1
down vote













In order to count the lines in each file only once:



wc -l *csv |sort -nr| sed 1d | {
read max file
pad=$(sed q "$file"|tr -cd ";") # extract separators from first record
while read lines file ; do
while [ $((lines+=1)) -le $max ] ; do
echo "$pad" >> "$file"
done
done
}


Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.






share|improve this answer





















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f485857%2fadd-lines-to-files-to-make-them-equal-length%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote













    Thanks @Sparhawk for the suggestions in the comments, I update based on those,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in files*pattern.txt;do
    lineNumber=$(wc -l < $name)
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in $(ls files*pattern.txt);do
    lineNumber=$(cat $name | wc -l )
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt






    share|improve this answer



















    • 1




      Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
      – Sparhawk
      Dec 4 at 12:28






    • 1




      And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
      – Sparhawk
      Dec 4 at 12:30






    • 2




      @Sparhawk: I think you meant wc -l < $name
      – Thor
      Dec 4 at 12:37










    • @Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
      – Sparhawk
      Dec 4 at 20:54










    • @Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
      – Thor
      Dec 5 at 8:07















    up vote
    3
    down vote













    Thanks @Sparhawk for the suggestions in the comments, I update based on those,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in files*pattern.txt;do
    lineNumber=$(wc -l < $name)
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in $(ls files*pattern.txt);do
    lineNumber=$(cat $name | wc -l )
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt






    share|improve this answer



















    • 1




      Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
      – Sparhawk
      Dec 4 at 12:28






    • 1




      And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
      – Sparhawk
      Dec 4 at 12:30






    • 2




      @Sparhawk: I think you meant wc -l < $name
      – Thor
      Dec 4 at 12:37










    • @Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
      – Sparhawk
      Dec 4 at 20:54










    • @Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
      – Thor
      Dec 5 at 8:07













    up vote
    3
    down vote










    up vote
    3
    down vote









    Thanks @Sparhawk for the suggestions in the comments, I update based on those,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in files*pattern.txt;do
    lineNumber=$(wc -l < $name)
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in $(ls files*pattern.txt);do
    lineNumber=$(cat $name | wc -l )
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt






    share|improve this answer














    Thanks @Sparhawk for the suggestions in the comments, I update based on those,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in files*pattern.txt;do
    lineNumber=$(wc -l < $name)
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    Well, not elegand nor efficient. Actually, it takes a couple of seconds which sounds an eternity given the small size of the data. Nevertheless it works,



    #!/bin/bash

    emptyLine=;;;;;;;
    rr=($(wc -l files*pattern.txt | awk '{print $1}' | sed '$ d'))
    max=$(echo "${rr[*]}" | sort -nr | head -n1)
    for name in $(ls files*pattern.txt);do
    lineNumber=$(cat $name | wc -l )
    let missing=max-lineNumber
    for((i=0;i<$missing;i++));do
    echo $emptyLine >> $name
    done
    done


    I just put this file together in the directory where I have the files provided that there is a pattern I can use to list them with files*pattern.txt







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Dec 5 at 9:08

























    answered Dec 4 at 11:54









    myradio

    2459




    2459








    • 1




      Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
      – Sparhawk
      Dec 4 at 12:28






    • 1




      And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
      – Sparhawk
      Dec 4 at 12:30






    • 2




      @Sparhawk: I think you meant wc -l < $name
      – Thor
      Dec 4 at 12:37










    • @Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
      – Sparhawk
      Dec 4 at 20:54










    • @Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
      – Thor
      Dec 5 at 8:07














    • 1




      Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
      – Sparhawk
      Dec 4 at 12:28






    • 1




      And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
      – Sparhawk
      Dec 4 at 12:30






    • 2




      @Sparhawk: I think you meant wc -l < $name
      – Thor
      Dec 4 at 12:37










    • @Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
      – Sparhawk
      Dec 4 at 20:54










    • @Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
      – Thor
      Dec 5 at 8:07








    1




    1




    Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
    – Sparhawk
    Dec 4 at 12:28




    Nice one (+1)! Thanks for posting up the solution. One small note, don't parse ls, just use for name in files*pattern.txt; do instead.
    – Sparhawk
    Dec 4 at 12:28




    1




    1




    And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
    – Sparhawk
    Dec 4 at 12:30




    And while I'm nit-picking, there's a "useless use of cat" there too. Just do wc -l $name
    – Sparhawk
    Dec 4 at 12:30




    2




    2




    @Sparhawk: I think you meant wc -l < $name
    – Thor
    Dec 4 at 12:37




    @Sparhawk: I think you meant wc -l < $name
    – Thor
    Dec 4 at 12:37












    @Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
    – Sparhawk
    Dec 4 at 20:54




    @Thor No? wc [OPTION]... [FILE]... works too, as per the man. In fact, this script uses this construction in an earlier line.
    – Sparhawk
    Dec 4 at 20:54












    @Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
    – Thor
    Dec 5 at 8:07




    @Sparhawk: Sure, but if you wanted the equivalent output of cat file | wc -l redirection is the way to go.
    – Thor
    Dec 5 at 8:07












    up vote
    2
    down vote













    An improvement of @myradio's answer.

    The part inside the loop written in awk which should be much faster.



    max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )
    for f in file*.csv; do
    awk -F';' -v max=$max
    'END{
    s=sprintf("%*s",FS,"");
    gsub(/ /,"-",s);
    for(i=NR;i<max;i++)
    print s;
    }' "$f" >> "$f"
    done


    With -F you set the correct field separator of your files (here -F';').



    The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

    You could simply replace that with print ";;;;;" or other static content if you like.






    share|improve this answer























    • I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
      – myradio
      Dec 5 at 9:22










    • yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
      – RoVo
      Dec 5 at 9:52












    • About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
      – RoVo
      Dec 5 at 9:54

















    up vote
    2
    down vote













    An improvement of @myradio's answer.

    The part inside the loop written in awk which should be much faster.



    max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )
    for f in file*.csv; do
    awk -F';' -v max=$max
    'END{
    s=sprintf("%*s",FS,"");
    gsub(/ /,"-",s);
    for(i=NR;i<max;i++)
    print s;
    }' "$f" >> "$f"
    done


    With -F you set the correct field separator of your files (here -F';').



    The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

    You could simply replace that with print ";;;;;" or other static content if you like.






    share|improve this answer























    • I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
      – myradio
      Dec 5 at 9:22










    • yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
      – RoVo
      Dec 5 at 9:52












    • About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
      – RoVo
      Dec 5 at 9:54















    up vote
    2
    down vote










    up vote
    2
    down vote









    An improvement of @myradio's answer.

    The part inside the loop written in awk which should be much faster.



    max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )
    for f in file*.csv; do
    awk -F';' -v max=$max
    'END{
    s=sprintf("%*s",FS,"");
    gsub(/ /,"-",s);
    for(i=NR;i<max;i++)
    print s;
    }' "$f" >> "$f"
    done


    With -F you set the correct field separator of your files (here -F';').



    The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

    You could simply replace that with print ";;;;;" or other static content if you like.






    share|improve this answer














    An improvement of @myradio's answer.

    The part inside the loop written in awk which should be much faster.



    max=$(wc -l file*.csv | sed '$ d' | sort -n | tail -n1 | awk '{print $1}' )
    for f in file*.csv; do
    awk -F';' -v max=$max
    'END{
    s=sprintf("%*s",FS,"");
    gsub(/ /,"-",s);
    for(i=NR;i<max;i++)
    print s;
    }' "$f" >> "$f"
    done


    With -F you set the correct field separator of your files (here -F';').



    The s=sprintf();gsub(); part dynamically sets the right amount of the FS (= field separator) (via).

    You could simply replace that with print ";;;;;" or other static content if you like.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Dec 5 at 9:51

























    answered Dec 4 at 13:53









    RoVo

    2,444215




    2,444215












    • I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
      – myradio
      Dec 5 at 9:22










    • yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
      – RoVo
      Dec 5 at 9:52












    • About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
      – RoVo
      Dec 5 at 9:54




















    • I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
      – myradio
      Dec 5 at 9:22










    • yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
      – RoVo
      Dec 5 at 9:52












    • About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
      – RoVo
      Dec 5 at 9:54


















    I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
    – myradio
    Dec 5 at 9:22




    I like this solution. It's certainly harder to read but is good that had a dynamic FS. Nevertheless, 2 things: 1. About the efficiency, I don't know what your time results mean because this depends on (number and size of) the files. 2. I actually wanted to try this to compare the time results with my version, but I got a problem, awk complains about gensub being undefined. Is gensub maybe on gawk instead?
    – myradio
    Dec 5 at 9:22












    yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
    – RoVo
    Dec 5 at 9:52






    yeah that seems to be GNU Awk. I replaced it with the gsub solution from the linked answer.
    – RoVo
    Dec 5 at 9:52














    About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
    – RoVo
    Dec 5 at 9:54






    About the time results, I think I misunderstood the statement "it takes a couple of seconds" from your answer to be the time needed to process the example files from your question ... I removed that.
    – RoVo
    Dec 5 at 9:54












    up vote
    1
    down vote













    In order to count the lines in each file only once:



    wc -l *csv |sort -nr| sed 1d | {
    read max file
    pad=$(sed q "$file"|tr -cd ";") # extract separators from first record
    while read lines file ; do
    while [ $((lines+=1)) -le $max ] ; do
    echo "$pad" >> "$file"
    done
    done
    }


    Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.






    share|improve this answer

























      up vote
      1
      down vote













      In order to count the lines in each file only once:



      wc -l *csv |sort -nr| sed 1d | {
      read max file
      pad=$(sed q "$file"|tr -cd ";") # extract separators from first record
      while read lines file ; do
      while [ $((lines+=1)) -le $max ] ; do
      echo "$pad" >> "$file"
      done
      done
      }


      Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.






      share|improve this answer























        up vote
        1
        down vote










        up vote
        1
        down vote









        In order to count the lines in each file only once:



        wc -l *csv |sort -nr| sed 1d | {
        read max file
        pad=$(sed q "$file"|tr -cd ";") # extract separators from first record
        while read lines file ; do
        while [ $((lines+=1)) -le $max ] ; do
        echo "$pad" >> "$file"
        done
        done
        }


        Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.






        share|improve this answer












        In order to count the lines in each file only once:



        wc -l *csv |sort -nr| sed 1d | {
        read max file
        pad=$(sed q "$file"|tr -cd ";") # extract separators from first record
        while read lines file ; do
        while [ $((lines+=1)) -le $max ] ; do
        echo "$pad" >> "$file"
        done
        done
        }


        Note that any newlines in the filenames will cause problems for both sort and the while read loop, but they can handle filenames containing normal spaces.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 4 at 15:42









        JigglyNaga

        3,593829




        3,593829






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f485857%2fadd-lines-to-files-to-make-them-equal-length%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to send String Array data to Server using php in android

            Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

            Is anime1.com a legal site for watching anime?