bash: how to compute average of different columns?











up vote
0
down vote

favorite












I am writing a script for automatically computing average runtime.



First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



$ for i in `seq 100`; do { time ./foo.py; } 2>> time.txt; done


Output looks as follows



time ./foo.py
real 0m0,030s
user 0m0,030s
sys 0m0,000s
[...]


Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



I have thought about maybe using awk to calculate the mean, like this



awk '{ total += $2 } END { print total/NR }' time.txt


But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



Since I do not know how to achieve this objective, I thought to ask the community.



Any help is greatly appreciated.










share|improve this question




























    up vote
    0
    down vote

    favorite












    I am writing a script for automatically computing average runtime.



    First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



    $ for i in `seq 100`; do { time ./foo.py; } 2>> time.txt; done


    Output looks as follows



    time ./foo.py
    real 0m0,030s
    user 0m0,030s
    sys 0m0,000s
    [...]


    Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



    Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



    I have thought about maybe using awk to calculate the mean, like this



    awk '{ total += $2 } END { print total/NR }' time.txt


    But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



    Since I do not know how to achieve this objective, I thought to ask the community.



    Any help is greatly appreciated.










    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I am writing a script for automatically computing average runtime.



      First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



      $ for i in `seq 100`; do { time ./foo.py; } 2>> time.txt; done


      Output looks as follows



      time ./foo.py
      real 0m0,030s
      user 0m0,030s
      sys 0m0,000s
      [...]


      Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



      Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



      I have thought about maybe using awk to calculate the mean, like this



      awk '{ total += $2 } END { print total/NR }' time.txt


      But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



      Since I do not know how to achieve this objective, I thought to ask the community.



      Any help is greatly appreciated.










      share|improve this question















      I am writing a script for automatically computing average runtime.



      First I need to run $ time ./foo.py for 100 times and save output to file time.txt (working)



      $ for i in `seq 100`; do { time ./foo.py; } 2>> time.txt; done


      Output looks as follows



      time ./foo.py
      real 0m0,030s
      user 0m0,030s
      sys 0m0,000s
      [...]


      Runtimes from different scripts are in the same file. Each entry starts with time ./foo.py, followed by 100 "triplets" of real, user and sys.



      Now, if possible, I would love to have the script automatically compute the average runtime for each tested file by using all 100 "triplets", and neatly returning only one "mean triplet".



      I have thought about maybe using awk to calculate the mean, like this



      awk '{ total += $2 } END { print total/NR }' time.txt


      But the command would need to be adapted to fit my needs - after all, only the parts after the , (e.g. ,030s) may be used for computation and the s would also need to be disregarded.



      Since I do not know how to achieve this objective, I thought to ask the community.



      Any help is greatly appreciated.







      bash awk time mean






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 13 at 3:46









      mjuarez

      9,31473551




      9,31473551










      asked Nov 13 at 3:13









      OingoBoingo

      158




      158
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ { totalReal += $2 } /^user/ { totalUser += $2 } /^sys/ { totalSys += $2 } END { print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg  " totalSys/(NR/4) }' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:




          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.






          share|improve this answer





















          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!
            – OingoBoingo
            Nov 13 at 15:34











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273248%2fbash-how-to-compute-average-of-different-columns%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote



          accepted










          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ { totalReal += $2 } /^user/ { totalUser += $2 } /^sys/ { totalSys += $2 } END { print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg  " totalSys/(NR/4) }' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:




          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.






          share|improve this answer





















          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!
            – OingoBoingo
            Nov 13 at 15:34















          up vote
          1
          down vote



          accepted










          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ { totalReal += $2 } /^user/ { totalUser += $2 } /^sys/ { totalSys += $2 } END { print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg  " totalSys/(NR/4) }' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:




          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.






          share|improve this answer





















          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!
            – OingoBoingo
            Nov 13 at 15:34













          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ { totalReal += $2 } /^user/ { totalUser += $2 } /^sys/ { totalSys += $2 } END { print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg  " totalSys/(NR/4) }' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:




          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.






          share|improve this answer












          It's easier if you tell time to output the time info in POSIX format:



          awk '/^real/ { totalReal += $2 } /^user/ { totalUser += $2 } /^sys/ { totalSys += $2 } END { print "realAvg " totalReal/(NR/4) "n" "userAvg " totalUser/(NR/4) "n" "sysAvg  " totalSys/(NR/4) }' time.txt


          Prints output as follows:



          realAvg 12.62
          userAvg 27
          sysAvg 3.8


          Explanation:




          • Basically, tell awk to go through each line in the file, and if the line starts with real, add that to the totalReal variable, same for user and sys. So, basically, keep a running total of each of the three "types".

          • At the end, simply print the the three running totals, divided by the number of lines divided by 4. This is because you want each "set" of 4 lines to count as 1 instance, and awk's NR just counts the number of lines.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 13 at 3:46









          mjuarez

          9,31473551




          9,31473551












          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!
            – OingoBoingo
            Nov 13 at 15:34


















          • Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!
            – OingoBoingo
            Nov 13 at 15:34
















          Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!
          – OingoBoingo
          Nov 13 at 15:34




          Thank you very much! The idea with POSIX and keeping track of instances is really great. I'll try that out as soon as I have got the time. If I get it to work, I'll accept your answer. Thanks again!
          – OingoBoingo
          Nov 13 at 15:34


















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273248%2fbash-how-to-compute-average-of-different-columns%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to change which sound is reproduced for terminal bell?

          Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

          Can I use Tabulator js library in my java Spring + Thymeleaf project?