Pandas measure elapsed time since a condition











up vote
14
down vote

favorite
1












I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?










share|improve this question


















  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21















up vote
14
down vote

favorite
1












I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?










share|improve this question


















  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21













up vote
14
down vote

favorite
1









up vote
14
down vote

favorite
1






1





I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?










share|improve this question













I have the following dataframe:



               Time   Work
2018-12-01 10:00:00 Off
2018-12-01 10:00:02 On
2018-12-01 10:00:05 On
2018-12-01 10:00:06 On
2018-12-01 10:00:07 On
2018-12-01 10:00:09 Off
2018-12-01 10:00:11 Off
2018-12-01 10:00:14 On
2018-12-01 10:00:16 On
2018-12-01 10:00:18 On
2018-12-01 10:00:20 Off


I would like to creat a new column with the elapsed time since the device started working.



               Time   Work    Elapsed Time
2018-12-01 10:00:00 Off 0
2018-12-01 10:00:02 On 2
2018-12-01 10:00:05 On 5
2018-12-01 10:00:06 On 6
2018-12-01 10:00:07 On 7
2018-12-01 10:00:09 Off 0
2018-12-01 10:00:11 Off 0
2018-12-01 10:00:14 On 3
2018-12-01 10:00:16 On 5
2018-12-01 10:00:18 On 7
2018-12-01 10:00:20 Off 0


How can I do it?







python pandas time timedelta






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Dec 6 at 17:36









Rafael

713




713








  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21














  • 4




    Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
    – zarose
    Dec 6 at 20:21








4




4




Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
Dec 6 at 20:21




Welcome to Stack Overflow, Rafael! I definitely came here just because the title seemed amusing, but left learning what Pandas actually means in this context.
– zarose
Dec 6 at 20:21












5 Answers
5






active

oldest

votes

















up vote
13
down vote













You can use groupby:



# df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
sec = df['Time'].dt.second
df['Elapsed Time'] = (
sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0
1 2018-12-01 10:00:02 On 2
2 2018-12-01 10:00:05 On 5
3 2018-12-01 10:00:06 On 6
4 2018-12-01 10:00:07 On 7
5 2018-12-01 10:00:09 Off 0
6 2018-12-01 10:00:11 Off 0
7 2018-12-01 10:00:14 On 3
8 2018-12-01 10:00:16 On 5
9 2018-12-01 10:00:18 On 7
10 2018-12-01 10:00:20 Off 0


The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



cumsum is used to identify groups:



df.Work.eq('Off').cumsum()

0 1
1 1
2 1
3 1
4 1
5 2
6 3
7 3
8 3
9 3
10 4
Name: Work, dtype: int64




If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



sec = df['Time'].values.astype(np.int64) // 10e8

df['Elapsed Time'] = (
sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0





share|improve this answer























  • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
    – coldspeed
    Dec 6 at 19:19










  • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
    – coldspeed
    Dec 6 at 19:21










  • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
    – Rafael
    Dec 6 at 19:32










  • @Rafael can you read my comments just about yours again please?
    – coldspeed
    Dec 6 at 19:32










  • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
    – Rafael
    Dec 6 at 19:49


















up vote
8
down vote













IIUC first with transform



(df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
Out[1090]:
0 0
1 2
2 5
3 6
4 7
5 0
6 0
7 3
8 5
9 7
10 0
Name: Time, dtype: int64





share|improve this answer





















  • If I set the column Time as the index, how should I change the code so it would also work?
    – Rafael
    2 days ago










  • @Rafael df.reset_index(inplace=True)
    – W-B
    2 days ago










  • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
    – Rafael
    2 days ago










  • @Rafael this is df.reset_index(inplace=True) reset not set
    – W-B
    2 days ago


















up vote
7
down vote













You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



s = (df.Work=='Off').cumsum()
df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


Output



                  Time Work  Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0





share|improve this answer





















  • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
    – Rafael
    Dec 6 at 19:25










  • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
    – ALollz
    Dec 6 at 19:34




















up vote
4
down vote













Using a groupby, you can do this:



df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
.transform(lambda x: x.diff()
.dt.total_seconds()
.cumsum())
.fillna(0))

>>> df
Time Work Elapsed Time
0 2018-12-01 10:00:00 Off 0.0
1 2018-12-01 10:00:02 On 2.0
2 2018-12-01 10:00:05 On 5.0
3 2018-12-01 10:00:06 On 6.0
4 2018-12-01 10:00:07 On 7.0
5 2018-12-01 10:00:09 Off 0.0
6 2018-12-01 10:00:11 Off 0.0
7 2018-12-01 10:00:14 On 3.0
8 2018-12-01 10:00:16 On 5.0
9 2018-12-01 10:00:18 On 7.0
10 2018-12-01 10:00:20 Off 0.0





share|improve this answer




























    up vote
    3
    down vote













    A numpy slicy approach



    u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
    t = df.Time.values

    df['Elapsed Time'] = t - t[f[i]]
    df

    Time Work Elapsed Time
    0 2018-12-01 10:00:00 Off 00:00:00
    1 2018-12-01 10:00:02 On 00:00:02
    2 2018-12-01 10:00:05 On 00:00:05
    3 2018-12-01 10:00:06 On 00:00:06
    4 2018-12-01 10:00:07 On 00:00:07
    5 2018-12-01 10:00:09 Off 00:00:00
    6 2018-12-01 10:00:11 Off 00:00:00
    7 2018-12-01 10:00:14 On 00:00:03
    8 2018-12-01 10:00:16 On 00:00:05
    9 2018-12-01 10:00:18 On 00:00:07
    10 2018-12-01 10:00:20 Off 00:00:00




    We can nail down the integer bit with



    df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
    df

    Time Work Elapsed Time
    0 2018-12-01 10:00:00 Off 0
    1 2018-12-01 10:00:02 On 2
    2 2018-12-01 10:00:05 On 5
    3 2018-12-01 10:00:06 On 6
    4 2018-12-01 10:00:07 On 7
    5 2018-12-01 10:00:09 Off 0
    6 2018-12-01 10:00:11 Off 0
    7 2018-12-01 10:00:14 On 3
    8 2018-12-01 10:00:16 On 5
    9 2018-12-01 10:00:18 On 7
    10 2018-12-01 10:00:20 Off 0





    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53656878%2fpandas-measure-elapsed-time-since-a-condition%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      5 Answers
      5






      active

      oldest

      votes








      5 Answers
      5






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      13
      down vote













      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer























      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49















      up vote
      13
      down vote













      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer























      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49













      up vote
      13
      down vote










      up vote
      13
      down vote









      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer














      You can use groupby:



      # df['Time'] = pd.to_datetime(df['Time'], errors='coerce') # Uncomment if needed.
      sec = df['Time'].dt.second
      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0
      1 2018-12-01 10:00:02 On 2
      2 2018-12-01 10:00:05 On 5
      3 2018-12-01 10:00:06 On 6
      4 2018-12-01 10:00:07 On 7
      5 2018-12-01 10:00:09 Off 0
      6 2018-12-01 10:00:11 Off 0
      7 2018-12-01 10:00:14 On 3
      8 2018-12-01 10:00:16 On 5
      9 2018-12-01 10:00:18 On 7
      10 2018-12-01 10:00:20 Off 0


      The idea is to extract the seconds portion and subtract the elapsed time from the first moment the state changes from "Off" to "On". This is done using transform and first.



      cumsum is used to identify groups:



      df.Work.eq('Off').cumsum()

      0 1
      1 1
      2 1
      3 1
      4 1
      5 2
      6 3
      7 3
      8 3
      9 3
      10 4
      Name: Work, dtype: int64




      If there's a possibility your device can span multiple minutes while in the "On", then, initialise sec as:



      sec = df['Time'].values.astype(np.int64) // 10e8

      df['Elapsed Time'] = (
      sec - sec.groupby(df.Work.eq('Off').cumsum()).transform('first'))

      df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Dec 6 at 19:50

























      answered Dec 6 at 17:42









      coldspeed

      114k18105182




      114k18105182












      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49


















      • @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
        – coldspeed
        Dec 6 at 19:19










      • @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
        – coldspeed
        Dec 6 at 19:21










      • The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
        – Rafael
        Dec 6 at 19:32










      • @Rafael can you read my comments just about yours again please?
        – coldspeed
        Dec 6 at 19:32










      • I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
        – Rafael
        Dec 6 at 19:49
















      @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
      – coldspeed
      Dec 6 at 19:19




      @Rafael Yeah, the assumption here is that your row starts in the "Off" condition. Can you append a row at the beginning of your frame?
      – coldspeed
      Dec 6 at 19:19












      @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
      – coldspeed
      Dec 6 at 19:21




      @Rafael Okay, and regarding your second point, does df['Time'].values.astype(np.int64) // 10e8 work?
      – coldspeed
      Dec 6 at 19:21












      The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
      – Rafael
      Dec 6 at 19:32




      The code worked fine for seconds. However, when the first cell of the column Work was 'On' the elapsed time did not begin in zero. Besides, when the time changed to the next minute, the elapsed time was negative. I tried using sec = df['Time'].astype(int) but I got the error: cannot astype a datetimelike from [datetime64[ns]] to [int32];
      – Rafael
      Dec 6 at 19:32












      @Rafael can you read my comments just about yours again please?
      – coldspeed
      Dec 6 at 19:32




      @Rafael can you read my comments just about yours again please?
      – coldspeed
      Dec 6 at 19:32












      I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
      – Rafael
      Dec 6 at 19:49




      I deleted the comment and posted it again so I could edit it. Regarding your answers, I receive the data every day, it begins 'On' and ends 'On', so I am not sure if I can append a row, but I will try to using the date change as condition. The code df['Time'].values.astype(np.int64) // 10e8 did work.
      – Rafael
      Dec 6 at 19:49












      up vote
      8
      down vote













      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64





      share|improve this answer





















      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        2 days ago










      • @Rafael df.reset_index(inplace=True)
        – W-B
        2 days ago










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        2 days ago










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        2 days ago















      up vote
      8
      down vote













      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64





      share|improve this answer





















      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        2 days ago










      • @Rafael df.reset_index(inplace=True)
        – W-B
        2 days ago










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        2 days ago










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        2 days ago













      up vote
      8
      down vote










      up vote
      8
      down vote









      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64





      share|improve this answer












      IIUC first with transform



      (df.Time-df.Time.groupby(df.Work.eq('Off').cumsum()).transform('first')).dt.seconds
      Out[1090]:
      0 0
      1 2
      2 5
      3 6
      4 7
      5 0
      6 0
      7 3
      8 5
      9 7
      10 0
      Name: Time, dtype: int64






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Dec 6 at 18:05









      W-B

      97.9k73162




      97.9k73162












      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        2 days ago










      • @Rafael df.reset_index(inplace=True)
        – W-B
        2 days ago










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        2 days ago










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        2 days ago


















      • If I set the column Time as the index, how should I change the code so it would also work?
        – Rafael
        2 days ago










      • @Rafael df.reset_index(inplace=True)
        – W-B
        2 days ago










      • I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
        – Rafael
        2 days ago










      • @Rafael this is df.reset_index(inplace=True) reset not set
        – W-B
        2 days ago
















      If I set the column Time as the index, how should I change the code so it would also work?
      – Rafael
      2 days ago




      If I set the column Time as the index, how should I change the code so it would also work?
      – Rafael
      2 days ago












      @Rafael df.reset_index(inplace=True)
      – W-B
      2 days ago




      @Rafael df.reset_index(inplace=True)
      – W-B
      2 days ago












      I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
      – Rafael
      2 days ago




      I added the line df.set_index('Time', inplace=True) before the code you wrote for the elapsed time. So I have to adapt the code to subtract in the index column instead of the Time column. I tried (df.index-df.index.groupby(df.Operation.eq('Off').cumsum()).transform('first')) but it did not work.
      – Rafael
      2 days ago












      @Rafael this is df.reset_index(inplace=True) reset not set
      – W-B
      2 days ago




      @Rafael this is df.reset_index(inplace=True) reset not set
      – W-B
      2 days ago










      up vote
      7
      down vote













      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer





















      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34

















      up vote
      7
      down vote













      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer





















      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34















      up vote
      7
      down vote










      up vote
      7
      down vote









      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer












      You could use two groupbys. The first calculates the time difference within each group. The second then sums those within each group.



      s = (df.Work=='Off').cumsum()
      df['Elapsed Time'] = df.groupby(s).Time.diff().dt.total_seconds().fillna(0).groupby(s).cumsum()


      Output



                        Time Work  Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Dec 6 at 17:41









      ALollz

      10.9k31334




      10.9k31334












      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34




















      • The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
        – Rafael
        Dec 6 at 19:25










      • @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
        – ALollz
        Dec 6 at 19:34


















      The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
      – Rafael
      Dec 6 at 19:25




      The code worked fine. However, when the first work cell of the dataframe was 'On', the elapsed time was not zero.
      – Rafael
      Dec 6 at 19:25












      @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
      – ALollz
      Dec 6 at 19:34






      @Rafael good point. There might be a neat way to fix it in the calculation, but you can just fix it after the fact with df.loc[df.index < s[s==1].idxmax(), 'Elapsed Time'] = 0. I guess there's still an issue if the machine never tuns on, but that can be fixed or handled too.
      – ALollz
      Dec 6 at 19:34












      up vote
      4
      down vote













      Using a groupby, you can do this:



      df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
      .transform(lambda x: x.diff()
      .dt.total_seconds()
      .cumsum())
      .fillna(0))

      >>> df
      Time Work Elapsed Time
      0 2018-12-01 10:00:00 Off 0.0
      1 2018-12-01 10:00:02 On 2.0
      2 2018-12-01 10:00:05 On 5.0
      3 2018-12-01 10:00:06 On 6.0
      4 2018-12-01 10:00:07 On 7.0
      5 2018-12-01 10:00:09 Off 0.0
      6 2018-12-01 10:00:11 Off 0.0
      7 2018-12-01 10:00:14 On 3.0
      8 2018-12-01 10:00:16 On 5.0
      9 2018-12-01 10:00:18 On 7.0
      10 2018-12-01 10:00:20 Off 0.0





      share|improve this answer

























        up vote
        4
        down vote













        Using a groupby, you can do this:



        df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
        .transform(lambda x: x.diff()
        .dt.total_seconds()
        .cumsum())
        .fillna(0))

        >>> df
        Time Work Elapsed Time
        0 2018-12-01 10:00:00 Off 0.0
        1 2018-12-01 10:00:02 On 2.0
        2 2018-12-01 10:00:05 On 5.0
        3 2018-12-01 10:00:06 On 6.0
        4 2018-12-01 10:00:07 On 7.0
        5 2018-12-01 10:00:09 Off 0.0
        6 2018-12-01 10:00:11 Off 0.0
        7 2018-12-01 10:00:14 On 3.0
        8 2018-12-01 10:00:16 On 5.0
        9 2018-12-01 10:00:18 On 7.0
        10 2018-12-01 10:00:20 Off 0.0





        share|improve this answer























          up vote
          4
          down vote










          up vote
          4
          down vote









          Using a groupby, you can do this:



          df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
          .transform(lambda x: x.diff()
          .dt.total_seconds()
          .cumsum())
          .fillna(0))

          >>> df
          Time Work Elapsed Time
          0 2018-12-01 10:00:00 Off 0.0
          1 2018-12-01 10:00:02 On 2.0
          2 2018-12-01 10:00:05 On 5.0
          3 2018-12-01 10:00:06 On 6.0
          4 2018-12-01 10:00:07 On 7.0
          5 2018-12-01 10:00:09 Off 0.0
          6 2018-12-01 10:00:11 Off 0.0
          7 2018-12-01 10:00:14 On 3.0
          8 2018-12-01 10:00:16 On 5.0
          9 2018-12-01 10:00:18 On 7.0
          10 2018-12-01 10:00:20 Off 0.0





          share|improve this answer












          Using a groupby, you can do this:



          df['Elapsed Time'] = (df.groupby(df.Work.eq('Off').cumsum()).Time
          .transform(lambda x: x.diff()
          .dt.total_seconds()
          .cumsum())
          .fillna(0))

          >>> df
          Time Work Elapsed Time
          0 2018-12-01 10:00:00 Off 0.0
          1 2018-12-01 10:00:02 On 2.0
          2 2018-12-01 10:00:05 On 5.0
          3 2018-12-01 10:00:06 On 6.0
          4 2018-12-01 10:00:07 On 7.0
          5 2018-12-01 10:00:09 Off 0.0
          6 2018-12-01 10:00:11 Off 0.0
          7 2018-12-01 10:00:14 On 3.0
          8 2018-12-01 10:00:16 On 5.0
          9 2018-12-01 10:00:18 On 7.0
          10 2018-12-01 10:00:20 Off 0.0






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 6 at 17:42









          sacul

          29.7k41640




          29.7k41640






















              up vote
              3
              down vote













              A numpy slicy approach



              u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
              t = df.Time.values

              df['Elapsed Time'] = t - t[f[i]]
              df

              Time Work Elapsed Time
              0 2018-12-01 10:00:00 Off 00:00:00
              1 2018-12-01 10:00:02 On 00:00:02
              2 2018-12-01 10:00:05 On 00:00:05
              3 2018-12-01 10:00:06 On 00:00:06
              4 2018-12-01 10:00:07 On 00:00:07
              5 2018-12-01 10:00:09 Off 00:00:00
              6 2018-12-01 10:00:11 Off 00:00:00
              7 2018-12-01 10:00:14 On 00:00:03
              8 2018-12-01 10:00:16 On 00:00:05
              9 2018-12-01 10:00:18 On 00:00:07
              10 2018-12-01 10:00:20 Off 00:00:00




              We can nail down the integer bit with



              df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
              df

              Time Work Elapsed Time
              0 2018-12-01 10:00:00 Off 0
              1 2018-12-01 10:00:02 On 2
              2 2018-12-01 10:00:05 On 5
              3 2018-12-01 10:00:06 On 6
              4 2018-12-01 10:00:07 On 7
              5 2018-12-01 10:00:09 Off 0
              6 2018-12-01 10:00:11 Off 0
              7 2018-12-01 10:00:14 On 3
              8 2018-12-01 10:00:16 On 5
              9 2018-12-01 10:00:18 On 7
              10 2018-12-01 10:00:20 Off 0





              share|improve this answer

























                up vote
                3
                down vote













                A numpy slicy approach



                u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
                t = df.Time.values

                df['Elapsed Time'] = t - t[f[i]]
                df

                Time Work Elapsed Time
                0 2018-12-01 10:00:00 Off 00:00:00
                1 2018-12-01 10:00:02 On 00:00:02
                2 2018-12-01 10:00:05 On 00:00:05
                3 2018-12-01 10:00:06 On 00:00:06
                4 2018-12-01 10:00:07 On 00:00:07
                5 2018-12-01 10:00:09 Off 00:00:00
                6 2018-12-01 10:00:11 Off 00:00:00
                7 2018-12-01 10:00:14 On 00:00:03
                8 2018-12-01 10:00:16 On 00:00:05
                9 2018-12-01 10:00:18 On 00:00:07
                10 2018-12-01 10:00:20 Off 00:00:00




                We can nail down the integer bit with



                df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
                df

                Time Work Elapsed Time
                0 2018-12-01 10:00:00 Off 0
                1 2018-12-01 10:00:02 On 2
                2 2018-12-01 10:00:05 On 5
                3 2018-12-01 10:00:06 On 6
                4 2018-12-01 10:00:07 On 7
                5 2018-12-01 10:00:09 Off 0
                6 2018-12-01 10:00:11 Off 0
                7 2018-12-01 10:00:14 On 3
                8 2018-12-01 10:00:16 On 5
                9 2018-12-01 10:00:18 On 7
                10 2018-12-01 10:00:20 Off 0





                share|improve this answer























                  up vote
                  3
                  down vote










                  up vote
                  3
                  down vote









                  A numpy slicy approach



                  u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
                  t = df.Time.values

                  df['Elapsed Time'] = t - t[f[i]]
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 00:00:00
                  1 2018-12-01 10:00:02 On 00:00:02
                  2 2018-12-01 10:00:05 On 00:00:05
                  3 2018-12-01 10:00:06 On 00:00:06
                  4 2018-12-01 10:00:07 On 00:00:07
                  5 2018-12-01 10:00:09 Off 00:00:00
                  6 2018-12-01 10:00:11 Off 00:00:00
                  7 2018-12-01 10:00:14 On 00:00:03
                  8 2018-12-01 10:00:16 On 00:00:05
                  9 2018-12-01 10:00:18 On 00:00:07
                  10 2018-12-01 10:00:20 Off 00:00:00




                  We can nail down the integer bit with



                  df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 0
                  1 2018-12-01 10:00:02 On 2
                  2 2018-12-01 10:00:05 On 5
                  3 2018-12-01 10:00:06 On 6
                  4 2018-12-01 10:00:07 On 7
                  5 2018-12-01 10:00:09 Off 0
                  6 2018-12-01 10:00:11 Off 0
                  7 2018-12-01 10:00:14 On 3
                  8 2018-12-01 10:00:16 On 5
                  9 2018-12-01 10:00:18 On 7
                  10 2018-12-01 10:00:20 Off 0





                  share|improve this answer












                  A numpy slicy approach



                  u, f, i = np.unique(df.Work.eq('Off').values.cumsum(), True, True)
                  t = df.Time.values

                  df['Elapsed Time'] = t - t[f[i]]
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 00:00:00
                  1 2018-12-01 10:00:02 On 00:00:02
                  2 2018-12-01 10:00:05 On 00:00:05
                  3 2018-12-01 10:00:06 On 00:00:06
                  4 2018-12-01 10:00:07 On 00:00:07
                  5 2018-12-01 10:00:09 Off 00:00:00
                  6 2018-12-01 10:00:11 Off 00:00:00
                  7 2018-12-01 10:00:14 On 00:00:03
                  8 2018-12-01 10:00:16 On 00:00:05
                  9 2018-12-01 10:00:18 On 00:00:07
                  10 2018-12-01 10:00:20 Off 00:00:00




                  We can nail down the integer bit with



                  df['Elapsed Time'] = (t - t[f[i]]).astype('timedelta64[s]').astype(int)
                  df

                  Time Work Elapsed Time
                  0 2018-12-01 10:00:00 Off 0
                  1 2018-12-01 10:00:02 On 2
                  2 2018-12-01 10:00:05 On 5
                  3 2018-12-01 10:00:06 On 6
                  4 2018-12-01 10:00:07 On 7
                  5 2018-12-01 10:00:09 Off 0
                  6 2018-12-01 10:00:11 Off 0
                  7 2018-12-01 10:00:14 On 3
                  8 2018-12-01 10:00:16 On 5
                  9 2018-12-01 10:00:18 On 7
                  10 2018-12-01 10:00:20 Off 0






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Dec 6 at 17:59









                  piRSquared

                  151k22138282




                  151k22138282






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53656878%2fpandas-measure-elapsed-time-since-a-condition%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to change which sound is reproduced for terminal bell?

                      Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

                      Can I use Tabulator js library in my java Spring + Thymeleaf project?