Make Datetime Series from separate year, month, and date columns in Pandas












0















How can we use columns 'Yr', 'Mo' and 'Dy' to create a new column with type Datetime and set it as the index of the Pandas DataFrame?



type










share|improve this question





























    0















    How can we use columns 'Yr', 'Mo' and 'Dy' to create a new column with type Datetime and set it as the index of the Pandas DataFrame?



    type










    share|improve this question



























      0












      0








      0








      How can we use columns 'Yr', 'Mo' and 'Dy' to create a new column with type Datetime and set it as the index of the Pandas DataFrame?



      type










      share|improve this question
















      How can we use columns 'Yr', 'Mo' and 'Dy' to create a new column with type Datetime and set it as the index of the Pandas DataFrame?



      type







      python pandas datetime dataframe






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 23 '18 at 13:49









      Brad Solomon

      13.8k83486




      13.8k83486










      asked Nov 20 '18 at 23:47









      Shivam SinhaShivam Sinha

      635




      635
























          2 Answers
          2






          active

          oldest

          votes


















          0














          First, you should convert Yr to a four-digit int, i.e. 1961 or 2061. This is unambiguous and, if you use the approach below, the format YYYY-MM-DD is required. That's because Pandas uses format='%Y%m%d' in pandas/core/tools/datetimes.py:



          # From pandas/core/tools/datetimes.py, if you pass a DataFrame or dict
          values = to_datetime(values, format='%Y%m%d', errors=errors)


          So, to take an example:



          from itertools import product

          import numpy as np
          import pandas as pd
          np.random.seed(444)

          datecols = ['Yr', 'Mo', 'Dy']
          mapper = dict(zip(datecols, ('year', 'month', 'day')))
          df = pd.DataFrame(list(product([61, 62], [1, 2], [1, 2, 3])),
          columns=datecols)
          df['data'] = np.random.randn(len(df))


          Here is df:



          In [11]: df                                                                                                                                                   
          Out[11]:
          Yr Mo Dy data
          0 61 1 1 0.357440
          1 61 1 2 0.377538
          2 61 1 3 1.382338
          3 61 2 1 1.175549
          4 61 2 2 -0.939276
          5 61 2 3 -1.143150
          6 62 1 1 -0.542440
          7 62 1 2 -0.548708
          8 62 1 3 0.208520
          9 62 2 1 0.212690
          10 62 2 2 1.268021
          11 62 2 3 -0.807303


          Let's assume for the sake of simplicity that the true range is 1920 onward, i.e.:



          In [16]: yr = df['Yr']                                                                                                                                        

          In [17]: df['Yr'] = np.where(yr <= 20, 2000 + yr, 1900 + yr)

          In [18]: df
          Out[18]:
          Yr Mo Dy data
          0 1961 1 1 0.357440
          1 1961 1 2 0.377538
          2 1961 1 3 1.382338
          3 1961 2 1 1.175549
          4 1961 2 2 -0.939276
          5 1961 2 3 -1.143150
          6 1962 1 1 -0.542440
          7 1962 1 2 -0.548708
          8 1962 1 3 0.208520
          9 1962 2 1 0.212690
          10 1962 2 2 1.268021
          11 1962 2 3 -0.807303


          The second thing you need to do is rename the columns; Pandas is fairly strict about this if you pass in a mapping or DataFrame to pd.to_datetime(). Here is that step and the result:



          In [21]: df.index = pd.to_datetime(df[datecols].rename(columns=mapper))                                                                                       

          In [22]: df
          Out[22]:
          Yr Mo Dy data
          1961-01-01 1961 1 1 0.357440
          1961-01-02 1961 1 2 0.377538
          1961-01-03 1961 1 3 1.382338
          1961-02-01 1961 2 1 1.175549
          1961-02-02 1961 2 2 -0.939276
          1961-02-03 1961 2 3 -1.143150
          1962-01-01 1962 1 1 -0.542440
          1962-01-02 1962 1 2 -0.548708
          1962-01-03 1962 1 3 0.208520
          1962-02-01 1962 2 1 0.212690
          1962-02-02 1962 2 2 1.268021
          1962-02-03 1962 2 3 -0.807303


          Lastly, here's one alternate through concatenating the columns as strings:



          In [27]: as_str = df[datecols].astype(str)   
          In [30]: pd.to_datetime(
          ...: as_str['Yr'] + '-' + as_str['Mo'] +'-' + as_str['Dy'],
          ...: format='%y-%m-%d'
          ...: )
          Out[30]:
          0 2061-01-01
          1 2061-01-02
          2 2061-01-03
          3 2061-02-01
          4 2061-02-02
          5 2061-02-03
          6 2062-01-01
          7 2062-01-02
          8 2062-01-03
          9 2062-02-01
          10 2062-02-02
          11 2062-02-03
          dtype: datetime64[ns]


          Notice again that this will assume the century for you. If you want to be explicit, you need to follow the same approach as above for adding the correct century before defining as_str.






          share|improve this answer

































            0














            As pointed out by Brad, this is how I fixed it



            def adjustyear(x):
            if x >= 1800:
            x = 1900 + x
            else:
            x = 2000 + x
            return x

            def parsefunc(x):
            yearmodified = adjustyear(x['Yr'])
            print(yearmodified)
            datetimestr = str(yearmodified)+str(x['Mo'])+str(x['Dy'])
            return pd.to_datetime(datetimestr, format='%Y%m%d', errors='ignore')

            data['newindex'] = data.apply(parsefunc, axis=1)
            data.index = data['newindex']





            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403318%2fmake-datetime-series-from-separate-year-month-and-date-columns-in-pandas%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0














              First, you should convert Yr to a four-digit int, i.e. 1961 or 2061. This is unambiguous and, if you use the approach below, the format YYYY-MM-DD is required. That's because Pandas uses format='%Y%m%d' in pandas/core/tools/datetimes.py:



              # From pandas/core/tools/datetimes.py, if you pass a DataFrame or dict
              values = to_datetime(values, format='%Y%m%d', errors=errors)


              So, to take an example:



              from itertools import product

              import numpy as np
              import pandas as pd
              np.random.seed(444)

              datecols = ['Yr', 'Mo', 'Dy']
              mapper = dict(zip(datecols, ('year', 'month', 'day')))
              df = pd.DataFrame(list(product([61, 62], [1, 2], [1, 2, 3])),
              columns=datecols)
              df['data'] = np.random.randn(len(df))


              Here is df:



              In [11]: df                                                                                                                                                   
              Out[11]:
              Yr Mo Dy data
              0 61 1 1 0.357440
              1 61 1 2 0.377538
              2 61 1 3 1.382338
              3 61 2 1 1.175549
              4 61 2 2 -0.939276
              5 61 2 3 -1.143150
              6 62 1 1 -0.542440
              7 62 1 2 -0.548708
              8 62 1 3 0.208520
              9 62 2 1 0.212690
              10 62 2 2 1.268021
              11 62 2 3 -0.807303


              Let's assume for the sake of simplicity that the true range is 1920 onward, i.e.:



              In [16]: yr = df['Yr']                                                                                                                                        

              In [17]: df['Yr'] = np.where(yr <= 20, 2000 + yr, 1900 + yr)

              In [18]: df
              Out[18]:
              Yr Mo Dy data
              0 1961 1 1 0.357440
              1 1961 1 2 0.377538
              2 1961 1 3 1.382338
              3 1961 2 1 1.175549
              4 1961 2 2 -0.939276
              5 1961 2 3 -1.143150
              6 1962 1 1 -0.542440
              7 1962 1 2 -0.548708
              8 1962 1 3 0.208520
              9 1962 2 1 0.212690
              10 1962 2 2 1.268021
              11 1962 2 3 -0.807303


              The second thing you need to do is rename the columns; Pandas is fairly strict about this if you pass in a mapping or DataFrame to pd.to_datetime(). Here is that step and the result:



              In [21]: df.index = pd.to_datetime(df[datecols].rename(columns=mapper))                                                                                       

              In [22]: df
              Out[22]:
              Yr Mo Dy data
              1961-01-01 1961 1 1 0.357440
              1961-01-02 1961 1 2 0.377538
              1961-01-03 1961 1 3 1.382338
              1961-02-01 1961 2 1 1.175549
              1961-02-02 1961 2 2 -0.939276
              1961-02-03 1961 2 3 -1.143150
              1962-01-01 1962 1 1 -0.542440
              1962-01-02 1962 1 2 -0.548708
              1962-01-03 1962 1 3 0.208520
              1962-02-01 1962 2 1 0.212690
              1962-02-02 1962 2 2 1.268021
              1962-02-03 1962 2 3 -0.807303


              Lastly, here's one alternate through concatenating the columns as strings:



              In [27]: as_str = df[datecols].astype(str)   
              In [30]: pd.to_datetime(
              ...: as_str['Yr'] + '-' + as_str['Mo'] +'-' + as_str['Dy'],
              ...: format='%y-%m-%d'
              ...: )
              Out[30]:
              0 2061-01-01
              1 2061-01-02
              2 2061-01-03
              3 2061-02-01
              4 2061-02-02
              5 2061-02-03
              6 2062-01-01
              7 2062-01-02
              8 2062-01-03
              9 2062-02-01
              10 2062-02-02
              11 2062-02-03
              dtype: datetime64[ns]


              Notice again that this will assume the century for you. If you want to be explicit, you need to follow the same approach as above for adding the correct century before defining as_str.






              share|improve this answer






























                0














                First, you should convert Yr to a four-digit int, i.e. 1961 or 2061. This is unambiguous and, if you use the approach below, the format YYYY-MM-DD is required. That's because Pandas uses format='%Y%m%d' in pandas/core/tools/datetimes.py:



                # From pandas/core/tools/datetimes.py, if you pass a DataFrame or dict
                values = to_datetime(values, format='%Y%m%d', errors=errors)


                So, to take an example:



                from itertools import product

                import numpy as np
                import pandas as pd
                np.random.seed(444)

                datecols = ['Yr', 'Mo', 'Dy']
                mapper = dict(zip(datecols, ('year', 'month', 'day')))
                df = pd.DataFrame(list(product([61, 62], [1, 2], [1, 2, 3])),
                columns=datecols)
                df['data'] = np.random.randn(len(df))


                Here is df:



                In [11]: df                                                                                                                                                   
                Out[11]:
                Yr Mo Dy data
                0 61 1 1 0.357440
                1 61 1 2 0.377538
                2 61 1 3 1.382338
                3 61 2 1 1.175549
                4 61 2 2 -0.939276
                5 61 2 3 -1.143150
                6 62 1 1 -0.542440
                7 62 1 2 -0.548708
                8 62 1 3 0.208520
                9 62 2 1 0.212690
                10 62 2 2 1.268021
                11 62 2 3 -0.807303


                Let's assume for the sake of simplicity that the true range is 1920 onward, i.e.:



                In [16]: yr = df['Yr']                                                                                                                                        

                In [17]: df['Yr'] = np.where(yr <= 20, 2000 + yr, 1900 + yr)

                In [18]: df
                Out[18]:
                Yr Mo Dy data
                0 1961 1 1 0.357440
                1 1961 1 2 0.377538
                2 1961 1 3 1.382338
                3 1961 2 1 1.175549
                4 1961 2 2 -0.939276
                5 1961 2 3 -1.143150
                6 1962 1 1 -0.542440
                7 1962 1 2 -0.548708
                8 1962 1 3 0.208520
                9 1962 2 1 0.212690
                10 1962 2 2 1.268021
                11 1962 2 3 -0.807303


                The second thing you need to do is rename the columns; Pandas is fairly strict about this if you pass in a mapping or DataFrame to pd.to_datetime(). Here is that step and the result:



                In [21]: df.index = pd.to_datetime(df[datecols].rename(columns=mapper))                                                                                       

                In [22]: df
                Out[22]:
                Yr Mo Dy data
                1961-01-01 1961 1 1 0.357440
                1961-01-02 1961 1 2 0.377538
                1961-01-03 1961 1 3 1.382338
                1961-02-01 1961 2 1 1.175549
                1961-02-02 1961 2 2 -0.939276
                1961-02-03 1961 2 3 -1.143150
                1962-01-01 1962 1 1 -0.542440
                1962-01-02 1962 1 2 -0.548708
                1962-01-03 1962 1 3 0.208520
                1962-02-01 1962 2 1 0.212690
                1962-02-02 1962 2 2 1.268021
                1962-02-03 1962 2 3 -0.807303


                Lastly, here's one alternate through concatenating the columns as strings:



                In [27]: as_str = df[datecols].astype(str)   
                In [30]: pd.to_datetime(
                ...: as_str['Yr'] + '-' + as_str['Mo'] +'-' + as_str['Dy'],
                ...: format='%y-%m-%d'
                ...: )
                Out[30]:
                0 2061-01-01
                1 2061-01-02
                2 2061-01-03
                3 2061-02-01
                4 2061-02-02
                5 2061-02-03
                6 2062-01-01
                7 2062-01-02
                8 2062-01-03
                9 2062-02-01
                10 2062-02-02
                11 2062-02-03
                dtype: datetime64[ns]


                Notice again that this will assume the century for you. If you want to be explicit, you need to follow the same approach as above for adding the correct century before defining as_str.






                share|improve this answer




























                  0












                  0








                  0







                  First, you should convert Yr to a four-digit int, i.e. 1961 or 2061. This is unambiguous and, if you use the approach below, the format YYYY-MM-DD is required. That's because Pandas uses format='%Y%m%d' in pandas/core/tools/datetimes.py:



                  # From pandas/core/tools/datetimes.py, if you pass a DataFrame or dict
                  values = to_datetime(values, format='%Y%m%d', errors=errors)


                  So, to take an example:



                  from itertools import product

                  import numpy as np
                  import pandas as pd
                  np.random.seed(444)

                  datecols = ['Yr', 'Mo', 'Dy']
                  mapper = dict(zip(datecols, ('year', 'month', 'day')))
                  df = pd.DataFrame(list(product([61, 62], [1, 2], [1, 2, 3])),
                  columns=datecols)
                  df['data'] = np.random.randn(len(df))


                  Here is df:



                  In [11]: df                                                                                                                                                   
                  Out[11]:
                  Yr Mo Dy data
                  0 61 1 1 0.357440
                  1 61 1 2 0.377538
                  2 61 1 3 1.382338
                  3 61 2 1 1.175549
                  4 61 2 2 -0.939276
                  5 61 2 3 -1.143150
                  6 62 1 1 -0.542440
                  7 62 1 2 -0.548708
                  8 62 1 3 0.208520
                  9 62 2 1 0.212690
                  10 62 2 2 1.268021
                  11 62 2 3 -0.807303


                  Let's assume for the sake of simplicity that the true range is 1920 onward, i.e.:



                  In [16]: yr = df['Yr']                                                                                                                                        

                  In [17]: df['Yr'] = np.where(yr <= 20, 2000 + yr, 1900 + yr)

                  In [18]: df
                  Out[18]:
                  Yr Mo Dy data
                  0 1961 1 1 0.357440
                  1 1961 1 2 0.377538
                  2 1961 1 3 1.382338
                  3 1961 2 1 1.175549
                  4 1961 2 2 -0.939276
                  5 1961 2 3 -1.143150
                  6 1962 1 1 -0.542440
                  7 1962 1 2 -0.548708
                  8 1962 1 3 0.208520
                  9 1962 2 1 0.212690
                  10 1962 2 2 1.268021
                  11 1962 2 3 -0.807303


                  The second thing you need to do is rename the columns; Pandas is fairly strict about this if you pass in a mapping or DataFrame to pd.to_datetime(). Here is that step and the result:



                  In [21]: df.index = pd.to_datetime(df[datecols].rename(columns=mapper))                                                                                       

                  In [22]: df
                  Out[22]:
                  Yr Mo Dy data
                  1961-01-01 1961 1 1 0.357440
                  1961-01-02 1961 1 2 0.377538
                  1961-01-03 1961 1 3 1.382338
                  1961-02-01 1961 2 1 1.175549
                  1961-02-02 1961 2 2 -0.939276
                  1961-02-03 1961 2 3 -1.143150
                  1962-01-01 1962 1 1 -0.542440
                  1962-01-02 1962 1 2 -0.548708
                  1962-01-03 1962 1 3 0.208520
                  1962-02-01 1962 2 1 0.212690
                  1962-02-02 1962 2 2 1.268021
                  1962-02-03 1962 2 3 -0.807303


                  Lastly, here's one alternate through concatenating the columns as strings:



                  In [27]: as_str = df[datecols].astype(str)   
                  In [30]: pd.to_datetime(
                  ...: as_str['Yr'] + '-' + as_str['Mo'] +'-' + as_str['Dy'],
                  ...: format='%y-%m-%d'
                  ...: )
                  Out[30]:
                  0 2061-01-01
                  1 2061-01-02
                  2 2061-01-03
                  3 2061-02-01
                  4 2061-02-02
                  5 2061-02-03
                  6 2062-01-01
                  7 2062-01-02
                  8 2062-01-03
                  9 2062-02-01
                  10 2062-02-02
                  11 2062-02-03
                  dtype: datetime64[ns]


                  Notice again that this will assume the century for you. If you want to be explicit, you need to follow the same approach as above for adding the correct century before defining as_str.






                  share|improve this answer















                  First, you should convert Yr to a four-digit int, i.e. 1961 or 2061. This is unambiguous and, if you use the approach below, the format YYYY-MM-DD is required. That's because Pandas uses format='%Y%m%d' in pandas/core/tools/datetimes.py:



                  # From pandas/core/tools/datetimes.py, if you pass a DataFrame or dict
                  values = to_datetime(values, format='%Y%m%d', errors=errors)


                  So, to take an example:



                  from itertools import product

                  import numpy as np
                  import pandas as pd
                  np.random.seed(444)

                  datecols = ['Yr', 'Mo', 'Dy']
                  mapper = dict(zip(datecols, ('year', 'month', 'day')))
                  df = pd.DataFrame(list(product([61, 62], [1, 2], [1, 2, 3])),
                  columns=datecols)
                  df['data'] = np.random.randn(len(df))


                  Here is df:



                  In [11]: df                                                                                                                                                   
                  Out[11]:
                  Yr Mo Dy data
                  0 61 1 1 0.357440
                  1 61 1 2 0.377538
                  2 61 1 3 1.382338
                  3 61 2 1 1.175549
                  4 61 2 2 -0.939276
                  5 61 2 3 -1.143150
                  6 62 1 1 -0.542440
                  7 62 1 2 -0.548708
                  8 62 1 3 0.208520
                  9 62 2 1 0.212690
                  10 62 2 2 1.268021
                  11 62 2 3 -0.807303


                  Let's assume for the sake of simplicity that the true range is 1920 onward, i.e.:



                  In [16]: yr = df['Yr']                                                                                                                                        

                  In [17]: df['Yr'] = np.where(yr <= 20, 2000 + yr, 1900 + yr)

                  In [18]: df
                  Out[18]:
                  Yr Mo Dy data
                  0 1961 1 1 0.357440
                  1 1961 1 2 0.377538
                  2 1961 1 3 1.382338
                  3 1961 2 1 1.175549
                  4 1961 2 2 -0.939276
                  5 1961 2 3 -1.143150
                  6 1962 1 1 -0.542440
                  7 1962 1 2 -0.548708
                  8 1962 1 3 0.208520
                  9 1962 2 1 0.212690
                  10 1962 2 2 1.268021
                  11 1962 2 3 -0.807303


                  The second thing you need to do is rename the columns; Pandas is fairly strict about this if you pass in a mapping or DataFrame to pd.to_datetime(). Here is that step and the result:



                  In [21]: df.index = pd.to_datetime(df[datecols].rename(columns=mapper))                                                                                       

                  In [22]: df
                  Out[22]:
                  Yr Mo Dy data
                  1961-01-01 1961 1 1 0.357440
                  1961-01-02 1961 1 2 0.377538
                  1961-01-03 1961 1 3 1.382338
                  1961-02-01 1961 2 1 1.175549
                  1961-02-02 1961 2 2 -0.939276
                  1961-02-03 1961 2 3 -1.143150
                  1962-01-01 1962 1 1 -0.542440
                  1962-01-02 1962 1 2 -0.548708
                  1962-01-03 1962 1 3 0.208520
                  1962-02-01 1962 2 1 0.212690
                  1962-02-02 1962 2 2 1.268021
                  1962-02-03 1962 2 3 -0.807303


                  Lastly, here's one alternate through concatenating the columns as strings:



                  In [27]: as_str = df[datecols].astype(str)   
                  In [30]: pd.to_datetime(
                  ...: as_str['Yr'] + '-' + as_str['Mo'] +'-' + as_str['Dy'],
                  ...: format='%y-%m-%d'
                  ...: )
                  Out[30]:
                  0 2061-01-01
                  1 2061-01-02
                  2 2061-01-03
                  3 2061-02-01
                  4 2061-02-02
                  5 2061-02-03
                  6 2062-01-01
                  7 2062-01-02
                  8 2062-01-03
                  9 2062-02-01
                  10 2062-02-02
                  11 2062-02-03
                  dtype: datetime64[ns]


                  Notice again that this will assume the century for you. If you want to be explicit, you need to follow the same approach as above for adding the correct century before defining as_str.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 21 '18 at 0:15

























                  answered Nov 21 '18 at 0:05









                  Brad SolomonBrad Solomon

                  13.8k83486




                  13.8k83486

























                      0














                      As pointed out by Brad, this is how I fixed it



                      def adjustyear(x):
                      if x >= 1800:
                      x = 1900 + x
                      else:
                      x = 2000 + x
                      return x

                      def parsefunc(x):
                      yearmodified = adjustyear(x['Yr'])
                      print(yearmodified)
                      datetimestr = str(yearmodified)+str(x['Mo'])+str(x['Dy'])
                      return pd.to_datetime(datetimestr, format='%Y%m%d', errors='ignore')

                      data['newindex'] = data.apply(parsefunc, axis=1)
                      data.index = data['newindex']





                      share|improve this answer




























                        0














                        As pointed out by Brad, this is how I fixed it



                        def adjustyear(x):
                        if x >= 1800:
                        x = 1900 + x
                        else:
                        x = 2000 + x
                        return x

                        def parsefunc(x):
                        yearmodified = adjustyear(x['Yr'])
                        print(yearmodified)
                        datetimestr = str(yearmodified)+str(x['Mo'])+str(x['Dy'])
                        return pd.to_datetime(datetimestr, format='%Y%m%d', errors='ignore')

                        data['newindex'] = data.apply(parsefunc, axis=1)
                        data.index = data['newindex']





                        share|improve this answer


























                          0












                          0








                          0







                          As pointed out by Brad, this is how I fixed it



                          def adjustyear(x):
                          if x >= 1800:
                          x = 1900 + x
                          else:
                          x = 2000 + x
                          return x

                          def parsefunc(x):
                          yearmodified = adjustyear(x['Yr'])
                          print(yearmodified)
                          datetimestr = str(yearmodified)+str(x['Mo'])+str(x['Dy'])
                          return pd.to_datetime(datetimestr, format='%Y%m%d', errors='ignore')

                          data['newindex'] = data.apply(parsefunc, axis=1)
                          data.index = data['newindex']





                          share|improve this answer













                          As pointed out by Brad, this is how I fixed it



                          def adjustyear(x):
                          if x >= 1800:
                          x = 1900 + x
                          else:
                          x = 2000 + x
                          return x

                          def parsefunc(x):
                          yearmodified = adjustyear(x['Yr'])
                          print(yearmodified)
                          datetimestr = str(yearmodified)+str(x['Mo'])+str(x['Dy'])
                          return pd.to_datetime(datetimestr, format='%Y%m%d', errors='ignore')

                          data['newindex'] = data.apply(parsefunc, axis=1)
                          data.index = data['newindex']






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 21 '18 at 5:36









                          Shivam SinhaShivam Sinha

                          635




                          635






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403318%2fmake-datetime-series-from-separate-year-month-and-date-columns-in-pandas%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              mysqli_query(): Empty query in /home/lucindabrummitt/public_html/blog/wp-includes/wp-db.php on line 1924

                              How to change which sound is reproduced for terminal bell?

                              Can I use Tabulator js library in my java Spring + Thymeleaf project?