- Data manipulation - Time series
- Possible frequencies:
Print current datetime
from datetime import datetime
now = datetime.now()
date_time = now.strftime("%m/%d/%Y %H:%M:%S")
print(date_time)
Datetime from a timestamp, and timestamp from a datetime
from datetime import datetime
timestamp = 1458987543
date_time = datetime.fromtimestamp(timestamp)
print(date_time)
timestamp_regenerated = datetime.timestamp(date_time)
print(timestamp_regenerated)
Get a datetime from string
from datetime import datetime
date_string = "21 May, 1985"
date_time = datetime.strptime(date_string, "%d %B, %Y")
Get a datetime from string automaticly
from dateutil import parser
date_string = "21 May, 1985"
date_time = parser.parse(date_string)
import pandas as pd
datetime_string = "21 May, 1985"
date = pd.to_datetime(datetime_string)
import pandas as pd
index = pd.DatetimeIndex(['2013-05-01', '2014-05-01', '2015-07-04', '2016-05-01'])
data = pd.Series([0, 1, 2, 3], index=index)
import pandas as pd
index = pd.DatetimeIndex(['2013-05-01', '2014-05-01', '2015-07-04', '2016-05-01'])
data = pd.DataFrame(
{
'column_1': [0, 1, 2, 3],
'column_2': [0, 1, 2, 3]
},
index=index
)
Important: To prevent errors while converting to datetime use errors parameter For instance errors='coerce' will put NaT values in invalid values
import pandas as pd
# my_dataframe have a colum date with string date value inside
my_dataframe['date'] = pd.to_datetime(my_dataframe['date'], errors='coerce')
my_dataframe.set_index('date', inplace=True)
Process on index
import pandas as pd
df = pd.DataFrame({ 'date_column': ['2013-05-01', '2014-05-01', '2015-07-04', '2016-05-01'] })
df.date_column = pd.to_datetime(df.date_column)
df.set_index('date_column', inplace=True)
df['date_year_string'] = df.index.strftime('%Y')
Process on a column
import pandas as pd
df = pd.DataFrame({ 'date_column': ['2013-05-01', '2014-05-01', '2015-07-04', '2016-05-01'] })
df.date_column = pd.to_datetime(df.date_column)
df['date_year_string'] = df.date_column.dt.strftime('%Y')
import pandas as pd
data = pd.date_range('1998-03-10', '1998-03-15', freq='D')
Convert generated range to python datetime type
import pandas as pd
data = pd.date_range('1998-03-10', '1998-03-15', freq='D').to_pydatetime()
Possible frequencies:
- 'H' - hourly
- 'D' - Daily
- 'B' - Buisiness daily
- 'W' - Weekly
- 'M' - Monthly
- 'Q' - Quarterly
- 'A' - Annual
Refer to the full list in the bottom of page
import pandas as pd
# my_dataframe an index of datetime type
# select all data that match year: 2020
df_2020 = my_dataframe.loc['2020':'2020']
More complex example with date parse
from dateutil import parser
from_date_raw_string = '21 May 1985'
to_date_raw_string = '22 May 1985'
a = parser.parse(from_date_raw_string).strftime('%Y-%m-%d %H:%M:%S')
b = parser.parse(to_date_raw_string).strftime('%Y-%m-%d %H:%M:%S')
# my_dataframe an index of datetime type
df_filtered = my_dataframe.loc[a:b]
import pandas as pd
my_dataframe = my_dataframe.set_index('datetime_column')
my_dataframe = my_dataframe.resample('D').count()
import pandas as pd
my_dataframe = my_dataframe.set_index('datetime_column')
summed_serie = my_dataframe.serie_to_sum.resample('D').sum()
- resample('D').mean()
- resample('D').first()
- resample('D').last()
import pandas as pd
my_dataframe = my_dataframe.set_index('datetime_column')
def custom_sum(array_like):
return np.sum(array_like)
summed_serie = my_dataframe.serie_to_sum.resample('D').apply(custom_sum)
- B business day frequency
- C custom business day frequency (experimental)
- D calendar day frequency
- W weekly frequency
- M month end frequency
- SM semi-month end frequency (15th and end of month)
- BM business month end frequency
- CBM custom business month end frequency
- MS month start frequency
- SMS semi-month start frequency (1st and 15th)
- BMS business month start frequency
- CBMS custom business month start frequency
- Q quarter end frequency
- BQ business quarter endfrequency
- QS quarter start frequency
- BQS business quarter start frequency
- A year end frequency
- BA, BY business year end frequency
- AS, YS year start frequency
- BAS, BYS business year start frequency
- BH business hour frequency
- H hourly frequency
- T, min minutely frequency
- S secondly frequency
- L, ms milliseconds
- U, us microseconds
- N nanoseconds