In this article, we will be working with a dataset containing a column with names, ages, and timestamps. Our goal is to extract various time components from the timestamps, such as hours, minutes, seconds, milliseconds, and more. We will also demonstrate how to convert the timestamps to a specific timezone using Python. To achieve this, we will use the pandas and pytz libraries.
Prerequisites
- Python 3.7 or higher
- pandas library
- pytz library
To install the required libraries, use the following command:
pip install pandas pytz
Input Data
Let’s assume we have a CSV file named “data.csv” with the following content:
name,age,timestamp
Sachin,30,2022-12-01 12:30:15.123456
Barry,25,2023-01-10 16:45:35.789012
Suzy,35,2023-02-07 09:15:30.246801
Loading the Dataset
First, let’s load the dataset into a pandas DataFrame:
import pandas as pd
data = pd.read_csv("data.csv")
data['timestamp'] = pd.to_datetime(data['timestamp'])
print(data)
Extracting Time Components
Now, we will extract various time components from the ‘timestamp’ column:
data['hour'] = data['timestamp'].dt.hour
data['minute'] = data['timestamp'].dt.minute
data['second'] = data['timestamp'].dt.second
data['millisecond'] = data['timestamp'].dt.microsecond // 1000
data['year'] = data['timestamp'].dt.year
data['month'] = data['timestamp'].dt.month
data['day'] = data['timestamp'].dt.day
data['week'] = data['timestamp'].dt.isocalendar().week
data['quarter'] = data['timestamp'].dt.quarter
Converting Timestamps to a Specific Timezone
To convert the timestamps to a specific timezone, we will use the pytz library. In this example, we will convert the timestamps to the ‘America/New_York’ timezone:
import pytz
local_timezone = pytz.timezone('America/New_York')
data['timestamp_local'] = data['timestamp'].dt.tz_localize('UTC').dt.tz_convert(local_timezone)
Result
After running the above code, the resulting DataFrame will look like this:
name age timestamp hour minute second millisecond year month day week quarter timestamp_local
0 Sachin 30 2022-12-01 12:30:15.123456 12 30 15 123 2022 12 1 48 4 2022-12-01 07:30:15.123456-05:00
1 Barry 25 2023-01-10 16:45:35.789012 16 45 35 789 2023 1 10 2 1 2023-01-10 11:45:35.789012-05:00
2 Syzye 35 2023-02-07 09:15:30.246801 9 15 30 246 2023 2 7 6 1 2023-02-07 04