When dealing with data in Google BigQuery, especially time-series data, it’s common to want to truncate dates and timestamps to a specific level of precision. For instance, you might want to round a timestamp down to the start of the hour or the day. This guide will provide a step-by-step walkthrough on how to truncate dates and timestamps in Google BigQuery.
Step 1: Set up the table
Let’s first create a table named freshers_in_page_views which will contain the page view counts and their associated timestamps.
CREATE TABLE freshers_in_page_views (
view_id INT64,
view_timestamp TIMESTAMP,
view_count INT64
);
Step 2: Insert sample data
For the sake of demonstration, we’ll insert a few rows of sample data into our table.
Insert Statement:
INSERT INTO freshers_in_page_views (view_id, view_timestamp, view_count)
VALUES
(1, TIMESTAMP "2023-09-14 15:43:22 UTC", 120),
(2, TIMESTAMP "2023-09-14 12:05:45 UTC", 89),
(3, TIMESTAMP "2023-09-14 09:15:00 UTC", 155);
Step 3: Truncate Date and Timestamp
Now, let’s look at some common truncation scenarios:
Sample data:
view_id | view_timestamp | view_count
------- | -------------------------- | ----------
1 | 2023-09-14 15:43:22 UTC | 120
2 | 2023-09-14 12:05:45 UTC | 89
3 | 2023-09-14 09:15:00 UTC | 155
Truncate to the start of the day:
This will get you the date without the time portion.
SELECT
view_id,
DATE(view_timestamp) AS truncated_date,
view_count
FROM freshers_in_page_views;
view_id | truncated_date | view_count
------- | -------------- | ----------
1 | 2023-09-14 | 120
2 | 2023-09-14 | 89
3 | 2023-09-14 | 155
Truncate to the start of the hour:
This will give you the timestamp rounded down to the start of the hour.
SELECT
view_id,
TIMESTAMP_TRUNC(view_timestamp, HOUR) AS truncated_hour,
view_count
FROM freshers_in_page_views;
view_id | truncated_hour | view_count
------- | ------------------------- | ----------
1 | 2023-09-14 15:00:00 UTC | 120
2 | 2023-09-14 12:00:00 UTC | 89
3 | 2023-09-14 09:00:00 UTC | 155
To get the timestamp at the start of the month.
SELECT
view_id,
TIMESTAMP_TRUNC(view_timestamp, MONTH) AS truncated_month,
view_count
FROM freshers_in_page_views;
view_id | truncated_month | view_count
------- | ------------------------- | ----------
1 | 2023-09-01 00:00:00 UTC | 120
2 | 2023-09-01 00:00:00 UTC | 89
3 | 2023-09-01 00:00:00 UTC | 155
Truncating dates and timestamps is essential when you need to aggregate data over specific periods or when you want to simplify the visual representation of time-series data. Google BigQuery offers functions like DATE() and TIMESTAMP_TRUNC() to make this process straightforward and efficient.
BigQuery import urls to refer