How to Generate a Date and Time Series in Firebolt

Introduction

Generating a series of dates and times is a common requirement in data analysis and reporting, especially when dealing with time series data. Firebolt provides powerful functions to create such series efficiently. By the end of this article, you will know how to generate a continuous series of dates and times for various intervals and use cases using Firebolt SQL syntax.

TL;DR

  • Use GENERATE_SERIES to create a sequence of numbers.

  • Combine GENERATE_SERIES with DATE_ADD to generate dates and times.

  • Customize the start date, start time, and interval as needed.

  • Integrate the generated date and time series into your queries.

  • If needed, store the generated series in an array.

Step-by-Step Guide

All the example SQL code uses the Ultra Fast Gaming data set. To familiarize yourself with this dataset, visit this link: Ultra Fast Gaming Data Set.

Step 1: Generate a Sequence of Numbers

The first step in creating a date and time series is to generate a sequence of numbers using the GENERATE_SERIES function. This function creates a series of integers that we can later transform into dates and times.

Example SQL code:

-- Generate a series of numbers from 1 to 10
SELECT n FROM GENERATE_SERIES(1, 10) s(n);

In this query, GENERATE_SERIES(1, 10) generates numbers from 1 to 10. The alias s(n) assigns these numbers to a column n.

Step 2: Add the Generated Numbers to a Start Date and Time

Next, use the DATE_ADD function to convert the generated numbers into dates and times. You specify the start date and time, and the interval (e.g., days, hours, minutes) to add to this date and time.

Example SQL code:

-- Generate a date series starting from 2023-01-01 for 10 days
SELECT DATE_ADD('day', n, '2023-01-01'::DATE) AS date_series FROM GENERATE_SERIES(0, 9) s(n);

This query generates a series of dates starting from 2023-01-01, adding each number in the series to this start date.

To generate a time series, modify the interval to hours, minutes, or seconds:

Example SQL code:

-- Generate a time series starting from 2023-01-01 00:00:00 for 10 hours
SELECT DATE_ADD('hour', n, '2023-01-01 00:00:00'::TIMESTAMP) AS time_series FROM GENERATE_SERIES(0, 9) s(n);

In this query, each generated timestamp is one hour apart, starting from 2023-01-01 00:00:00.

Step 3: Generate a Custom Interval Date and Time Series

You can customize the interval by changing the interval parameter in the DATE_ADD function. This allows you to create date and time series with different intervals, such as minutes, hours, days, or months.

Example SQL code:

-- Generate a date series with a weekly interval starting from 2023-01-01 for 10 weeks
SELECT DATE_ADD('week', n, '2023-01-01'::DATE) AS date_series FROM GENERATE_SERIES(0, 9) s(n);

In this query, each generated date is one week apart, starting from 2023-01-01.

Example SQL code:

-- Generate a time series with a 15-minute interval starting from 2023-01-01 00:00:00 for 10 intervals
SELECT DATE_ADD('minute', n * 15, '2023-01-01 00:00:00'::TIMESTAMP) AS time_series FROM GENERATE_SERIES(0, 9) s(n);

In this query, each generated timestamp is 15 minutes apart, starting from 2023-01-01 00:00:00.

Step 4: Integrate the Date and Time Series into Your Queries

You can integrate the generated date and time series into your data analysis queries. This is particularly useful for time series analysis and reporting.

Example SQL code:

-- Integrate the date series into a query to count stats per day for a player
WITH date_series AS (  
  SELECT DATE_ADD('day', n, '2021-01-01'::DATE) AS date_series  
  FROM GENERATE_SERIES(0, 9) s(n))
SELECT ds.date_series, COUNT(p.*) AS stat_count
  FROM date_series ds
  LEFT JOIN playstats p ON ds.date_series = p.stattime::DATE
  AND playerid = 5416
GROUP BY ds.date_series
ORDER BY ds.date_series;

In this query, the generated date series is used to count the number of statistics collected for a player per day, providing a comprehensive view of player activity.

Example SQL code:

WITH time_series AS (  
  SELECT DATE_ADD('hour', n, '2021-01-02 00:00:00'::TIMESTAMP) AS time_series  
  FROM GENERATE_SERIES(0, 23) s(n))
SELECT ts.time_series, COUNT(p.*) AS stat_count
  FROM time_series ts
  LEFT JOIN playstats p ON ts.time_series = p.stattime::DATE
  AND playerid = 5416
GROUP BY ts.time_series
ORDER BY ts.time_series;

In this query, the generated date series is used to count the number of statistics collected for a player per hour, providing a comprehensive view of player activity.

Step 5: Return Date and Time Series as an Array

If needed, the generated date and time series can be returned as an array. This can be done with the ARRAY_AGG function.

Example SQL code:

-- Store the the date series into an array
SELECT array_agg(date_series) FROM (
  SELECT DATE_ADD('day', n, '2021-01-01'::DATE) AS date_series  
  FROM GENERATE_SERIES(0, 9) s(n));

Full Example Code

SELECT n FROM GENERATE_SERIES(1, 10) s(n);

SELECT DATE_ADD('day', n, '2023-01-01'::DATE) AS date_series FROM GENERATE_SERIES(0, 9) s(n);

SELECT DATE_ADD('hour', n, '2023-01-01 00:00:00'::TIMESTAMP) AS time_series FROM GENERATE_SERIES(0, 9) s(n);

SELECT DATE_ADD('week', n, '2023-01-01'::DATE) AS date_series FROM GENERATE_SERIES(0, 9) s(n);

SELECT DATE_ADD('minute', n * 15, '2023-01-01 00:00:00'::TIMESTAMP) AS time_series FROM GENERATE_SERIES(0, 9) s(n);

WITH date_series AS (  
  SELECT DATE_ADD('day', n, '2021-01-01'::DATE) AS date_series  
  FROM GENERATE_SERIES(0, 9) s(n))
SELECT ds.date_series, COUNT(p.*) AS stat_count
  FROM date_series ds
  LEFT JOIN playstats p ON ds.date_series = p.stattime::DATE
  AND playerid = 5416
GROUP BY ds.date_series
ORDER BY ds.date_series;

WITH time_series AS (  
  SELECT DATE_ADD('hour', n, '2021-01-02 00:00:00'::TIMESTAMP) AS time_series  
  FROM GENERATE_SERIES(0, 23) s(n))
SELECT ts.time_series, COUNT(p.*) AS stat_count
  FROM time_series ts
  LEFT JOIN playstats p ON ts.time_series = p.stattime::DATE
  AND playerid = 5416
GROUP BY ts.time_series
ORDER BY ts.time_series;

SELECT array_agg(date_series) FROM (
  SELECT DATE_ADD('day', n, '2021-01-01'::DATE) AS date_series  
  FROM GENERATE_SERIES(0, 9) s(n));