How to use STRING_AGG functionality in Firebolt

Introduction

The STRING_AGG function is a powerful tool in SQL that allows you to concatenate strings from multiple rows into a single string, separated by a specified delimiter. This can be particularly useful for creating summary reports or generating combined fields for easier data presentation. By the end of this article, you will know how to efficiently use the STRING_AGG function in Firebolt SQL Syntax, to combine multiple string values into a single, comma-separated string.

TL;DR

  • Use the Firebolt functions to concatenate strings from multiple rows:

  • ARRAY_AGG: Aggregates values into an array.

  • ARRAY_TO_STRING: Joins array elements into a single string.

  • Specify a delimiter to separate the concatenated strings.

  • Apply STRING_AGG within the context of a GROUP BY clause to aggregate values for each group.

  • Utilize additional functions like DISTINCT to avoid duplicate values.

Step-by-Step Guide

All the example SQL code uses the Ultra Fast Gaming data set. To familiarize yourself with this data set, visit this link: Ultra Fast Gaming Firebolt Sample Dataset.

Step 1: Basic Usage of ARRAY_TO_STRING ARRAY_AGG

Let’s look at the table playstats with the fields PlayerID and SelectedCar.

Example SQL code:

SELECT DISTINCT PlayerID, 
SelectedCar
FROM playstats
WHERE PlayerID IN (1, 2);

The below example will present how to find all the cars selected by specific players during their gameplay sessions.

To achieve this in Firebolt, you can use ARRAY_TO_STRING(ARRAY_AGG(<column_name>), ', ').

Example SQL code:

SELECT PlayerID, 
ARRAY_TO_STRING(ARRAY_AGG(SelectedCar), ', ') AS All_SelectedCar
FROM playstats
WHERE PlayerID IN (1, 2)
GROUP BY PlayerID;

Step 2: Understanding how ARRAY_TO_STRING ARRAY_AGG works

Here is a step-by-step explanation of how this works:

  • ARRAY_AGG: This function collects all values of SelectedCar for each PlayerID into an array.

  • ARRAY_TO_STRING: This function then joins the elements of the array into a single string, with each value separated by a comma.

  • GROUP BY PlayerID: This clause ensures that the aggregation is done for each player individually.

Step 3: Avoiding Duplicate Values

To ensure that the concatenated string does not contain duplicate values, use the DISTINCT keyword within ARRAY_AGG, for example, if you want to avoid duplicate selected cars.

Example SQL code:

SELECT PlayerID, 
ARRAY_TO_STRING(ARRAY_AGG(DISTINCT SelectedCar), ', ') AS All_SelectedCar
FROM playstats
WHERE PlayerID IN (1, 2)
GROUP BY PlayerID;

Full Example Code:

SELECT DISTINCT PlayerID, 
SelectedCar
FROM playstats
WHERE PlayerID IN (1, 2);

SELECT PlayerID, 
ARRAY_TO_STRING(ARRAY_AGG(SelectedCar), ', ') AS All_SelectedCar
FROM playstats
WHERE PlayerID IN (1, 2)
GROUP BY PlayerID;

SELECT PlayerID, 
ARRAY_TO_STRING(ARRAY_AGG(DISTINCT SelectedCar), ', ') AS All_SelectedCar
FROM playstats
WHERE PlayerID IN (1, 2)
GROUP BY PlayerID;