Ultra Fast Gaming: Firebolt Sample Dataset

TL;DR

The “Ultra Fast Gaming Inc.” use case, outlined in various help center articles, showcases a gaming company that develops and publishes online games across multiple platforms including PlayStation, Xbox, PC, iOS, and Nintendo. This example demonstrates Firebolt’s capabilities in managing large-scale data analytics for gaming statistics. It covers a variety of dimension tables, data ingestion techniques, and advanced indexing for gaming data aimed at internal analytics and external reporting.

Fictional Company and Use Case Details

  • Industry: Gaming
  • Name: Ultra Fast Gaming Inc.
  • Description: Ultra Fast Gaming is renowned for its arcade games and has recently launched its first racing game, “Johnny B. Quick”. This game is played across multiple platforms and the company utilizes Firebolt to analyze gaming stats like play time, leaderboard standings, and platform performance.
  • Use Case: With “Johnny B. Quick”, Ultra Fast Gaming collects and analyzes detailed gaming statistics to improve game features and enhance player engagement. They aim to use these insights for future game development and potentially monetize anonymized data for targeted advertising.

Data Model At High Level

The data model includes several dimension tables representing various aspects of the game and its players, as well as fact tables for detailed gaming statistics.

Dimension Tables

  • Games: Details about the games like game ID, title, launch date, and supported platforms.
  • Levels: Information on game levels including level type and scoring details.
  • Players: Data about players such as player ID, nickname, and platform used.
  • Tournaments: Information on gaming tournaments including ID, name, and prize details.

Fact Tables

  • PlayStats: Records detailed player statistics during game sessions.
  • Rankings: Player rankings in tournaments based on their performance scores.

UltraFast Gaming Inc. Query Examples for Firebolt

The scripts and questions below will give you a head start in using Firebolt. We have gathered different types of questions using various functionalities to help you explore Firebolt’s functionality and performance. Be sure to check the next section about setting up your engine and database to ensure you are ready.

Ensure You Are Using the Correct Database and Engine

To run the queries provided, make sure you are using the correct database and engine.

The following commands will help you set up and switch to the appropriate database and engine. You can uncomment the necessary commands if needed.


-- Use the USE DATABASE command to ensure you are using the preloaded database. "UltraFast" is written with "" since Firebolt is case sensitive.
USE DATABASE "UltraFast";

/*
-- If you have created an engine already, no need to uncomment this section. If you haven't, this SQL command will create an S type engine with a single node to run your query. You can't use system engine to run DQLs.

CREATE ENGINE "demo_dataset_engine" WITH
TYPE = "S"
NODES = 1
AUTO_STOP = 20
INITIALLY_STOPPED = false
AUTO_START = true
CLUSTERS = 1;
*/

/*
-- Once created, we will want to use this engine by using the USE ENGINE command. If you are already using a different engine, no need to uncomment. If you want to use a different engine you've setup, just uncomment and use the engine name you want to use. Just make sure you use "" in case you use both upper and lower cases in your engine name.

USE ENGINE "demo_dataset_engine";
*/

Please Note

The UltraFast dataset is preloaded for the first account under your organization. If you wish to load the UltraFast to a different account, please see the complete script under the bottom section - “Creating the Model in Firebolt”.

References:

Examples

Question 1

  • Which games support a specific platform and how many platforms are supported by each game?
  • Example of using ARRAY_CONTAINS and ARRAY_LENGTH in Firebolt:
-- This query checks if the platform "PlayStation" is supported and counts the total number of supported platforms for each game.
SELECT
    Title AS GameTitle,
    ARRAY_CONTAINS(SupportedPlatforms, 'PlayStation') AS SupportsPlayStation,
    ARRAY_LENGTH(SupportedPlatforms) AS NumberOfPlatforms
FROM
    Games;

References:

Question 2

  • How can we extract numbers from a player’s nickname and split the email to get the domain?
  • Example of using REGEXP_EXTRACT and SPLIT_PART in Firebolt:
-- This query extracts numbers from the player's nickname and splits the email to get the domain.
SELECT
    Nickname AS PlayerNickname,
    REGEXP_EXTRACT(Nickname, '([0-9]+)') AS ExtractedNumbers,
    SPLIT_PART(Email, '@', 2) AS EmailDomain
FROM
    Players;

References:

Question 3

  • How can know how many players don’t use either iOS or Nintendo?
  • Example of using ARRAY_ANY_MATCH in Firebolt:
-- This query identifies players who use either iOS or Nintendo platforms.
SELECT
    count(Nickname) as count_users
FROM
    Players
WHERE 
    ARRAY_ANY_MATCH(x -> x IN ('iOS', 'Nintendo'), Platforms) = false;

References:

Question 4

  • How can we calculate the difference in days between the game launch date and the current date, truncate the launch date to the nearest month, and find games launched within the last year?
  • Example of using DATE_DIFF, DATE_TRUNC, and INTERVAL in Firebolt:
-- This query calculates the difference in days between the launch date and the current date, truncates the launch date to the nearest month, and checks if the game was launched within the last year.
SELECT
    Title AS GameTitle,
    DATE_DIFF('day', LaunchDate, CURRENT_DATE) AS DaysSinceLaunch,
    DATE_TRUNC('month', LaunchDate) AS LaunchMonth,
    CASE 
        WHEN LaunchDate >= CURRENT_DATE - INTERVAL '1 year' THEN 'Yes'
        ELSE 'No'
    END AS LaunchedWithinLastYear
FROM
    Games;

References:

Question 5

  • What are the highest average scores per player per game, filtered by specific tournaments, and which players consistently achieve top scores using window functions?
  • Example of using subqueries and window functions in Firebolt:
-- This query calculates the highest average scores per player per game filtered by specific tournaments and identifies players consistently achieving top scores.
WITH PlayerScores AS (
    SELECT 
        ps.GameID,
        ps.PlayerID,
        AVG(ps.CurrentScore) AS AvgScore
    FROM 
        PlayStats ps
    WHERE 
        ps.TournamentID IN (56, 16, 98) -- Replace with your specific TournamentIDs
    GROUP BY ALL
),
RankedScores AS (
    SELECT 
        ps.GameID,
        ps.PlayerID,
        ps.AvgScore,
        RANK() OVER (PARTITION BY ps.GameID ORDER BY ps.AvgScore DESC) AS ScoreRank
    FROM 
        PlayerScores ps
)
SELECT 
    g.Title AS GameTitle,
    p.Nickname AS PlayerNickname,
    rs.AvgScore,
    rs.ScoreRank
FROM 
    RankedScores rs
JOIN 
    Games g ON rs.GameID = g.GameID
JOIN 
    Players p ON rs.PlayerID = p.PlayerID
WHERE 
    rs.ScoreRank = 1;

References:

Question 6

  • How does the subscription status and platform type influence average playtime and player rankings, filtered by play sessions between specific dates and tournament IDs greater than 100?
  • Example of using CASE WHEN, subqueries, window functions, UNNEST, and HAVING clause in Firebolt:

-- This query examines the influence of subscription status and platform type on average playtime and player rankings filtered by play sessions between specific dates and tournament IDs greater than 100.

WITH PlayerPlayTime AS (
    SELECT 
        ps.PlayerID,
        ps.GameID,
        p.IsSubscribedToNewsletter,
        UNNEST(p.Platforms) AS Platform,
        AVG(ps.CurrentPlayTime) AS AvgPlayTime,
        RANK() OVER (PARTITION BY ps.GameID ORDER BY AVG(ps.CurrentPlayTime) DESC) AS PlayTimeRank
    FROM 
        PlayStats ps
    JOIN 
        Players p ON ps.PlayerID = p.PlayerID
    WHERE 
        ps.StatTime BETWEEN '2020-12-01' AND '2021-02-01'
        AND ps.TournamentID > 100
    GROUP BY ALL
),
FilteredPlayers AS (
    SELECT 
        ppt.PlayerID,
        ppt.GameID,
        ppt.IsSubscribedToNewsletter,
        ppt.Platform,
        ppt.AvgPlayTime,
        ppt.PlayTimeRank,
        CASE 
            WHEN ppt.IsSubscribedToNewsletter = TRUE THEN 'Subscribed'
            ELSE 'Not Subscribed'
        END AS SubscriptionStatus
    FROM 
        PlayerPlayTime ppt
    WHERE 
        ppt.AvgPlayTime > 0
)
SELECT 
    fp.GameID,
    fp.Platform,
    fp.SubscriptionStatus,
    AVG(fp.AvgPlayTime) AS AvgPlayTime,
    AVG(fp.PlayTimeRank) AS AvgPlayTimeRank
FROM 
    FilteredPlayers fp
GROUP BY ALL
HAVING 
    AVG(fp.AvgPlayTime) > 0;

References:

Data Model - Detailed Description

This data model includes detailed specifications of dimension and fact tables structured to support the analytical needs of Ultra Fast Gaming Inc. Each table and its fields are described to outline their roles in analyzing gaming statistics.

Dimension Tables

Games

  • GameID (INTEGER): Unique identifier for each game.
  • Title (TEXT): The name of the game.
  • Abbreviation (TEXT): A short form or acronym of the game title.
  • Series (TEXT): Indicates the series to which the game belongs, if applicable.
  • Version (NUMERIC(10, 2)): Version number of the game.
  • GameDescription (TEXT): A brief description of the game.
  • Category (TEXT): Genre of the game (e.g., Racing).
  • LaunchDate (DATE): Release date of the game.
  • Author (TEXT): Developer or publisher of the game.
  • SupportedPlatforms (ARRAY(TEXT)): List of platforms on which the game is available (e.g., PlayStation, Xbox).
  • GameConfiguration (TEXT): JSON blob containing detailed game configuration such as car specifications.
  • SOURCE_FILE_NAME (TEXT): Name of the source file from which data is loaded.
  • SOURCE_FILE_TIMESTAMP (TIMESTAMPNTZ): Timestamp for when the source file was last updated.

Levels

  • LevelID (INTEGER): Unique identifier for each game level.
  • GameID (INTEGER): Identifier linking to the Games table.
  • Level (INTEGER): Numeric order or designation of the level within the game.
  • Name (TEXT): Name of the level.
  • LevelType (TEXT): Type of gameplay involved (e.g., Fastest lap, First to complete).
  • NextLevel (INTEGER): Identifier of the subsequent level.
  • MinPointsToPass (INTEGER): Minimum points required to pass the level.
  • MaxPoints (INTEGER): Maximum points achievable in the level.
  • NumberOfLaps (INTEGER): Number of laps required in the level (applicable in racing games).
  • MaxPlayers (INTEGER): Maximum number of players allowed in the level.
  • MinPlayers (INTEGER): Minimum number of players required to start the level.
  • PointsPerLap (REAL): Points awarded per lap completed.
  • MusicTrack (TEXT): Background music track for the level.
  • SceneDetails (TEXT): JSON blob containing details about the scene or environment settings.
  • MaxPlayTimeSeconds (INTEGER): Maximum time allowed to play this level in seconds.
  • LevelIcon (BYTEA): Binary data for the level’s icon image.
  • SOURCE_FILE_NAME (TEXT): File name where level data is sourced.
  • SOURCE_FILE_TIMESTAMP (TIMESTAMPNTZ): Timestamp when the file data was last updated.

Players

  • PlayerID (INTEGER): Unique identifier for each player.
  • Nickname (TEXT): Player’s chosen nickname.
  • Email (TEXT): Player’s email address.
  • AgeCategory (TEXT): Age range category of the player (e.g., 20-25).
  • Platforms (ARRAY(TEXT)): Platforms on which the player is active.
  • RegisteredOn (PGDATE): Date the player registered on the platform.
  • IsSubscribedToNewsletter (BOOLEAN): Indicates if the player is subscribed to newsletters.
  • InternalProbabilityToWin (DOUBLE PRECISION): Calculated probability of the player winning games.
  • SOURCE_FILE_NAME (TEXT): Source file from which player data is ingested.
  • SOURCE_FILE_TIMESTAMP (TIMESTAMPNTZ): Timestamp of the last update to the player data file.

Tournaments

  • TournamentID (INTEGER): Unique identifier for each tournament.
  • Name (TEXT): Name of the tournament.
  • GameID (INTEGER): Identifier linking to the Games table, specifying which game the tournament is for.
  • TotalPrizeDollars (INTEGER): Total prize amount in dollars offered for the tournament.
  • StartDateTime (TIMESTAMPNTZ): Date and time when the tournament starts.
  • EndDateTime (TIMESTAMPNTZ): Date and time when the tournament ends.
  • RulesDefinition (TEXT): Description of the tournament rules.
  • SOURCE_FILE_NAME (TEXT): File name where tournament data is sourced.
  • SOURCE_FILE_TIMESTAMP (TIMESTAMPNTZ): Timestamp when the file data was last updated.

Fact Tables

PlayStats

  • GameID (INTEGER): Links to the Games table.
  • PlayerID (INTEGER): Links to the Players table.
  • StatTime (TIMESTAMPNTZ): Timestamp of when the statistics were recorded.
  • SelectedCar (TEXT): Description of the car selected by the player for the game.
  • CurrentLevel (INTEGER): Current level being played by the player.
  • CurrentSpeed (REAL): Current speed of the player in the game.
  • CurrentPlayTime (BIGINT): Total playtime accumulated by the player in the current session.
  • CurrentScore (BIGINT): Current score of the player in the game.
  • Event (TEXT): Description of a particular event during the game session (e.g., level up, power-up).
  • ErrorCode (TEXT): Error code encountered during the game session, if any.
  • TournamentID (INTEGER): Links to the Tournaments table if the game session is part of a tournament.
  • SOURCE_FILE_NAME (TEXT): Source file name for play statistics data.
  • SOURCE_FILE_TIMESTAMP (TIMESTAMPNTZ): Timestamp of the last update to the play statistics data file.

Rankings

  • GameID (INTEGER): Links to the Games table.
  • PlayerID (INTEGER): Links to the Players table.
  • MaxLevel (INTEGER): Highest level achieved by the player.
  • TotalScore (BIGINT): Total score accumulated by the player across sessions.
  • PlaceWon (INTEGER): Ranking position of the player in the tournament.
  • TournamentID (INTEGER): Links to the Tournaments table.
  • SOURCE_FILE_NAME (TEXT): Source file name for ranking data.
  • SOURCE_FILE_TIMESTAMP (TIMESTAMPNTZ): Timestamp when the ranking data file was last updated.

Creating the Model in Firebolt

Your model is already preloaded to Firebolt. If you want to reingest it, please use the following scripts to set up and populate the model in your database.

CREATE ENGINE "sample_firebolt_engine" WITH
TYPE = "S"
NODES = 1
AUTO_STOP = 20;


USE ENGINE sample_firebolt_engine;


DROP TABLE IF EXISTS games CASCADE;


CREATE DIMENSION TABLE IF NOT EXISTS games (
  GameID INTEGER,
  Title TEXT,
  Abbreviation TEXT,
  Series TEXT,
  Version NUMERIC(10, 2),
  GameDescription TEXT,
  Category TEXT,
  LaunchDate DATE,
  Author TEXT,
  SupportedPlatforms ARRAY (TEXT null),
  GameConfiguration TEXT,
  SOURCE_FILE_NAME TEXT,
  SOURCE_FILE_TIMESTAMP TIMESTAMPNTZ
) PRIMARY INDEX GameID, Title;


DROP TABLE IF EXISTS levels CASCADE;


CREATE DIMENSION TABLE IF NOT EXISTS levels (
  LevelID INTEGER,
  GameID INTEGER,
  Level INTEGER,
  Name TEXT,
  LevelType TEXT,
  NextLevel INTEGER NULL,
  MinPointsToPass INTEGER,
  MaxPoints INTEGER,
  NumberOfLaps INTEGER,
  MaxPlayers INTEGER,
  MinPlayers INTEGER,
  PointsPerLap REAL,
  MusicTrack TEXT,
  SceneDetails TEXT,
  MaxPlayTimeSeconds INTEGER,
  LevelIcon BYTEA,
  SOURCE_FILE_NAME TEXT,
  SOURCE_FILE_TIMESTAMP TIMESTAMPNTZ
) PRIMARY INDEX LevelID;


DROP TABLE IF EXISTS players CASCADE;


CREATE DIMENSION TABLE IF NOT EXISTS players (
  PlayerID INTEGER,
  Nickname TEXT,
  Email TEXT,
  AgeCategory TEXT,
  Platforms ARRAY (TEXT null),
  RegisteredOn PGDATE,
  IsSubscribedToNewsletter BOOLEAN,
  InternalProbabilityToWin DOUBLE PRECISION,
  SOURCE_FILE_NAME TEXT,
  SOURCE_FILE_TIMESTAMP TIMESTAMPNTZ
) PRIMARY INDEX AgeCategory, RegisteredOn;


DROP TABLE IF EXISTS tournaments CASCADE;


CREATE DIMENSION TABLE IF NOT EXISTS tournaments (
  TournamentID INTEGER,
  Name TEXT,
  GameID INTEGER,
  TotalPrizeDollars INTEGER,
  StartDateTime TIMESTAMPNTZ,
  EndDateTime TIMESTAMPNTZ,
  RulesDefinition TEXT,
  SOURCE_FILE_NAME TEXT,
  SOURCE_FILE_TIMESTAMP TIMESTAMPNTZ
) PRIMARY INDEX TournamentID;


DROP TABLE IF EXISTS rankings CASCADE;


CREATE TABLE IF NOT EXISTS rankings (
  GameID INTEGER,
  PlayerID INTEGER,
  MaxLevel INTEGER,
  TotalScore BIGINT,
  PlaceWon INTEGER,
  TournamentID INTEGER,
  SOURCE_FILE_NAME TEXT,
  SOURCE_FILE_TIMESTAMP TIMESTAMPNTZ
) PRIMARY INDEX GameID, TournamentID, PlayerID;



DROP TABLE IF EXISTS playstats CASCADE;


CREATE TABLE IF NOT EXISTS playstats (
  GameID INTEGER,
  PlayerID INTEGER,
  StatTime TIMESTAMPNTZ,
  SelectedCar TEXT,
  CurrentLevel INTEGER,
  CurrentSpeed REAL,
  CurrentPlayTime BIGINT,
  CurrentScore BIGINT,
  Event TEXT,
  ErrorCode TEXT,
  TournamentID INTEGER,
  SOURCE_FILE_NAME TEXT,
  SOURCE_FILE_TIMESTAMP TIMESTAMPNTZ
) PRIMARY INDEX GameID, TournamentID, PlayerID, StatTime;



COPY INTO games (
  gameid $1, 
  title $2, 
  abbreviation $3, 
  series $4, 
  version $5, 
  gamedescription $6, 
  category $7, 
  launchdate $8, 
  author $9, 
  supportedplatforms $10, 
  gameconfiguration $11, 
  source_file_name $SOURCE_FILE_NAME, 
  source_file_timestamp $SOURCE_FILE_TIMESTAMP)
FROM 's3://firebolt-sample-datasets-public-us-east-1/gaming/parquet/games/' WITH PATTERN = '*'  TYPE = PARQUET;

COPY INTO levels (
  levelid $1, 
  gameid $2, 
  LEVEL $3, 
  name $4, 
  leveltype $5, 
  nextlevel $6, 
  minpointstopass $7, 
  maxpoints $8, 
  numberoflaps $9, 
  maxplayers $10, 
  minplayers $11, 
  pointsperlap $12, 
  musictrack $13, 
  scenedetails $14, 
  maxplaytimeseconds $15, 
  levelicon $16, 
  source_file_name $SOURCE_FILE_NAME, 
  source_file_timestamp $SOURCE_FILE_TIMESTAMP)
FROM 's3://firebolt-sample-datasets-public-us-east-1/gaming/parquet/levels/' WITH PATTERN = '*'  TYPE = PARQUET;

COPY INTO players (
  playerid $1, 
  nickname $2, 
  email $3, 
  agecategory $4, 
  platforms $5, 
  registeredon $6, 
  issubscribedtonewsletter $7, 
  internalprobabilitytowin $8, 
  source_file_name $SOURCE_FILE_NAME, 
  source_file_timestamp $SOURCE_FILE_TIMESTAMP)
FROM 's3://firebolt-sample-datasets-public-us-east-1/gaming/parquet/players/' WITH PATTERN = '*'  TYPE = PARQUET;

COPY INTO playstats (
  gameid $1, 
  playerid $2, 
  stattime $3, 
  selectedcar $4, 
  currentlevel $5, 
  currentspeed $6, 
  currentplaytime $7, 
  currentscore $8, 
  event $9, 
  errorcode $10, 
  tournamentid $11, 
  source_file_name $SOURCE_FILE_NAME, 
  source_file_timestamp $SOURCE_FILE_TIMESTAMP)
FROM 's3://firebolt-sample-datasets-public-us-east-1/gaming/parquet/playstats/' WITH PATTERN = '*'  TYPE = PARQUET;

COPY INTO rankings (
  gameid $1, 
  playerid $2, 
  maxlevel $3, 
  totalscore $4, 
  placewon $5, 
  tournamentid $6, 
  source_file_name $SOURCE_FILE_NAME, 
  source_file_timestamp $SOURCE_FILE_TIMESTAMP)
FROM 's3://firebolt-sample-datasets-public-us-east-1/gaming/parquet/rankings/' WITH PATTERN = '*'  TYPE = PARQUET;

COPY INTO tournaments (
  tournamentid $1, 
  name $2, 
  gameid $3, 
  totalprizedollars $4, 
  startdatetime $5, 
  enddatetime $6, 
  rulesdefinition $7, 
  source_file_name $SOURCE_FILE_NAME, 
  source_file_timestamp $SOURCE_FILE_TIMESTAMP)
FROM 's3://firebolt-sample-datasets-public-us-east-1/gaming/parquet/tournaments/' WITH PATTERN = '*'  TYPE = PARQUET;