Query Diagnosis and Troubleshooting

Introduction

If a query is not running as quickly as desired, several areas can be investigated for bottlenecks or restrictions. By the end of this article, you will know how to effectively diagnose and troubleshoot your SQL queries in Firebolt.

TL;DR

  • Utilize all basic SQL functionalities to easily maximize query optimization.

    • Use the RECOMMEND DDL command to check if the primary index defined on the table is the most optimal.

    • Use aggregating indexes to improve performance.

    • Materialize Common Table Expressions (CTEs) for better efficiency.

  • Leverage query statistics in query history for performance insights.

  • Monitor the number of queries running in parallel to understand the impact of concurrent queries.

Step-by-Step Guide

Step 1: Utilize all basic SQL functionalities

Ensure you use all the basic functionalities to optimize query speed:

Step 2: Query Statistics in Query History

engine_query_history table provides details on usage and time spent during various steps of query execution.

Example SQL code:

SELECT
eqh.start_time,
eqh.end_time,
eqh.status,
eqh.duration_us/1000000 as duration_secs,
eqh.time_in_queue_us,
eqh.query_id,
eqh.query_text,
eqh.error_message,
eqh.spilled_bytes as "amount of spilled data to disk in bytes",
eqh.scanned_rows,
eqh.inserted_rows,
eqh.user_name
FROM information_schema.engine_query_history eqh
WHERE LOWER(query_text) NOT LIKE '%engine_query_history%'
AND LOWER(query_text) NOT LIKE '%engine_running_queries%'
AND LOWER(query_text) LIKE '%<an identifying text pattern for the query>%'
AND status != 'STARTED_EXECUTION'
AND start_time BETWEEN '<start_time>' and '<end_time>'
ORDER BY start_time;

Step 3: Concurrent queries

Concurrent queries running on the engine can affect the performance of a given query. To gauge this impact, it is helpful to know the number of other queries running simultaneously. This can be achieved by displaying query activity minute-by-minute for a period before and after the execution of the query being analyzed.

Example SQL code:

SELECT minute,
count(*) as query_count,
round(avg(duration_us/1000)) as avg_dur_ms,
round(avg(time_in_queue_us)) as avg_queue_ms,
round(avg(scanned_rows),0) as avg_read_rows,
round(avg(inserted_rows),0) as avg_written_rows,
sum(scanned_rows) as tot_rows_read,
sum(select_count) as select_count,
sum(insert_count) as insert_count,
sum(ddl_count) as ddl_count,
sum(err_count) as err_count
FROM (
SELECT
date_trunc('minute', qh.start_time) as minute,
query_id, 
query_text, status,
duration_us,
qh.time_in_queue_us,
CASE
WHEN lower(query_text) LIKE '%insert into%' THEN 1
ELSE 0
END insert_count,
CASE
WHEN (lower(query_text) LIKE '%select%'
AND lower(query_text) NOT LIKE '%insert into%') THEN 1
ELSE 0
END select_count,
CASE
WHEN lower(query_text) LIKE '%create%' THEN 1
ELSE 0
END ddl_count,
CASE
WHEN error_message != '' THEN 1
ELSE 0
END err_count,
round(cpu_usage_us/1000) as cpu_usage_ms,
round(cpu_delay_us/1000) as cpu_delay_ms,
round(scanned_bytes/1000000,3) as scanned_mb,
scanned_rows,
inserted_rows
FROM information_schema.engine_query_history qh
WHERE lower(query_text) NOT LIKE '%engine_query_history%'
AND lower(query_text) NOT LIKE '%running_queries%'
AND lower(query_text) NOT LIKE '%show_indexes%'
AND lower(query_text) NOT LIKE '%show_tables%'
AND status != 'STARTED_EXECUTION'
AND start_time BETWEEN '<start_time>' and '<end_time>'
)
GROUP BY minute
ORDER BY minute;

Full Example Code

SELECT
eqh.start_time,
eqh.end_time,
eqh.status,
eqh.duration_us/1000000 as duration_secs,
eqh.time_in_queue_us,
eqh.query_id,
eqh.query_text,
eqh.error_message,
eqh.spilled_bytes as "amount of spilled data to disk in bytes",
eqh.scanned_rows,
eqh.inserted_rows,
eqh.user_name
FROM information_schema.engine_query_history eqh
WHERE LOWER(query_text) NOT LIKE '%engine_query_history%'
AND LOWER(query_text) NOT LIKE '%engine_running_queries%'
AND LOWER(query_text) LIKE '%<an identifying text pattern for the query>%'
AND status != 'STARTED_EXECUTION'
AND start_time BETWEEN '<start_time>' and '<end_time>'
ORDER BY start_time;

SELECT minute,
count(*) as query_count,
round(avg(duration_us/1000)) as avg_dur_ms,
round(avg(time_in_queue_us)) as avg_queue_ms,
round(avg(scanned_rows),0) as avg_read_rows,
round(avg(inserted_rows),0) as avg_written_rows,
sum(scanned_rows) as tot_rows_read,
sum(select_count) as select_count,
sum(insert_count) as insert_count,
sum(ddl_count) as ddl_count,
sum(err_count) as err_count
FROM (
SELECT
date_trunc('minute', qh.start_time) as minute,
query_id, 
query_text, status,
duration_us,
qh.time_in_queue_us,
CASE
WHEN lower(query_text) LIKE '%insert into%' THEN 1
ELSE 0
END insert_count,
CASE
WHEN (lower(query_text) LIKE '%select%'
AND lower(query_text) NOT LIKE '%insert into%') THEN 1
ELSE 0
END select_count,
CASE
WHEN lower(query_text) LIKE '%create%' THEN 1
ELSE 0
END ddl_count,
CASE
WHEN error_message != '' THEN 1
ELSE 0
END err_count,
round(cpu_usage_us/1000) as cpu_usage_ms,
round(cpu_delay_us/1000) as cpu_delay_ms,
round(scanned_bytes/1000000,3) as scanned_mb,
scanned_rows,
inserted_rows
FROM information_schema.engine_query_history qh
WHERE lower(query_text) NOT LIKE '%engine_query_history%'
AND lower(query_text) NOT LIKE '%running_queries%'
AND lower(query_text) NOT LIKE '%show_indexes%'
AND lower(query_text) NOT LIKE '%show_tables%'
AND status != 'STARTED_EXECUTION'
AND start_time BETWEEN '<start_time>' and '<end_time>'
)
GROUP BY minute
ORDER BY minute;