How to warmup a table (asynchronously)

The best way to instruct the system to warmup data is by running CHECKSUM() query over the results set that you want to warm up. Generally speaking, a query that looks like this:

SELECT CHECKSUM(*) FROM <table> WHERE <filters>;

Note that RAM consumption can increase depending on the amount of data you warm. If you are running a “heavy” warmup, it is best to run it before running other processes/queries. This will avoid OOM errors or slow response times.

Variations

  • It is possible to warm-up certain columns in a table. It will generally be faster to warm up some columns rather than all of them, and the warmup query will use less memory. If the table has a large number of columns (50+) and only some of them are used, specifying the columns in the query will make the warmup process faster and use less memory. For example:
SELECT CHECKSUM(col1, col2, ...) FROM <table> WHERE <filters>;
  • The warmup process time can be shortened by running multiple warmup queries at once, if mutually exclusive filter criteria can be used. A good example is if the need is to warm-up 6 months of data, it could be accomplished by running 3 queries in parallel, each covering 2 months. If using the Firebolt UI, here is an example:
/* Execute in the first tab */
SELECT CHECKSUM(*) FROM orders WHERE order_date BETWEEN '2021-01-01' AND '2021-02-28'; 
/* Execute in a second tab */
SELECT CHECKSUM(*) FROM orders WHERE order_date BETWEEN '2021-03-01' AND '2021-04-30';

/* Execute in a third tab */
SELECT CHECKSUM(*) FROM orders WHERE order_date BETWEEN '2021-05-01' AND '2021-06-30';

/* Note:  CHECKSUM(col1, col2, ...) could also be used here, if appropriate */