Understanding Sync Across Engines

There are 2 types of engines in Firebolt - General purpose engines and Analytics engines.

General purpose engines can do everything analytics engines do, but can also write data to Firebolt tables. They are designed for database creation, data ingestion, and extract, load, and transform (ELT) operations. A database can have only one general purpose engine running at a time.

Analytics engines are read-only and are designed for queries that do not ingest data. They can’t write values. You can run as many analytics engines as you need at the same time.

Since analytics engines are read-only, all changes to data and schemas are done on a general purpose engine, and then synced to the analytics engines.

Syncing Changes to Data

Data changes are synced to all running analytics engines every 10 minutes. If the data changes are large, it will take additional time for all changes to be applied.

Data changes are synced to a non-running engine when the engine is started.

Data changes include the following activities:

  • INSERT/UPDATE/DELETE

  • DROP PARTITION

Syncing changes to schemas

Schema changes are not synced to running analytics engines. Running engines must be restarted for schema changes to be reflected in the engine.

Schema changes include the following activities:

  • DROP/CREATE TABLE

  • DROP/CREATE AGGREGATING INDEX

  • DROP/CREATE JOIN INDEX

  • DROP/CREATE VIEW

Note - schema changes block data changes

It is important to note that if a table schema is changed, data changes for that table will no longer be synced to running engines. A common pattern is to drop a table, create it with the same set of columns, and insert a new set of data. Because of the initial schema change (DROP/CREATE), subsequent data changes will not sync to running analytics engines, even though the table structure has not changed.