Quick Tip on Loading Array Columns from CSV into Firebolt

The Challenge

How to handle array data within CSV files when creating tables in Firebolt.

Solution

Firebolt supports the loading of array columns directly from CSV files by specifying the array data in a particular format. To correctly import array data, ensure that the arrays in the CSV are enclosed in double quotes.

Example

For a CSV with data structured as below:

foo,"[fizz,buzz]",bar
foo,"[fizz,buzz]",bar
foo,"[fizz,buzz]",bar
foo,"[fizz,buzz]",bar
foo,"[fizz,buzz]",bar

You can create an external table in Firebolt using the following SQL query:

CREATE EXTERNAL TABLE IF NOT EXISTS ex_array_test
    (  
      foo STRING,  
      fizzbuzz ARRAY(TEXT),  
      bar STRING
    )
    URL = 's3://path_to_your_data/'
    TYPE = (csv);

Notes

  • This method supports both text and numeric arrays. For example, array(DOUBLE) and array(INT) can be used if your array contains numeric values.
  • The order of elements within the array as specified in the CSV file is preserved when the data is loaded into Firebolt.

Conclusion

By correctly formatting your CSV data and using the appropriate SQL syntax, you can efficiently manage and query array data within Firebolt. Ensure that your CSV files are properly formatted to avoid loading issues.