In Snowflake, an array is a one-dimensional, zero-based collection of elements that can be of any data type, including other arrays. Snowflake uses the term “VARIANT” to describe data types that can store semi-structured data, such as JSON, ARRAY, and OBJECT. This article specifically focuses on the ARRAY data type and how you can work with arrays in Snowflake.
How to store data in arrays
To define an array column in Snowflake, use the ARRAY data type. Here’s an example table that contains an array column:
CREATE TABLE fruits_array (
id INTEGER AUTOINCREMENT PRIMARY KEY,
fruits ARRAY
);
You can insert data into array columns by specifying the array values in square brackets:
INSERT INTO fruits_array (fruits)
VALUES
(ARRAY_CONSTRUCT('apple', 'banana', 'cherry')),
(ARRAY_CONSTRUCT('grape', 'orange'));
Querying data within arrays
Accessing Specific Elements: You can access a specific element of an array using the zero-based index.
SELECT fruits[0] AS first_fruit FROM fruits_array;
| FIRST_FRUIT |
---------------
| apple |
| grape |
SELECT id, CONTAINS(fruits, 'banana') AS contains_banana FROM fruits_array;
| ID | CONTAINS_BANANA |
------------------------
| 1 | TRUE |
| 2 | FALSE |
The FLATTEN function to work with array data
If you want to transform each element of an array into a separate row, you can use the FLATTEN function. This is particularly useful when dealing with semi-structured data or when you want to analyze each element individually.
SELECT id, flattened.value::STRING AS individual_fruit
FROM fruits_array,
LATERAL FLATTEN(input => fruits_array.fruits) AS flattened;
| ID | INDIVIDUAL_FRUIT |
------------------------
| 1 | apple |
| 1 | banana |
| 1 | cherry |
| 2 | grape |
| 2 | orange |