Null values in databases represent missing or unknown data. Proper handling of these null values ensures data integrity, hygiene, and accuracy in analytics. Snowflake, as a popular cloud data platform, provides a plethora of functions and techniques to manage and replace these null values. Additionally, Snowflake’s support for semi-structured data types, like arrays, introduces another dimension to data handling. This article dives into handling null values in Snowflake, with a bonus insight into array handling.
Overview of NULL in databases
In relational databases, NULL is a marker indicating the absence of a value. It doesn’t mean zero or blank; instead, it signifies that a value is missing or undefined. The concept of NULL is essential, especially when ensuring data consistency, quality, and accuracy.
Functions to identify NULL values: IS NULL and IS NOT NULL
IS NULL
To identify records with NULL values in a specific column, you can use the IS NULL condition.
CREATE TABLE employees (id INT, name STRING, age INT);
INSERT INTO employees VALUES
(1, 'John', NULL),
(2, 'Doe', 30),
(3, 'Jane', NULL);
SELECT * FROM employees WHERE age IS NULL;
Output:
+----+-------+-----+
| id | name | age |
+----+-------+-----+
| 1 | John |NULL |
| 3 | Jane |NULL |
+----+-------+-----+
IS NOT NULL
Similarly, if you wish to fetch records where a specific column doesn’t have NULL values, use IS NOT NULL.
Example:
SELECT * FROM employees WHERE age IS NOT NULL;
Techniques to replace or manage NULL values using COALESCE and IFNULL
COALESCE
COALESCE is a powerful function that returns the first non-null value in its arguments.
Example:
SELECT name, COALESCE(age, 0) as age FROM employees;
Output:
+-------+-----+
| name | age |
+-------+-----+
| John | 0 |
| Doe | 30 |
| Jane | 0 |
+-------+-----+
IFNULL
IFNULL is another function that allows you to replace NULL values. It takes two arguments and returns the first if it’s not NULL, else it returns the second.
Example:
SELECT name, IFNULL(age, 0) as age FROM employees;
Array handling in Snowflake: Storing, querying, and flattening
Snowflake supports semi-structured data formats like JSON, which can include arrays.
Storing
To create a table with an array column:
CREATE TABLE array_example (id INT, values ARRAY);
Inserting
To insert data:
INSERT INTO array_example VALUES (1, ARRAY_CONSTRUCT(1,2,3)), (2, ARRAY_CONSTRUCT(4,5,6));
Querying
You can query the array data using various functions like ARRAY_CONTAINS.
SELECT * FROM array_example WHERE ARRAY_CONTAINS(2, values);
Flattening
Arrays can be flattened using the FLATTEN function.
SELECT id, flattened.value
FROM array_example, LATERAL FLATTEN(input => array_example.values) AS flattened;
Output:
+----+-------+
| id | value |
+----+-------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 4 |
| 2 | 5 |
| 2 | 6 |
+----+-------+
Refer more on
Cloud Database
cloud_storage
Snowflake
snowflake_interview