BigQuery : Handle missing or null values in BigQuery

Google Big Query @ Freshers.in

In BigQuery, missing or null values can be handled in several ways:

  1. Using the IS NULL and IS NOT NULL operators: The IS NULL and IS NOT NULL operators can be utilized within a WHERE clause to filter out rows containing missing or null values within a particular column. For instance, to retrieve all rows from a table where the “age” column is either missing or null, you can employ the following query:
SELECT * FROM my_table WHERE age IS NULL
  1. Using the COALESCE() function: You can use the COALESCE() function to return the first non-null value in a list of expressions. For example, to select the value of the “email” column or the “phone” column if the “email” column is null, you can use the following query:
SELECT COALESCE(email, phone) as contact_info FROM my_table
  1. Using the NULLIF() function: You can use the NULLIF() function to return null if two expressions are equal. For example, to set the value of the “age” column to null if the value is 0, you can use the following query:
SELECT NULLIF(age, 0) as age FROM my_table
  1. Using the IFNULL() function: You can use the IFNULL() function to return a specified value if the expression is null. For example, to replace null values in the “salary” column with 0, you can use the following query:
SELECT IFNULL(salary, 0) as salary FROM my_table
  1. Using the NOT NULL constraint: You can specify the NOT NULL constraint in the schema definition of the table which will ensure that the values in that column are not null.
  2. Using the INSERT statement: When inserting data into a table, you can use the DEFAULT keyword to insert a default value if the value is not specified or is null.

Here is an example of how to handle missing or null values in a BigQuery table:

# Create a table with data
CREATE TABLE my_table
(
    id INT64,
    name STRING,
    age INT64,
    salary FLOAT64
)
# Inserting data into table
INSERT INTO my_table (id, name, age, salary)
VALUES (1, "John Smith", 25, 50000),
       (2, "Peter Helen", NULL, 60000),
       (3, "Mike Williams", 30, DEFAULT)

In this example, the age of the second row is null, and the salary of the third row is not specified and will be set to null by default.

Then, to select the name and age of all the people older than 30, you can use the following query

SELECT name, age FROM my_table WHERE age > 30 OR age IS NULL

The query will return the name and age of all people in the table, including those whose age is null.

In this case, the query will return the name of “John Smith” and “Mike Williams” but not “Peter Helen” because their age is not greater than 30 and is null.

Author: user

Leave a Reply