Overview on how to source data from a table in a different GCP project based…
Tag: ETL
DBT : Converting S3 Paths with DBT Macros Based on Environment Variables
In data engineering, it is common to work with cloud-based storage systems such as Amazon S3. Often, the location of…
DBT : Demystifying the DBT Model: A Comprehensive Guide
Data Build Tool (DBT) has become an indispensable tool for data engineers and analysts in modern data environments. It enables…
DBT : Dealing with single quotes in SQL statements [Escape single quotes in SQL]
DBT is a popular open-source data modeling tool that allows you to transform and analyze data using SQL. One feature…
DBT : Converting a variable into a string in DBT
Jinja’s as_text filter is a way to convert a variable into a string in Jinja. It is often used to…
Hive : Learn hive external functions and how can you use external functions in Hive?
Hive is built on top of Hadoop, which is a distributed file system and a framework for processing large data…
Hive : Hive custom input/output formats .How can you use custom input/output formats in Hive?
Introduction to Custom Input/Output Formats in Hive: Hive allows users to define custom input and output formats to read and…
Hive : How can you increase parallelism in Hive?
Introduction to Parallelism in Hive: Parallelism refers to the ability to execute multiple tasks simultaneously. In the context of Hive,…
Hive : How can you configure job scheduling in Hive?
To ensure that your Hive jobs run smoothly, it is important to configure job scheduling in Hive. Job scheduling allows…
Hive : How can you use RC file format (Record Columnar File) in Hive ?
RC File is a columnar storage format used in Hive for storing structured data. It is designed to optimize the…
Hive : Role of Hive type coercion and how can you perform type coercion in Hive?
In Hive, type coercion is the process of converting one data type to another data type during query execution. Type…