To start a dbt run from the dbt-cloud CLI (Command Line Interface), you will first…
Author: user
Connecting dbt cloud or dbt core to databricks – Step by step procedure
Here is a step-by-step procedure for connecting dbt Cloud or dbt Core to Databricks: Create a new Databricks workspace, or…
Step-by-step procedure for loading JSON data into a Snowflake table from an AWS S3 bucket:
Here is a step-by-step procedure for loading JSON data into a Snowflake table from an AWS S3 bucket: Create a…
How to Grant access to the S3 bucket for the Snowflake account (example)
To grant access to an S3 bucket for a Snowflake account, you will need to create an AWS Identity and…
Different ways that you can load data into Snowflake.
There are several ways to load data into Snowflake, depending on the specific needs of the user and the nature…
PySpark:Getting approximate number of unique elements in a column of a DataFrame
pyspark.sql.functions.approx_count_distinct Pyspark’s approx_count_distinct function is a way to approximate the number of unique elements in a column of a DataFrame….
Utilize the power of Pandas library with PySpark dataframes.
pyspark.sql.functions.pandas_udf PySpark’s PandasUDFType is a type of user-defined function (UDF) that allows you to use the power of Pandas library…
Pyspark, how to format the number X to a format like ‘#,–#,–#.–’, rounded to d decimal places
pyspark.sql.functions.format_number The format_number function is used to format a number as a string. The function takes two arguments: the number…
Pyspark : Formating the arguments in printf-style and returns the result as a string column
pyspark.sql.functions.format_string ‘format_string’ is a parameter in the select method of a DataFrame in PySpark. It is used to specify the…
PySpark : Combine two or more arrays into a single array of tuple
pyspark.sql.functions.arrays_zip In PySpark, the arrays_zip function can be used to combine two or more arrays into a single array of…
PySpark : Transforming a column of arrays or maps into multiple rows : Converting rows into columns
pyspark.sql.functions.explode_outer In PySpark, the explode() function is used to transform a column of arrays or maps into multiple rows, with…