pyspark.sql.functions.coalesce If you want to return the first non zero from list of column you…
Category: spark
Spark User full article
Pyspark : Formating the arguments in printf-style and returns the result as a string column
pyspark.sql.functions.format_string ‘format_string’ is a parameter in the select method of a DataFrame in PySpark. It is used to specify the…
PySpark : Combine two or more arrays into a single array of tuple
pyspark.sql.functions.arrays_zip In PySpark, the arrays_zip function can be used to combine two or more arrays into a single array of…
PySpark : Transforming a column of arrays or maps into multiple rows : Converting rows into columns
pyspark.sql.functions.explode_outer In PySpark, the explode() function is used to transform a column of arrays or maps into multiple rows, with…
Retrieving value of a specific element in an array or map column of a DataFrame.
pyspark.sql.functions.element_at In PySpark, the element_at function is used to retrieve the value of a specific element in an array or…
In PySpark how to sort data in descending order, while putting the rows with null values at the end of the result ?
pyspark.sql.Column.desc_nulls_last In PySpark, the desc_nulls_last function is used to sort data in descending order, while putting the rows with null…
In PySpark how sort data in descending order, while putting the rows with null values at the beginning ?
pyspark.sql.Column.desc_nulls_first In PySpark, the desc_nulls_first function is used to sort data in descending order, while putting the rows with null…
Comparing PySpark with Map Reduce programming
PySpark is the Python library for Spark programming. It allows developers to interface with RDDs (Resilient Distributed Datasets) and perform…
Explain dense_rank. How to use dense_rank function in PySpark ?
In PySpark, the dense_rank function is used to assign a rank to each row within a result set, based on…
Pyspark code to read and write data from and to google Bigquery.
Here is some sample PySpark code that demonstrates how to read and write data from and to Google BigQuery: from…
In pyspark what is the difference between Spark spark.table() and spark.read.table()
In PySpark, spark.table() is used to read a table from the Spark catalog, whereas spark.read.table() is used to read a…