Category: spark

Spark User full article

PySpark @ Freshers.in

PySpark : Extracting minutes of a given date as integer in PySpark [minute]

pyspark.sql.functions.minute The minute function in PySpark is part of the pyspark.sql.functions module, and is used to extract the minute from…

Continue Reading PySpark : Extracting minutes of a given date as integer in PySpark [minute]
PySpark @ Freshers.in

PySpark : Function to perform simple column transformations [expr]

pyspark.sql.functions.expr The expr module is part of the PySpark SQL module and is used to create column expressions that can…

Continue Reading PySpark : Function to perform simple column transformations [expr]
PySpark @ Freshers.in

PySpark : Formatting numbers to a specific number of decimal places.

pyspark.sql.functions.format_number One of the useful functions in PySpark is the format_number function, which is used to format numbers to a…

Continue Reading PySpark : Formatting numbers to a specific number of decimal places.
PySpark @ Freshers.in

PySpark : Creating multiple rows for each element in the array[explode]

pyspark.sql.functions.explode One of the important operations in PySpark is the explode function, which is used to convert a column of…

Continue Reading PySpark : Creating multiple rows for each element in the array[explode]
PySpark @ Freshers.in

PySpark : How decode works in PySpark ?

One of the important concepts in PySpark is data encoding and decoding, which refers to the process of converting data…

Continue Reading PySpark : How decode works in PySpark ?
PySpark @ Freshers.in

PySpark : Extracting dayofmonth, dayofweek, and dayofyear in PySpark

pyspark.sql.functions.dayofmonth pyspark.sql.functions.dayofweek pyspark.sql.functions.dayofyear One of the most common data manipulations in PySpark is working with date and time columns. PySpark…

Continue Reading PySpark : Extracting dayofmonth, dayofweek, and dayofyear in PySpark

Spark : Calculate the number of unique elements in a column using PySpark

pyspark.sql.functions.countDistinct In PySpark, the countDistinct function is used to calculate the number of unique elements in a column. This is…

Continue Reading Spark : Calculate the number of unique elements in a column using PySpark
PySpark @ Freshers.in

PySpark : How to decode in PySpark ?

pyspark.sql.functions.decode The pyspark.sql.functions.decode Function in PySpark PySpark is a popular library for processing big data using Apache Spark. One of…

Continue Reading PySpark : How to decode in PySpark ?