Tag: big_data_interview

PySpark @ Freshers.in

PySpark : HiveContext in PySpark – A brief explanation

One of the key components of PySpark is the HiveContext, which provides a SQL-like interface to work with data stored…

Continue Reading PySpark : HiveContext in PySpark – A brief explanation
PySpark @ Freshers.in

PySpark: Explanation of PySpark Full Outer Join with example.

One of the most commonly used operations in PySpark is joining two dataframes together. Full outer join is one of…

Continue Reading PySpark: Explanation of PySpark Full Outer Join with example.
PySpark @ Freshers.in

PySpark : Extracting minutes of a given date as integer in PySpark [minute]

pyspark.sql.functions.minute The minute function in PySpark is part of the pyspark.sql.functions module, and is used to extract the minute from…

Continue Reading PySpark : Extracting minutes of a given date as integer in PySpark [minute]
PySpark @ Freshers.in

PySpark : Function to perform simple column transformations [expr]

pyspark.sql.functions.expr The expr module is part of the PySpark SQL module and is used to create column expressions that can…

Continue Reading PySpark : Function to perform simple column transformations [expr]
PySpark @ Freshers.in

PySpark : Formatting numbers to a specific number of decimal places.

pyspark.sql.functions.format_number One of the useful functions in PySpark is the format_number function, which is used to format numbers to a…

Continue Reading PySpark : Formatting numbers to a specific number of decimal places.
PySpark @ Freshers.in

PySpark : Creating multiple rows for each element in the array[explode]

pyspark.sql.functions.explode One of the important operations in PySpark is the explode function, which is used to convert a column of…

Continue Reading PySpark : Creating multiple rows for each element in the array[explode]