Category: spark

Spark User full article

PySpark @ Freshers.in

What is the difference between repartition() and coalesce() ?

The repartition algorithm will perform a full shuffle and creates new partitions with data that’s distributed evenly. The repartition algorithm makes…

Continue Reading What is the difference between repartition() and coalesce() ?
PySpark @ Freshers.in

How to drop nulls in a dataframe : PySpark

For most of the data cleansing the first thing that you may need to do drop the nulls in the…

Continue Reading How to drop nulls in a dataframe : PySpark