Tag: SparkExamples

PySpark @ Freshers.in

How to removes duplicate values from array in PySpark

This blog will show you , how to remove the duplicates in an column with array elements. Consider the below example….

Continue Reading How to removes duplicate values from array in PySpark
PySpark @ Freshers.in

PySpark – groupby with aggregation (count, sum, mean, min, max)

pyspark.sql.DataFrame.groupBy PySpark groupby functions groups the DataFrame using the specified columns to run aggregation ( count,sum,mean, min, max) on them….

Continue Reading PySpark – groupby with aggregation (count, sum, mean, min, max)