Big Data - Freshers.in

PySpark : Sort an array of elements in a DataFrame column
pyspark.sql.functions.array_sort The array_sort function is a PySpark function that allows you to sort an array…
PySpark : Find the maximum value in an array column of a DataFrame
pyspark.sql.functions.array_max The array_max function is a built-in function in Pyspark that finds the maximum value…
PySpark:Getting approximate number of unique elements in a column of a DataFrame
pyspark.sql.functions.approx_count_distinct Pyspark's approx_count_distinct function is a way to approximate the number of unique elements in…
PySpark : Find the minimum value in an array column of a DataFrame
pyspark.sql.functions.array_min The array_min function is a built-in function in Pyspark that finds the minimum value…
PySpark : Combine the elements of two or more arrays in a DataFrame column
pyspark.sql.functions.array_union The array_union function is a PySpark function that allows you to combine the elements…
PySpark : Removing all occurrences of a specified element from an array column in a DataFrame
pyspark.sql.functions.array_remove Syntax pyspark.sql.functions.array_remove(col, element) pyspark.sql.functions.array_remove is a function that removes all occurrences of a specified…
PySpark : Check if two or more arrays in a DataFrame column have any common elements
pyspark.sql.functions.arrays_overlap The arrays_overlap function is a PySpark function that allows you to check if two…
How to replace a value with another value in a column in Pyspark Dataframe ?
In PySpark we can replace a value in one column or multiple column or multiple…
PySpark : How to sort a dataframe column in ascending order while putting the null values first ?
pyspark.sql.Column.asc_nulls_first In PySpark, the asc_nulls_first() function is used to sort a column in ascending order…
How to run dataframe as Spark SQL - PySpark
If you have a situation that you can easily get the result using SQL/ SQL…

Tag: Big Data

PySpark : How to Compute the cumulative distribution of a column in a DataFrame

PySpark : How to convert a sequence of key-value pairs into a dictionary in PySpark

PySpark : Truncate date and timestamp in PySpark [date_trunc and trunc]

PySpark : Explain map in Python or PySpark ? How it can be used.

PySpark : Explanation of MapType in PySpark with Example

PySpark : Explain in detail whether Apache Spark SQL lazy or not ?

PySpark : Generate a sequence number based on a specific order of the DataFrame

PySpark : Generates a unique and increasing 64-bit integer ID for each row in a DataFrame

PySpark : Inserting row in Apache Spark Dataframe.

PySpark : How to write Scala code in spark shell ?

Trending

Recent Posts

Featured Posts – Slider Widget

How PARTITION BY Works in Snowflake, and SQL in general

Stash a specific file using Git

Prevent your computer from locking : Python to simulate mouse movements

AWS EC2 vs Azure Virtual Machines

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Most Viewed Posts