Retrieving value of a specific element in an array or map column of a DataFrame.

PySpark @ Freshers.in

pyspark.sql.functions.element_at

In PySpark, the element_at function is used to retrieve the value of a specific element in an array or map column of a DataFrame. The function takes two arguments: the first is the name of the column containing the array or map, and the second is the index or key of the element you want to retrieve.

Here’s an example of how you might use element_at in PySpark:

from pyspark.sql import SparkSession
from pyspark.sql.functions import element_at
# Create a DataFrame
df = spark.createDataFrame([
(1, ["Tom", "Jim", "Rebeccac"]), 
(2, ["Alice", "Sue"]), (3, ["Wilson", "Mike", "Charley"]
)], ["id", "names"])
df.show()
+---+--------------------+
| id|               names|
+---+--------------------+
|  1|[Tom, Jim, Rebeccac]|
|  2|        [Alice, Sue]|
|  3|[Wilson, Mike, Ch...|
+---+--------------------+

Retrieve the first element of the “names” column.

df.select(element_at("names", 1).alias("first_word")).show()
+----------+
|first_word|
+----------+
|       Tom|
|     Alice|
|    Wilson|
+----------+

Retrieve the second element of the “names” column.

df.select(element_at("names", 2).alias("second_word")).show()
+-----------+
|second_word|
+-----------+
|        Jim|
|        Sue|
|       Mike|
+-----------+

Retrieve the second element of the “names” column.

df.select(element_at("names", 3).alias("third_word")).show()
+----------+
|third_word|
+----------+
|  Rebeccac|
|      null|
|   Charley|
+----------+

As you can see the function takes the column name as the first argument and the index of the element as the second argument.

You can also use element_at function with a map column, in that case the second argument should be the key of the element you want to fetch.

df = spark.createDataFrame([(1, {"apple":1,"banana":2}), (2, {"cherry":3,"dragon fruit":4})], ["id", "map"])
df.select(element_at("map", "apple").alias("value")).show()

This will output:

+-----+
|value|
+-----+
|    1|
| null|
+-----+

Spark important urls.

  1. Spark Examples
  2. PySpark Blogs
  3. Bigdata Blogs
  4. Spark Interview Questions
  5. Official Page
Author: user

Leave a Reply