PySpark : Determining whether the current object holds any data : Series.empty

Spark_Pandas_Freshers_in

Within the fusion of Pandas API on Spark lies a crucial method – Series.empty. This method serves as a gatekeeper, allowing users to ascertain whether the current object is empty or not. In this article, we will delve into the intricacies of Series.empty within the context of Spark, elucidating its significance through comprehensive examples.

Understanding Series.empty

The Series.empty method is a part of the Pandas API, which has been seamlessly integrated into Spark, a distributed computing framework. Its primary purpose is to check whether the Series object contains any data points or is devoid of any entries.

Syntax:

Series.empty

Usage:

The Series.empty method returns a boolean value, True if the Series is empty and False otherwise.

Examples:

Let’s explore some examples to grasp a better understanding of how Series.empty operates within the context of Spark.

Example 1: Empty Series

Consider a scenario where we have an empty Series. Let’s create one and check if it’s empty using Series.empty.

from pyspark.sql import SparkSession
import pandas as pd

# Initialize SparkSession
spark = SparkSession.builder \
    .appName("SeriesEmpty Example Learning @ Freshers.in ") \
    .getOrCreate()

# Create an empty DataFrame
empty_df = spark.createDataFrame([], schema="col INT")

# Convert the DataFrame to Pandas Series
empty_series = empty_df.toPandas()["col"]

# Check if the Series is empty
is_empty = empty_series.empty
print("Is the Series empty?", is_empty)

Output:

Is the Series empty? True

As expected, the Series.empty method correctly identifies that the Series is indeed empty.

Example 2: Non-empty Series

Now, let’s examine a case where the Series contains some data.

# Create a Spark DataFrame with some data
data = [(1,), (2,), (3,), (4,), (5,)]
df = spark.createDataFrame(data, schema="col INT")
# Convert the DataFrame to Pandas Series
non_empty_series = df.toPandas()["col"]
# Check if the Series is empty
is_empty = non_empty_series.empty
print("Is the Series empty?", is_empty)

Output:

Is the Series empty? False

In this instance, Series.empty returns False, indicating that the Series contains data.

Author: user