PySpark : How to get the number of elements within an object : Series.size

Spark_Pandas_Freshers_in

Understanding the intricacies of Pandas API on Spark is essential for harnessing its full potential. Among its myriad functionalities, the Series.size method stands out for its ability to determine the number of elements within an object, paving the way for efficient data analysis and manipulation.

Understanding Series.size

The Series.size method in Pandas API on Spark returns an integer representing the total number of elements within the object. It provides valuable insights into the size of the dataset, facilitating various data analysis tasks.

Example 1: Determining Size of Series

Let’s start with a simple example to illustrate the usage of Series.size:

import pandas as pd
from pyspark.sql import SparkSession

# Create a SparkSession
spark = SparkSession.builder \
    .appName("PandasAPIOnSpark") \
    .getOrCreate()

# Sample data
data = [10, 20, 30, 40, 50]

# Create a Pandas Series on Spark
series = pd.Series(data)

# Get the size of the Series
size = series.size

print("Size of the Series:", size)

Output:

Size of the Series: 5

In this example, Series.size returns the size of the Series, which is 5, indicating that it contains five elements.

Example 2: Handling Missing Values

Now, let’s explore how Series.size handles missing values within the Series:

# Sample data with missing values
data_missing = [10, 20, None, 40, 50]

# Create a Pandas Series with missing values on Spark
series_missing = pd.Series(data_missing)

# Get the size of the Series with missing values
size_missing = series_missing.size

print("Size of the Series with Missing Values:", size_missing)

Output

Size of the Series with Missing Values: 5

Series.size still returns the size of the Series as 5, even though one element is missing. This highlights that Series.size counts the total number of elements present, including any missing or null values.

Spark important urls to refer

  1. Spark Examples
  2. PySpark Blogs
  3. Bigdata Blogs
  4. Spark Interview Questions
  5. Official Page
Author: user