PySpark : Extracting minutes of a given date as integer in PySpark [minute]

PySpark @


The minute function in PySpark is part of the pyspark.sql.functions module, and is used to extract the minute from a date or timestamp. The minute function takes a single argument, which is a column containing a date or timestamp, and returns an integer representing the minute component of the date or timestamp.

Here is an example of how to use the minute function in PySpark:

from pyspark.sql import SparkSession
from pyspark.sql.functions import minute
# Start a SparkSession
spark = SparkSession.builder.appName("MinuteExample at").getOrCreate()
# Create a DataFrame
data = [("2023-10-01 11:30:00",),
        ("2023-11-12 08:45:00",),
        ("2023-12-15 09:15:00",)]
df = spark.createDataFrame(data, ["timestamp"])
# Use the minute function to extract the minute component
df ="timestamp", minute("timestamp").alias("minute"))


|          timestamp|minute|
|2023-10-01 11:30:00|    30|
|2023-11-12 08:45:00|    45|
|2023-12-15 09:15:00|    15|

As you can see, the minute function has extracted the minute component of each timestamp in the DataFrame and returned it as an integer.

In addition to the minute function, the pyspark.sql.functions module also includes functions for extracting other components of a date or timestamp, such as the hour, day, month, and year. These functions can be used in combination with each other to perform more complex operations on dates and timestamps.

In conclusion, the minute function in PySpark is a useful tool for working with dates and timestamps in Spark dataframes. Whether you need to extract the minute component of a date or perform more complex operations, the pyspark.sql.functions module provides the tools you need to get the job done.

Spark important urls to refer

  1. Spark Examples
  2. PySpark Blogs
  3. Bigdata Blogs
  4. Spark Interview Questions
  5. Official Page
Author: user

Leave a Reply