big_data_interview - Freshers.in

PySpark : Adding a specified number of days to a date column in PySpark
pyspark.sql.functions.date_add The date_add function in PySpark is used to add a specified number of days…
PySpark how to find the date difference between two date and how to round it just days without decimal (datediff,floor)
pyspark.sql.functions.datediff and pyspark.sql.functions.floor In this article we will learn two function , mainly datediff and…
PySpark - How to convert string date to Date datatype
pyspark.sql.functions.to_date In this article will give you brief on how can you convert string date…
PySpark : Truncate date and timestamp in PySpark [date_trunc and trunc]
pyspark.sql.functions.date_trunc(format, timestamp) Truncation function offered by Spark Dateframe SQL functions is date_trunc(), which returns Date…
PySpark : Extracting minutes of a given date as integer in PySpark [minute]
pyspark.sql.functions.minute The minute function in PySpark is part of the pyspark.sql.functions module, and is used…
PySpark : Date Formatting : Converts a date, timestamp, or string to a string value with specified format in PySpark
pyspark.sql.functions.date_format In PySpark, dates and timestamps are stored as timestamp type. However, while working with…
PySpark : A Comprehensive Guide to PySpark's current_date and current_timestamp Functions
PySpark enables data engineers and data scientists to perform distributed data processing tasks efficiently. In…
PySpark : How to read date datatype from CSV ?
We specify schema = true when a CSV file is being read. Spark determines the…
PySpark: How to add months to a date column in Spark DataFrame (add_months)
I have a use case where I want to add months to a date column…
PySpark : Explanation of MapType in PySpark with Example
MapType in PySpark is a data type used to represent a value that maps keys…

Tag: big_data_interview

PySpark : Subtracting a specified number of days from a given date in PySpark [date_sub]

PySpark : A Comprehensive Guide to PySpark’s current_date and current_timestamp Functions

Hive : Different types of file formats supported by Hive

Hive : Exploring Different Types of User-Defined Functions (UDFs) in Hive

Hive : Understanding the MAPJOIN Operator in Hive with an Example

Hive : Understanding the DISTRIBUTE BY Operator in Hive with an Example

PySpark : Understanding the ‘take’ Action in PySpark with Examples. [Retrieves a specified number of elements from the beginning of an RDD or DataFrame]

Sort Merge Bucket Join in Hive: A Comprehensive Guide

Hive : Map-side join – A technique used in Hive to join large datasets efficiently.

PySpark : Exploring PySpark’s joinByKey on DataFrames: [combining data from two different DataFrames] – A Comprehensive Guide

Trending

Recent Posts

Featured Posts – Slider Widget

How PARTITION BY Works in Snowflake, and SQL in general

Stash a specific file using Git

Prevent your computer from locking : Python to simulate mouse movements

AWS EC2 vs Azure Virtual Machines

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Most Viewed Posts