Tag: Big Data

Hive @ Freshers.in

Hive : Hive Table Properties : How are Hive Table Properties used?

One of the key features of Hive is the ability to define table properties, which can be used to control…

Hive @ Freshers.in

Hive : Implementation of UDF in Hive using Python. A Comprehensive Guide

A User-Defined Function (UDF) in Hive is a function that is defined by the user and can be used in…

Hive @ Freshers.in

Hive : Hive metastore and its importance.

The Hive Metastore is an important component of the Apache Hive data warehouse software. It acts as a central repository…

Hive @ Freshers.in

Hive : Hive Optimizers: A Comprehensive Guide

Hive is a data warehousing tool that provides a SQL-like interface for querying large datasets stored in Hadoop Distributed File…

Hive @ Freshers.in

Hive : Comparison between the ORC and Parquet file formats in Hive

ORC (Optimized Row Columnar) and Parquet are two popular file formats for storing and processing large datasets in Hadoop-based systems…

Hive @ Freshers.in

Hive : Different types of storage formats supported by Hive.[16 Formats supported by Hive]

Apache Hive is an open-source data warehousing tool that was developed to provide an SQL-like interface to query and analyze…

PySpark @ Freshers.in

PySpark : Setting PySpark parameters – A complete Walkthru [3 Ways]

In PySpark, you can set various parameters to configure your Spark application. These parameters can be set in different ways…

PySpark @ Freshers.in

Spark : Calculation of executor memory in Spark – A complete info.

The executor memory is the amount of memory allocated to each executor in a Spark cluster. It determines the amount…