Category: hive

Bigdata – Hive

Hive @ Freshers.in

Hive : Different types of file formats supported by Hive

Apache Hive supports a variety of file formats to store and process data. These file formats can be categorized into…

Hive @ Freshers.in

Hive : Exploring Different Types of User-Defined Functions (UDFs) in Hive

In addition to its built-in functions, Hive also supports User-Defined Functions (UDFs), which enable users to extend Hive’s functionality by…

Hive @ Freshers.in

Hive : Understanding the MAPJOIN Operator in Hive with an Example

When dealing with large datasets, optimizing join operations is crucial to improving query performance. One of the techniques to achieve…

Hive @ Freshers.in

Hive : Understanding the DISTRIBUTE BY Operator in Hive with an Example

One of the key features of Hive is its ability to optimize queries for improved performance. The DISTRIBUTE BY operator…

Hive @ Freshers.in

Sort Merge Bucket Join in Hive: A Comprehensive Guide

Sort Merge Bucket (SMB) join is an optimization technique in Apache Hive that helps improve the performance of join operations….

Hive @ Freshers.in

Hive : Map-side join – A technique used in Hive to join large datasets efficiently.

Map-side join is a technique used in Hive to join large datasets efficiently. It is a type of join that…

Hive @ Freshers.in

Hive : Hive Table Properties : How are Hive Table Properties used?

One of the key features of Hive is the ability to define table properties, which can be used to control…

Hive @ Freshers.in

Hive : Implementation of UDF in Hive using Python. A Comprehensive Guide

A User-Defined Function (UDF) in Hive is a function that is defined by the user and can be used in…

Hive @ Freshers.in

Hive : Hive metastore and its importance.

The Hive Metastore is an important component of the Apache Hive data warehouse software. It acts as a central repository…

Hive @ Freshers.in

Hive : Hive Optimizers: A Comprehensive Guide

Hive is a data warehousing tool that provides a SQL-like interface for querying large datasets stored in Hadoop Distributed File…