Category: article

PySpark @ Freshers.in

How to run dataframe as Spark SQL – PySpark

If you have a situation that you can easily get the result using SQL/ SQL already existing , then you…

Continue Reading How to run dataframe as Spark SQL – PySpark

Hive – What are the metastore tables in Hive ?

Metastore is the central repository of Apache Hive metadata. It stores metadata for Hive tables AUX_TABLE BUCKETING_COLS CDS COLUMNS_V2 COMPACTION_QUEUE…

Continue Reading Hive – What are the metastore tables in Hive ?

How to transfer file from SFTP server to Local using Python

There are situation you may need to programmatically transfer file from SFTP server to you local environment. Here we will…

Continue Reading How to transfer file from SFTP server to Local using Python

How to remove csv header using Spark (PySpark)

A common use case when dealing with CSV file is to remove the header from the source to do data…

Continue Reading How to remove csv header using Spark (PySpark)

How to access hive using Python (Source code )

Use case : If you want to do some scheduling or some automation , we may need to access Hive…

Continue Reading How to access hive using Python (Source code )