In Hive, Managed tables / Internal table are Hive owned tables and the tables data…
Tag: Technical
How to insert from Non Partitioned table to Partitioned table in Hive?
You can insert data from Non Partitioned table to Partitioned table , in short , if you want to have…
PySpark-How to create and RDD from a List and from AWS S3
In this article you will learn , what an RDD is ? How can we create an RDD from a…
How to run dataframe as Spark SQL – PySpark
If you have a situation that you can easily get the result using SQL/ SQL already existing , then you…
How to get all combination of columns using PySpark? What is Cube in Spark ?
A cube is a multi-dimensional generalization of a two- or three-dimensional spreadsheet. Cube is a shorthand for multidimensional dataset, given…
How to convert xls to csv ? I can see first header column got shifted to next column- Solved
The requirement is to convert and xls to csv using python. Initially we used pandas , pandas.read_excel to read an…
How can I get all the hive tables and its external location,partitions etc ?
There may be some situations where you may need to give all the hive tables created and its location and…
Python-How to extract multiple words between two strings-(Extracting word between {})
Here we will see how to extract string between two specific character/string. This is a use case when you want…
Amazon API Gateway interview questions
1. Can we monitor Amazon API Gateway calls ? After an API is published and in use, API Gateway provides…
Apache Storm interview questions
1. What is Apache Storm? Apache Storm is a free and open source distributed realtime computation system. Apache Storm makes…
Apache PIG interview questions
1. What is pig? Pig is a Apache open soucre project which run on top of hadoop,provides engine for data…