Author: user
How to merge multiple PDF files using Python?
Use case : If you have multiple files for example chapter wise question papers etc. and you need to have…
Over the Wall – 0/1 knapsack ( Smallest number of boxes required to build two towers such that each of them has least height )
Ramu and Jithin want to watch the grand finale, but unfortunately, they could not get tickets to the match. However,…
How to create UDF in PySpark ? What are the different ways you can call PySpark UDF ( With example)
PySpark UDF PySpark UDF is used to extend the PySpark build in capabilities. UDF (User Defined Functions) are used to…
How to convert MapType to multiple columns based on Key using PySpark ?
Use case : Converting Map to multiple columns. There can be raw data with Maptype with multiple key value pair….
How to create a Airflow DAG(Scheduler) to execute a redshift query ?
Use case : We have a redshift query (an insert sql ) to load data from another table on daily…
Explain how can you implement dynamic partitioning in Hive (automatically creating partition based on column value)
Dynamic partition in hive Where there are large number of partition values , then its…
How to insert from Non Partitioned table to Partitioned table in Hive?
You can insert data from Non Partitioned table to Partitioned table , in short , if you want to have…
How to create AWS Glue table where partitions have different columns?
There can be a condition where you can expect new column in JSON file regularly . There can be a…
Explain what is happening internally once you upload a file in Amazon S3
This article will explain what is happening inside the S3 once you upload a file. The client sends an HTTP…
How to create a m x n random matric in Python without using built in functions (numpy etc)
There are some scenario , where you need to create an m x n random matric A without using built…