pyspark.sql.functions.array_max The array_max function is a built-in function in Pyspark that finds the maximum value…
Category: article
How to drop nulls in a dataframe : PySpark
For most of the data cleansing the first thing that you may need to do drop the nulls in the…
Why sqitch init snowflake cannot determine Snowflake account name ?
Currently supported databases by Sqitch’s database change management tool include Snowflake’s Cloud Data Warehouse as well as PostgreSQL 8.4+, SQLite…
In Spark how to replace null value for all columns or for each column separately-PySpark (na.fill)
Spark api : pyspark.sql.DataFrameNaFunctions.fill Syntax : fill(value, subset=None) value : “value” can only be int, long, float, string, bool or…
How to create an array containing a column repeated count times – PySpark
For repeating array elements k times in PySpark we can use the below library. Library : pyspark.sql.functions.array_repeat array_repeat is a…
AI for Solving Quantitative Reasoning Problems – Minerva
Google AI Introduces Minerva: A Natural Language Processing (NLP) Model for solving Mathematical Questions Solving mathematical and scientific questions was…
Airflow dags not getting refreshed/updating. How to do it manually?
Refreshing Airflow Dag Manually Solution After creating a DAG in Airflow, we anticipate its immediate refresh. However, even after refreshing…
Docker Interview Questions and Answers for Experienced and Freshers
1. Can you explain how Docker is advantageous over Hypervisors? Docker is advantageous in the below ways 1. It is…
pyplot : Sample code to draw a graph using python like MATLAB.
matplotlib.pyplot is a collection of functions that make matplotlib work like MATLAB. pyplot function can creates plotting area in a…
What are the Aggregate functions in SQL ?
Aggregate functions are the function that is used to compute against a “returned column of numeric data” from your select…
What is Seeds in dbt? How to load CSV files into your data warehouse using the dbt ?
Seeds are the CSV files in your dbt project. There will be a seed director , that dbt can load…