Author: user
PySpark : Reversing the order of strings in a list using PySpark
Lets create a sample data in the form of a list of strings. from pyspark import SparkContext, SparkConf from pyspark.sql…
PySpark : Generating a 64-bit hash value in PySpark
Introduction to 64-bit Hashing A hash function is a function that can be used to map data of arbitrary size…
PySpark : Create an MD5 hash of a certain string column in PySpark.
Introduction to MD5 Hash MD5 (Message Digest Algorithm 5) is a widely used cryptographic hash function that produces a 128-bit…
PySpark : Introduction to BASE64_ENCODE and its Applications in PySpark
Introduction to BASE64_ENCODE and its Applications in PySpark BASE64 is a group of similar binary-to-text encoding schemes that represent binary…
Hive : How to drop duplicate rows from Hive table.
This is a work around to show how can we drop duplicate rows from Hive table. Here is how to…
Shell : Checks whether each file exists or not by giving list of paths.
In this article we will discuss a script that takes a list of file paths as input and checks whether each…
Unix : Shell script that performs log file analysis : Find the top 5 IP addresses
In this article we will explain a script will analyze an Apache web server’s access log file to find the…
Unix : Shell script that monitors the system’s CPU usage and free memory
Here we will discuss on a shell script that monitors the system’s CPU usage and free memory, and issues a…
Shell : Shell script that checks the status of a specific web page
Here we will explain a shell script that checks the status of a specific web page. It could be used…
Shell : Shell script that monitors the disk usage of a specified directory and sends an alert if the disk usage exceeds
Here we will discuss about a shell script that monitors the disk usage of a specified directory and sends an…