Apache Pig Tutorial – Executing Script with Parameters Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. […]
Apache Pig Tutorial – Load Variations Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All posts […]
Hadoop Archives (HAR) Hadoop Archives (HAR) offers an effective way to deal with the small files problem. This post will explain – The problem with small […]
Datanode Block Scanner In this blog post we saw how HDFS handles and corrects data corruption in HDFS using checksum. During a write operation the datanode […]
Can Reducer always be reused for Combiner? A Combiner function is an optional intermediary function which is executed on the Map phase right after the execution […]
What is HDFS Federation? Namenode is responsible for the successful operation of HDFS. Namenode holds the entire metadata of HDFS, which includes information about files and […]