Hadoop – Page 3 – Big Data In Real World

February 9, 2017

Published by Big Data In Real World at February 9, 2017

Categories

Hadoop

Working with HDFS

In the HDFS – Why another filesytem post, we got ourselves introduced about HDFS its time to try some HDFS commands. You are probably thinking why […]

February 6, 2017

Published by Big Data In Real World at February 6, 2017

Categories

Hadoop

HDFS – Why another file system?

In Understanding Big Data Problem post we saw that HDFS or Hadoop Distributed filesystem takes care of all the storage related complexities in Hadoop. In this […]

February 2, 2017

Published by Big Data In Real World at February 2, 2017

Categories

Finding the MAX tuple with Pig

Finding the MAX tuple with Pig Here is a sample dataset. Our goal is to find the record with maximum record_value which is DEF, 300 record_key, […]

January 30, 2017

Published by Big Data In Real World at January 30, 2017

Categories

Hadoop

How to find directories in HDFS which are older than N days?

How to find directories in HDFS which are older than N days? Cleaning up older or obsolete files in HDFS is important. Even if you have […]

January 26, 2017

Published by Big Data In Real World at January 26, 2017

Categories

How to use multi character delimiter in a Hive table?

How to use multi character delimiter in a Hive table? Sometimes your data is slightly complex to delimit the individual columns with a single character like […]

January 23, 2017

Published by Big Data In Real World at January 23, 2017

Categories

Change field termination value in Hive

Change field termination value in Hive This blog post describes how to change the field termination value in Hive. Assume when you created the Hive table, […]

January 19, 2017

Published by Big Data In Real World at January 19, 2017

Categories

DataNode process killed due to Incompatible clusterIDs error

DataNode process killed due to Incompatible clusterIDs error This blog post will describe how to address Incompatible clusterIDs with DataNodes. 2013-04-11 16:26:15,720 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed […]

January 16, 2017

Published by Big Data In Real World at January 16, 2017

Categories

FSNamesystem initialization failed

FSNamesystem initialization failed FSNamesystem initialization failed is a common error Hadoop users gets especially if there are trying to set up of a Hadoop cluster for […]

January 12, 2017

Published by Big Data In Real World at January 12, 2017

Categories

Hadoop safemode recovery – taking too long!

Hadoop safemode recovery – taking too long! Any time NameNode is restarted or started, NameNode first goes into maintenance state called Safe Mode. When NameNode is […]

January 9, 2017

Published by Big Data In Real World at January 9, 2017

Categories

There are 0 datanode(s) running and no node(s)

There are 0 datanode(s) running and no node(s) You are trying to write a file to HDFS and this is what you see in your datanode […]

October 9, 2016

Published by Big Data In Real World at October 9, 2016

Categories

Admin
Hadoop

Hadoop Administrator In Real World – Course Coverage

Hadoop Administrator In Real World – Course Coverage We launched Hadoop Developer In Real World course on Nov 2015 and we got excellent response from the […]

October 1, 2016

Published by Big Data In Real World at October 1, 2016

Categories

Admin
Hadoop

What employers expect from Hadoop Administrators?

What employers expect from Hadoop Administrators? In this post we will discuss what employers expect from Hadoop Administrators. We also have a video version of this post, […]

September 24, 2016

Published by Big Data In Real World at September 24, 2016

Categories

Admin
Hadoop

Is Hadoop Administration right for me?

Is Hadoop Administration right for me? When we first announced that we are working on a new Hadoop Administration course we had several students and members […]

July 10, 2016

Published by Big Data In Real World at July 10, 2016

Categories

Hadoop

Changing The Output File Prefix Of Hadoop MapReduce Job

Changing The Output File Prefix Of Hadoop MapReduce Job Your Hadoop job can have multiple reducers and each reducer will create a file by default with […]

June 22, 2016

Published by Big Data In Real World at June 22, 2016

Categories

Hadoop

Hadoop Mapper and Reducer Output Type Mismatch

Hadoop Mapper and Reducer Output Mismatch Can you have different output Key Value pair types for Mapper and Reducer in a MapReduce program? Short answer – […]

December 31, 2015

Published by Big Data In Real World at December 31, 2015

Categories

Apache Pig Tutorial – Map

Apache Pig Tutorial – Map Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All posts […]

December 31, 2015

Published by Big Data In Real World at December 31, 2015

Categories

Apache Pig Tutorial – Tuple & Bag

Apache Pig Tutorial – Tuple & Bag Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. […]

December 20, 2015

Published by Big Data In Real World at December 20, 2015

Categories

Apache Pig Tutorial – Executing Script with Parameters

Apache Pig Tutorial – Executing Script with Parameters Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. […]

December 20, 2015

Published by Big Data In Real World at December 20, 2015

Categories

Apache Pig Tutorial – Executing as a Script

Apache Pig Tutorial – Executing as a Script Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy […]

December 20, 2015

Published by Big Data In Real World at December 20, 2015

Categories

Hadoop

Apache Pig Tutorial – Ordering Records

Apache Pig Tutorial – Ordering Records Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All […]