One Of Several Explanations To "could only be replicated to 0 nodes" Error - Big Data In Real World

One Of Several Explanations To “could only be replicated to 0 nodes” Error

Configuring MultipleInputs-InputFormats-Mappers In Oozie MapReduce Action
February 8, 2014
Fixing org.apache.hadoop.security.AccessControlException: Permission denied
March 3, 2014
Configuring MultipleInputs-InputFormats-Mappers In Oozie MapReduce Action
February 8, 2014
Fixing org.apache.hadoop.security.AccessControlException: Permission denied
March 3, 2014

There could be several reasons when you see “could only be replicated to 0 nodes” message in your exception when you are trying to write something to HDFS. Here are couple of good reasons

Datanodes doesn’t have enough disk space to store the blocks

Namenode can not reach Datanode(s) or Datanode(s) could be down/unavailable

So make sure the connectivity between Namenode and Datanode(s) and also the Datanode(s) have sufficient space to store the new blocks.

There could also be another reason where the Client you are writing from may not have access to the Datanode. Here the Hadoop cluster is on Amazon EC2 and the Client is outside the cluster trying to upload a file to HDFS.

Here is the code. The code is very simple. We are specifying an URI and writing a sequence file to HDFS.

String uri = "hdfs://ec2-x-x-x.compute-1.amazonaws.com:9000/user/ubuntu/input/sequence/SequenceFile.seq";
Configuration conf = new Configuration();
conf.addResource(new Path("conf/hadoop-cluster.xml"));

FileSystem fs = FileSystem.get(URI.create(uri), conf);
Path path = new Path(uri);

if (fs.exists(path))
{
  fs.delete(path, true);
}
LongWritable key = new LongWritable();
TextArrayWritable values = new TextArrayWritable();
Text val = new Text();
SequenceFile.Writer writer = null;
try {

  writer = SequenceFile.createWriter(fs, conf, path, key.getClass(), values.getClass());

  key.set(1);
  Text[] arr = new Text[DATA1.length];
  for(int count = 0 ; count < DATA1.length ; count++) {
    arr[count] = new Text(DATA1[count]);
  }
  values.set(arr);

  writer.append(key, values);

} finally {
  IOUtils.closeStream(writer);
}

Lets look at the below error.

14/02/17 08:57:40 INFO hdfs.DFSClient: Exception in createBlockOutputStream 172.31.x.x:50010 java.net.ConnectException: Connection timed out: no further information

14/02/17 08:57:40 INFO hdfs.DFSClient: Abandoning blk_5474642743455132775_4411
14/02/17 08:57:40 INFO hdfs.DFSClient: Excluding datanode 172.31.x.x:50010
14/02/17 08:57:40 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/input/sequence/SequenceFile.seq could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

Couple of things to note in the exception

172.31.x.x:50010 java.net.ConnectException: Connection timed out

Excluding datanode 172.31.x.x:50010

The message above is when the Client tried to communicate with the Datanode and it fails to communicate. The second message indicates that the Datanode is excluded from storing any blocks. 172.31.x.x is a valid Datanode but it is the internal IP of the Datanode in the EC2 cluster. Since the client couldn’t see the internal IP, the write failed. In this case, since the Datanode is the only Datanode in the cluster (with replication factor 1) the write fails with the exception.

Here the Namenode has no problem communicating with the Datanode and if we run the program from the cluster the write will be successful. The error occurs when the Client outside the cluster try to write a file to the cluster. But why should it matter when Namenode can communicate OK with the Datanode. Look at the sourcecode from DFSClient$DFSOutputStrem  (Hadoop 1.2.1)

//
// Connect to first DataNode in the list.
//
success = createBlockOutputStream(nodes, clientName, false);

if (!success) {
  LOG.info("Abandoning " + block);
  namenode.abandonBlock(block, src, clientName);

  if (errorIndex < nodes.length) {
    LOG.info("Excluding datanode " + nodes[errorIndex]);
    excludedNodes.add(nodes[errorIndex]);
  }

  // Connection failed. Let's wait a little bit and retry
  retry = true;
}

 

The key to understand here is that Namenode only provide the list of Datanodes to store the blocks. Namenode does not write the data to the Datanodes. It is the job of the Client to write the data to the Datanodes using theDFSOutputStream . Before any write can begin the above code make sure that the Client can communicate with the Datanode(s) and if the communication fails to the Datanode, the Datanode is added to theexcludedNodes .

Bottom line – your Datanodes could be live and healthy and the communication between the Namenode and Datanode(s) could be OK but if the Client which is writing to HDFS has trouble communicating with the Datanode then we will have a problem.

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.
One Of Several Explanations To “could only be replicated to 0 nodes” Error
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X