One Of Several Explanations To “could only be replicated to 0 nodes” Error

Configuring MultipleInputs-InputFormats-Mappers In Oozie MapReduce Action

February 8, 2014

Fixing org.apache.hadoop.security.AccessControlException: Permission denied

March 3, 2014

Published by Big Data In Real World at February 17, 2014

There could be several reasons when you see “could only be replicated to 0 nodes” message in your exception when you are trying to write something to HDFS. Here are couple of good reasons

Datanodes doesn’t have enough disk space to store the blocks

Namenode can not reach Datanode(s) or Datanode(s) could be down/unavailable

So make sure the connectivity between Namenode and Datanode(s) and also the Datanode(s) have sufficient space to store the new blocks.

There could also be another reason where the Client you are writing from may not have access to the Datanode. Here the Hadoop cluster is on Amazon EC2 and the Client is outside the cluster trying to upload a file to HDFS.

Here is the code. The code is very simple. We are specifying an URI and writing a sequence file to HDFS.

String uri = "hdfs://ec2-x-x-x.compute-1.amazonaws.com:9000/user/ubuntu/input/sequence/SequenceFile.seq";
Configuration conf = new Configuration();
conf.addResource(new Path("conf/hadoop-cluster.xml"));

FileSystem fs = FileSystem.get(URI.create(uri), conf);
Path path = new Path(uri);

if (fs.exists(path))
{
  fs.delete(path, true);
}
LongWritable key = new LongWritable();
TextArrayWritable values = new TextArrayWritable();
Text val = new Text();
SequenceFile.Writer writer = null;
try {

  writer = SequenceFile.createWriter(fs, conf, path, key.getClass(), values.getClass());

  key.set(1);
  Text[] arr = new Text[DATA1.length];
  for(int count = 0 ; count < DATA1.length ; count++) {
    arr[count] = new Text(DATA1[count]);
  }
  values.set(arr);

  writer.append(key, values);

} finally {
  IOUtils.closeStream(writer);
}

Lets look at the below error.

14/02/17 08:57:40 INFO hdfs.DFSClient: Exception in createBlockOutputStream 172.31.x.x:50010 java.net.ConnectException: Connection timed out: no further information

14/02/17 08:57:40 INFO hdfs.DFSClient: Abandoning blk_5474642743455132775_4411
14/02/17 08:57:40 INFO hdfs.DFSClient: Excluding datanode 172.31.x.x:50010
14/02/17 08:57:40 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/input/sequence/SequenceFile.seq could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

Couple of things to note in the exception

172.31.x.x:50010 java.net.ConnectException: Connection timed out

Excluding datanode 172.31.x.x:50010

The message above is when the Client tried to communicate with the Datanode and it fails to communicate. The second message indicates that the Datanode is excluded from storing any blocks. 172.31.x.x is a valid Datanode but it is the internal IP of the Datanode in the EC2 cluster. Since the client couldn’t see the internal IP, the write failed. In this case, since the Datanode is the only Datanode in the cluster (with replication factor 1) the write fails with the exception.

Here the Namenode has no problem communicating with the Datanode and if we run the program from the cluster the write will be successful. The error occurs when the Client outside the cluster try to write a file to the cluster. But why should it matter when Namenode can communicate OK with the Datanode. Look at the sourcecode from DFSClient$DFSOutputStrem (Hadoop 1.2.1)

//
// Connect to first DataNode in the list.
//
success = createBlockOutputStream(nodes, clientName, false);

if (!success) {
  LOG.info("Abandoning " + block);
  namenode.abandonBlock(block, src, clientName);

  if (errorIndex < nodes.length) {
    LOG.info("Excluding datanode " + nodes[errorIndex]);
    excludedNodes.add(nodes[errorIndex]);
  }

  // Connection failed. Let's wait a little bit and retry
  retry = true;
}

The key to understand here is that Namenode only provide the list of Datanodes to store the blocks. Namenode does not write the data to the Datanodes. It is the job of the Client to write the data to the Datanodes using theDFSOutputStream . Before any write can begin the above code make sure that the Client can communicate with the Datanode(s) and if the communication fails to the Datanode, the Datanode is added to theexcludedNodes .

Bottom line – your Datanodes could be live and healthy and the communication between the Namenode and Datanode(s) could be OK but if the Client which is writing to HDFS has trouble communicating with the Datanode then we will have a problem.

Big Data In Real World

We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

1 Comment

HDFS error: could only be replicated to 0 nodes, instead of 1 – Config9.com says:

September 13, 2019 at 4:38 pm

[…] https://www.bigdatainrealworld.com/could-only-be-replicated-to-0-nodes/ […]

Log in to Reply

One Of Several Explanations To “could only be replicated to 0 nodes” Error

Configuring MultipleInputs-InputFormats-Mappers In Oozie MapReduce Action

Fixing org.apache.hadoop.security.AccessControlException: Permission denied

Configuring MultipleInputs-InputFormats-Mappers In Oozie MapReduce Action

Fixing org.apache.hadoop.security.AccessControlException: Permission denied

There could be several reasons when you see “could only be replicated to 0 nodes” message in your exception when you are trying to write something to HDFS. Here are couple of good reasons

Big Data In Real World

Related posts

Sunset: Hadoop Developer In Real World cluster

How to recursively delete files, folders or bucket from S3?

Hadoop In Real World is now Big Data In Real World!

1 Comment

Leave a Reply Cancel reply