What is the difference between hadoop fs put and copyFromLocal?

What is the difference between Query and Filter in Elasticsearch?
February 22, 2021
How to avoid a Broadcast Nested Loop join in Spark?
February 26, 2021
What is the difference between Query and Filter in Elasticsearch?
February 22, 2021
How to avoid a Broadcast Nested Loop join in Spark?
February 26, 2021

The answer to this question depends on the version of Hadoop you are using.

Older version of Hadoop (1.x.x)

The first argument of copyFromLocal is restricted to a location in the location filesystem whereas  you are not restricted to the local file system with the put command.

With put you can specify the filesystem scheme (file:// or hdfs://) to distinguish between local filesystem and HDFS.

copyFromLocal

hdfs dfs -copyFromLocal <local source> <destination>

put

hdfs dfs -put <source> <destination>

 

Newer versions of Hadoop (> 2.0.0)

With the newer versions of Hadoop, put and copyFromLocal does exactly the same. Infact copyFromLocal calls the -put command. You can see this by calling the help on the commands.

[hirw@wk1 ~]$ hdfs dfs -help put

-put [-f] [-p] [-l] <localsrc> … <dst> :

  Copy files from the local file system into fs. Copying fails if the file already

  exists, unless the -f flag is given.

  Flags:

  -p  Preserves access and modification times, ownership and the mode.

  -f  Overwrites the destination if it already exists.

  -l  Allow DataNode to lazily persist the file to disk. Forces

         replication factor of 1. This flag will result in reduced

         durability. Use with care.

 

[hirw@wk1 ~]$ hdfs dfs -help copyFromLocal

-copyFromLocal [-f] [-p] [-l] <localsrc> … <dst> :

  Identical to the -put command.

[hirw@wk1 ~]$

From the code you can see the copyFromLocal extends put.

public static class CopyFromLocal extends Put { 
  public static final String NAME = "copyFromLocal"; 
  public static final String USAGE = Put.USAGE; 
  public static final String DESCRIPTION = "Identical to the -put command."; 
}

 

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

What is the difference between hadoop fs put and copyFromLocal?
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X