The command DistCp works for copying from one hdfs cluster to other hdfs cluster. But, the problem is that it is not specified any where(As I aware of) what should be the source url and destination url.
Many places, it is specified that specify hdfs://namenode1:50070/path_to_file and many other places some other port number.
So, after a lot of debugging, I figured out that it should be
hadoop distcp hdfs://(fs.default.name)_property_specified_in_hadoop-site.xml/path_to_file
for ex:
hadoop distcp hdfs://remotehost:10000/opt/hadoop-name/foo/bar hdfs://localhost:54310/opt/hadoop-name/foo/bar
The important note here is the url should be picked up exactly how it is in fs.default.name in hadoop-site.xml in hadoop conf directory.
Monday, February 09, 2009
Distributed copy from remote hdfs to local hdfs
Subscribe to:
Post Comments (Atom)
1 comment:
Pl arrange to provide the premium receipt of policy no 13314021 as direct debit from my Account Date:12.12.2016. On my email ID no .rajesh.m@sewinfrastructure.com
Post a Comment