The Copy to HDFS stage copies a local file/folder into HDFS.

Communication to HDFS will be through the HDFS API FileSystem methods.


Configuration

ElementTypeDefaultDescription
hdfsLocationStringhdfs://localhost:8020The HDFS Namenode URL.
hdfsPathString
The path within the HDFS server where the data will be copied to.
localPathString
The folder where the data to be copied to HDFS is located.
overwritebooleanfalseWhether or not to overwrite the HDFS folder when copying
validatebooleanfalseWhether or not to validate if the local file/folder data contains valid AspireInputFormat


Example

This section provides an example of Copy To HDFS configuration to a local HDFS server.

<component name="PostHDFS" subType="copy-to-hdfs" factoryName="aspire-hadoop-hdfs">
  <hdfsLocation>hdfs://localhost:8020/</hdfsLocation>
  <folderPath>/user/jsmith/testData/</folderPath> 
  <localPath>C:\LocalData\testData</localPath>
  <overwrite>true</overwrite>
  <validate>true</validate>
</component>

Output

The AspireObject output will be the same as the input AspireObject as the job acts just like a trigger to perform the copy.

  • No labels