The Hadoop Emit stage writes a key/value pair into a Hadoop Context. The Hadoop Context reference is in a job variable called hadoopContext. The key is configured as a SimpleTemplate and the value is the job's AspireObject.
Configuration
Element Type Default Description keyTemplate SimpleTemplate String {hadoopKey} A simple template to extract the value of the key from the Aspire Job.
Example Configuration
This section provides an example of Hadoop Emit configuration.
Set a key from a job variable
<component name="Emit" subType="default" factoryName="aspire-hadoop-emit"> <keyTemplate>{hadoopKey}</keyTemplate> </component>
Set a key from a field in the AspireObject
<component name="Emit" subType="default" factoryName="aspire-hadoop-emit"> <keyTemplate>{XML:url}</keyTemplate> </component>
<component name="Emit" subType="default" factoryName="aspire-hadoop-emit"> <keyTemplate>{TAG:url}</keyTemplate> </component>
Note: Both TAG and XML work the same way.
Set a key querying with AXPath the AspireObject
<component name="Emit" subType="default" factoryName="aspire-hadoop-emit"> <keyTemplate>{XPATH:/doc/field[@name='url']/.}</keyTemplate> </component>
Set a key as the JobId
<component name="Emit" subType="default" factoryName="aspire-hadoop-emit"> <keyTemplate>{JOBID}</keyTemplate> </component>
Output
The stage has no job output, the key, value pair will be written directly to the Hadoop Context class referenced in the job variable hadoopContext.
Overview
Content Tools