ALPHA VERSION
The Avro Files Extractor provides the following functionality:
Avro Files Extractor | |
---|---|
Factory Name | com.searchtechnologies.aspire:aspire-avro-extractor |
subType | default |
Inputs | Job containing a data stream (object['contentStream'] which is a stream to the Avro File to process). |
Outputs | One subDocument for each Avro record in the Avro file, submitted as a subjob. |
Element | Type | Default | Description |
---|---|---|---|
subJobTimeout | long | 600000 | timeout for subjobs in millis |
debug | boolean | false | If true it will log debug information from the component |
notStoreIds | boolean | false | If true it will NOT store deletes for future deletes |
noInfoMessages | boolean | false | If true it will not write info messages to log |
bulkSize | int | 1000 | The bulk size for NoSql |
bulkTimeout | long | 1000 | The bulk timeout for NoSql in ms |
<component name="AvroSubJobExtractor" subType="default" factoryName="aspire-avro-extractor"> <debug>${debug}</debug> <notStoreIds>${notStoreIds}</notStoreIds> <noInfoMessages>${noInfoMessages}</noInfoMessages> <bulkSize>${bulkSize}</bulkSize> <bulkTimeout>${bulkTimeout}</bulkTimeout> <branches> <branch event="onAddUpdateSubJob" pipelineManager="CompletePM" batching="false" /> <branch event="onDeleteSubJob" pipelineManager="DeletePM" batching="false" /> </branches> </component> <component name="CompletePM" subType="pipeline" factoryName="aspire-application"> <debug>${debug}</debug> <gatherStatistics>${debug}</gatherStatistics> <pipelines> <pipeline name="completePipeline" default="true"> <script> <![CDATA[ job | AddUpdateJobLogger job.addRoute("/"+doc.sourceId.text()+"/ProcessPipelineManager@${addUpdatePM}") ]]> </script> </pipeline> </pipelines> <components> <component name="AddUpdateJobLogger" subType="jobLogger" factoryName="aspire-tools"> <debug>${debug}</debug> <logFile>log/${app.name}/addUpdate.jobs</logFile> </component> </components> </component> <component name="DeletePM" subType="pipeline" factoryName="aspire-application"> <debug>${debug}</debug> <gatherStatistics>${debug}</gatherStatistics> <pipelines> <pipeline name="deletePipeline" default="true"> <script> <![CDATA[ job | DeletedJobLogger job.addRoute("/"+doc.sourceId.text()+"/ProcessPipelineManager@${deletePM}") ]]> </script> </pipeline> </pipelines> <components> <component name="DeletedJobLogger" subType="jobLogger" factoryName="aspire-tools"> <debug>${debug}</debug> <logFile>log/${app.name}/deleted.jobs</logFile> </component> </components> </component>