The Elasticsearch Cache Lookup is a workflow component for Aspire. Content Type Detector |
---|
Factory Name | com.accenture.aspire:content-type-detector |
---|
subType | job-input |
---|
Inputs | The field from which you want to get the value and a field to be created in the Aspire document. |
---|
Outputs | Aspire object that contains a subjob with metadata and checked out content extracted from the file. |
---|
Configuration
This section lists all configuration parameters available to configure the Content Type Detector component.
| Element | Type | Default | Description |
---|
General | Ignore Delete Jobs | boolean | True | Option to skip delete jobs. |
Fetch file | boolean | False | Select if you need to fetch file. |
Use default document path | boolean | True | Select so that Aspire will use the fetchUrl or displayUrl as the location of the file. |
Document fetch path | None | None | Location in the Aspire document of the path to the file to fetch. |
Max Lookahead in MBytes for type detection | text | No | Maximum to consume the file stream to detect the type. |
Max percent of column variability to allow in text separated files | text | No | Maximum percentage of variability to allow in the number of columns. |
Apache Tika configuration path | text | No | Path for Apache Tika configuration file. |
Example Configuration
"General":[
{ "ignoreDeleteJobs": true,
"enableFetchUrl": true,
"defaultFetchPath": true,
"fetchPath": "/doc/fetchUrl",
"maxLookaheadSize": 0.5,
"variabilityPercent": 0,
"tikaConfig": "/path/to/tikaConfig.xml"
}
],