The Content Type Detector component can be configured using the Aspire Admin UI. It requires the following entities to be created:

  • Connector
  • Seed

This component is an application for workflow configuration, and it is used in the "onAddUpdate" Workflow Event.

Create Workflow


  1. On the Aspire Admin UI, go to the workflow page image2022-6-28_23-30-54.png
  2. All existing workflows will be listed. Click on the new button
  3. Enter the new workflow description.
  4. Select the “Create” button.
  5. Go to the Workflow Event “onAddUpdate”.
  6. Search in “Type criteria” the Applications options and drag, using , the Content Type Detector component in the onAddUpdate section.
  7. Enter a new description for this application component.
  8. General:
    1. Ignore Delete Jobs: Select if delete jobs need to be ignored.
    2. Fetch file: Fetch the file before text extraction. If you disable this, make sure some preceding stage or component has assigned a content stream to the job.
      1.  Use the default document path: Select so that Aspire will use the fetchUrl or displayUrl as the location of the file. Clear if your Aspire document stores the path to the file in a different location.  
        1. Document fetch path: The location in the Aspire document of the path to the file to fetch.
    3. Max Lookahead in MBytes for type detection: The maximum to consume the file stream to detect the type, specially for CSV/TSV detection.
    4. Max percent of column variability to allow in text separated files: The maximum percentage of variability to allow in the number of columns when detecting the Content Type of separated value files. NOTE: If you set a high variability, you may get wrongly detected types for the files.
    5. Apache Tika configuration path: Path for Apache Tika configuration file.













  • No labels