A workflow is a set of rules (grouped by workflow event) to be executed sequentially for every given job being processed. A workflow can be assigned to be executed on multiple seeds, which means that all seeds that have the same workflow assigned will share all its rules.


Rules


A workflow rule is any application or script to be executed as part of the workflow. They can be used to process the content or control the flow of the jobs received in the workflow.

Rule Types

TypeDescription
ApplicationRules that reference an Aspire application contain the maven coordinates of the application and the properties to configure it.
TemplateRules created from a workflow template contain the ID of the template and the properties to configure it. The type of these rules matches the one from the referenced template (Custom, Application, etc.)
CustomA custom groovy script rule contains the script to execute.
Folder

A type of template created to contain references to other rules.

Templates

Templates are predefined workflow rules. In general, groovy scripts are used for common job operations like adding or copying fields. The templates can be used by creating a rule that references it and including the required properties.

Provided Templates

NameDescription
add-static-aclAdds the configured ACL to the ACLs list of all jobs.
add-static-groupAdds the configured group to the groups list of all jobs.
override-with-public-aclOverrides the ACLs of a job with a single PUBLIC:ALL ACL.
override-aclOverrides the ACLs of a job with the ACL provided.
override-acl-domainOverrides the domain of all the ACLs with the provided domain.
dump-documentLogs the content of the job with the configured log severity level.
logLogs the specified message with the configured log severity level.
exceptionThrows an exception with the configured message. This will cause the job to fail.
exception-conditionalThrows an exception with the configured message if the specified job field matches the configured value. This will cause the job to fail.
terminateTerminates the job if the specified job field matches the configured value. This ceases the processing of the job.
terminate-conditionalTerminates the job. This ceases the processing of the job.
terminate-by-file-extensionTerminates the job based on the extension of the URL field. This ceases the processing of the job.
terminate-by-file-sizeTerminates the job based on the value of the dataSize field. This ceases the processing of the job.
terminate-by-file-nameTerminates the job based on the names of the URL field. This ceases the processing of the job.
field-copyCopies the value of the source field to a target field. This does not override a previous existing target field, it creates a new one with the same name.
field-fallback-copyCopies the value of the source field to a target field only if the target field does not exist.
field-defaultCreates the configured field with the specified value if it does not exist.
field-setOverrides the configured field with the specified value, the field is created if it does not exist.
booleanChecks if a field matches a specific value. The execution of job will continue to the condition it matches.
field-equalsChecks if a field matches another. The execution of job will continue to the condition it matches.
field-isEmptyChecks if a field value is empty or null, or if it does not exist. The execution of job will continue to the condition it matches.
switchExecutes a switch statement on the specified field. The execution of job will continue to the condition it matches.
switch-customExecutes a switch statement on the specified field. The execution of job will continue to the condition it matches.



Workflow Events


A virtual set of rules to be executed sequentially that lives inside a workflow object, there are 7 different events possible, each event is triggered upon different stages within the lifespan of a document:

  • onScan
    • Executed for each document after it has been discovered (as soon as its parent document discovers it)
  • onAddUpdate
    • Executed once the document is identified to have one of the following actions based on its incremental state: add or update
  • onDelete
    • Executed once the document is identified to be deleted (action: delete)
  • onSubJob
    • Executed for any sub job created by Aspire components, this is the event to place any rule that should be executed only for subjobs. (Available from Aspire 5.1)
  • onPublish
    • Executed after onAddUpdate, onDelete or onSubJob, this is the event when all common publishing rules should be placed regardless of the document action
  • onError
    • Executed if any stage within the connector framework or workflow rule fails while processing a job.
  • onIdentity
    • Executed for each identity extracted in an Identity crawl.
  • onIdentityError
    • Executed if any stage within the connector framework or workflow rule fails while processing an identity job.
  • onStart
    • Executed only for the crawl start job. Rules that must be executed only at the beginning of the crawl should be added here. (Available from Aspire 5.1)
  • onEnd
    • Executed only for the crawl end job. Rules that must be executed only at the end of the crawl should be added here. (Available from Aspire 5.1)

Event Items

Each event is composed of a set of items. Each item defines an action to execute in the event, and in some cases they can contain nested event items. There are three types of event items:

Reference

These items are references to rules that will be executed in the event, each reference contains the ID of the rule to execute. References to rules of type “Folder” can contain nested references or exit items.

Exit

The exit items are used to finish the execution of the event at a specific point. Any job processed by this item will exit the workflow event, but it does not mean that the job will be terminated. The processing of the job will be continued by the next stage, pipeline, or event.

Condition

Conditions are items used to control the flow of a job. If a job matches the condition statement, the job will be processed by the items contained by the condition, if not, it will continue to the next item in the workflow. Conditions items are always nested to references of rules of type “Choice”, and they can contain other references and exit items. All condition items are associated to the template “Condition”.

  • No labels