Page History
...
If you want to reference a stage configured outside the actual Pipeline Manager, you can reference it by using the path to that stage component:
Code Block | ||||
---|---|---|---|---|
| ||||
job | stage("../OtherPipelineManager/HierarchyExtractor") |
...
The groovy pipelines allows you to dynamically build a list of stages to execute. This way you can have a better and easier control of what stages should and shouldn't be processed based on the input job metadata.
Code Block | ||||
---|---|---|---|---|
| ||||
def myPath = ((doc.action == "add" || doc.action == "update")? FetchUrl | //Stages to process if "add" or "update" action was received ExtractText | ExtractDomain : PrintToFile //Stages to process if no "add" or "update" action was received ) | Feed2Solr //Stage to process every time after all stages job | myPath |
...
You can use the redirect feature to print to a file the contents of the jobs received in the actual groovy pipeline, using the ">>" operator and then specifying the target file path.
Code Block | ||
---|---|---|
| ||
job | FetchUrl | ExtractText >> "out.xml" | Feed2Solr
|
In the previous example the redirect is executed before the "Feed2Solr" stage, so if that stage adds or modify any content on the job metadata, it will not be reflected in the "out.xml" file.
...
A Closure Stage is an embedded stage (to the Groovy Pipeline) that receives a groovy closure to execute. For example:
Code Block | ||||
---|---|---|---|---|
| ||||
job | stage{it.doc.add("fetchUrl","http://www.searchtechnologies.com")} | FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain | PrintToFile | Feed2Solr |
You can use this to configure other job flows too:
Code Block | ||||
---|---|---|---|---|
| ||||
job | stage{ it.doc.add("fetchUrl","http://www.searchtechnologies.com"); it | FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain; println "End of Closure Stage" } | PrintToFile | Feed2Solr |
...
Groovy control flow statements can be used to control what pipeline to execute given any condition you want:
Code Block | ||||
---|---|---|---|---|
| ||||
job | FetchUrl | ExtractText; if (doc.type.text == "text/xml") job | XMLProcessor | Post2Solr >> "xmlFiles.xml"; else if (doc.type.text == "text/html") job | HTTPProcessor | Post2Solr >> "htmlFiles.xml"; else job | Post2Solr >> "otherFiles.xml"; |
...
You can loop through some stages as needed:
Code Block | ||||
---|---|---|---|---|
| ||||
for (i in 0..9) { job | stage {doc.add("stageNum",i)} } |
...
You can also configure exceptions to lists of Stages:
Code Block | ||||
---|---|---|---|---|
| ||||
def myStagePath = FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain | PrintToFile ; job | myStagePath.exceptions([ onComplete: Feed2Solr ]); |
...
Nested exception handling is also available:
Code Block | ||||
---|---|---|---|---|
| ||||
def myStagePath = FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain | PrintToFile ; job | myStagePath.exceptions([ onComplete: Feed2Solr.exceptions([ onError: stage{"it >> 'fetchUrlError.xml'"}, onComplete: stage{"it >> 'indexedJobs.xml'"} ]) ]); |
...
Groovy pipelines provide a way of controlling the flow of sub jobs through stages. Using the subJobs() method of each stage, you can specify what you want to execute for possible subjobs generated in that Stage. It receives a single Groovy Closure or a Map of label (used when the subJob was branched) vs a Stage (or a List of stages):
Code Block | ||||
---|---|---|---|---|
| ||||
job | FetchUrl | XmlSubJobExtractor.subJobs([ onSubJob: stage{it | FetchUrl | ExtractText | PostHttp >> "subjobs.xml"} ]) |
...
or just a single Closure that will be executed no matter what are the branch labels for the subjobs:
Code Block | ||||
---|---|---|---|---|
| ||||
job | FetchUrl | XmlSubJobExtractor.subJobs( {it | FetchUrl | ExtractText | PostHttp >> "subjobs.xml"} ) |
...
In the current design, when you create a sub job extractor for use in Groovy pipelines, you will need to create a dummy sub-job extractor. For example:
Code Block | ||||
---|---|---|---|---|
| ||||
<component name="XmlSubJobExtractor" subType="xmlSubJobExtractor" factoryName="aspire-xml-files"> <branches><branch event="onSubJob" pipelineManager="DUMMY" /></branches> </component> |
...
For example:
Code Block | ||||
---|---|---|---|---|
| ||||
<pipeline name="doc-process" default="true"> <script maxThreadPools="10" maxThreadsPerPool="10" maxQueueSizePerPool="30"><![CDATA[ job | FetchUrl | XmlSubJobExtractor.subJobs([ onSubJob: stage{it | FetchUrl | ExtractText | PostHttp >> "subjobs.xml"} ]) ]]></script> </pipeline> |
...
You can create jobs inside a Groovy Pipeline by using the createJob method:
Code Block | ||||
---|---|---|---|---|
| ||||
contactsJob = createJob('<doc><url>'+doc.url.text()+'/contacts.html</url></doc>') contactsJob | FetchUrl | ExtractText |
...
Example:
Code Block | ||||
---|---|---|---|---|
| ||||
dir {it | FetchUrl | ExtractText >> "files.xml"} //Only files inside the Aspire_Home directory
dir ({it | FetchUrl | ExtractText >> "files+dir.xml"},"+d") //Files and directories inside the Aspire_Home directory
dir ({it | FetchUrl | ExtractText >> "files+dir.xml"},"+d+r") //Files and directories recursively inside the Aspire_Home directory
dir ("data",{it | FetchUrl | ExtractText >> "data_files.xml"}) //Only files inside the Aspire_Home/data directory
dir ("data",{it | FetchUrl | ExtractText >> "data_files+dir.xml"},"+d") //Files and directories inside the Aspire_Home/data directory |
...
Configuration:
Code Block | ||||
---|---|---|---|---|
| ||||
<initialization name="Test Long Initializer"> <check componentRef="/pipeline/LongInitializer"/> <check componentRef="/pipeline/Concat-Test"/> </initialization> |
...
Configuration:
Code Block | ||||
---|---|---|---|---|
| ||||
<jobCount name="Count of Document Jobs" redThreshold="3" yellowThreshold="1"/> |
...
Configuration:
Code Block | ||||
---|---|---|---|---|
| ||||
<timestamp name="Rebuild Dictionary Token Stats" history="5" redThreshold="10000" yellowThreshold="2000"/> |
...
Configuration: (defaults to 15 minute intervals over 24 hours)
Code Block | ||||
---|---|---|---|---|
| ||||
<latency name="Process Single Document" jobsToAverage="5" isSticky="true" redThreshold="15000" yellowThreshold="5000" /> |
...
Configuration: (specify the interval and history length)
Code Block | ||||
---|---|---|---|---|
| ||||
<latency name="Process Single Document" jobsToAverage="5" isSticky="true" redThreshold="15000" yellowThreshold="5000" interval="3600000" history="48" /> |
...