Page History

...

If you want to reference a stage configured outside the actual Pipeline Manager, you can reference it by using the path to that stage component:

Code Block

language	xml
linenumbers	true

  job | stage("../OtherPipelineManager/HierarchyExtractor")

...

The groovy pipelines allows you to dynamically build a list of stages to execute. This way you can have a better and easier control of what stages should and shouldn't be processed based on the input job metadata.

Code Block

language	xml
linenumbers	true

   def myPath = ((doc.action == "add" || doc.action == "update")? 
                  FetchUrl |         //Stages to process if "add" or "update" action was received
                      ExtractText | 
                      ExtractDomain :  
                  PrintToFile         //Stages to process if no "add" or "update" action was received
               ) | Feed2Solr          //Stage to process every time after all stages

  job | myPath

...

You can use the redirect feature to print to a file the contents of the jobs received in the actual groovy pipeline, using the ">>" operator and then specifying the target file path.

Code Block

language	xml

   job | FetchUrl | ExtractText >> "out.xml" | Feed2Solr

In the previous example the redirect is executed before the "Feed2Solr" stage, so if that stage adds or modify any content on the job metadata, it will not be reflected in the "out.xml" file.

...

A Closure Stage is an embedded stage (to the Groovy Pipeline) that receives a groovy closure to execute. For example:

Code Block

language	xml
linenumbers	true

 job | stage{it.doc.add("fetchUrl","http://www.searchtechnologies.com")} | FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain | PrintToFile | Feed2Solr

  You can use this to configure other job flows too:

Code Block

language	xml
linenumbers	true

   job | stage{
           it.doc.add("fetchUrl","http://www.searchtechnologies.com");
           it | FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain;
           println "End of Closure Stage"
        } | PrintToFile | Feed2Solr

...

Groovy control flow statements can be used to control what pipeline to execute given any condition you want:

Code Block

linenumbers

language	xml		true

   job | FetchUrl | ExtractText;

  if (doc.type.text == "text/xml") 
   job | XMLProcessor | Post2Solr >> "xmlFiles.xml";
  else if (doc.type.text == "text/html") 
    job | HTTPProcessor | Post2Solr >> "htmlFiles.xml";
  else
    job | Post2Solr >> "otherFiles.xml";

...

You can loop through some stages as needed:

Code Block

language	xml
linenumbers	true

   for (i in 0..9) { 
    job | stage {doc.add("stageNum",i)}
  }

...

You can also configure exceptions to lists of Stages:

Code Block

language	xml
linenumbers	true

   def myStagePath = FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain | PrintToFile ;
  job | myStagePath.exceptions([
        onComplete: Feed2Solr
      ]);

...

Nested exception handling is also available:

Code Block

language	xml	linenumbers	true

   def myStagePath = FetchUrl | ExtractText | Splitter | DateChooser | ExtractDomain | PrintToFile ;
  job | myStagePath.exceptions([
        onComplete: Feed2Solr.exceptions([
                      onError: stage{"it >> 'fetchUrlError.xml'"},
                      onComplete: stage{"it >> 'indexedJobs.xml'"}
                    ])
      ]);

...

Groovy pipelines provide a way of controlling the flow of sub jobs through stages. Using the subJobs() method of each stage, you can specify what you want to execute for possible subjobs generated in that Stage. It receives a single Groovy Closure or a Map of label (used when the subJob was branched) vs a Stage (or a List of stages):

Code Block

language	xml
linenumbers	true

   job | FetchUrl | XmlSubJobExtractor.subJobs([
                     onSubJob: stage{it | FetchUrl | ExtractText | PostHttp >> "subjobs.xml"}
                   ])

...

or just a single Closure that will be executed no matter what are the branch labels for the subjobs:

Code Block

language	xml	linenumbers	true

   job | FetchUrl | XmlSubJobExtractor.subJobs(
                     {it | FetchUrl | ExtractText | PostHttp >> "subjobs.xml"}
                   )

...

In the current design, when you create a sub job extractor for use in Groovy pipelines, you will need to create a dummy sub-job extractor. For example:

Code Block

language	xml
linenumbers	true

  <component name="XmlSubJobExtractor" subType="xmlSubJobExtractor" factoryName="aspire-xml-files">
   <branches><branch event="onSubJob" pipelineManager="DUMMY" /></branches>
 </component>

...

For example:

Code Block

language	xml
linenumbers	true

   <pipeline name="doc-process" default="true">
    <script maxThreadPools="10" maxThreadsPerPool="10" maxQueueSizePerPool="30"><![CDATA[
        job | FetchUrl | XmlSubJobExtractor.subJobs([
                     onSubJob: stage{it | FetchUrl | ExtractText | PostHttp >> "subjobs.xml"}
                   ])
    ]]></script>
  </pipeline>

...

You can create jobs inside a Groovy Pipeline by using the createJob method:

Code Block

language	xml	linenumbers	true

    contactsJob = createJob('<doc><url>'+doc.url.text()+'/contacts.html</url></doc>')
   contactsJob | FetchUrl | ExtractText

...

Example:

Code Block

language	xml	linenumbers	true

   dir {it | FetchUrl | ExtractText >> "files.xml"}               //Only files inside the Aspire_Home directory
  dir ({it | FetchUrl | ExtractText >> "files+dir.xml"},"+d")    //Files and directories inside the Aspire_Home directory
  dir ({it | FetchUrl | ExtractText >> "files+dir.xml"},"+d+r")  //Files and directories recursively inside the Aspire_Home directory
  dir ("data",{it | FetchUrl | ExtractText >> "data_files.xml"}) //Only files inside the Aspire_Home/data directory
  dir ("data",{it | FetchUrl | ExtractText >> "data_files+dir.xml"},"+d") //Files and directories inside the Aspire_Home/data directory

...

Configuration:

Code Block

language	xml	linenumbers	true

      <initialization name="Test Long Initializer">
        <check componentRef="/pipeline/LongInitializer"/>
        <check componentRef="/pipeline/Concat-Test"/>
     </initialization>

...

Configuration:

Code Block

language	xml
linenumbers	true

  <jobCount name="Count of Document Jobs" redThreshold="3" yellowThreshold="1"/>

...

Configuration:

Code Block

language	xml
linenumbers	true

  <timestamp name="Rebuild Dictionary Token Stats" history="5" redThreshold="10000" yellowThreshold="2000"/>

...

Configuration: (defaults to 15 minute intervals over 24 hours)

Code Block

language	xml	linenumbers	true

  <latency name="Process Single Document" jobsToAverage="5" isSticky="true" 
          redThreshold="15000" yellowThreshold="5000" />

...

Configuration: (specify the interval and history length)

Code Block

language	xml	linenumbers	true

  <latency name="Process Single Document" jobsToAverage="5" isSticky="true" 
          redThreshold="15000" yellowThreshold="5000"
          interval="3600000" history="48" />

...

Page tree

Versions Compared

Old Version 2

New Version 3

Key