The Pipeline Manager is responsible for processing jobs (which contain documents to process) and pipelines. Pipeline managers are essentially passive and wait for jobs to arrive (from feeders). The jobs are put on an internal job queue and are then picked up by execution threads (a thread pool), which processes the job through the pipeline to completion.
Pipeline managers process jobs. Jobs can come from either of two sources:
Every pipeline manager maintains a queue and a thread pool. New jobs received by the pipeline manager will be first placed on the queue. When a thread becomes available, it will take the next job from the queue and will process that job through the specified pipeline.
Note that the thread carries the job all the way through to completion (see below). This may include branching the job to other pipelines (see the <branches> tag below).
See below for a list of parameters that can control job queueing and threading pools (the size of the queue, the maximum number of threads, etc.). Also note that there is a timeout for idle threads, so that system resources are minimized when not all threads are required.
If you need multiple queues and multiple thread pools, then just create multiple pipeline managers. This is a useful technique for managing thread pools to ensure that one pool does not get starved.
In general, a pipeline manager should only process a certain type of job. If you have multiple types of jobs, it is best to create multiple pipeline managers. For example, parent jobs and sub-jobs are best handled by multiple pipeline managers to ensure that parent job processing is not starved for threads while the sub-jobs are processing.
Branching from one pipeline manager to another does not cause the job to be re-enqueued onto the remote pipeline manager's job queue. Instead, the original thread is used to continue processing of the job through the remote pipeline manager's pipeline.
This means that once a thread has accepted a job to process, that thread will process the job all the way through to completion - even if this means calling the process() method of other pipeline managers to process the job.
The same is true when jobs are routed to other Pipeline Managers using routing tables.
A pipeline manager is a sub-class of component managers. This means that component manager configuration (such as installing bundles and creating components) are also available as part of the pipeline manager configuration.
A job is "completed" in either of two situations:
If either of these two situations occur, then the job is "completed". The pipeline manager will do the following for completed jobs:
Pipeline managers are also responsible for performing system health checks. Health checks check the overall health of the system for things like:
Once configured, the health of the entire server, as well as detailed information on each health check, is available through the admin RESTful interface.
ssful jobs, since these are the only ones which will provide reliable measurement data.