To configure reprocessing, specify reprocessQueue properties under reprocessQueue (batchSize: number of documents to process as a batch, querySize: number of records by reprocess worker to retrieve at a time, timeout: wait timeout when there are no items in the queue) and define the number of reprocessing workers on the StageR configuration file:
{ ..., reprocessQueue: { batchSize: 20, querySize: 40, timeout: 5000 }, workers: { restapi: 1, reprocess: 4, replication: 1 }, ... }
Specify the content processing modules, for the scope to be reprocessed, using the admin/setContentProcessingModules API call:
POST admin/setContentProcessingModules/STORAGE_UNIT { "modules" : { "connector": [ { "settings" : { "solr-collection" : "testcollection" }, "module" : "SolrPublisher" } ] }, "settings" : { "solr-hosts" : localhost:8983" } }
Enable the reprocessing queue at the storage unit level. This will register the storage unit to be scanned by the reprocess workers.
PUT admin/enableReprocessingQueue/STORAGE_UNIT/true
Automatic Updates
When foreign keys are defined for a content record and the ForeignKeyJoin processing module is defined for a storage unit, a foreign key lookup table is created in StageR which allows the foreign item to know which primary items are referencing it, allowing for any updates to the foreign item to trigger an automatic update (send the keys to reprocess) of the primary keys.
As long as the reprocessing queue for the primary item's storage unit is enabled and there are reprocessing workers configured, automatic updates will be triggered.