The worker nodes can be configured setting environment variables or JVM properties or using the settings json that is uploaded to the NoSQL database.
Seeds can share connectors and workflows. For worker nodes this means a single instance of a given connector or workflow is loaded for multiple seeds. These components are loaded on demand as soon as a worker node needs to process an item of a given seed, and kept in memory while in use.
When items are being processed and the assigned connector or workflow is not loaded, the worker will automatically load the components before processing any item. This components are kept in a list of loaded components and will be removed after a certain idle period. Configuration changes on these components are detected by the worker using a checksum and checking the current loaded configuration is still valid when a new crawl for a seed is detected. Configuration changes while a seed is running are not allowed by the api.
Batching is now done at the publisher level inside the workflow. This allows to have batches with items from different seeds that have a shared destination. When a batch fails, it will report the failure back to all the jobs it contained and, depending on the connector configuration, will mark the job appropriately (either mark it as error, completed or delete it from the queue).