Page History

Some Aspire components must be configured before the settings.json is read like the NoSQL provider. This configuration is done using environment variables or JVM parameters.

Some components configured in the settings file like the Worker and Manager node allows configuration with environment variables or JVM parameters as an alternative

Easy Heading Free

navigationTitle	On this Page
wrapNavigationText	true
navigationExpandOption	expand-all-by-default

Format

All properties can be passed either as a environment variable or as a JVM parameter. The "." and "_" characters can be replaced with each other.

Code Block

language	bash
theme	RDark

SET aspire.noSql.elastic.server=http://localhost:9200 //Setting a property as a environment variable
java -Daspire.noSql.elastic.server=http://localhost:9200 //Passing a property like JVM parameter

Warning
In the case of environment variables that , property name should always use "_" instead of "."

Memory settings

In Linux based systems, the memory usage can be changed with the following environment variables

Parameter	Required	Default	Description
aspire.max.heap.memory	no	2g	Maximum Heap the JVM will request and use from the host system
aspire.max.metaspace.size	no	256m	Maximum metaspace the JVM will request and use from the host system.

In windows-based systems, if using the aspire.bat file, the memory usage should be changed in the aspire.bat itself. Otherwise, if it is running as a Windows Service:

Browse to
- HKEY_LOCAL_MACHINE / SOFTWARE / WOW6432Node / Apache Software / Procrun 2.0 / AspireService / Parameters / Java
Modify JvmMs and JvmMx to what you need (in bytes), and then restart the service.

Elasticsearch NoSQL Provider

Bellow Below are the list of properties to configure the Elasticsearch NoSQL Provider

elasticsearch url urls elasticsearch elasticsearch elasticsearch elasticsearch access.keyAWS secret key for authenticationregionThe regionkeepSearchContextAliveThe elasticsearch amount mantain amount maxRetries3 amount of times to retry each request elasticsearch amount path

Parameter	Required	Default	New & Available from	Description
aspire.noSql.elastic.server	yes	http://localhost:9200		The	Elasticsearch server	URL to use. It can be multiple	URLs separated by ","
aspire.noSql.elastic.authentication.basic	no	false		Enables	Elasticsearch basic authentication
aspire.noSql.elastic.user	no	null		The	Elasticsearch user to use for basic authentication
aspire.noSql.elastic.password	no	null		The	Elasticsearch password to use for basic authentication
aspire.noSql.elastic.authentication.aws	no	false		Enables	Elasticsearch AWS authentication
aspire.noSql.elastic.authentication.useCredentialsProviderChain	no	false		Option to use the AWS credentials provider chain to get the credentials
aspire.noSql.elastic.aws.	assumeRole	no	null	AWS access key for authentication	aspire.noSql.elastic.aws.secret.key	no	null	false	5.0.3	If a role must be assumed to access Elasticsearch. Must be true/false
aspire.noSql.elastic.aws.roleArn	no	null	5.0.3	The Role ARN to be assumed.
aspire.noSql.elastic.aws.	access.key	no	null		AWS	access key for authentication
aspire.noSql.elastic.aws.secret.	key	no	null		AWS secret key for authentication
aspire.noSql.elastic.aws.region	no	null		The AWS region
aspire.noSql.elastic.keepSearchContextAlive	no	5m		The	5m	amount of time to keep	Elasticsearch scrolls requests search context alive using "m" as a unit
aspire.noSql.elastic.maxRequestSize	no	10485760B		The maximum size for a bulk request. The value can be specified in B, K, M and G units
aspire.noSql.elastic.maxConnections	no	100		The maximum	number of connections to	keep open.
aspire.noSql.elastic.maxConnectionsPerRoute	no	10		The maximum	number of connections per server
aspire.noSql.elastic.	readTimeout	no	30000		The	socket timeout.
aspire.noSql.elastic.connectionTimeout	no	15000		The connection timeout.
aspire.noSql.elastic.maxRetries	no	3		The number of times to retry each request
aspire.noSql.elastic.retriesWaitTime	no	5000		The time to wait in ms between retries
aspire.noSql.elastic.useThrottling	no	false		Enables requests throttling to	Elasticsearch
aspire.noSql.elastic.throttlingRate	no	5000		The throttling rate in ms
aspire.noSql.elastic.throttlingConnectionRate	no	500		The maximum	number of requests allowed in the period specified by the throttlingRate
aspire.noSql.elastic.waitTime429	no	3000		The time to wait in ms before retrying a 429 error
aspire.noSql.elastic.bulk	no	true		Enables using bulk for requests
aspire.noSql.elastic.bulkSize	no	500		The maximum number of documents to include in a bulk request
aspire.noSql.elastic.bulkInactivityTimeout	no	5		The inactivity in s before flushing a bulk request
aspire.noSql.elastic.bulkRegularTimeout	no	30		The maximum amount of time in s for a bulk request to be kept in memory before flushing
aspire.noSql.elastic.debugFile	no	null		The path to the debug file, request to ES are logged in this file
aspire.noSql.elastic.mappingFile	no	null		The file	path (including file name) that includes the mapping for the indexes used by Aspire. By default, the mapping included in the provider is used
aspire.noSql.elastic.index.prefix	no	aspire		The prefix to use for the indexes created by the provider
aspire.noSql.elastic.debug	no	false		Enables debug logging information
aspire.noSql.elastic.usePooling	no	true		Enables HTTP connection pooling

Index Sharding and Replicas settings

Each index sharding and replicas settings can be changed by using a custom mappingFile and configuring the aspire.noSql.elastic.mappingFile property.

Since 5.0.1, when using the default mapping settings, you can use the following prefix parameter:

aspire.noSql.elastic.index.shards.[indexName]
- Should hold an integer specifying the number of shards for the given index
aspire.noSql.elastic.index.replicas.[indexName]
- Should hold an integer specifying the number of replicas for the given index

If none of these properties are specified, the default replicas and shard number of the cluster will be used (usually 1 shard and 1 replica).

The different index names available are:

audit
base
errors
hierarchy
log
map
queue
retryLog
set
settings
snapshot
updateQueue
identityCache

SSL Certificates

Bellow are the list of properties to configure the SSL Certificates

ParameterRequiredDefaultDescriptionaspire.ssl.trustAllnofalseConfigure if all certificates should be trustedaspire.ssl.overwriteFactorynofalseConfigure if the created key managers should overwrite the Java connection factoryaspire.ssl.truststore.filenonullThe path of the trust store fileaspire.ssl.truststore.passwordnonullThe trust store file passwordaspire.ssl.truststore.typenojksThe file format of the trust store fileaspire.ssl.keystore.filenonullThe path of the key store fileaspire.ssl.keystore.passwordnonullThe key store file passwordaspire.ssl.keystore.typenojksThe file format of the key store file

aspire.nosql.elastic.timeSeriesType	no	index	5.0.3, 5.3 (Opensearch)	Enables rollover possibility for audit, error, log time series index. Value "index" - default without rollover, Value "dataStream" - rollover based on data stream. Value "dataStreamOpensearch" - rollover based on Opensearch data stream.
aspire.nosql.elastic.ilmPolicyFile	no	default provider policy file	5.0.3	The file path (including file name) that includes the ilm policy file for the indexes used by Aspire. Used only for the time series option "dataStream*".
aspire.nosql.elastic.indexTemplateFile	no	default provider template file	5.0.3	The file path (including file name) that includes the template file for the indexes used by Aspire. Used only for the time series option "dataStream*".

Index Sharding and Replicas settings

Each index sharding and replicas settings can be changed by using a custom mappingFile and configuring the aspire.noSql.elastic.mappingFile property.

Since 5.0.1, when using the default mapping settings, you can use the following prefix parameter:

aspire.noSql.elastic.index.shards.[indexName]
- Should hold an integer specifying the number of shards for the given index
aspire.noSql.elastic.index.replicas.[indexName]
- Should hold an integer specifying the number of replicas for the given index

If none of these properties are specified, the default replicas and shard number of the cluster will be used (usually 1 shard and 1 replica).

The different index names available are:

audit
base
errors
hierarchy
log
map
queue
retryLog
set
settings
snapshot
updateQueue
identityCache

SSL Certificates

Bellow are the list of properties to configure the SSL Certificates

Parameter	Required	Default	Description
aspire.ssl.trustAll	no	false	Configure if all certificates should be trusted
aspire.ssl.overwriteFactory	no	false	Configure if the created key managers should overwrite the Java connection factory
aspire.ssl.truststore.file	no	null	The path of the trust store file
aspire.ssl.truststore.password	no	null	The trust store file password
aspire.ssl.truststore.type	no	jks	The file format of the trust store file
aspire.ssl.keystore.file	no	null	The path of the key store file
aspire.ssl.keystore.password	no	null	The key store file password
aspire.ssl.keystore.type	no	jks	The file format of the key store file

Security and encryption

Below are the list of properties related to the Aspire security and encryption

Parameter	Required	Default	Description
aspire.ldap.bind.dn.password	no	null	The password of the User DN. Not required if the authentication is anonymous
aspire.security.api.auditing	no	true	It audits the Aspire API calls. It is available from Aspire 5.1 and on.

Encryption

When a password/secret/token in a configuration must be persisted, Aspire encrypts it and stores it encrypted. The default encryption mechanism is AES 256 given a local key. AWS KMS encryption can be used instead as of version 5.0.3

Standard AES 256 Encryption with local key

Parameter Required Default Description

aspire.encryption.key.file

no

null

(Optional) Path (including file name) where the encryption key is located, if not provided a default in-memory key will be used, for production installations it must be always provided. This can also be passed as a JVM parameter or as an environment variable, aspire_encryption_key_file.

This should be a 32 byte file, if longer, the first 32 bytes will be used as the encryption key.

Grant read access to the Aspire user only (chmod 400 <file>)

This file could be generated randomly

$ head -c 32 /dev/urandom > encryption.key

AWS KMS Encryption

Available since Aspire 5.0.3, Uses AWS Key Management Service (KMS) to encrypt the sensitive data. It uses a key in KMS to encrypt and decrypt data. See more details about this encryption provider at Aspire KMS encryption

Parameter	Required	Default	Description
aspire.encryption.kms.roleARN	no	null	(Optional) If the KMS service must be accessed through the assumption of an IAM role, specify the role ARN.
aspire.encryption.kms.keyARN	yes	N/A	The KMS key ARN. See Aspire KMS encryption for more information about creating a KMS key for Aspire.
aspire.encryption.kms.region	yes	N/A	The AWS region on which the KMS service will be used
aspire.encryption.kms.accessKey	no	null	(Optional) Specify the access key if static credentials must be used. If this is not specified, the Default Credential Provider Chain will be used.
aspire.encryption.kms.secretKey	no	null	(Optional) Specify the secret key if static credentials must be used. If this is not specified, the Default Credential Provider Chain will be used.

Worker Node

These properties will be used by all worker nodes in the cluster.

Security and encryption

Bellow are the list of properties related to the Aspire security and encryption

Parameter Required Default Description

aspire.ldapnode.bindworker.dn.passwordmaxMemQueueSize noyes null4000 The password of the User DN. Not required if the authentication is anonymous

aspire.encryption.key.file

no

null

(Optional) Path (including file name) where the encryption key is located, if not provided a default in-memory key will be used, for production installations it must be always provided. This can also be passed as a JVM parameter or as an environment variable aspire_encryption_key_file

This should be a 32 byte file, if longer, the first 32 bytes will be used as the encryption key.

Grant read access to the Aspire user only (chmod 400 <file>)

This file could be generated randomly

$ head -c 32 /dev/urandom > encryption.key

Worker Node

These properties will be used by all worker nodes in the cluster.

Description

maximum number of items to keep in the in-memory queue
aspire.node.worker.queueSizeThreshold	yes	0.5	The capacity threshold of the in memory queue before requesting more items to the managers
aspire.node.worker.cleanUpWaitTime	yes	300000	The wait time in ms for the thread that checks the connectors clean up threshold
aspire.node.worker.cleanUpThreshold	yes	3600000	The time in ms for a connector to be idle before being removed from memory	Parameter	Required	Default
aspire.node.worker.maxMemQueueSizemaxEnqueueRetries	yes	40005	The maximum number of items to keep in the in memory queueretries to enqueue an item into the framework pipeline
aspire.node.worker.workflow.queueSizeThresholdappCleanUpWaitTime	yes	0.5	60000	The wait time in ms for the thread that checks the workflow application's clean-up thresholdThe capacity threshold of the in memory queue before requesting more items to the managers
aspire.node.worker.workflow.cleanUpWaitTimeappCleanUpThreshold	yes	3000003600000	The wait time in ms for the thread that checks the connectors clean up thresholda workflow application to be idle before being removed from memory
aspire.node.worker.tags	no		The tags of the worker node. These tags will determine which items this node can process. It should be a comma separated list of tags.
aspire.node.worker.cleanUpThresholdentryProcessorBaseSleep	yes	360000010	The base sleep time in ms for a connector to be idle before being removed from memory	aspire.node.worker.maxEnqueueRetries	yes	5	The number of retries to enqueue a item into the framework pipelinethe thread in charge of queuing received items into the connector framework pipelines
aspire.node.worker.workflow.appCleanUpWaitTimeentryProcessorMaxSleep	yes	600002000	The wait time maximum sleep in ms for the thread that checks the workflow applications clean up thresholdin charge of queuing received items into the connector framework pipelines
aspire.node.worker.workflow.appCleanUpThresholdentryProcessorMaxIterations	yes	36000005	The time in ms for a workflow application to be idle before being removed from memorynumber of iterations without queuing items before in increasing the sleep time
aspire.node.worker.tags	no	.entryProcessorMultiplier	yes	1.25	The multiplier used to increase the sleep time after the specified iterations without queuing itemsThe tags of the worker node. These tags will determine which items this node can process. Should be a comma separated list of tags.
aspire.node.worker.entryProcessorBaseSleepbatchLoaderBaseSleep	yes	10	The base sleep time in ms for the thread in charge of enqueueing received items into the connector framework pipelinesof requesting batches to the manager nodes
aspire.node.worker.entryProcessorMaxSleepbatchLoaderMaxSleep	yes	2000	The maximum sleep in ms for the thread in charge of enqueueing received items into the connector framework pipelinesof requesting batches to the manager nodes
aspire.node.worker.entryProcessorMaxIterationsbatchLoaderMaxIterations	yes	5	The number of iterations without enqueueing items receiving batches from the managers nodes before in increasing the sleep time
aspire.node.worker.entryProcessorMultiplierbatchLoaderMultiplier	yes	1.25	The multiplier used to increase the sleep time after the specified iterations without enqueueing itemsreceiving batches from the managers nodes
aspire.node.worker.batchLoaderBaseSleepconnectionTimeout	yes	1020000	The base sleep time in ms for the thread in charge of requesting batches to the manager connection timeout for requests to other aspire nodes
aspire.node.worker.batchLoaderMaxSleepsocketTimeout	yes	200020000	The maximum sleep in ms for the thread in charge of requesting batches to the manager socket timeout for requests to other aspire nodes
aspire.node.worker.batchLoaderMaxIterationsmaxRetries	yes	53	The number of iterations without receiving batches from the managers nodes before in increasing the sleep timeretries for requests to other aspire nodes
aspire.node.worker.batchLoaderMultiplierproxyHost	yesno	1.25null	The multiplier used to increase the sleep time after the specified iterations without receiving batches from the managers proxy host to use for requests to other aspire nodes
aspire.node.worker.connectionTimeoutproxyPort	yesno	200000	The connection timeout proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configured
aspire.node.worker.socketTimeoutproxyUser	yesno	20000null	The socket timeout proxy user to use for requests to other aspire nodes
aspire.node.worker.maxRetriesproxyPassword	yesno	3null	The number of retries proxy password to use for requests to other aspire nodes.
aspire.node.worker.proxyHostpingFrequency	noyes	null15000	The proxy host to use for requests to other aspire nodesfrequency for the node to ping to Elasticsearch. The pings are used to determine if a node is alive and working properly
aspire.node.worker.proxyPortnodeFailureTimeout	noyes	030000	The proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configured
aspire.node.worker.proxyUser	no	null	The proxy user to use for requests to other aspire nodes
ping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shut down itself

Manager Node

These properties will be used by all manager nodes in the cluster.

The proxy password to use for requests to other aspire nodes.

Parameter	Required	Default	Description	aspire.node.worker.proxyPassword	no	null
aspire.node.workermanager.pingFrequencyscanBatchCreatorBaseSleep	yes	1500030	The frequency for the node to ping to Elastisearch. The pings are used to determine if a node is alive and working properlybase sleep time in ms for the thread in charge of creating batches from the scan queue
aspire.node.workermanager.nodeFailureTimeoutscanBatchCreatorMaxSleep	yes	30000	The ping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shutdown itself

Manager Node

These properties will be used by all manager nodes in the cluster.

Description

2000	The maximum sleep in ms for the thread in charge of creating batches from the scan queue
aspire.node.manager.scanBatchCreatorMaxIterations	yes	10	The number of iterations without creating new scan batches before in increasing the sleep time
aspire.node.manager.scanBatchCreatorMultiplier	yes	1.25	The multiplier used to increase the sleep time after the specified iterations without creating new scan batches	Parameter	Required	Default
aspire.node.manager.scanBatchCreatorBaseSleepprocessBatchCreatorBaseSleep	yes	30	The base sleep time in ms for the thread in charge of creating batches from the scan process queue
aspire.node.manager.scanBatchCreatorMaxSleepprocessBatchCreatorMaxSleep	yes	2000	The maximum sleep in ms for the thread in charge of creating batches from the scan process queue
aspire.node.manager.scanBatchCreatorMaxIterationsprocessBatchCreatorMaxIterations	yes	10	The number of iterations without creating new scan process batches before in increasing the sleep time
aspire.node.manager.scanBatchCreatorMultiplierprocessBatchCreatorMultiplier	yes	1.25	The multiplier used to increase the sleep time after the specified iterations without creating new scan batchesprocess batches
aspire.node.manager.crawlProgressManagerBaseSleep	yes	100	The base sleep time in ms for the thread in charge of monitoring active crawls
aspire.node.manager.processBatchCreatorBaseSleepschedulerBaseSleep	yes	3010000	The base sleep time in ms for the thread in charge of creating batches from the process queueexecuting seeds based on the configured schedules
aspire.node.manager.processBatchCreatorMaxSleepmaxBatches	yes	20001000	The maximum sleep in ms for the thread in charge of creating batches from the process queuenumber of batches the manager will keep in memory
aspire.node.manager.processBatchCreatorMaxIterationsmaxBatchItems	yes	10100	The maximum number of iterations without creating new process batches before in increasing the sleep timedocuments per batch
aspire.node.manager.processBatchCreatorMultiplierconnectionTimeout	yes	1.2520000	The multiplier used to increase the sleep time after the specified iterations without creating new process batches connection timeout for requests to other aspire nodes
aspire.node.manager.crawlProgressManagerBaseSleepsocketTimeout	yes	10020000	The base sleep time in ms for the thread in charge of monitoring active crawlssocket timeout for requests to other aspire nodes
aspire.node.manager.schedulerBaseSleepmaxRetries	yes	100003	The base sleep time in ms for the thread in charge of executing seeds based on the configured schedulesnumber of retries for requests to other aspire nodes
aspire.node.manager.maxBatchesproxyHost	yesno	1000	null	The proxy host to use for requests to other aspire nodesThe maximum number of batches the manager will keep in memory
aspire.node.manager.maxBatchItemsproxyPort	no	yes	100	0	The proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configuredThe maximum number of documents per batch
aspire.node.manager.connectionTimeoutproxyUser	yesno	20000null	The connection timeout proxy user to use for requests to other aspire nodes
aspire.node.manager.socketTimeoutproxyPassword	yesno	20000null	The socket timeout proxy password to use for requests to other aspire nodes.
aspire.node.manager.maxRetriespingFrequency	yes	315000	The number of retries for requests to other aspire nodesfrequency for the node to ping to Elasticsearch. The pings are used to determine if a node is alive and working properly
aspire.node.manager.proxyHostnodeFailureTimeout	noyes	null30000	The proxy host to use for requests to other aspire nodesping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shutdown itself
aspire.node.manager.proxyPortinProgressJobTimeout	noyes	0	3600000	The maximum time in ms, a job can be in "in-progress" status before being released. Default is 1 hourThe proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configured
aspire.node.manager.proxyUserinProgressJobTimeoutCheckFrequency	noyes	null	1800000	How frequently to verify for timed-out "in-progress" jobsThe proxy user to use for requests to other aspire nodes
aspire.node.manager.proxyPasswordackCleanBaseSleep	no	null	5000	How frequently the manager should check for Acknowledged Batches that need to be removed from memoryThe proxy password to use for requests to other aspire nodes.
aspire.node.manager.pingFrequencyproxyCall	yesno	15000	false	Configure if you want to perform a proxy call to the main manager from a non-main manager and avoid redirect calls.The frequency for the node to ping to Elastisearch. The pings are used to determine if a node is alive and working properly
aspire.node.manager.nodeFailureTimeouttags	yes	30000	no		The tags of the manager node. These tags will determine which seeds this node can process. It should be a comma separated list of tags.The ping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shutdown itself
aspire.node.manager.inProgressJobTimeoutworkerRoundRobin	yesno	3600000	false	If round-robin should be applied when serving workers with batchesThe maximum time in ms a job can be in "in-progress" status before being released. Default is 1 hour
aspire.node.manager.inProgressJobTimeoutCheckFrequencyworkerRoundRobinTimeout	yesno	1800000	How frequently to verify for timed-out "in-progress" jobs	600000	The time in ms after which the worker is considered timed out when round-robin is used.

User Interface

Since 5.0.3, the following properties are also available for managing the behavior of the user interface.

Parameter	Required	Default	Description
aspire.ui.refreshRate	no	5s	How often trigger the auto-refresh for the listing pages. It can be defined with a time unit, i.e. 15s, 1 m	aspire.node.manager.ackCleanBaseSleep	no	5000	How frequently the manager should check for Acknowledged Batches that need to be removed from memory.

Dashboards

These properties will be used to generate the dashboard links in the UI

Parameter	Required	Default	Description
aspire.dashboards.enabled	no	false	Enables the dashboard links on the UI
aspire.dashboards.base	no		Base URL to Kibana
aspire.dashboards.main	no		Main Dasboard Dashboard relative URL
aspire.dashboards.metrics	no		Metrics Dashboard relative URL

Page tree

Versions Compared

Old Version 29

New Version Current

Key

Format

Memory settings

Elasticsearch NoSQL Provider

Index Sharding and Replicas settings

SSL Certificates

Index Sharding and Replicas settings

SSL Certificates

Security and encryption

Encryption

Standard AES 256 Encryption with local key

AWS KMS Encryption

Worker Node

Security and encryption

Worker Node

Manager Node

Manager Node

User Interface

Dashboards