Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Some Aspire components must be configured before the settings.json is read like the NoSQL provider. This configuration is done using environment variables or JVM parameters.

Some components configured in the settings file like the Worker and Manager node allows configuration with environment variables or JVM parameters as an alternative

Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Format


All properties can be passed either as a environment variable or as a JVM parameter. The "." and "_" characters can be replaced with each other.

Code Block
languagebash
themeRDark
SET aspire.noSql.elastic.server=http://localhost:9200 //Setting a property as a environment variable
java -Daspire.noSql.elastic.server=http://localhost:9200 //Passing a property like JVM parameter
Warning

In the case of environment variables that , property name should always use "_" instead of "."

Memory settings


In Linux based systems, the memory usage can be changed with the following environment variables

Parameter

RequiredDefaultDescription
aspire.max.heap.memory
no2gMaximum Heap the JVM will request and use from the host system
aspire.max.metaspace.size
no256mMaximum metaspace the JVM will request and use from the host system.


In windows-based systems, if using the aspire.bat file, the memory usage should be changed in the aspire.bat itself. Otherwise, if it is running as a Windows Service:

  1. Browse to
    • HKEY_LOCAL_MACHINE / SOFTWARE / WOW6432Node / Apache Software / Procrun 2.0 / AspireService / Parameters / Java
  2. Modify JvmMs and JvmMx to what you need (in bytes), and then restart the service.


Elasticsearch NoSQL Provider


Bellow Below are the list of properties to configure the Elasticsearch NoSQL Provider

elasticsearch url urls elasticsearch elasticsearch elasticsearch elasticsearch access.keyAWS secret key for authenticationregionThe regionkeepSearchContextAliveThe elasticsearch amount mantain amount maxRetries3 amount of times to retry each request elasticsearch amount path 
ParameterRequiredDefaultNew & Available fromDescription
aspire.noSql.elastic.serveryeshttp://localhost:9200
The Elasticsearch server URL to use. It can be multiple URLs separated by ","
aspire.noSql.elastic.authentication.basicnofalse
Enables Elasticsearch basic authentication
aspire.noSql.elastic.usernonull
The Elasticsearch user to use for basic authentication
aspire.noSql.elastic.passwordnonull
The Elasticsearch password to use for basic authentication
aspire.noSql.elastic.authentication.awsnofalse
Enables Elasticsearch AWS authentication
aspire.noSql.elastic.authentication.useCredentialsProviderChainnofalse
Option to use the AWS credentials provider chain to get the credentials
aspire.noSql.elastic.aws.assumeRolenonullAWS access key for authenticationaspire.noSql.elastic.aws.secret.keynonullfalse5.0.3If a role must be assumed to access Elasticsearch. Must be true/false
aspire.noSql.elastic.aws.roleArnnonull5.0.3The Role ARN to be assumed.
aspire.noSql.elastic.aws.access.keynonull
AWS access key for authentication
aspire.noSql.elastic.aws.secret.keynonull
AWS secret key for authentication
aspire.noSql.elastic.aws.regionnonull
The AWS region
aspire.noSql.elastic.keepSearchContextAliveno5m
The 5m amount of time to keep Elasticsearch scrolls requests search context alive using "m" as a unit
aspire.noSql.elastic.maxRequestSizeno10485760B
The maximum size for a bulk request. The value can be specified in B, K, M and G units
aspire.noSql.elastic.maxConnectionsno100
The maximum number of connections to keep open.
aspire.noSql.elastic.maxConnectionsPerRouteno10
The maximum number of connections per server

aspire.noSql.elastic.

readTimeout

no30000
The socket timeout.
aspire.noSql.elastic.connectionTimeoutno15000
The connection timeout.
aspire.noSql.elastic.maxRetriesno3
The number of times to retry each request
aspire.noSql.elastic.retriesWaitTimeno5000
The time to wait in ms between retries
aspire.noSql.elastic.useThrottlingnofalse
Enables requests throttling to Elasticsearch
aspire.noSql.elastic.throttlingRateno5000
The throttling rate in ms
aspire.noSql.elastic.throttlingConnectionRateno500
The maximum number of requests allowed in the period specified by the throttlingRate
aspire.noSql.elastic.waitTime429no3000
The time to wait in ms before retrying a 429 error
aspire.noSql.elastic.bulknotrue
Enables using bulk for requests
aspire.noSql.elastic.bulkSizeno500
The maximum number of documents to include in a bulk request
aspire.noSql.elastic.bulkInactivityTimeoutno5
The inactivity in s before flushing a bulk request
aspire.noSql.elastic.bulkRegularTimeoutno30
The maximum amount of time in s for a bulk request to be kept in memory before flushing
aspire.noSql.elastic.debugFilenonull
The path to the debug file, request to ES are logged in this file 
aspire.noSql.elastic.mappingFilenonull
The file path (including file name) that includes the mapping for the indexes used by Aspire. By default, the mapping included in the provider is used
aspire.noSql.elastic.index.prefixnoaspire
The prefix to use for the indexes created by the provider
aspire.noSql.elastic.debugnofalse
Enables debug logging information
aspire.noSql.elastic.usePoolingnotrue
Enables HTTP connection pooling

Index Sharding and Replicas settings

Each index sharding and replicas settings can be changed by using a custom mappingFile and configuring the aspire.noSql.elastic.mappingFile property.

Since 5.0.1, when using the default mapping settings, you can use the following prefix parameter:

  • aspire.noSql.elastic.index.shards.[indexName]
    • Should hold an integer specifying the number of shards for the given index
  • aspire.noSql.elastic.index.replicas.[indexName]
    • Should hold an integer specifying the number of replicas for the given index

If none of these properties are specified, the default replicas and shard number of the cluster will be used (usually 1 shard and 1 replica).

The different index names available are:

  • audit

  • base

  • errors

  • hierarchy

  • log

  • map

  • queue

  • retryLog

  • set

  • settings

  • snapshot

  • updateQueue

  • identityCache

SSL Certificates

Bellow are the list of properties to configure the SSL Certificates

ParameterRequiredDefaultDescriptionaspire.ssl.trustAllnofalseConfigure if all certificates should be trustedaspire.ssl.overwriteFactorynofalseConfigure if the created key managers should overwrite the Java connection factoryaspire.ssl.truststore.filenonullThe path of the trust store fileaspire.ssl.truststore.passwordnonullThe trust store file passwordaspire.ssl.truststore.typenojksThe file format of the trust store fileaspire.ssl.keystore.filenonullThe path of the key store fileaspire.ssl.keystore.passwordnonullThe key store file passwordaspire.ssl.keystore.typenojksThe file format of the key store file
aspire.nosql.elastic.timeSeriesTypenoindex5.0.3, 5.3 (Opensearch) Enables rollover possibility for audit, error, log time series index. Value "index" - default without rollover, Value "dataStream" - rollover based on data stream. Value "dataStreamOpensearch" - rollover based on Opensearch data stream.
aspire.nosql.elastic.ilmPolicyFilenodefault provider policy file5.0.3The file path (including file name) that includes the ilm policy file for the indexes used by Aspire. Used only for the time series option "dataStream*". 
aspire.nosql.elastic.indexTemplateFilenodefault provider template file5.0.3The file path (including file name) that includes the template file for the indexes used by Aspire. Used only for the time series option "dataStream*". 

Index Sharding and Replicas settings

Each index sharding and replicas settings can be changed by using a custom mappingFile and configuring the aspire.noSql.elastic.mappingFile property.


Since 5.0.1, when using the default mapping settings, you can use the following prefix parameter:

  • aspire.noSql.elastic.index.shards.[indexName]
    • Should hold an integer specifying the number of shards for the given index
  • aspire.noSql.elastic.index.replicas.[indexName]
    • Should hold an integer specifying the number of replicas for the given index

If none of these properties are specified, the default replicas and shard number of the cluster will be used (usually 1 shard and 1 replica).

The different index names available are:

  • audit

  • base

  • errors

  • hierarchy

  • log

  • map

  • queue

  • retryLog

  • set

  • settings

  • snapshot

  • updateQueue

  • identityCache

SSL Certificates


Bellow are the list of properties to configure the SSL Certificates

ParameterRequiredDefaultDescription
aspire.ssl.trustAllnofalseConfigure if all certificates should be trusted
aspire.ssl.overwriteFactorynofalseConfigure if the created key managers should overwrite the Java connection factory
aspire.ssl.truststore.filenonullThe path of the trust store file
aspire.ssl.truststore.passwordnonullThe trust store file password
aspire.ssl.truststore.typenojksThe file format of the trust store file
aspire.ssl.keystore.filenonullThe path of the key store file
aspire.ssl.keystore.passwordnonullThe key store file password
aspire.ssl.keystore.typenojksThe file format of the key store file

Security and encryption


Below are the list of properties related to the Aspire security and encryption

ParameterRequiredDefaultDescription
aspire.ldap.bind.dn.passwordnonullThe password of the User DN. Not required if the authentication is anonymous 
aspire.security.api.auditingnotrueIt audits the Aspire API calls. It is available from Aspire 5.1 and on.

Encryption

When a password/secret/token in a configuration must be persisted, Aspire encrypts it and stores it encrypted. The default encryption mechanism is AES 256 given a local key. AWS KMS encryption can be used instead as of version 5.0.3

Standard AES 256 Encryption with local key

ParameterRequiredDefaultDescription
aspire.encryption.key.filenonull

(Optional) Path (including file name) where the encryption key is located, if not provided a default in-memory key will be used, for production installations it must be always provided. This can also be passed as a JVM parameter or as an environment variable, aspire_encryption_key_file.

This should be a 32 byte file, if longer, the first 32 bytes will be used as the encryption key.

Grant read access to the Aspire user only (chmod 400 <file>)

This file could be generated randomly

$ head -c 32 /dev/urandom > encryption.key

AWS KMS Encryption

Available since Aspire 5.0.3, Uses AWS Key Management Service (KMS) to encrypt the sensitive data. It uses a key in KMS to encrypt and decrypt data. See more details about this encryption provider at Aspire KMS encryption

ParameterRequiredDefaultDescription
aspire.encryption.kms.roleARNnonull

(Optional) If the KMS service must be accessed through the assumption of an IAM role, specify the role ARN.

aspire.encryption.kms.keyARNyesN/AThe KMS key ARN. See Aspire KMS encryption for more information about creating a KMS key for Aspire.
aspire.encryption.kms.regionyesN/AThe AWS region on which the KMS service will be used
aspire.encryption.kms.accessKeynonull(Optional) Specify the access key if static credentials must be used. If this is not specified, the Default Credential Provider Chain will be used.
aspire.encryption.kms.secretKeynonull(Optional) Specify the secret key if static credentials must be used. If this is not specified, the Default Credential Provider Chain will be used.

Worker Node


These properties will be used by all worker nodes in the cluster. 

Security and encryption

Bellow are the list of properties related to the Aspire security and encryption

ParameterRequiredDefaultDescription
aspire.ldapnode.bindworker.dn.passwordmaxMemQueueSizenoyesnull4000The password of the User DN. Not required if the authentication is anonymous 
aspire.encryption.key.filenonull

(Optional) Path (including file name) where the encryption key is located, if not provided a default in-memory key will be used, for production installations it must be always provided. This can also be passed as a JVM parameter or as an environment variable aspire_encryption_key_file

This should be a 32 byte file, if longer, the first 32 bytes will be used as the encryption key.

Grant read access to the Aspire user only (chmod 400 <file>)

This file could be generated randomly

$ head -c 32 /dev/urandom > encryption.key

Worker Node

These properties will be used by all worker nodes in the cluster. 

Description
maximum number of items to keep in the in-memory queue
aspire.node.worker.queueSizeThresholdyes0.5The capacity threshold of the in memory queue before requesting more items to the managers
aspire.node.worker.cleanUpWaitTimeyes300000The wait time in ms for the thread that checks the connectors clean up threshold
aspire.node.worker.cleanUpThresholdyes3600000The time in ms for a connector to be idle before being removed from memoryParameterRequiredDefault
aspire.node.worker.maxMemQueueSizemaxEnqueueRetriesyes40005The maximum number of items to keep in the in memory queueretries to enqueue an item into the framework pipeline
aspire.node.worker.workflow.queueSizeThresholdappCleanUpWaitTimeyes0.560000The wait time in ms for the thread that checks the workflow application's clean-up thresholdThe capacity threshold of the in memory queue before requesting more items to the managers
aspire.node.worker.workflow.cleanUpWaitTimeappCleanUpThresholdyes3000003600000The wait time in ms for the thread that checks the connectors clean up thresholda workflow application to be idle before being removed from memory
aspire.node.worker.tagsno
The tags of the worker node. These tags will determine which items this node can process. It should be a comma separated list of tags.
aspire.node.worker.cleanUpThresholdentryProcessorBaseSleepyes360000010The base sleep time in ms for a connector to be idle before being removed from memoryaspire.node.worker.maxEnqueueRetriesyes5The number of retries to enqueue a item into the framework pipelinethe thread in charge of queuing received items into the connector framework pipelines
aspire.node.worker.workflow.appCleanUpWaitTimeentryProcessorMaxSleepyes600002000The wait time maximum sleep in ms for the thread that checks the workflow applications clean up thresholdin charge of queuing received items into the connector framework pipelines
aspire.node.worker.workflow.appCleanUpThresholdentryProcessorMaxIterationsyes36000005The time in ms for a workflow application to be idle before being removed from memorynumber of iterations without queuing items before in increasing the sleep time
aspire.node.worker.tagsno.entryProcessorMultiplieryes1.25The multiplier used to increase the sleep time after the specified iterations without queuing itemsThe tags of the worker node. These tags will determine which items this node can process. Should be a comma separated list of tags.
aspire.node.worker.entryProcessorBaseSleepbatchLoaderBaseSleepyes10The base sleep time in ms for the thread in charge of enqueueing received items into the connector framework pipelinesof requesting batches to the manager nodes
aspire.node.worker.entryProcessorMaxSleepbatchLoaderMaxSleepyes2000The maximum sleep in ms for the thread in charge of enqueueing received items into the connector framework pipelinesof requesting batches to the manager nodes
aspire.node.worker.entryProcessorMaxIterationsbatchLoaderMaxIterationsyes5The number of iterations without enqueueing items receiving batches from the managers nodes before in increasing the sleep time
aspire.node.worker.entryProcessorMultiplierbatchLoaderMultiplieryes1.25The multiplier used to increase the sleep time after the specified iterations without enqueueing itemsreceiving batches from the managers nodes
aspire.node.worker.batchLoaderBaseSleepconnectionTimeoutyes1020000The base sleep time in ms for the thread in charge of requesting batches to the manager connection timeout for requests to other aspire nodes
aspire.node.worker.batchLoaderMaxSleepsocketTimeoutyes200020000The maximum sleep in ms for the thread in charge of requesting batches to the manager socket timeout for requests to other aspire nodes
aspire.node.worker.batchLoaderMaxIterationsmaxRetriesyes53The number of iterations without receiving batches from the managers nodes before in increasing the sleep timeretries for requests to other aspire nodes
aspire.node.worker.batchLoaderMultiplierproxyHostyesno1.25nullThe multiplier used to increase the sleep time after the specified iterations without receiving batches from the managers proxy host to use for requests to other aspire nodes
aspire.node.worker.connectionTimeoutproxyPortyesno200000The connection timeout proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configured
aspire.node.worker.socketTimeoutproxyUseryesno20000nullThe socket timeout proxy user to use for requests to other aspire nodes
aspire.node.worker.maxRetriesproxyPasswordyesno3nullThe number of retries proxy password to use for requests to other aspire nodes.
aspire.node.worker.proxyHostpingFrequencynoyesnull15000The proxy host to use for requests to other aspire nodesfrequency for the node to ping to Elasticsearch. The pings are used to determine if a node is alive and working properly
aspire.node.worker.proxyPortnodeFailureTimeoutnoyes030000The proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configured
aspire.node.worker.proxyUsernonullThe proxy user to use for requests to other aspire nodes
ping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shut down itself

Manager Node


These properties will be used by all manager nodes in the cluster. 

The proxy password to use for requests to other aspire nodes.
ParameterRequiredDefaultDescriptionaspire.node.worker.proxyPasswordnonull
aspire.node.workermanager.pingFrequencyscanBatchCreatorBaseSleepyes1500030The frequency for the node to ping to Elastisearch. The pings are used to determine if a node is alive and working properlybase sleep time in ms for the thread in charge of creating batches from the scan queue
aspire.node.workermanager.nodeFailureTimeoutscanBatchCreatorMaxSleepyes30000The ping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shutdown itself

Manager Node

These properties will be used by all manager nodes in the cluster. 

Description
2000The maximum sleep in ms for the thread in charge of creating batches from the scan queue
aspire.node.manager.scanBatchCreatorMaxIterationsyes10The number of iterations without creating new scan batches before in increasing the sleep time
aspire.node.manager.scanBatchCreatorMultiplieryes1.25The multiplier used to increase the sleep time after the specified iterations without creating new scan batchesParameterRequiredDefault
aspire.node.manager.scanBatchCreatorBaseSleepprocessBatchCreatorBaseSleepyes30The base sleep time in ms for the thread in charge of creating batches from the scan process queue
aspire.node.manager.scanBatchCreatorMaxSleepprocessBatchCreatorMaxSleepyes2000The maximum sleep in ms for the thread in charge of creating batches from the scan process queue
aspire.node.manager.scanBatchCreatorMaxIterationsprocessBatchCreatorMaxIterationsyes10The number of iterations without creating new scan process batches before in increasing the sleep time
aspire.node.manager.scanBatchCreatorMultiplierprocessBatchCreatorMultiplieryes1.25The multiplier used to increase the sleep time after the specified iterations without creating new scan batchesprocess batches 
aspire.node.manager.crawlProgressManagerBaseSleepyes100The base sleep time in ms for the thread in charge of monitoring active crawls
aspire.node.manager.processBatchCreatorBaseSleepschedulerBaseSleepyes3010000The base sleep time in ms for the thread in charge of creating batches from the process queueexecuting seeds based on the configured schedules
aspire.node.manager.processBatchCreatorMaxSleepmaxBatchesyes20001000The maximum sleep in ms for the thread in charge of creating batches from the process queuenumber of batches the manager will keep in memory
aspire.node.manager.processBatchCreatorMaxIterationsmaxBatchItemsyes10100The maximum number of iterations without creating new process batches before in increasing the sleep timedocuments per batch
aspire.node.manager.processBatchCreatorMultiplierconnectionTimeoutyes1.2520000The multiplier used to increase the sleep time after the specified iterations without creating new process batches connection timeout for requests to other aspire nodes
aspire.node.manager.crawlProgressManagerBaseSleepsocketTimeoutyes10020000The base sleep time in ms for the thread in charge of monitoring active crawlssocket timeout for requests to other aspire nodes
aspire.node.manager.schedulerBaseSleepmaxRetriesyes100003The base sleep time in ms for the thread in charge of executing seeds based on the configured schedulesnumber of retries for requests to other aspire nodes
aspire.node.manager.maxBatchesproxyHostyesno1000nullThe proxy host to use for requests to other aspire nodesThe maximum number of batches the manager will keep in memory
aspire.node.manager.maxBatchItemsproxyPortnoyes1000The proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configuredThe maximum number of documents per batch
aspire.node.manager.connectionTimeoutproxyUseryesno20000nullThe connection timeout proxy user to use for requests to other aspire nodes
aspire.node.manager.socketTimeoutproxyPasswordyesno20000nullThe socket timeout proxy password to use for requests to other aspire nodes.
aspire.node.manager.maxRetriespingFrequencyyes315000The number of retries for requests to other aspire nodesfrequency for the node to ping to Elasticsearch. The pings are used to determine if a node is alive and working properly
aspire.node.manager.proxyHostnodeFailureTimeoutnoyesnull30000The proxy host to use for requests to other aspire nodesping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shutdown itself
aspire.node.manager.proxyPortinProgressJobTimeoutnoyes03600000The maximum time in ms, a job can be in "in-progress" status before being released. Default is 1 hourThe proxy port to use for requests to other aspire nodes. Must be provided if the proxyHost is configured
aspire.node.manager.proxyUserinProgressJobTimeoutCheckFrequencynoyesnull1800000How frequently to verify for timed-out "in-progress" jobsThe proxy user to use for requests to other aspire nodes
aspire.node.manager.proxyPasswordackCleanBaseSleepnonull5000How frequently the manager should check for Acknowledged Batches that need to be removed from memoryThe proxy password to use for requests to other aspire nodes.
aspire.node.manager.pingFrequencyproxyCallyesno15000falseConfigure if you want to perform a proxy call to the main manager from a non-main manager and avoid redirect calls.The frequency for the node to ping to Elastisearch. The pings are used to determine if a node is alive and working properly
aspire.node.manager.nodeFailureTimeouttagsyes30000no
The tags of the manager node. These tags will determine which seeds this node can process. It should be a comma separated list of tags.The ping timeout used to determine if a node is not working. The node will be marked as failed in this case and the node eventually will shutdown itself
aspire.node.manager.inProgressJobTimeoutworkerRoundRobinyesno3600000falseIf round-robin should be applied when serving workers with batchesThe maximum time in ms a job can be in "in-progress" status before being released. Default is 1 hour
aspire.node.manager.inProgressJobTimeoutCheckFrequencyworkerRoundRobinTimeoutyesno1800000How frequently to verify for timed-out "in-progress" jobs600000The time in ms after which the worker is considered timed out when round-robin is used.

User Interface


Since 5.0.3, the following properties are also available for managing the behavior of the user interface.

ParameterRequiredDefaultDescription
aspire.ui.refreshRateno5sHow often trigger the auto-refresh for the listing pages. It can be defined with a time unit, i.e. 15s, 1 maspire.node.manager.ackCleanBaseSleepno5000How frequently the manager should check for Acknowledged Batches that need to be removed from memory.

Dashboards


These properties will be used to generate the dashboard links in the UI

ParameterRequiredDefaultDescription
aspire.dashboards.enablednofalseEnables the dashboard links on the UI
aspire.dashboards.baseno
Base URL to Kibana
aspire.dashboards.mainno
Main Dasboard Dashboard relative URL
aspire.dashboards.metricsno
Metrics Dashboard relative URL