Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

When deploying an Aspire cluster in a production environment, resource allocation and security settings becomes very important, as these environments should be configured to be as stable and secure as possible.

Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Resource Allocation

When deploying Aspire nodes, it is important to correctly size each VM or container running the node as different node type has different resource consumption behaviors.

Even though it can be used on development and testing, on production deployments starting manager and worker capabilities in the same JVM is not supported or recommended. There should be at least one distribution for the worker and one for the manager in different VMs.

Manager nodes

The number of manager nodes impact on the availability of the cluster and responsiveness of the cluster, as each manager node handles a set of active seeds (seeds for which there is a running crawl).

The optimum number of manager nodes also depends on how many worker nodes there are, as the more worker nodes there are, the harder the manager node would have to work to keep up with their requests. If the manager/worker node ratio is not right, the manager nodes might not serve quick enough the worker requests, or there would be very few workers to consume the work created by the manager, under-utilizing the manager's resources.

Minimum nodesRecommended nodesResources
12

4 GB RAM

2 CPU cores

For each manager node it is recommended to increase the CPU cores by one for every 100 concurrent seeds each node will manage. For instance:

Suppose you have 2 manager nodes, and initially you calculated 200 concurrent seeds a time. This means each manager will manage at most 100 seeds concurrently. If it is needed to increase it to 400 concurrent seeds, it implies 100 extra seeds per manager node, thus it is recommended to increase the CPU cores of each manager node by 1.

Worker nodes

The number of worker nodes impact directly on the crawl throughput, as these are the ones doing the actual work.

Minimum nodesRecommended nodesResources
12

16 GB RAM

4 CPU cores

Security Settings

Create a customized Encryption Key File

Aspire stores sensitive configuration such as credentials encrypted with AES-256 algorithm. For that it uses an encryption key located in a file accessible by the Aspire process. If such a key is not configured, a constant default key will be used to encrypt and decrypt.

Using the default key is not secure!, as anything encrypted with it can be decrypted in any other Aspire deployment using the default key.

It is strongly recommended to create a random 256 bit key file (32 bytes) and configure it as the encryption key for all Aspire nodes in the same cluster. See Encryption properties for details on setting it.

Secured Access Authentication and Authorization to Aspire Admin UI and REST API

If an engineering team will be managing Aspire, it is recommended to secure access to the UI by using LDAP to control who gets access to certain actions. See Security API for information on the security model and the roles and Ldap Configuration on how to configure it.

Enable HTTPS on Aspire Admin UI and REST API

It is recommended to secure access to Aspire HTTP endpoints with a TLS/SSL certificate (HTTPS), this is important since some requests will contain sensitive information like credentials. See Enable HTTPS for information on this.

Custom Keystore and Truststore configuration

If using HTTPS services (such as Elasticsearch provider, or crawling HTTPS repositories), and you need to trust the CA of those services it is recommended to include a Java Keystore providing the custom trusted certificates. See Crawling via HTTPs.