Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Since the Beginning with release 3.1 release0, Aspire needs uses an external MongoDB instance in order to keep its for connectors built with the new NoSQL Connector Framework. The database is used to keep crawl metadata and distribute the allow processing and scanning to be distributed. All the MongoDB configuration should be is done in the settings.xml file. 

Basic Example:


Excerpt
Code Block
languagexml
firstline78
linenumberstrue
  <!-- noSql database provider for the 3.2 connector framework -->
  <noSQLConnectionProvider sslEnabled="false" sslInvalidHostNameAllowed="false">
    <implementation>com.searchtechnologies.aspire:aspire-mongodb-provider</implementation>
    <servers>mongodb-host:27017</servers>
  </noSQLConnecitonProvider>


Aspire will create one MongoDB database for each content source configured. When the content source is deleted, the database will be dropped. The database name will the taken from the normalized value of the content source name. Starting in Aspire 3.2, the database names will be prefixed with "aspire-" to avoid possible conflicts of name. If you wish to change the prefix, add a "namespace" to the configuration:


Code Block
languagexml
firstline78
linenumberstrue
  <!-- noSql database provider for the 3.2 connector framework -->
  <noSQLConnectionProvider 

...

sslEnabled="false" sslInvalidHostNameAllowed="false">
    <namespace>myNamespace</namespace>
    <implementation>com.searchtechnologies.aspire:aspire-mongodb-provider</implementation>
    <servers>mongodb-host:27017</servers>
  </noSQLConnecitonProvider>


Anchor
Connect+to+a+Multi-node+MongoDB+Installation
Connect+to+a MongoDB ClusterConnect to a MongoDB Cluster+Multi-node+MongoDB+Installation

Connect to a Multi-node MongoDB Installation

To connect to a multi-node MongoDB installation, you just need to provide a comma-separated list of hostname:port of the MongoDB nodes in the cluster.

Example:

Code Block
languagexml
firstline78
linenumberstrue
  <!-- noSql database provider for the 3.2 connector framework -->
  <noSQLConnectionProvider sslEnabled="false" sslInvalidHostNameAllowed="false">
    <implementation>com.searchtechnologies.aspire:aspire-mongodb-provider</implementation>
    <servers>mongodb-host1:27017,mongodb-host2:27017,mongodb-host3:27017,mongodb-host4:27017</servers>
  </noSQLConnecitonProvider>

Using TLS/SSL

If you need to connect to a MongoDB configured to Use TLS/SSL you need to set the following attributes into the noSQLConnectionProvider tag:

AttributeValueDescription
sslEnabledtrueEnables the ssl on the Aspire MongoDB client
sslInvalidHostNameAllowedtrue/falseDisables the hostname verification from the SSL validation


For using TLS/SSL you need to make sure the Certificate Authority (CA) that signed the server certificate that MongoDB is using (server.pem) is a trusted certificate, or that its trust chain can lead to one. If you are using a self signed Certificate Authority to sign your server certificate, you need to add it into the java truststore.

To use a java truststore that you need the Certificate Authority certificate (.cert) and import it using the following command

Code Block
$ keytool -import -trustcacerts -alias slc -file <your-CA-certificate.cert> -keystore truststore.jks -storepass <your-truststore-password> -noprompt

After importing it into a truststore you need to add it into the Aspire startup script, read Crawling via HTTPs for more instructions on how to add the truststore into the startup script.

Retries Settings

The Provider will automatically retry the operations in case they couldn't be completed because of connections errors. The maximum retries to execute is configurable using the "maxRetries" option. By default (if nothing is provided), it will not retry operations at all.

Code Block
languagexml
firstline78
linenumberstrue
  <!-- noSql database provider for the 3.2 connector framework -->
  <noSQLConnectionProvider sslEnabled="false" sslInvalidHostNameAllowed="false">
    <namespace>myNamespace</namespace>
    <implementation>com.searchtechnologies.aspire:aspire-mongodb-provider</implementation>
    <servers>mongodb-host:27017</servers>
	<maxRetries>5</maxRetries>
  </noSQLConnecitonProvider>

MongoDB Authentication

Aspire 3.2 supports authenticating to MongoDB using X.509. Based on the requirement will be necessary modify the settings.xml file.

X.509 Authentication

Aspire 3.2 only supports authenticating to MongoDB using X.509.

The X.509 mechanism authenticates a user whose name is derived from the distinguished subject name of the X.509 certificate presented by the driver during SSL negotiation. This authentication method requires the use of SSL connections with certificate validation.

To configure it, add the following to your settings.xml file:

Code Block
languagexml
firstline78
linenumberstrue
  <!-- noSql database provider for the 3.2 connector framework -->
  <noSQLConnectionProvider sslEnabled="true" sslInvalidHostNameAllowed="false">
    <implementation>com.searchtechnologies.aspire:aspire-mongodb-provider</implementation>
    <servers>mongodb-host:27017</servers>
    <x509username>CN=user,OU=OrgUnit,O=myOrg</x509username>
  </noSQLConnecitonProvider>

If you don't know what to use into the <x509username> field execute the following command using the x509 client certificate:

Code Block
$ openssl x509 -in client.pem -inform PEM -subject -nameopt RFC2253 | grep subject
subject= CN=aaguilar-lptp.search.local,OU=demouser,O=Search Technologies S.A.,ST=Limon,C=CR


For using x509 authentication you need to import the client x509 certificate into a java keystore for Aspire to be able to present it to the server for authentication. (The truststore should already be set in the startup script for self signed certificates)

For importing the x509 certificate (client.pem) into a java keystore you need to execute the following commands:

Code Block
$ openssl pkcs12 -export -out client.pkcs12 -in client.pem
Enter Export Password: <your-password-here>

$ keytool -importkeystore -srckeystore client.pkcs12 -srcstoretype PKCS12 -destkeystore client.jks -deststoretype JKS
Enter destination keystore password:
Re-enter new password: <your-password-here>
Enter source keystore password: <your-password-here>
Entry for alias 1 successfully imported.
Import command completed:  1 entries successfully imported, 0 entries failed or cancelled

After importing the client's certificate into a java keystore, you need to include it into the Aspire startup script (aspire.bat) :

Code Block
-Djavax.net.ssl.keyStore=C:\pathToKeyStore\client.jks
-Djavax.net.ssl.keyStorePassword=password

Encrypt sensitive fields in MongoDB

If you want to be extra safe and encrypt the URLs, IDs, or any other metadata stored in MongoDB, you can do by specifying the name of the fields to encrypt:


Code Block
languagexml
firstline78
linenumberstrue
  <!-- noSql database provider for the 3.2 connector framework -->
  <noSQLConnectionProvider sslEnabled="false" sslInvalidHostNameAllowed="false">
    <implementation>com.searchtechnologies.aspire:aspire-mongodb-provider</implementation>
    <servers>mongodb-host:27017</servers>
    <encryptFields>
      <field>_id</field> <!-- Encrypts all the IDs -->
      <field>url</field> <!-- Encrypts the url fields -->
      <field>fetchUrl</field> 
      <field>parentId</field> 
    </encryptFields>
  </noSQLConnecitonProvider>