Configuration
This section lists all configuration parameters available to configure the Publish to Azure Search App Bundle component.
Element | Type | Default | Description |
---|
ElasticNoUrlbooleantrue if the publisher must use a Url or build one from the host and port entered. the name of the service endpoint to use.. |
index |
ElasticUrlComplete Url Azure Search index where the |
feeds are going to be send. e.g. http://localhost:9200/bulk_ElasticPort | int | 9200 | ElasticSearch port where to send the feeds |
ElasticHost | String | - | ElasticSearch hostname or IP adress. e.g. server.domain.com |
ElasticIndex | String | index1 | Index to which the jobs are going to be published. |
aspireToElasticGroovyaspireToElasticsearchBulkmaxResults | int | 1000000 | How many documents can be fetched by the search engine for the same query |
pageSize | int | 10000 | How many documents to fetch per page |
idField | String | hits._id | Field used to store the url in the search engine |
urlField | String | hits.fields.url | Field used to store the id in the search engine. |
timestampField | String | hits.fields.submitTime | The name of the timestamp field holding the index timestamp of every document. |
aspireToAzureSearchBulk.groovy | Location of the Groovy to transform the job data to a |
ElasticSearch feed.debug | boolean | false | If true it will log debug information from the component |
Example Configuration
With Host and Port
Code Block |
---|
|
<application config="com.searchtechnologies.aspire:app-publish-to-elasticsearch">
<properties>
<ElasticNoUrl>true</ElasticNoUrl>
<ElasticHost>locahost</ElasticHost>
<ElasticPort>9200</ElasticPort>
<ElasticIndex>index1</ElasticIndex>
<aspireToElasticGroovy>${appbundle.home}/config/groovy/aspireToElasticsearchBulk.groovy</aspireToElasticGroovy><server>corest.search.windows.net</server>
<maxResults>1000000</maxResults>
<pageSize>10000<<index>test</pageSize>index>
<idField>hits._id</idField>
<urlField>hits.fields.url</urlField>
<timestampField>hits.fields.submitTime</timestampField>
<debug>false</debug>
</properties>
</application> |
With Complete Url
Code Block |
---|
|
<application config="com.searchtechnologies.aspire:app-publish-to-elasticsearch">
<properties>
<apiVersion>2016-09-01</apiVersion>
<ElasticNoUrl>false</ElasticNoUrl>
<ElasticUrl>http<apiKey>encrypted://localhost:9200/_bulk</ElasticUrl>9804B36327DAF1E712E4E82301B6A276FCBBA459834EB15F6A94255B6B0BC32B20A3E7262DD2D3D74A6FE5A70A251FCD</apiKey>
<ElasticIndex>index1</ElasticIndex>
<aspireToElasticGroovy>$<aspireToAzureSearchGroovy>${appbundle.home}/config/groovy/aspireToElasticsearchBulkaspireToAzureSearchBulk.groovy</aspireToElasticGroovy>aspireToAzureSearchGroovy>
<maxResults>1000000</maxResults>
<pageSize>10000</pageSize>
<idField>hits._id</idField>
<urlField>hits.fields.url</urlField>
<timestampField>hits.fields.submitTime</timestampField>
<debug>false</debug>
</properties>
</application> |
Edit Groovy
The default Groovy transformation file can be found in aspireToElasticsearchBulk.groovyaspireToAzureSearchBulk
The default transformation Groovy file provided by the publisher expects metadata as described in Connector Metadata
To add a new metadata field extracted by an Aspire Connector add an groovy element inside the builder.$object() that is right after the builder.flush().
metadata-name doc.metadatafield
Change the document ID
The id of a ElasticSearch Azure Search document is used to uniquely identify a file in the index. By default, Publish To ElasticSearch Azure Search will use the MD5 of the following fields from the Aspire document in order of precedence (if one is missing, then the next will be used):
If you want to change this behavior, edit or create a new Groovy file which has the following element inside builder.index():
Code Block |
---|
language | groovy |
---|
theme | Eclipse |
---|
|
'_id' value-for-id |
Tip |
---|
For more information in how to create a Groovy file transformation please see JSON Transformation |
Connector-specific fields
By default the connector specific fields of the document are not indexed, in order to enable the indexing of connector specific fields you have to add them at the map connectorSpecificMap that is at the start of the Groovy file
Code Block |
---|
language | groovy |
---|
theme | Eclipse |
---|
|
def connectorSpecificMap = [
'isContainer':'is_container'
] |
The key of the map entry is the name of the connector specific field as is contained by the document and the value is the name that is going to be used for the indexing. Only the fields specified in this map will be indexed.