You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

The Elasticsearch Connector will crawl content from a Elasticsearch repository.


Introduction


The Elasticsearch connector retrieves documents stored in an Elastic index using an Elasticsearch query to filter the documents to extract.

Environment and Access Requirements


Repository Support

The Elasticsearch connector supports crawling the following the repositories:

RepositoryVersionConnector Version
ElasticsearchALL5.0

Account Privileges

For the Elasticsearch connector to be able to crawl content, the Aspire Worker nodes must be run with an account with (permissions).

If authentication is enabled in the Elasticsearch server, a user account with sufficient privileges must be supplied.

Environment Requirements


Framework and Connector Features


Framework Features

Name Supported
Content Crawlingyes
Identity Crawlingno
Snapshot-based Incrementalsyes
Non-snapshot-based Incrementalsyes
Document Hierarchy

Connector Features

The Elasticsearch connector has the following features:

  • Extract documents from multiple Elasticsearch indexes
  • Use of  Query DSL to define queries.
  • Slice support for querying
  • Basic and AWS Signature V4 Authentication.
  • Use of Get or MGet Elasticsearch methods for fetching content.


Content Crawled


The Elasticsearch connector is able to crawl the following objects:

NameTypeRelevant MetadataContent Fetch and ExtractionDescription
Indexescontainer
N/AThe ones that hold the documents.
Documentsdocument
YesThe documents hold by indices. A document can be any structured data (numbers, strings, dates, etc) encoded in JSON. 






Limitations


No limitations defined

  • No labels