The Apache Kafka Connector will crawl content from a  repository.


Introduction


The Apache Kafka Connector will crawl content from a file system with PLAINTEXT protocol.

Environment and Access Requirements


Repository Support

The Apache Kafka connector supports crawling the following the repositories:

RepositoryVersionConnector Version
Windowsall5.0.2
Linuxall5.0.2

Account Privileges

For the Apache Kafka connector to be able to crawl content, the Aspire Worker nodes must be run with an account with.

Environment Requirements

Requirementversion
Apache Kafka3.0.0

Framework and Connector Features


Framework Features

Name Supported
Content Crawling

Yes

Identity CrawlingNo
Snapshot-based Incrementals
Yes
Non-snapshot-based IncrementalsNo
Document HierarchyNo

Connector Features

The Apache Kafka connector has the following features:

  • Extract documents from multiple Kafka topics.
  • Select between one and multiple servers to crawl.


Content Crawled


The Apache Kafka connector is able to crawl the following objects:


NameTypeContent Fetch and ExtractionDescription
TopicSystem.StringYesThe topic name
PartitionsSystem.Collections.Generic.List<PartitionMetadata>YesMetadata for the partitions of the topic.

Limitations


The Apache Kafka Connector has the following limitations:

  • REST API implementation is part of our future development plan.
  • No authentication mechanisms have been implemented to date.


  • No labels