Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Apache Kafka Connector will crawl content from a  repository.


Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Introduction


The Apache Kafka Connector will crawl content from a filesystem with PLAINTEXT protocol.

Environment and Access Requirements


Repository Support

The Apache Kafka connector supports crawling the following the repositories:

RepositoryVersionConnector Version
WindowsallKafka5.0.2
Linuxall5.0.2

Account Privileges

For the Apache Kafka connector to be able to crawl content, the Aspire Worker nodes must be run with an account with .

Environment Requirements

Requirementversion
Apache Kafka3.0.0

Framework and Connector Features


Framework Features

Name Supported
Content Crawlingyes

Yes

Identity CrawlingnoNo
Snapshot-based Incrementals
yesYes
Non-snapshot-based IncrementalsyesNo
Document HierarchynoNo

Connector Features

The Apache Kafka connector has the following features:

  • Extract documents from multiple Kafka topics.
  • Select between one and multiple servers to crawl.


Content Crawled


The Apache Kafka connector is able to crawl the following objects:


NameType
Relevant Metadata
Content Fetch and ExtractionDescription
File
Topic
document
System.String
partition
Yes
offset
The topic name
  • topic
  • value
  • N/AThe files contained by the directories in the crawled file system
    PartitionsSystem.Collections.Generic.List<PartitionMetadata>YesMetadata for the partitions of the topic.

    Limitations


    The Apache Kafka Connector has the following limitations:

    • REST API imlpementation is part of our future development plan.
    • No authentication mechanisms have been implemented to date.