Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Apache Kafka Connector will crawl content from a  repository.


Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Introduction


The Apache Kafka Connector will crawl content from a file system with PLAINTEXT protocol.

Environment and Access Requirements


Repository Support

The Apache Kafka connector supports crawling the following the repositories:

RepositoryVersionConnector Version
WindowsAspireall5.0.2
Linuxall5.0.2

Account Privileges

For the Apache Kafka connector the Apache Kafka connector to be able to crawl content, the Aspire Worker nodes must be run with an account with with.

Environment Requirements

Requirementversion
Apache Kafka3.0.0

Framework and Connector Features


Framework Features

Name Supported
Content Crawlingyes

Yes

Identity CrawlingnoNo
Snapshot-based Incrementals
yesYes
Non-snapshot-based IncrementalsyesNo
Document HierarchynoNo

Connector Features

The Apache Kafka connector has the following features:

  • Extract documents from multiple Kafka topics.
  • Select between one and multiple servers to crawl.


Content Crawled


The Apache Kafka connector is able to crawl the following objects:


The files contained by the directories in the crawled file system
NameTypeRelevant MetadataContent Fetch and ExtractionDescription
FileTopicdocumentSystem.StringpartitionYesoffsetThe topic name
  • topic
  • value
  • N/A
    PartitionsSystem.Collections.Generic.List<PartitionMetadata>YesMetadata for the partitions of the topic.

    Limitations


    The Apache Kafka Connector has the following limitations:

    • REST API imlpementation implementation is part of our future development plan.
    • No authentication mechanisms have been implemented to date.