Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Panel
titleOn this page

Table of Contents

WebHDFS configuration


The WebHDFS feature must be enabled in order to use this publisher.

Grant Read Permissions to Crawl Path


Granting READ permissions is a must since the connector won't be able to get any data if the Path to be crawled is restricted.

Kerberized Clusters


For Kerberized Clusters, a delegation token is required in order to crawl any path within the HDFS. To obtain this token you must:

  1. SSH into your cluster.
  2. Run:

    Code Block
    $ kinit <your-user-principal>
    $ curl -i --negotiate -u : "http://<host>:<port>/webhdfs/v1/?op=GETDELEGATIONTOKEN"
    ...
    {"Token":{"urlString":"<A-VERY-LONG-TOKEN>"}}
  3. Copy the "Token" field and set it into the configuration of the connector.