Page tree
Skip to end of metadata
Go to start of metadata


WebHDFS configuration


The WebHDFS feature must be enabled in order to use this connector.

Grant Read Permissions to Crawl Path


Granting READ permissions is a must since the connector won't be able to get any data if the Path to be crawled is restricted.

Kerberized Clusters


For Kerberized Clusters, a delegation token is required in order to crawl any path within the HDFS. To obtain this token you must:

  1. SSH into your cluster.
  2. Run:

    $ kinit <your-user-principal>
    $ curl -i --negotiate -u : "http://<host>:<port>/webhdfs/v1/?op=GETDELEGATIONTOKEN"
    ...
    {"Token":{"urlString":"<A-VERY-LONG-TOKEN>"}}
  3. Copy the "Token" field and set it into the configuration of the connector.


  • No labels