Date: Thu, 28 Mar 2024 18:58:09 -0500 (CDT) Message-ID: <1390145536.39.1711670289463@slrs01vf4p1cn02.as.ad.digital.accenture.com> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_38_1976563594.1711670289462" ------=_Part_38_1976563594.1711670289462 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
The WebHDFS feature must be enabled in order to use this connector.
Granting READ permissions is a must since the connector won't be able to= get any data if the Path to be crawled is restricted.
For Kerberized Clusters, a delegation token is required in order to craw= l any path within the HDFS. To obtain this token you must:
Run:
$ kinit <yo= ur-user-principal> $ curl -i --negotiate -u : "http://<host>:<port>/webhdfs/v1/?op= =3DGETDELEGATIONTOKEN" ... {"Token":{"urlString":"<A-VERY-LONG-TOKEN>"}}