Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Azure Data Lake Connector can be configured using the Aspire Admin UI. It requires the following entities to be created:

  • Credential
  • Connection
  • Connector
  • Seed

Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Create Credential 


  1. On the Aspire Admin UI go to the credentials page
  2. All existing credentials will be listed. Click on the new button
  3. Enter the new credential description.
  4. Select Azure Data Lake from the Type list.
    1. Authorization Token End Point: OAuth 2 End Point supplied by Azure for your App, format https://login.microsoftonline.com/[mykey]/oauth2/tokenAccount Name: Storage Account name
    2. Application ID: Client ID for your application
    3. Application Secret: Key supplied by Azure
    4. Fully Qualified Domain Name or FQDN: full path of your Data Lake domain, format [mydomain].azuredatalakestore.net
    5. Tenant ID:Tenant ID for your Application

Image AddedImage Removed

Create Connection 


  1. On the Aspire Admin UI, go to the connections page
  2. All existing connections will be listed. Click on the new button
  3. Enter the new connection description. 
  4. Select Azure Data Lake from the Type list.
    1.  Select if all file systems are to be scanned
    2. File System Name: Specify the name of the file system
    3. Index Containers: Select if folders are to be indexed
    4. Scan Recursively: Select if a sub-folder are is to be scanned
    5. Scan Excluded Items: If selected, the scanner will scan sub items of container items that have been excluded by a pattern (because it matches an exclude pattern or because it doesn't match an include pattern)
    6. Include patterns: Specify regex display URL patterns to include
    7. Exclude patterns: Specify regex display URL patterns to exclude

Image RemovedImage Added


Create Connector Instance


For the creation of the Connector object using the Admin UI, check this page.


Create Seed 


  1. On the Aspire Admin UI, go to the seeds page
  2. All existing seed existing seeds will be listed. Click on the new button
  3. Enter the new seed description.
  4. Select Azure Data Lake from the Type list.
    1. Collect from Root: Within this option, connector will crawl from root directory of Azure Data Lake FQDN supplied. Meaning "/"
    2. Use Seeds File: This option will allow collect paths from a supplied file location, very useful if paths will be constantly changing and controlled by a 3rd party process. Paths should be listed one per line in a form of /folder/sub-folder
      1. For Windows: D:\folder\folder1\paths.txt

      2. For Linux: /home/user/folder/folder1/paths.txt
    3. Specific Paths: This option will allow submit N paths. Admin is able to supply as many paths in a format of /folder/sub-folder
    4. Specific Path: Specific path to crawl. If “Scan all Filesystems” in the Connection was checked, this path will be ignored.


Image Added

Image Removed

Image Removed

Image Removed