The Azure Blob Storage Connector will crawl content from an Azure Blob Container repository.


Introduction


Microsoft Azure Blob storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing massive amounts of unstructured data, such as text or binary data. 

For more information about Azure Blob storage, see the official Microsoft Azure Blob Storage documentation.

Environment and Access Requirements


Repository Support

The Azure Blob Storage connector supports crawling the following the repositories:

RepositoryVersionConnector Version
Azure Blob StorageAll5.1

Environment Requirements

To access the Azure Blob storage, a connection must be established to a valid Azure storage account.

Microsoft Azure storage is a service that is independent of Accenture Aspire technologies and licenses. See Create a storage account.


User Account Requirements

To access the Azure Blob Container, a connection string must be supplied. See Microsoft's Manage Storage Account Access Keys documentation for the steps on how to get the connection string.


Framework and Connector Features


Framework Features

Name Supported
Content CrawlingYes
Identity CrawlingUse Azure Identity Connector
Snapshot-based IncrementalsYes
Non-snapshot-based IncrementalsNo
Document HierarchyYes

Connector Features

The Azure Blob Storage connector has the following features:

  • Performs incremental crawling (so that only new/updated documents are indexed)
  • Fetches Object ACLs (Access Control Lists) for Azure document-level security
  • Runs from any machine with access to the given Azure Blob Storage source


Content Crawled


The Azure Blob Storage connector can crawl the following objects:

NameTypeRelevant MetadataContent Fetch and ExtractionDescription
Containercontainer
N/AOrganizes a set of blobs, similar to a directory in a file system
Blobdocument
YesStore text and binary data


  • No labels