The SharePoint Online connector will crawl content from any SharePoint Online site collection URL. The connector will retrieve Sites, Lists, Folders, List Items and Attachments, as well as other pages (in .aspx format). This connector supports SharePoint running in the Microsoft 365 offering.
This is not a O365 connector, the individual repository offerings within O365, such as OneDrive, Calendar, Tasks, Yammer will have their own connectors.
The File System supports crawling the following the repositories
Repository | Version | Connector Version |
---|---|---|
SharePoint | Microsoft 365 | 5.0 |
The connector offers two authentication options to access the SharePoint REST API: user account or Azure AD application.
To configure a user crawl account use the following GUIDE.
To use a user crawl account on multiple site collections, you'll have to follow the steps on each site collection the access is needed.
To configure an Azure AD application for crawling, see Azure AD Access for SharePoint Online.
Using an Azure AD Application will grant access to all site collections under the tenant.
The connector uses SharePoint's REST API, so the Aspire Worker nodes must have internet access to connect to the Microsoft 365 environment. Optionally, you can configure a proxy on the connector to enable internet access.
Name | Supported |
---|---|
Content Crawling | yes |
Identity Crawling | yes |
Snapshot-based Incrementals | yes |
Non-snapshot-based Incrementals | yes |
Document Hierarchy | yes |
The SharePoint connector has the following features:
The File System connector is able to crawl the following objects:
Name | Type | Relevant Metadata | Content Fetch & Extraction | Description |
---|---|---|---|---|
Folder | container |
| NA | The directories of the file system. Each directory will be scanned to retrieve more directories or files |
File | document |
| yes | The files contained by the directories in the crawled file system. |
The File System Connector has the following limitations: