The Box Connector will crawl content from a Box.com repository.
Easy Heading Free | ||||||
---|---|---|---|---|---|---|
|
Box connector will crawl content from a Box repository. The connector will retrieve the supported elements using the RESTful API (Content API Basics 2.0 version), for authentication will use Box API (that uses OAuth 2). Using JWT configuration.
The Box connector supports crawling the following the repositories:
Repository | Version | Connector Version |
---|---|---|
Box.com | All | 5.3 |
To access Box repository, a user account with sufficient privileges must be supplied.
To access Box APIs, you'll need to create an Box Application (Client Id and Client Secret)
Select "API permissions" > "Add a permission" > "Microsoft APIs".
Authentication
The connector supports two types of authentication using a Box Application's Client Id and Client Secret (Client Credential) or using Json Web Token (JWT).
Client Credential Grant
This server-side aithentication does not require end-user interaction and, if granted the proper privileges, can be used to act on behalf of any user in an enterprise, this is important to get all the content extracted. Identity is validated using the application's client ID and client secret.
See Box page, https://developer.box.com/guides/authentication/client-credentials/client-credentials-setup/ for the steps on how configure Custom App, with the following configurations:
Please, select 'Server Authentication (Client Credentials Grant)' as authentication method.
In Application Access, choose App Acccess + Enterprise Access.
For Application Scopes, select 'Read all files and folders stored in Box', 'Manage user', 'Manage groups', 'Manage enterprise properties'.
JWT
Server-side authentication using JSON Web Tokens (JWT) does not require end-user interaction and, if granted the proper privileges, can be used to act on behalf of any user in an enterprise, this is important to get all the content extracted. Identity is validated using a JWT assertion and public/private keypair.
Follow this page for JWT configuration steps https://developer.box.com/guides/authentication/jwt/jwt-setup/ in order to create a custom app.
For application authetication select Server Authentication (With JWT).
Generate a keypair. Make sure to write down your Application Key at the time of creation. It will not be shown again after you exit the portal.
Name | Supported |
---|---|
Content Crawling | Yes |
Identity Crawling | Yes |
Snapshot-based Incrementals | No |
Non-snapshot-based Incrementals | Yes |
Document Hierarchy | Yes |
Some of the features of the Box connector include:
The Box connector retrieves several types of objects:
Name | Type | Relevant Metadata | Content Fetch and Extraction | Description |
---|---|---|---|---|
Folder | container | Yes | The directories of the files. Each directory will be scanned to retrieve more subfolders or documents. Also the collaborators are include as ACLs (Access Control Lists). | |
File | document | Yes | Files stored in folders/subfolders, content, tasks and comments are part of metadata fields that are extracted. | |
Box Note | document | Yes | Type of document | |
Bookmark | document | Yes | Web Links | |
Google Doc | document | Yes | Type of document | |
Google spreadsheet | document | Yes | Type of document | |
Word doc | document | Yes | Type of document | |
Powerpoint doc | document | Yes | Type of document | |
Excel spreadsheet | document | Yes | Type of document | |
For incremental crawls, the connector will use the event stream _ position value, so only the changes reported from that time value to the current moment will be crawl.
These are the list of events supported:
Folder colaborators, group memberships, and users are part of the ACLs and Identity Fetcher values which are part of the security of the content.