The SMB connector will crawl content from the samba share folder. 

Introduction


The SMB connector can scan and fetch the directories and documents of a samba shared folder.

Environment and Access Requirements


Repository Support

The SMB supports crawling the following the repositories

RepositoryVersionConnector Version
WindowsAll5.0
LinuxAll5.0

This component has been officially tested on local Windows and Linux.

Account Privileges

For the SMB connector to be able to crawl the Aspire Worker nodes must be run with a domain account with full read permissions over the shared folder to be crawled.

If the feature to "not change the last access date" is used, the account also requires write permissions.

Environment Requirements

The SMB connector was created and tested using Microsoft SMB2 Protocol

 A Samba file server installed and setup. The Samba file server enables file sharing across different operating systems over a network.

Framework and Connector Features


Framework Features

NameSupported
Content CrawlingYes
Identity CrawlingNo
Snapshot-based IncrementalsYes
Non-snapshot-based IncrementalsNo
Document HierarchyYes

Connector Features

The SMB connector has the following features:

  • Document filtering using include and exclude regex patterns.
  • Static acls can be added the documents crawled.
  • Distributed File System support.
  • Security Information retrieval.
  • Read documents without changing the last accessed date

Content Crawled


The SMB connector is able to crawl the following objects:

NameType Relevant MetadataContent Fetch & ExtractionDescription
Foldercontainer
  • Last Modified Date
NAThe directories of the share folder. Each directory will be scanned to retrieve more directories or files
Filedocument
  • Last Modified Date
  • Data size
yesThe files contained by the directories in the crawled share folder.

Limitations


The SMB Connector has the following limitations:

  • The following features are not currently implemented, but are on the development plan:

    • SMBv3 support
  • No labels