Crawl Zip File Process

While the crawl is running, when a zip file is found, the scanner get the metadata of the file and process it in order to get the info of every entry in the zip file. 

For the moment the Zip File process is only available for these 3 connectors:

  • File System
  • CIFS
  • Lotus

The process is able to extract and process these file types:

  • ZIP
  • AR
  • ARJ
  • CPIO
  • JAR
  • DUMP
  • TAR


Known Limitations:

  • RAR is a proprietary algorithm and was not included for this version.
  • 7z does not support stream opening so it was excluded from this version.
  • If the ZIP files are excluded from the crawl, the Scan Excluded Items option will not work.


Configuration parameters:

  • extractFolders (the same of the scanner)
  • scanRecursive (the same of the scanner)


We have planned another approach, would be to move the processing of zip files into a separate component and place this in the pipeline after the scanner in order to implement this process for all connectors.

This will be implemented in the next Aspire releases.


  • No labels