Page tree
Skip to end of metadata
Go to start of metadata

This page maintains a list of all of the updates for version 3.1 of the Aspire Framework.

New Features

  • Web Crawler named Aspider Connector replaces the Legacy Heritrix Connector.
  • Salesforce Connector has been refactored to include the following features:
    • Runs in the new connector framework.
    • Supports execution in a distributed environment.
    • Allows concurrent crawling of multiple endpoints.
    • Provides faster incremental crawls.
    • Uses snapshots.
  • New way to manage Failed Documents for all of the Source Connectors:
    • Allows document reprocessing that previously failed in both processing and publishing stages.
  • Avro Reader Extractor Application and Avro Publisher. ALPHA VERSION
  • Parquet Extractor Application. ALPHA VERSION
  • HDFS Connector and Web HDFS Publisher. ALPHA VERSION
  • Http Generic Service. ALPHA VERSION
  • Documentum DQL
    • Error tolerant option to index metadata when fetch document fails.
    • New RenditionType option for indexing.
  • Support of Azure Authentication on the SharePoint Online Connector.
  • New features for the SharePoint Connector (2007/2010):
    • Supports default snapshots on incremental crawls.
    • Supports crawling-specific views on lists.
  • Implemented a single security key-store throughout all of Aspire.
  • Updated SharePoint 2007/2010 Web Service Extensions.

Bug Fixes

Aspire Core

  • Incorrect Historical Statistics for a new connector after another was executed.
  • Negative DPS appeared on the Statistics.
  • Link related to Aspire Authentication was updated in the config/settings.xml file.
  • Crawl Begin/End Auditing actions displayed different elements on Auditing.
  • NPE when a crawl stopped without pausing.
  • NPE when a crawl stopped in ReleaseController after pausing.
  • UI not showing correctly the error for an invalid groovy script.
  • Content source name could be updated with blank spaces.
  • Historical Crawl Statistics for one connector appearing in another one.
  • Incorrect Logs showed in QueueLoader component.
  • Exception loading workflow.xml file when an empty custom groovy script was added.
  • Advanced scheduler option was not working.
  • Server error was encountered when displaying Audit Log.
  • Group Expansion and Advance properties check boxes were misplaced.
  • Page was stuck when adding a Custom Publisher.
  • Service start and stop buttons displayed wrong tooltips.
  • Crawl stalling in Linux after publishing end job.
  • Couldn't save content source when using multiple check box selectors with one option checked.
  • OpenDXF was not escaping characters in JSON inputs.
  • Any exception thrown produced a NPE in the app-rap-connector.
  • Failover: Dual instance full test (interrupted) - Recovery Option Full - Not all items Crawled
  • Failover: Dual instance full test (interrupted) - Recovery Option Incremental - Never ended crawling
  • Allowed laxing of deletes policy in connector framework.
  • Aspire not detecting changes in the connector settings and not asking to save them.
  • Stop crawl option not working correctly.
  • Crawl showing wrong time when crawling in Linux.
  • Statistics showing items In Progress when paused.
  • Docs not crawled were reported as Adds on Auditing.
  • Server error was encountered when displaying Audit Log.
  • -create_master option not working properly on Linux.
  • Hierarchy extractor needed to have default values selected on Workflow jobs to work properly.
  • Minor improvements and fixes in Aspire UI.
  • Validations improvements and fixes for several components.

External Technical Limitations  

  • Zip files are not crawled with the Activity Incrementals when they are created inside Jive Documents.
  • When a Salesforce SOQL statement selects a number of large fields (such as two or more custom fields of type long text) then Salesforce may return fewer records than defined in the page size in order to control the overall response payload size. The reduction in page size also occurs when dealing with base64 encoded fields (or blob fields), such as the Body. Remove these fields from the query if you want Salesforce to return the number of rows specified on the page size.

Items to Deprecate on Aspire 3.2

The following items are marked to be deprecated on the next Aspire version: 

  • Elasticsearch bootloader
    • aspire-elastic-bootloader
  • DCM
    • aspire-dcm-enterprise
    • aspire-amazonec2-dm
    • aspire-zk-dm
  • Big Data
    • app-semantic-co-occurrence-hadoop
    • app-semantic-co-occurrence-hadoop-soln
    • aspire-hadoop-job-launcher
    • aspire-hadoop-hdfs
    • aspire-hadoop-wiki-dict-generator
    • aspire-load-hdfs

Known Issues

Aspire Core 

  • Importing connector with special characters in the path fields not loading correctly.
  • Auditing
    • Dump option not working with Solr 6.2.0 & 6.3.0
    • Dump option not working with ElasticSearch 5.0.2
    • Incremental Crawl - Unchanged documents not displayed in Audit Log.
  • Aspire Shell
    • The option load-content-sources not working.
    • Relative paths were not working for the commands that create jobs.
    • Sometimes it was possible to delete the Aspire Shell prompt.
  • Failed Documents
    • FailedDocuments - Connector getting stuck when stopping the crawl.
  • Failover
    • Single instance, full test, interrupted, incremental recovery: Error Processing some files after resuming crawl.
    • During full crawls, some documents were left out if an instance was killed.
    • File System connector resumed crawling after restart.

  • No labels