You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 64 Next »


This page maintains a list of all of the updates for version 3.3 of Aspire.


New Features


  • Web Crawler named Aspider Connector replaces the Legacy Heritrix Connector.
  • Salesforce Connector has been refactored to include the following features:
    • Runs in the new connector framework.
    • Supports execution in a distributed environment.
    • Allows concurrent crawling of multiple endpoints.
    • Provides faster incremental crawls.
    • Uses snapshots.
  • New way to manage Failed Documents for all of the Source Connectors:
    • Allows document reprocessing that previously failed in both processing and publishing stages.
  • Avro Reader Extractor Application and Avro Publisher. ALPHA VERSION
  • Parquet Extractor Application. ALPHA VERSION
  • SMTP Connector. ALPHA VERSION
  • HDFS Connector and Web HDFS Publisher. ALPHA VERSION
  • Documentum DQL
    • Error tolerant option to index metadata when fetch document fails.
    • New RenditionType option for indexing.
  • Support of Azure Authentication on the SharePoint Online Connector.
  • New features for the SharePoint Connector (2007/2010):
    • Supports default snapshots on incremental crawls.
    • Supports crawling-specific views on lists.
  • Implemented a single security key-store throughout all of Aspire.
  • Updated SharePoint 2007/2010 Web Service Extensions.


Bug Fixes


Aspire Core

  • Incorrect Historical Statistics for a new connector after another was executed.
  • Negative DPS appeared on the Statistics.
  • Link related to Aspire Authentication was updated in the config/settings.xml file.
  • Crawl Begin/End Auditing actions displayed different elements on Auditing.
  • NPE when a crawl stopped without pausing.
  • NPE when a crawl stopped in ReleaseController after pausing.
  • UI not showing correctly the error for an invalid groovy script.
  • Content source name could be updated with blank spaces.
  • Historical Crawl Statistics for one connector appearing in another one.
  • Incorrect Logs showed in QueueLoader component.
  • Exception loading workflow.xml file when an empty custom groovy script was added.
  • Advanced scheduler option was not working.
  • Server error was encountered when displaying Audit Log.
  • Group Expansion and Advance properties check boxes were misplaced.
  • Page was stuck when adding a Custom Publisher.
  • Service start and stop buttons displayed wrong tooltips.
  • Crawl stalling in Linux after publishing end job.
  • Couldn't save content source when using multiple check box selectors with one option checked.
  • OpenDXF was not escaping characters in JSON inputs.
  • Any exception thrown produced a NPE in the app-rap-connector.
  • Failover: Dual instance full test (interrupted) - Recovery Option Full - Not all items Crawled
  • Failover: Dual instance full test (interrupted) - Recovery Option Incremental - Never ended crawling
  • Allowed laxing of deletes policy in connector framework.
  • Aspire not detecting changes in the connector settings and not asking to save them.
  • Stop crawl option not working correctly.
  • Crawl showing wrong time when crawling in Linux.
  • Statistics showing items In Progress when paused.
  • Docs not crawled were reported as Adds on Auditing.
  • Server error was encountered when displaying Audit Log.
  • Aspire.sh -create_master option not working properly on Linux.
  • Hierarchy extractor needed to have default values selected on Workflow jobs to work properly.
  • Minor improvements and fixes in Aspire UI.
  • Validations improvements and fixes for several components.

Applications

  • Archive Extractor
    • Incremental on archives files was not working using the Lotus connector.
    • Nested archive threw an "Archive not recognized" error.

Connectors

  • CIFS
    • Malformed URL was not being validated.
    • Removed slash character at the end of name attribute on Hierarchy.
  • Confluence
    • Name attribute for the level 1 hierarchy showed the name of the content source.
  • Documentum DQL
    • Fixed NPE when running an incremental crawl.
    • DisplayUrl field was not separating webtop from document id
    • Include/Exclude fields appeared as part of the configurations.
  • eRoom
    • Updates and Deletes were not picked up by Incremental crawl over certain items (Comments and Votes for polls.)
    • UI validation when using wrong URL.
    • No error was reported when setting a wrong username/password.
  • FileSystem
    • Improved the wording in some tooltips.
  • Heritrix
    • Implemented deletes handling feature from Heritrix in the connector framework.
  • Jive
    • Changes on Document ACL's were reflected incorrectly for both Activity Incremental and Normal Incremental.
    • Non-text Document filtering reported Add instead of Update for the documents filtered.
    • Page Size value was not using the UI parameter.
  • Lotus
    • Exclude pattern was not working as expected for items that were not attachments.
    • Incremental on archive files was not working.
    • Incremental crawl with index containers was not working.
    • No error showed if the database and view were the same.
  • RDB Snapshot
    • Crawl was not finishing with the Use Slices option and set bad Extract SQL.
    • No error was reported when setting a wrong ACL SQL.
    • Wrong sql statement in Full crawl was not showing errors.
  • RDB Tables
    • Action column was ignored for the incremental crawl.
  • Service Now
    • Displayed incorrect URL field in Knowledge Articles (XML representation).
    • Inclusion\Exclusion pattern was not working for attachments.
    • Aspire error when two images files were attached and a full crawl was run.
  • Social Cast
    • Tag nonTextDocument was missed in the Aspire Object.
  • SharePoint 2007
    • Error on console and UI while crawling an item updated on root using both Index Containers and Scan Recursively disabled.
    • NPE processed container after changing ACL on an Incremental crawl.
    • ACLs showed the same item as group and user.
  • SharePoint 2010
    • Minor fixes to the tooltips.
  • SharePoint 2013
    • Incremental reported duplicate jobs when adding a subsite.
    • Delete job had the incorrect displayUrl and fetchUrl after renaming a file.
  • SharePoint Online
    • Adding specific site collections made incremental crawl everything.
    • Renaming an item returned an add, update and delete on the same crawl.
    • Error when crawling site URL with encoded blank spaces.

Publishers

  • Publish to Solr
    • Deletes were not working correctly.

Services

  • Add Service button was not working.
  • Azure Group Expander
    • Azure GE and SharePoint Online GE were not deleting users.
  • CEWS Listener
    • PropertyOflong and PropertyOfArrayOflong were not working.
  • Fast Content API
    • Missing validations.
  • Group Expansion Manager
    • Fixed 'Missing version number' error when service was loaded.
    • Some validations were missed.
  • LDAP Cache
    • Some validations were missed.
    • Problems with tooltips for LDAP Attribute in Cache user and Cache group options.


External Technical Limitations  


  • Zip files are not crawled with the Activity Incrementals when they are created inside Jive Documents.




To Be Released


  • Amazon S3
  • Box
  • CEWS Listener
  • FTP
  • GSA Publisher
  • IBM Connections
  • PST Extractor
  • Publish to HDFS
  • Publish to SharePoint 2013
  • Publish to SharePoint 2013 (Install & Setup)
  • Salesforce
  • Subversion
  • Teamforge




Items to Deprecate on Aspire 3.3


The following items are marked to be deprecated on the next Aspire version: 

  • Elasticsearch bootloader
    • aspire-elastic-bootloader
  • DCM
    • aspire-dcm-enterprise
    • aspire-amazonec2-dm
    • aspire-zk-dm
  • The old Admin UI(s)
    • Parts of aspire-application
  • Big Data
    • app-semantic-co-occurrence-hadoop
    • app-semantic-co-occurrence-hadoop-soln
    • aspire-hadoop-job-launcher
    • aspire-hadoop-hdfs
    • aspire-hadoop-wiki-dict-generator
    • aspire-load-hdfs
  • Connectors
    • Staging Repo Connector
  • Solutions
    • OCR
    • Semantic Co-ocurrence
  • Publishers
    • Cloudsearch
    • Staging Repo Publisher

Known Issues


Aspire Core 

  • Importing connector with special characters in the path fields not loading correctly.
  • Auditing
    • Dump option not working with Solr 6.2.0 & 6.3.0
    • Dump option not working with ElasticSearch 5.0.2
    • Incremental Crawl - Unchanged documents not displayed in Audit Log.
  • Aspire Shell
    • The option load-content-sources not working.
    • Relative paths were not working for the commands that create jobs.
    • Sometimes it was possible to delete the Aspire Shell prompt.
  • Failed Documents
    • FailedDocuments - Connector getting stuck when stopping the crawl.
  • Failover
    • Single instance, full test, interrupted, incremental recovery: Error Processing some files after resuming crawl.
    • During full crawls, some documents were left out if an instance was killed.
    • File System connector resumed crawling after restart.

Applications

  • Archive Extractor
    • Routing options not working with OnError.
    • Delete by Query not working as expected Using ElasticSearch 5.0.1

Connectors

  • Include and Exclude pattern trimming empty spaces.
  • Aspider
    • Aspider - Crawl statistics not displayed on the UI when version was used.
  • Documentum
    • GroupExpansion marking groups as users.
    • Error on console was displayed while crawling a folder: "Stream handler unavailable due to: null"
  • Heritrix
    • Reject Images/Videos/Javascript/CSS not working for external site out of domain.
  • Jira Issues
    • Multiple connection timeouts occurring.
  • Jive
    • Crawl generated unnecessary deletes on Normal Incremental with the Use Progressive Retries option.
  • RDB Snapshot
    • bigINT SQL Server database type not supported for SQL Slices.
    • Inconsistency when crawling ACL information using ACL fetching options.
    • Crawl not working with a specific column when the Use column from Extraction SQL option was specified.
  • RDB Tables
    • Wrong value in "sequence column" parameter not showing UI error.
    • Inconsistent crawling ACL information when using ACL fetching options.
  • SharePoint Connectors
    • Renaming one document included by a pattern not generating a deletion.
    • No descriptive error with non-existent URL. 
  • SharePoint 2007
    • Connector generating wrong hierarchy on updates.
    • Custom headers not being added to the request.
  • SharePoint 2010
    • Documents inside a folder not being picked up by incrementals if the parent ACL changed.

Services

  • Import Service option not working when the service under the same name already existed.
  • Group Expansion Service
    • Removing GE collection from mongo causing unusable GE for connectors.
  • Group Expansion Manager
    • GE Manager - More than one GE Service using the same servlet name generating an error.

UI

  • Add Source List not showing until clicking refresh sources.
  • Update service name to chain of spaces was not validated all the time.


  • No labels