Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

         

Info

For version 3.3, Aspire requires a license file to run.

See Aspire Licensing for information on obtaining a license.


The following are the NoSQL DB providers supported by the Aspire 3.3 release:

  • MongoDB version 3.6
  • HBase version 1.2.4

The supported version of Elasticsearch is 6.3.0

The supported version of StageR is 1.2   Note:  The latest version of Stager is v. 1.2 and it supports MongoDB v. 3.4.10 




Below you can find a list

This page maintains a list of all

of the updates for this version

3.3 of Aspire.

.

Panel
titleOn this page
:Bug FixesApplicationsServicesKnown Issues Core Applications
  • UI
  • External Technical Limitations  
  • To Be Released
  • Items to Deprecate on Aspire 3.3
  • Related pages:

    AnchornewfeaturesnewfeaturesNew Features
    • Web Crawler named Aspider Connector replaces the Legacy Heritrix Connector.
    • Salesforce Connector has been refactored to include the following features:
      • Runs in the new connector framework.
      • Supports execution in a distributed environment.
      • Allows concurrent crawling of multiple endpoints.
      • Provides faster incremental crawls.
      • Uses snapshots.
    • New way to manage Failed Documents for all of the Source Connectors:
      • Allows document reprocessing that previously failed in both processing and publishing stages.
    • Avro Reader Extractor Application and Avro Publisher. 
      Status
      subtletrue
      colourGreen
      titleAlpha version
    • Parquet Extractor Application. 
      Status
      subtletrue
      colourGreen
      titleAlpha version
    • SMTP Connector. 
      Status
      subtletrue
      colourGreen
      titleAlpha version
    • HDFS Connector and Web HDFS Publisher. 
      Status
      subtletrue
      colourGreen
      titleAlpha version
    • Http Generic Service. 
      Status
      subtletrue
      colourGreen
      titleAlpha version
    • Documentum DQL
      • Error tolerant option to index metadata when fetch document fails.
      • New RenditionType option for indexing.
    • Support of Azure Authentication on the SharePoint Online Connector.
    • New features for the SharePoint Connector (2007/2010):
      • Supports default snapshots on incremental crawls.
      • Supports crawling-specific views on lists.
    • Implemented a single security key-store throughout all of Aspire.
    • Updated SharePoint 2007/2010 Web Service Extensions.
    AnchorbugfixesbugfixesBug Fixes AnchorAspireCoreAspireCoreAspire Core
    • Incorrect Historical Statistics for a new connector after another was executed.
    • Negative DPS appeared on the Statistics.
    • Link related to Aspire Authentication was updated in the config/settings.xml file.
    • Crawl Begin/End Auditing actions displayed different elements on Auditing.
    • NPE when a crawl stopped without pausing.
    • NPE when a crawl stopped in ReleaseController after pausing.
    • UI not showing correctly the error for an invalid groovy script.
    • Content source name could be updated with blank spaces.
    • Historical Crawl Statistics for one connector appearing in another one.
    • Incorrect Logs showed in QueueLoader component.
    • Exception loading workflow.xml file when an empty custom groovy script was added.
    • Advanced scheduler option was not working.
    • Server error was encountered when displaying Audit Log.
    • Group Expansion and Advance properties check boxes were misplaced.
    • Page was stuck when adding a Custom Publisher.
    • Service start and stop buttons displayed wrong tooltips.
    • Crawl stalling in Linux after publishing end job.
    • Couldn't save content source when using multiple check box selectors with one option checked.
    • OpenDXF was not escaping characters in JSON inputs.
    • Any exception thrown produced a NPE in the app-rap-connector.
    • Failover: Dual instance full test (interrupted) - Recovery Option Full - Not all items Crawled
    • Failover: Dual instance full test (interrupted) - Recovery Option Incremental - Never ended crawling
    • Allowed laxing of deletes policy in connector framework.
    • Aspire not detecting changes in the connector settings and not asking to save them.
    • Stop crawl option not working correctly.
    • Crawl showing wrong time when crawling in Linux.
    • Statistics showing items In Progress when paused.
    • Docs not crawled were reported as Adds on Auditing.
    • Server error was encountered when displaying Audit Log.
    • Aspire.sh -create_master option not working properly on Linux.
    • Hierarchy extractor needed to have default values selected on Workflow jobs to work properly.
    • Minor improvements and fixes in Aspire UI.
    • Validations improvements and fixes for several components.
    AnchorApplicationsApplicationsApplications
    • Archive Extractor
      • Incremental on archives files was not working using the Lotus connector.
      • Nested archive threw an "Archive not recognized" error.
    AnchorConnectorsConnectorsConnectors
    • CIFS
      • Malformed URL was not being validated.
      • Removed slash character at the end of name attribute on Hierarchy.
    • Confluence
      • Name attribute for the level 1 hierarchy showed the name of the content source.
    • Documentum DQL
      • Fixed NPE when running an incremental crawl.
      • DisplayUrl field was not separating webtop from document id
      • Include/Exclude fields appeared as part of the configurations.
    • eRoom
      • Updates and Deletes were not picked up by Incremental crawl over certain items (Comments and Votes for polls.)
      • UI validation when using wrong URL.
      • No error was reported when setting a wrong username/password.
    • FileSystem
      • Improved the wording in some tooltips.
    • Heritrix
      • Implemented deletes handling feature from Heritrix in the connector framework.
    • Jive
      • Changes on Document ACL's were reflected incorrectly for both Activity Incremental and Normal Incremental.
      • Non-text Document filtering reported Add instead of Update for the documents filtered.
      • Page Size value was not using the UI parameter.
    • Lotus
      • Exclude pattern was not working as expected for items that were not attachments.
      • Incremental on archive files was not working.
      • Incremental crawl with index containers was not working.
      • No error showed if the database and view were the same.
    • RDB Snapshot
      • Crawl was not finishing with the Use Slices option and set bad Extract SQL.
      • No error was reported when setting a wrong ACL SQL.
      • Wrong sql statement in Full crawl was not showing errors.
    • RDB Tables
      • Action column was ignored for the incremental crawl.
    • Service Now
      • Displayed incorrect URL field in Knowledge Articles (XML representation).
      • Inclusion\Exclusion pattern was not working for attachments.
      • Aspire error when two images files were attached and a full crawl was run.
    • Social Cast
      • Tag nonTextDocument was missed in the Aspire Object.
    • SharePoint 2007
      • Error on console and UI while crawling an item updated on root using both Index Containers and Scan Recursively disabled.
      • NPE processed container after changing ACL on an Incremental crawl.
      • ACLs showed the same item as group and user.
    • SharePoint 2010
      • Minor fixes to the tooltips.
    • SharePoint 2013
      • Incremental reported duplicate jobs when adding a subsite.
      • Delete job had the incorrect displayUrl and fetchUrl after renaming a file.
    • SharePoint Online
      • Adding specific site collections made incremental crawl everything.
      • Renaming an item returned an add, update and delete on the same crawl.
      • Error when crawling site URL with encoded blank spaces.
    AnchorPublishersPublishersPublishers
    • Publish to Solr
      • Deletes were not working correctly.
    AnchorServicesServicesServices
    • Add Service button was not working.
    • Azure Group Expander
      • Azure GE and SharePoint Online GE were not deleting users.
    • CEWS Listener
      • PropertyOflong and PropertyOfArrayOflong were not working.
    • Fast Content API
      • Missing validations.
    • Group Expansion Manager
      • Fixed 'Missing version number' error when service was loaded.
      • Some validations were missed.
    • LDAP Cache
      • Some validations were missed.
      • Problems with tooltips for LDAP Attribute in Cache user and Cache group options.
    AnchorExtTechLimitExtTechLimitExternal Technical Limitations  
    • Zip files are not crawled with the Activity Incrementals when they are created inside Jive Documents.
    • When a Salesforce SOQL statement selects a number of large fields (such as two or more custom fields of type long text) then Salesforce may return fewer records than defined in the page size in order to control the overall response payload size. The reduction in page size also occurs when dealing with base64 encoded fields (or blob fields), such as the Body. Remove these fields from the query if you want Salesforce to return the number of rows specified on the page size.
    AnchorToReleaseToReleaseTo Be Released
    • Box
    • CEWS Listener
    • GSA Publisher
    • IBM Connections
    • Publish to HDFS
    • Publish to SharePoint 2013
    AnchorItemDeprecateItemDeprecateItems to Deprecate on Aspire 3.3

    The following items are marked to be deprecated on the next Aspire version: 

    • Elasticsearch bootloader
      • aspire-elastic-bootloader
    • DCM
      • aspire-dcm-enterprise
      • aspire-amazonec2-dm
      • aspire-zk-dm
    • The old Admin UI(s)
      • Parts of aspire-application
    • Big Data
      • app-semantic-co-occurrence-hadoop
      • app-semantic-co-occurrence-hadoop-soln
      • aspire-hadoop-job-launcher
      • aspire-hadoop-hdfs
      • aspire-hadoop-wiki-dict-generator
      • aspire-load-hdfs
    • Connectors
      • Staging Repo Connector
    • Solutions
      • OCR
      • Semantic Co-ocurrence
    • Publishers
      • Cloudsearch
      • Staging Repo Publisher
    AnchorknownissuesknownissuesKnown Issues AnchorAspireCore2AspireCore2Aspire Core 
    • Importing connector with special characters in the path fields not loading correctly.
    • Auditing
      • Dump option not working with Solr 6.2.0 & 6.3.0
      • Dump option not working with ElasticSearch 5.0.2
      • Incremental Crawl - Unchanged documents not displayed in Audit Log.
    • Aspire Shell
      • The option load-content-sources not working.
      • Relative paths were not working for the commands that create jobs.
      • Sometimes it was possible to delete the Aspire Shell prompt.
    • Failed Documents
      • FailedDocuments - Connector getting stuck when stopping the crawl.
    • Failover
      • Single instance, full test, interrupted, incremental recovery: Error Processing some files after resuming crawl.
      • During full crawls, some documents were left out if an instance was killed.
      • File System connector resumed crawling after restart.
    AnchorApplications2Applications2Applications
    • Archive Extractor
      • Routing options not working with OnError.
      • Delete by Query not working as expected Using ElasticSearch 5.0.1
    AnchorConnectors2Connectors2Connectors
    • Include and Exclude pattern trimming empty spaces.
    • Aspider
      • Aspider - Crawl statistics not displayed on the UI when version was used.
    • Documentum
      • GroupExpansion marking groups as users.
      • Error on console was displayed while crawling a folder: "Stream handler unavailable due to: null"
    • Heritrix
      • Reject Images/Videos/Javascript/CSS not working for external site out of domain.
    • Jira Issues
      • Multiple connection timeouts occurring.
    • Jive
      • Crawl generated unnecessary deletes on Normal Incremental with the Use Progressive Retries option.
    • RDB Snapshot
      • bigINT SQL Server database type not supported for SQL Slices.
      • Inconsistency when crawling ACL information using ACL fetching options.
      • Crawl not working with a specific column when the Use column from Extraction SQL option was specified.
    • RDB Tables
      • Wrong value in "sequence column" parameter not showing UI error.
      • Inconsistent crawling ACL information when using ACL fetching options.
    • SharePoint Connectors
      • Renaming one document included by a pattern not generating a deletion.
      • No descriptive error with non-existent URL. 
    • SharePoint 2007
      • Connector generating wrong hierarchy on updates.
      • Custom headers not being added to the request.
    • SharePoint 2010
      • Documents inside a folder not being picked up by incrementals if the parent ACL changed.
    AnchorServices2Services2Services
    • Import Service option not working when the service under the same name already existed.
    • Group Expansion Service
      • Removing GE collection from mongo causing unusable GE for connectors.
    • Group Expansion Manager
      • GE Manager - More than one GE Service using the same servlet name generating an error.
    AnchorUI2UI2UI
  • Add Source List not showing until clicking refresh sources.
  • New and Enhanced Features


    Aspire Core and Framework Components

    • Salted Challenge Response Authentication Mechanism (SCRAM) support has been added to the MongoDB used with Aspire.
    • The ability to dynamically load jar files has been added to Aspire with Java 9.
    • When starting Aspire either normally or in debug mode, the debug line in the settings.xml file is handled appropriately. 
    • A section has been added to the settings.xml file for HBase information.

    • Logging of remote IP addresses for successful or failed logins will now occur.

    • The Mongo provider now encrypts/hashes IDs.
    • Record fields have been improved.
    • Entitlements checking no longer checks missing components at every restart.

    • Time zones have been normalized for Aspire, including logs and statistics.

    • The documentation has been updated for Keytab/Kerberos.
    • Improvements have been added to Job usage.
    • Updates have been made to the ExtractText default configuration limit for text extracted from a stream.
    • List page retrieval and metadata extraction have been improved in SharePoint Commons.

    Aspire UI

    • To re-fetch entitled components (after deleting the Resources folder), an "Allow Refresh" button has been added.
    • The ability to show Provider Information has been added.

    Connectors

    • Aspider
      • A Headless browser has been added for rendering dynamically generated pages (client-side JavaScript pages).
    • IBM Connections
    • Elastic
    • SharePoint 2010

      • On the Multiple URls drop-down, when the 'Site Discovery' option is set, the 'Set List View' option is removed. 

    • SharePoint Online
      • An NPE at crawl end error could occur if bad credentials were used.
      • Incremental crawls no longer detect containers as updated items.
      • Scan recursively was not working as expected.

    • SMB

      • Added DFS support and override last access date of documents
    • Twitter

    Publishers

    • Elasticsearch
      • Case sensitive index names can be handled properly now.
    • Google Cloud Search
      • A new Google Cloud Search (GCS) publisher receives content from Aspire connectors and uses the Java Client library to index the content into Cloud Search.
    • HBase
      • Content can now be deleted.
      • During a full crawl, the publisher now defaults to clean.
      • When not in file configuration mode, the publisher can now be used without security. 

    • Publish to StageR
      • Field level help has been added for the special scope $record.

    Applications

    • The Entitlements Admin application has been updated.


    Bug Fixes


    Aspire Core and Framework Components

    • Admin UI
      • The ability to configure a weekly schedule could cause an error when saving
    • Aspire Application
      • ConfigManager could log a debug message into {aspire.home}/logs/configmanager.log

      • A problem could occur when editing a custom application in the Admin UI

      • Startup problems could occur using the Staging Publisher
    • Connector Framework
      • When stopping and restarting Aspire while the GroupDownload process was running, the group download did not start again

    • MongoDB Provider
      • The LDAP Cache could report a MongoDB Duplicate key error

      • Aspider could stop with a MongoDB Duplicate key error

    • SharePoint Commons

      • An out of memory (OOM) exception could occur during large crawls

      • Added support for incrementals using Aspire Snapshots on SP
    • The Aspire Archetype had "http" rather than "https" repository and entitlement URLs
    • Failed to connect to Artifactory with custom keystore. Artifactory certificates were added to the distribution. See: https://contentanalytics.digital.accenture.com/pages/viewpage.action?spaceKey=aspire33&title=Crawling+via+HTTPs
    • AspireObject was casting an incorrect numeric type when created from JSON
    • The AspireObject isEmpty method returned true even if the object had children

    • The processDeletes (String) was missing a Status page
    • The Aspire Connector Framework was not using shouldScan during incremental crawls
    • When running a full crawl, a "Provider 'encrypted' not installed" message could occur
    • The Mongo provider generated an invalid JSON object during document conversion

    • Audit logs were incomplete
    • For AIP integration, the logout action was not being logged

    • Publisher framework retryDelay, retryDelayMultiplier and maxRetryDelay properties were not supported by Dynamic XML Forms (DFX)

    • The Aspire-Services jar file was missing a noSQL package

    • The "Loading Application" message could display whether a connector was loading or not

    • Extract Text
      • Use the Apache Tika SAX Parser for Microsoft documents
    • Scheduler
      • The option to create a Cache Groups scheduler was not being displayed

    Aspire UI

    • A Connector component might not show the actual state of a crawl
    • The link that points to the Confluence wiki has been updated

    Connectors

    • Aspider
      • An authentication form error could occur indicating "Target host is not specified while crawling"
      • Neither NTLM nor ADFS authentication was occurring when a host was specified in the Credentials
      • On any port, the Port field was not working correctly with any value except "-1"

      • A crawl could cause a warning about duplicate IDs in MongoDB

      • To indicate that the Gateway was not working, the exception message in ADFS needed updating 

    • Confluence

      • ACLs info appeared inside the hierarchy section

      • A batch error could display while publishing to Elasticsearch 6.3.0
    • Documentum
      • Exception was being thrown during Group Expansion
    • File System
      • Starting Directories in the File option was not working as expected
    • IBM Connections

      • The connector needed to use the Aspire GroupExpansion instead of SharePoint Integrated security with an optimized IBM Connections Group Downloader
      • Memory leaks could occur
      • During an incremental crawl, the deletes of Blogs, Wikis and Files were not working
      • The Content crawled from IBM Connection did not contain a last-modified date. The problem was with the date format
    • Kafka
      • A "NO-NAME" field could occur

    • SharePoint 2010

      • A problem could occur when identifying the site-collections for a WEB-Application
      • When adding a link on a site collection to crawl, [NO-NAME] should not be part of the name attribute in the hierarchy section
      • No error should occur during the incremental crawl for the Blog site collection
      • No errors should occur when crawling a specified list (views included)
    • SharePoint 2013

      • When crawling incrementals for an External list, the connector was not picking up the changes
      • When crawling SP2013, errors such as "HTTP Error 400. The size of the request headers is too Long" might occur
      • KeyNotFoundException while trying to check attachments for list with lookup references deleted

      • Crawl a list and the name in the hierarchy of the documents will be displayed as NO-NAME even though the items have  title field.
      • The placeholder needed to be changed for the 'Seeds file' field

      • The connector was unable to crawl large lists

    • SharePoint 2016

      • An error could occur while crawling site

    • SharePoint Online

      • NPE crawling on distributed mode. Random NPE in the item complete callback

      • String index out of range while getting a List display url

      • Error while crawling after a crawl was stopped: Item parent wasn't assigned during crawl

    • Standalone Mode

      • When a user added a custom connector, feedback needed to be provided by the Aspire UI
    • Staging Repository
      • A global variable was not working when configuring the server in the Staging Repository connector

      • When crawling over multiple documents and publishing at two different scopes, the items published could be duplicated

      • The Stager connection could be broken when running a full crawl

    Publishers

    • Stager BDC Plugin could randomly fail during the crawls after setup

    • Elasticsearch
      • DeleteByQuery was not being used with Elasticsearch 6.1.1
    • GCS Publisher

      • A resource config/application.xml was missing on the jar file

      • A relative path was not working in the Credentials Key File field
      • An error could occur when crawling and publishing to GCS

    • Kafka
      • An error could be masked when running a non-batched job

    • Publish to Avro
      • Validation needed to be added to the Time Rollover Threshold field

    • TLS 1.2 support was needed for the SharePoint Security Pre Trimmer

    Services

    • Azure Group Expander could refuse to start.

    • Group Expansion failed if user data exceeded the Mongo Max Document Limit (16MB)

    • For Aspire Distributed mode, Services in the master node were not starting automatically after saving changes
    • Errors to reflect failed Services were not being generated
    • Services that were set up in an Aspire cluster were not synced up correctly

    • Azure Active Directory Group Expander
      • Users were not being removed

    • Group Expansion Service
      • The userGroupCache map was accessed when the Group Expansion Service was running
    • LDAP Cache Service

      • The controls did not display and the Schedule was set to Advanced even if Minutes or Hourly were set

      • Problems with  LDAP-Cache component could include: reporting a duplicate key error twice, stopping with a duplicate error, taking too long, and refresh refusing to start. The connector could not look up ACL information in the LDAP-Cache component

    • You are now able to check the user’s cache for the Azure Group Expander via the Debug console

    Applications

    • Archive Extractor
      • The "Send delete by query first" option could throw an exception

      • Deleting files inside of an archive file was not handled properly for incremental crawls

    • AVRO Extractor

      • During an incremental crawl, a "duplicate key error" message could display


    Update service name to chain of spaces was not validated all the time.

    Known Issues


    Connectors 

    • FTP 

      • FTP connector is only working with Unix systems and not in Windows
    • Twitter
      • Full/Incremental crawls for retweets are not working

    Publishers

    • Google Cloud Search
      • Bundle location error loading the publisher for the first time
      • NullPointerException publishing with Batch and Content Type Raw options
      • ItemUploadRequest exception
      • Pending required field validation for the 'Indexer Type' field