New Elasticsearch as a NoSQL provider.
All Publishers are using the new Publisher Framework.
Updates to the Listener service (push updates).
Cluster Mode improvements (Zookeeper stability).
Implemented the FIFOQueue for the MongoDB Provider.
New options for the Extract Text configuration and new Throttling section in Advanced Configuration for connectors.
Background processing.
Saga natural language processing is available as an Aspire plug-in and can be used to perform NLP as part of Aspire workflow.

Anchor
UIenhance
UIenhance
Aspire UI

Import/Export System configuration.
Log Browser.

Anchor
connectorEnhance
connectorEnhance
Connectors

OneDrive.

Anchor
publisherEnhance
publisherEnhance
Publishers

Amazon S3.
Background Queue.
Azure Search.

Anchor
plugEnhance
plugEnhance
Plugins

Stager BDC Plugin.
- Now supports Sharepoint 2019.

Anchor
plugEnhance
plugEnhance
Services

Binary Store.
Thumbnails

Related pages:

AnchornewfeaturesnewfeaturesNew Features

Web Crawler named Aspider Connector replaces the Legacy Heritrix Connector.
Salesforce Connector has been refactored to include the following features:
- Runs in the new connector framework.
- Supports execution in a distributed environment.
- Allows concurrent crawling of multiple endpoints.
- Provides faster incremental crawls.
- Uses snapshots.
New way to manage Failed Documents for all of the Source Connectors:
- Allows document reprocessing that previously failed in both processing and publishing stages.
Avro Reader Extractor Application and Avro Publisher.
Status
subtle true
colour Green
title Alpha version
Parquet Extractor Application.
Status
subtle true
colour Green
title Alpha version
SMTP Connector.
Status
subtle true
colour Green
title Alpha version
HDFS Connector and Web HDFS Publisher.
Status
subtle true
colour Green
title Alpha version
Http Generic Service.
Status
subtle true
colour Green
title Alpha version
Documentum DQL
- Error tolerant option to index metadata when fetch document fails.
- New RenditionType option for indexing.
Support of Azure Authentication on the SharePoint Online Connector.
New features for the SharePoint Connector (2007/2010):
- Supports default snapshots on incremental crawls.
- Supports crawling-specific views on lists.
Implemented a single security key-store throughout all of Aspire.
Updated SharePoint 2007/2010 Web Service Extensions.

Anchor
bugfixes
bugfixes
Bug Fixes

Anchor

AspireCore

AspireCorebugs

AspireCore

AspireCorebugs
Aspire Core

Incorrect Historical Statistics for a new connector after another was executed.

Negative DPS appeared on the Statistics.

and Framework Components

Missing headers on OAuth classes.
Wrong URL info for the Aspire UI Authentication documentation in

Link related to Aspire Authentication was updated in the config/

settings.xml file.

Crawl Begin/End Auditing actions displayed different elements on Auditing.

NPE when a crawl stopped without pausing.

NPE when a crawl stopped in ReleaseController after pausing.

UI not showing correctly the error for an invalid groovy script.

Content source name could be updated with blank spaces.

Historical Crawl Statistics for one connector appearing in another one.

Incorrect Logs showed in QueueLoader component.

Exception loading workflow.xml file when an empty custom groovy script was added.

Advanced scheduler option was not working.

Server error was encountered when displaying Audit Log.

Group Expansion and Advance properties check boxes were misplaced.

Page was stuck when adding a Custom Publisher.

Service start and stop buttons displayed wrong tooltips.

Crawl stalling in Linux after publishing end job.

Couldn't save content source when using multiple check box selectors with one option checked.

OpenDXF was not escaping characters in JSON inputs.

Any exception thrown produced a NPE in the app-rap-connector.

Failover: Dual instance full test (interrupted) - Recovery Option Full - Not all items Crawled

Failover: Dual instance full test (interrupted) - Recovery Option Incremental - Never ended crawling

Allowed laxing of deletes policy in connector framework.

Aspire not detecting changes in the connector settings and not asking to save them.

Stop crawl option not working correctly.

Crawl showing wrong time when crawling in Linux.

Statistics showing items In Progress when paused.

Docs not crawled were reported as Adds on Auditing.

Server error was encountered when displaying Audit Log.

Aspire.sh -create_master option not working properly on Linux.

Hierarchy extractor needed to have default values selected on Workflow jobs to work properly.

Minor improvements and fixes in Aspire UI.

Validations improvements and fixes for several components.

Master password ssh file not working on Centos OS.
Errors processing failed documents with the Exception Patterns option.
Components on Workflow not saved if the content source was not saved first.
Invalid entitlements host caused missing workflow applications.
Double click ignored on disabled workflow item.
NPE after shutting down 2 Aspire instances in distributed mode.
Java 1.8 Error when name of the Application and name of Publish was the same.
Aspire not starting in shell mode on Centos OS.
Publisher added to the workflow not being unpacked into cache folder so they were unavailable and not working.
Error installing Aspire as a service in Windows.
Status not displayed in Aspire after a crawl was aborted.
NPE scheduling the "cacheGroups" option without the GEM configured.
Two different entries in status collection being generated for the same crawl ID.
NPE having Artifactory user with not entitlements assigned.
Mongo database name limit exceeded by the Aspire Database name.
NPE with the Non Text Document Filter and Open Data Stream options enabled.
NPE using encrypted password at the SSL settings in settings.xml file.
NPE pausing a crawl with MongoDB and Zookeeper in distributed mode.
Error trying to import a Service since some services do not have a workflow associated.
Previous crawl errors displayed when current crawl was running.
Crawls on distributed mode not populating correctly ancestor ID and ACLs.
Error uninstalling Aspire as a service.
Aspire not getting alert if Elasticsearch provider is not running.
Crawl statistics not reflecting the deletes if there were adds/updates.
NPE after an authentication method configured in the settings.xml file.
NPE displayed while stopping a crawl after it just started.
Some Aspire UI settings configured in settings.xml file being ignored.
Out of Memory error using a very big number in Hierarchy Cache Size option.
Every time a groovy script was updated, a blank line was added at the beginning of the script.
Invalid characters validation in the Extension List option of the Non Text Document filter.

Anchor
appbugs
appbugs
Applications

Archive Extractor
- Using Select/Deselect All option closed the Configuration window.
AVRO Extractor
- ASPIRE-8112/ASPIRE-8113 Routing section options not displaying correctly.
Hierarchy Extractor
- The User/Group field on ACLs section is now required.

Anchor
UIbugs
UIbugs
Aspire UI

Typos on Accenture license information.
UI refreshing stacks over and over while changing between the Cards View and the List View.
Aspire DXF not accepting Windows relative paths.
The word "content sources" displayed in the Service Group control.
Navigation controls at the bottom overlapping the footer.
Fixed special characters allowed in the connector's name.

Anchor
connectorsbugs
connectorsbugs
Connectors

Adobe Experience Manager
- Use scheduled (de)activation item settings not working without include/exclude properties.
- Fetch ACLs option not working.
- Updates on pages not crawled on incremental.
- Wrong credentials threw unclear message on Basic Authentication.
- Malformed URLs not validated.
- More user friendly exception for non-existent pages/assets.
- Normalized date format for the "lastModified" field.
Amazon S3
- Crawl failing for items published with the S3 Publisher.
- Some exceptions using the connector, the Archive Extractor application and the Elasticsearch publisher.
- Crawl failing if directory URL not ending in the "/" character.
- Using bad Include Pattern prevented crawl to start.
Aspider
- Crawls not finishing in distributed mode.
- Updates processed as Add instead of Updates.
- Missed some options on the Extract Text section.
- Hierarchy info appearing having the Hierarchy option disabled.
- NPE displayed while content cleanup is selected but nothing is configured.
- A [NO-NAME] value displayed in the hierarchy section.
- Extract Text & Hierarchy options took out of the Advanced Configuration section.
- Content cleanup of web pages not working in Aspire 3.3.0.4.
- Images not being crawled using the Extract Text option enabled.
Azure Blob
- Seed file option is not working.
- <Non text document> tag not published using open data stream.
- Split Words per XML/HTML Tag is not working.
- HTML Output not producing any document output.
- Crawl errors displayed in the UI.
- Incremental crawl not detecting updates.
- Storage Connection String set as a placeholder.
- Problem crawling folders.
Azure EventHub
- Valid tooltips for the field in the Credentials section.
Box
- SSLExceptions during crawls (HTTP error code 429)
- Incremental actions not working.
- Acces token issue during crawls.
- Incremental crawls getting more items than expected.
Database Server
- Scan errors crawling all tables in the RDBMS.
Elasticsearch
- 429 Error Management.
File System
- Hierarchy information incomplete.
- No error using invalid filename specified in the "Path to Root directories file" field.
- NPE using Multiple starting points option.
IBM Connections
- Crawling specific endpoints (Communities, Forums, and Wikis) not working.
- Error caching groups.
- Incremental after deletes using Elasticsearch not working.
- Querying an IBM Domino getting an OperationNotSupportedException (LDAP: error code 12 - Unavailable Critical Extension).
- Hierarchy information not being published to Elasticsearch.
Lotus
- Error publishing to Elasticsearch. Check Known Issues section for workaround for this issue.
RDB Snapshot
- Problems crawling delete actions.
RDB Tables
- Exception using the Slices option not reported on the UI.
Sharepoint 2013
- Connector processes same document with different ID between crawls.
- NPE pausing a crawl.
- Problem running incremental using Lists option. All content being crawled.
Sharepoint 2016
- Issue on incremental when an External List was included in a pattern.
- Problem on incremental using Tokens with Crawl Attachments option enabled.
- Connector not crawling folders created under site collection.
Sharepoint Online
- List threshold: not all items on a big list are being crawled.
- Group Expansion not working.
- Error generating FetchUrl and Display URL for link list items inside a folder.
SMB
- Crashes with start URL ending without slash
- Connector failing when crawling specific file url and regex
- Deny permissions are missing
- SMB doesnt' detect ACL changes on incremental crawl
- SMB connector override last access date only when it was changed by the connector
ServiceNow
- Use Agregate API not working.
- ID Unexpected displayed instead of ACL.
- Group expansion not being checked by default.
StageR
- No error message on console or UI indicating wrong storage/scope used.
Yammer
- No error message on Aspire Web UI when Yammer token is invalid.

Anchor
publishersbugs
publishersbugs
Publishers

Elasticsearch
- Updated items published as a new item.
- NPE when using an incorrect ES port/host.
- Added validation for malformed index name.
- Groovy Transform is not validated with absolute/relative path.
- Minor UI changes (tooltips and validations)
Google Cloud Search
- NPE when hierarchy info not coming from the connector.
- $superSearcherAcl being added as part of the ACLs when setting is empty.
- Content type Raw not extracting the content for binary files.
- Option to populate the gcsUniqueId field.
- SocketTimeout exception.
- Date fields using a month range from 0 to 11 instead of 1 to 12.
Solr
- Option to set multiple URLs not working.
- XSL Transform is not validated with absolute/relative path.
- Solr URL field required Malformed URL validation.
- Removed info from tooltip about 'default core'. Core field now is required.
StageR
- Delete All Action is not always executed first.

Anchor
servicesbugs
servicesbugs
Services

NPE while using services with workflows.
Group Expansion not loading after Aspire was restarted.
Error while adding Services with no workflows.
Broken images/icons on Services UI.
Encryption issue with authentication using LDAP Cache Service.
LDAP Cache Service: Unavailable Critical Extension error querying IBM Domino.
LDAP Cache error after importing the Service and run it.
LDAP Cache authentication problem using service account.
Discovery by Regex will throw error for non-pst files.

Anchor
knownissues
knownissues
Known Issues

Anchor
connectorknown
connectorknown
Aspire Core and Framework Components

Completed items not being removed from the process queue.
Crawl time execution still running after pause it.
Felix startup warning using Java version 11.
Connectors/publishers saved twice when Aspire components are still downloading.
HttpFeeder - Servlet added with the same name of another servlet is not notified in the UI.
Error validating field Maximum size on Extract Text

Anchor
connectorknown
connectorknown
Publishers

- Elasticsearch: Error publishing hierarchy information to the index using the Lotus connector. This will be fixed for Aspire 4.0.1. Meanwhile and as workaround change the transform.groovy file in the line 225 to the following line:
  - ancestors?.getChildren().each() { ancestor ->

Anchor
techLimitations
techLimitations
External Technical Limitations

Aspire Core and Framework Components
- Elasticsearch Provider - "FATAL: Flushing-Error" can happen in some connectors.
Publisher
- S3: Current implementation has a limit of 5GB when upload.

AnchorApplicationsApplicationsApplications

Archive Extractor
- Incremental on archives files was not working using the Lotus connector.
- Nested archive threw an "Archive not recognized" error.

AnchorConnectorsConnectorsConnectors

CIFS
- Malformed URL was not being validated.
- Removed slash character at the end of name attribute on Hierarchy.
Confluence
- Name attribute for the level 1 hierarchy showed the name of the content source.
Documentum DQL
- Fixed NPE when running an incremental crawl.
- DisplayUrl field was not separating webtop from document id
- Include/Exclude fields appeared as part of the configurations.
eRoom
- Updates and Deletes were not picked up by Incremental crawl over certain items (Comments and Votes for polls.)
- UI validation when using wrong URL.
- No error was reported when setting a wrong username/password.
FileSystem
- Improved the wording in some tooltips.
Heritrix
- Implemented deletes handling feature from Heritrix in the connector framework.
Jive
- Changes on Document ACL's were reflected incorrectly for both Activity Incremental and Normal Incremental.
- Non-text Document filtering reported Add instead of Update for the documents filtered.
- Page Size value was not using the UI parameter.
Lotus
- Exclude pattern was not working as expected for items that were not attachments.
- Incremental on archive files was not working.
- Incremental crawl with index containers was not working.
- No error showed if the database and view were the same.

RDB Snapshot
- Crawl was not finishing with the Use Slices option and set bad Extract SQL.
- No error was reported when setting a wrong ACL SQL.
- Wrong sql statement in Full crawl was not showing errors.
RDB Tables
- Action column was ignored for the incremental crawl.
Service Now
- Displayed incorrect URL field in Knowledge Articles (XML representation).
- Inclusion\Exclusion pattern was not working for attachments.
- Aspire error when two images files were attached and a full crawl was run.
Social Cast
- Tag nonTextDocument was missed in the Aspire Object.
SharePoint 2007
- Error on console and UI while crawling an item updated on root using both Index Containers and Scan Recursively disabled.
- NPE processed container after changing ACL on an Incremental crawl.
- ACLs showed the same item as group and user.
SharePoint 2010
- Minor fixes to the tooltips.
SharePoint 2013
- Incremental reported duplicate jobs when adding a subsite.
- Delete job had the incorrect displayUrl and fetchUrl after renaming a file.
SharePoint Online
- Adding specific site collections made incremental crawl everything.
- Renaming an item returned an add, update and delete on the same crawl.
- Error when crawling site URL with encoded blank spaces.

AnchorPublishersPublishersPublishers

Publish to Solr
- Deletes were not working correctly.

AnchorServicesServicesServices

Add Service button was not working.
Azure Group Expander
- Azure GE and SharePoint Online GE were not deleting users.
CEWS Listener
- PropertyOflong and PropertyOfArrayOflong were not working.
Fast Content API
- Missing validations.
Group Expansion Manager
- Fixed 'Missing version number' error when service was loaded.
- Some validations were missed.
LDAP Cache
- Some validations were missed.
- Problems with tooltips for LDAP Attribute in Cache user and Cache group options.

AnchorExtTechLimitExtTechLimitExternal Technical Limitations

Zip files are not crawled with the Activity Incrementals when they are created inside Jive Documents.
When a Salesforce SOQL statement selects a number of large fields (such as two or more custom fields of type long text) then Salesforce may return fewer records than defined in the page size in order to control the overall response payload size. The reduction in page size also occurs when dealing with base64 encoded fields (or blob fields), such as the Body. Remove these fields from the query if you want Salesforce to return the number of rows specified on the page size.

AnchorToReleaseToReleaseTo Be Released

Box
IBM Connections

AnchorItemDeprecateItemDeprecateItems to Deprecate on Aspire 4.0

The following items are marked to be deprecated on the next Aspire version:

Elasticsearch bootloader
- aspire-elastic-bootloader
DCM
- aspire-dcm-enterprise
- aspire-amazonec2-dm
- aspire-zk-dm
The old Admin UI(s)
- Parts of aspire-application
Big Data
- app-semantic-co-occurrence-hadoop
- app-semantic-co-occurrence-hadoop-soln
- aspire-hadoop-job-launcher
- aspire-hadoop-hdfs
- aspire-hadoop-wiki-dict-generator
- aspire-load-hdfs
Connectors
- Staging Repo Connector (File System)
Solutions
- OCR
- Semantic Co-ocurrence
Publishers
- Cloudsearch
- Staging Repo Publisher (File System)

AnchorknownissuesknownissuesKnown Issues AnchorAspireCore2AspireCore2Aspire Core

Importing connector with special characters in the path fields not loading correctly.
Auditing
- Dump option not working with Solr 6.2.0 & 6.3.0
- Dump option not working with ElasticSearch 5.0.2
- Incremental Crawl - Unchanged documents not displayed in Audit Log.
Aspire Shell
- The option load-content-sources not working.
- Relative paths were not working for the commands that create jobs.
- Sometimes it was possible to delete the Aspire Shell prompt.
Failed Documents
- FailedDocuments - Connector getting stuck when stopping the crawl.
Failover
- Single instance, full test, interrupted, incremental recovery: Error Processing some files after resuming crawl.
- During full crawls, some documents were left out if an instance was killed.
- File System connector resumed crawling after restart.

AnchorApplications2Applications2Applications

Archive Extractor

Routing options not working with OnError.
Delete by Query not working as expected Using ElasticSearch 5.0.1

AVRO ExtractorRouting Workflow for add/update jobs to On Error does not work
AnchorConnectors2Connectors2Connectors

Include and Exclude pattern trimming empty spaces.
Aspider
- Aspider - Crawl statistics not displayed on the UI when version was used.
Documentum
- GroupExpansion marking groups as users.
- Error on console was displayed while crawling a folder: "Stream handler unavailable due to: null"
Heritrix
- Reject Images/Videos/Javascript/CSS not working for external site out of domain.
Jira Issues
- Multiple connection timeouts occurring.
Jive
- Crawl generated unnecessary deletes on Normal Incremental with the Use Progressive Retries option.
RDB Snapshot
- bigINT SQL Server database type not supported for SQL Slices.
- Inconsistency when crawling ACL information using ACL fetching options.
- Crawl not working with a specific column when the Use column from Extraction SQL option was specified.
RDB Tables
- Wrong value in "sequence column" parameter not showing UI error.
- Inconsistent crawling ACL information when using ACL fetching options.
SharePoint Connectors
- Renaming one document included by a pattern not generating a deletion.
- No descriptive error with non-existent URL.
SharePoint 2007
- Connector generating wrong hierarchy on updates.
- Custom headers not being added to the request.
SharePoint 2010
- Documents inside a folder not being picked up by incrementals if the parent ACL changed.

AnchorServices2Services2Services

Import Service option not working when the service under the same name already existed.
Group Expansion Service
- Removing GE collection from mongo causing unusable GE for connectors.
Group Expansion Manager
- GE Manager - More than one GE Service using the same servlet name generating an error.

AnchorUI2UI2UI

Add Source List not showing until clicking refresh sources.

Update service name to chain of spaces was not validated all the time.

Page tree

Versions Compared

Old Version 76

New Version Current

Key

For version 4.0, Aspire requires a license file to run.

See Aspire Licensing for information on obtaining a license.

Anchor
enhancements
enhancements
New and Enhanced Features

Anchor
AspireCoreEnhance
AspireCoreEnhance
Aspire Core and Framework Components

Anchor
UIenhance
UIenhance
Aspire UI

Anchor
connectorEnhance
connectorEnhance
Connectors

Anchor
publisherEnhance
publisherEnhance
Publishers

Anchor
plugEnhance
plugEnhance
Plugins

Anchor
plugEnhance
plugEnhance
Services

Anchor
bugfixes
bugfixes
Bug Fixes

Anchor

AspireCorebugs

AspireCorebugs
Aspire Core

and Framework Components

Anchor
appbugs
appbugs
Applications

Anchor
UIbugs
UIbugs
Aspire UI

Anchor
connectorsbugs
connectorsbugs
Connectors

Anchor
publishersbugs
publishersbugs
Publishers

Anchor
servicesbugs
servicesbugs
Services

Anchor
knownissues
knownissues
Known Issues

Anchor
connectorknown
connectorknown
Aspire Core and Framework Components

Anchor
connectorknown
connectorknown
Publishers

Anchor
techLimitations
techLimitations
External Technical Limitations

Page tree

Page History

Versions Compared

Old Version 76

New Version Current

Key

For version 4.0, Aspire requires a license file to run.

See Aspire Licensing for information on obtaining a license.

AnchorenhancementsenhancementsNew and Enhanced Features

AnchorAspireCoreEnhanceAspireCoreEnhanceAspire Core and Framework Components

AnchorUIenhanceUIenhanceAspire UI

AnchorconnectorEnhanceconnectorEnhanceConnectors

AnchorpublisherEnhancepublisherEnhancePublishers

AnchorplugEnhanceplugEnhancePlugins

AnchorplugEnhanceplugEnhanceServices

AnchorbugfixesbugfixesBug Fixes

Anchor

AspireCorebugs

AspireCorebugsAspire Core

and Framework Components

AnchorappbugsappbugsApplications

AnchorUIbugsUIbugsAspire UI

AnchorconnectorsbugsconnectorsbugsConnectors

AnchorpublishersbugspublishersbugsPublishers

AnchorservicesbugsservicesbugsServices

AnchorknownissuesknownissuesKnown Issues

AnchorconnectorknownconnectorknownAspire Core and Framework Components

AnchorconnectorknownconnectorknownPublishers

AnchortechLimitationstechLimitationsExternal Technical Limitations

Anchor
enhancements
enhancements
New and Enhanced Features

Anchor
AspireCoreEnhance
AspireCoreEnhance
Aspire Core and Framework Components

Anchor
UIenhance
UIenhance
Aspire UI

Anchor
connectorEnhance
connectorEnhance
Connectors

Anchor
publisherEnhance
publisherEnhance
Publishers

Anchor
plugEnhance
plugEnhance
Plugins

Anchor
plugEnhance
plugEnhance
Services

Anchor
bugfixes
bugfixes
Bug Fixes

AspireCorebugs
Aspire Core

Anchor
appbugs
appbugs
Applications

Anchor
UIbugs
UIbugs
Aspire UI

Anchor
connectorsbugs
connectorsbugs
Connectors

Anchor
publishersbugs
publishersbugs
Publishers

Anchor
servicesbugs
servicesbugs
Services

Anchor
knownissues
knownissues
Known Issues

Anchor
connectorknown
connectorknown
Aspire Core and Framework Components

Anchor
connectorknown
connectorknown
Publishers

Anchor
techLimitations
techLimitations
External Technical Limitations