RDB Snapshot

...

This page maintains a list of all of the updates for version 3.1 of Aspire.

Contenido

Table of Contents

New Features

Improved and redesigned User Interface.
Brand new Connector Framework (for more information check here)
Several connectors refactored and migrated to the new framework.
Refactored Group Expansion to use MongoDB
Added the creationDate filter option and Progressive Retries for Jive Source Connector.
New applications: Archive and PST Extractor.

Bug Fixes

Aspire Core

Fixed issue on DXF with the escapeValue flag.
Workflow - Validation allows to re-save a workflow rule.
Workflow - Component with a DXF text field is now saved.
Workflow - Deleting Components now works properly.
Scheduler General Tab - Content Source configuration saved is now being displayed in the UI.
UI - Deleting references do not delete application if shared.
Failover - After a single instance full crawl it's possible to run another one.
Aspire Framework crashing when include/exclude pattern is left empty.
Hierarchy Extractor - Fixed NPE.
Workflow - Rules are editable even if disabled.
General Auditing fixes.

Applications

Field Mapper
- Multiple Source Mappings not updating correctly.

Connectors

eRoom
- [ASPIRE-3840] Extension List option and Open Data Stream not working with the Groovy script.
IBM Connections
- ACLs not being extracted.
TeamForge
- Exclude pattern works as expected.

Publishers

- Pub2HDFS
  - WebHDFS exception.

Services

- JMS
  - now loading correctly.

Known Issues

Aspire Core

NoSQLSet is not working as expected.
Source Connector disabled in distributed scenario crawls items and counts them in the statistics.
Auditing Tool - filtering by "Batch" and "All" & "Job and All" is not working.
Debug console broken in Firefox.
Aspire does not load even after the console says is loaded.
Add option does not show results when filtering from the last page.

Applications

Archive Extractor
- Deletion is not working with file names with special characters.
- "Index Archive file job" is throwing exceptions or is not adding the job requested.
PST Extractor
- Connector indexes containers.

Connectors

...

Aspire Connector Archetype

Aspire connector archetype not compiling because of a wrong dependency. In order to make it work, change the "aspire-connector-framework" dependency version from 3.0 to 3.1

...

IBM Connections

Adds are reported as Updates.

...

RDB Source

Need to flush jobs on RDB Source Connector.

Anchor
newfeatures
newfeatures
New Features

Web Crawler named Aspider Connector replaces the Legacy Heritrix Connector.
Salesforce Connector has been refactored to include the following features:
- Runs in the new connector framework.
- Supports execution in a distributed environment.
- Allows concurrent crawling of multiple endpoints.
- Provides faster incremental crawls.
- Uses snapshots.
New way to manage Failed Documents for all of the Source Connectors:
- Allows document reprocessing that previously failed in both processing and publishing stages.
Avro Reader Extractor Application and Avro Publisher.
Status
subtle true
colour Green
title Alpha version
Parquet Extractor Application.
Status
subtle true
colour Green
title Alpha version
SMTP Connector.
Status
subtle true
colour Green
title Alpha version
HDFS Connector and Web HDFS Publisher.
Status
subtle true
colour Green
title Alpha version
Http Generic Service.
Status
subtle true
colour Green
title Alpha version
Documentum DQL
- Error tolerant option to index metadata when fetch document fails.
- New RenditionType option for indexing.
Support of Azure Authentication on the SharePoint Online Connector.
New features for the SharePoint Connector (2007/2010):
- Supports default snapshots on incremental crawls.
- Supports crawling-specific views on lists.
Implemented a single security key-store throughout all of Aspire.
Updated SharePoint 2007/2010 Web Service Extensions.

Anchor
bugfixes
bugfixes
Bug Fixes

Anchor
AspireCore
AspireCore
Aspire Core

Incorrect Historical Statistics for a new connector after another was executed.
Negative DPS appeared on the Statistics.
Link related to Aspire Authentication was updated in the config/settings.xml file.
Crawl Begin/End Auditing actions displayed different elements on Auditing.
NPE when a crawl stopped without pausing.
NPE when a crawl stopped in ReleaseController after pausing.
UI not showing correctly the error for an invalid groovy script.
Content source name could be updated with blank spaces.
Historical Crawl Statistics for one connector appearing in another one.
Incorrect Logs showed in QueueLoader component.
Exception loading workflow.xml file when an empty custom groovy script was added.
Advanced scheduler option was not working.
Server error was encountered when displaying Audit Log.
Group Expansion and Advance properties check boxes were misplaced.
Page was stuck when adding a Custom Publisher.
Service start and stop buttons displayed wrong tooltips.
Crawl stalling in Linux after publishing end job.
Couldn't save content source when using multiple check box selectors with one option checked.
OpenDXF was not escaping characters in JSON inputs.
Any exception thrown produced a NPE in the app-rap-connector.
Failover: Dual instance full test (interrupted) - Recovery Option Full - Not all items Crawled
Failover: Dual instance full test (interrupted) - Recovery Option Incremental - Never ended crawling
Allowed laxing of deletes policy in connector framework.
Aspire not detecting changes in the connector settings and not asking to save them.
Stop crawl option not working correctly.
Crawl showing wrong time when crawling in Linux.
Statistics showing items In Progress when paused.
Docs not crawled were reported as Adds on Auditing.
Server error was encountered when displaying Audit Log.
Aspire.sh -create_master option not working properly on Linux.
Hierarchy extractor needed to have default values selected on Workflow jobs to work properly.
Minor improvements and fixes in Aspire UI.
Validations improvements and fixes for several components.

Anchor
Applications
Applications
Applications

Archive Extractor
- Incremental on archives files was not working using the Lotus connector.
- Nested archive threw an "Archive not recognized" error.

Anchor
Connectors
Connectors
Connectors

CIFS
- Malformed URL was not being validated.
- Removed slash character at the end of name attribute on Hierarchy.
Confluence
- Name attribute for the level 1 hierarchy showed the name of the content source.
Documentum DQL
- Fixed NPE when running an incremental crawl.
- DisplayUrl field was not separating webtop from document id
- Include/Exclude fields appeared as part of the configurations.
eRoom
- Updates and Deletes were not picked up by Incremental crawl over certain items (Comments and Votes for polls.)
- UI validation when using wrong URL.
- No error was reported when setting a wrong username/password.
FileSystem
- Improved the wording in some tooltips.
Heritrix
- Implemented deletes handling feature from Heritrix in the connector framework.
Jive
- Changes on Document ACL's were reflected incorrectly for both Activity Incremental and Normal Incremental.
- Non-text Document filtering reported Add instead of Update for the documents filtered.
- Page Size value was not using the UI parameter.
Lotus
- Exclude pattern was not working as expected for items that were not attachments.
- Incremental on archive files was not working.
- Incremental crawl with index containers was not working.
- No error showed if the database and view were the same.

RDB Snapshot

- Full crawl not working. Console and UI got stuck.
- Crawl was not finishing with the Use Slices option and set bad Extract SQL.
- No error was reported when setting a wrong ACL SQL.
- Wrong sql statement in Full crawl was not showing errors.
RDB Tables
- Action column was ignored for the incremental crawl.
Service Now
- Displayed incorrect URL field in Knowledge Articles (XML representation).
- Inclusion\Exclusion pattern was not working for attachments.
- Aspire error when two images files were attached and a full crawl was run.
Social Cast
- Tag nonTextDocument was missed in the Aspire Object.
SharePoint 2007
- Error on console and UI while crawling an item updated on root using both Index Containers and Scan Recursively disabled.
- NPE processed container after changing ACL on an Incremental crawl.
- ACLs showed the same item as group and user.
SharePoint 2010

...

NPE when Domain is empty.

...

SharePoint 2013

Items crawled and Items with error are the same if domain is empty.

...

Staging

Full crawl is retrieving deleted documents.
"Cannot get content source from job" error when Content Source field is not specified.
Staging Source Connector - Wrong error shown when Storage Unit does not exist.
Staging Source Connector - No error shown when scope does not exist for the given Storage Unit.

- Minor fixes to the tooltips.
SharePoint 2013
- Incremental reported duplicate jobs when adding a subsite.
- Delete job had the incorrect displayUrl and fetchUrl after renaming a file.
SharePoint Online
- Adding specific site collections made incremental crawl everything.
- Renaming an item returned an add, update and delete on the same crawl.
- Error when crawling site URL with encoded blank spaces.

Anchor
Publishers
Publishers
Publishers

Publish to Solr
- Deletes were not working correctly.

Anchor
Services
Services
Services

Add Service button was not working.
Azure Group Expander
- Azure GE and SharePoint Online GE were not deleting users.
CEWS Listener
- PropertyOflong and PropertyOfArrayOflong were not working.
Fast Content API
- Missing validations.
Group Expansion Manager
- Fixed 'Missing version number' error when service was loaded.
- Some validations were missed.
LDAP Cache
- Some validations were missed.
- Problems with tooltips for LDAP Attribute in Cache user and Cache group options.

Anchor
ExtTechLimit
ExtTechLimit
External Technical Limitations

Zip files are not crawled with the Activity Incrementals when they are created inside Jive Documents.
When a Salesforce SOQL statement selects a number of large fields (such as two or more custom fields of type long text) then Salesforce may return fewer records than defined in the page size in order to control the overall response payload size. The reduction in page size also occurs when dealing with base64 encoded fields (or blob fields), such as the Body. Remove these fields from the query if you want Salesforce to return the number of rows specified on the page size.

Anchor
ToRelease
ToRelease
To Be Released

Box
IBM Connections

Anchor
ItemDeprecate
ItemDeprecate
Items to Deprecate on Aspire 3.2

The following items are marked to be deprecated on the next Aspire version:

Elasticsearch bootloader
- aspire-elastic-bootloader
DCM
- aspire-dcm-enterprise
- aspire-amazonec2-dm
- aspire-zk-dm
The old Admin UI(s)
- Parts of aspire-application
Big Data
- app-semantic-co-occurrence-hadoop
- app-semantic-co-occurrence-hadoop-soln
- aspire-hadoop-job-launcher
- aspire-hadoop-hdfs
- aspire-hadoop-wiki-dict-generator
- aspire-load-hdfs
Connectors
- Staging Repo Connector (File System)
Solutions
- OCR
- Semantic Co-ocurrence
Publishers
- Cloudsearch
- Staging Repo Publisher (File System)

Anchor
knownissues
knownissues
Known Issues

Anchor
AspireCore2
AspireCore2
Aspire Core

Importing connector with special characters in the path fields not loading correctly.
Auditing
- Dump option not working with Solr 6.2.0 & 6.3.0
- Dump option not working with ElasticSearch 5.0.2
- Incremental Crawl - Unchanged documents not displayed in Audit Log.
Aspire Shell
- The option load-content-sources not working.
- Relative paths were not working for the commands that create jobs.
- Sometimes it was possible to delete the Aspire Shell prompt.
Failed Documents
- FailedDocuments - Connector getting stuck when stopping the crawl.
Failover
- Single instance, full test, interrupted, incremental recovery: Error Processing some files after resuming crawl.
- During full crawls, some documents were left out if an instance was killed.
- File System connector resumed crawling after restart.

Anchor
Applications2
Applications2
Applications

Archive Extractor
- Routing options not working with OnError.
- Delete by Query not working as expected Using ElasticSearch 5.0.1
AVRO Extractor
- Routing Workflow for add/update jobs to On Error does not work

Anchor
Connectors2
Connectors2
Connectors

Include and Exclude pattern trimming empty spaces.
Aspider
- Aspider - Crawl statistics not displayed on the UI when version was used.
CIFS
- Due to a security issue observed with the protocol, Microsoft has recommended deactivating SMB1 from Windows servers.
Documentum
- GroupExpansion marking groups as users.
- Error on console was displayed while crawling a folder: "Stream handler unavailable due to: null"
Heritrix
- Reject Images/Videos/Javascript/CSS not working for external site out of domain.
Jira Issues
- Multiple connection timeouts occurring.
Jive
- Crawl generated unnecessary deletes on Normal Incremental with the Use Progressive Retries option.
RDB Snapshot
- bigINT SQL Server database type not supported for SQL Slices.
- Inconsistency when crawling ACL information using ACL fetching options.
- Crawl not working with a specific column when the Use column from Extraction SQL option was specified.
RDB Tables
- Wrong value in "sequence column" parameter not showing UI error.
- Inconsistent crawling ACL information when using ACL fetching options.
SharePoint Connectors
- Renaming one document included by a pattern not generating a deletion.
- No descriptive error with non-existent URL.
SharePoint 2007
- Connector generating wrong hierarchy on updates.
- Custom headers not being added to the request.
SharePoint 2010
- Documents inside a folder not being picked up by incrementals if the parent ACL changed.

Anchor
KnownIssuesPublishers
KnownIssuesPublishers
Publishers

Publish to SharePoint 2013
- Due to a security issue observed with the protocol, Microsoft has recommended deactivating SMB1 from Windows servers.

Anchor
Services2
Services2
Services

Import Service option not working when the service under the same name already existed.
Group Expansion Service
- Removing GE collection from mongo causing unusable GE for connectors.
Group Expansion Manager
- GE Manager - More than one GE Service using the same servlet name generating an error.

Anchor
UI2
UI2
UI

Add Source List not showing until clicking refresh sources.
Update service name to chain of spaces was not validated all the time.

Solutions

OCR
- Publish to staging is not being added to the Workflow section when connector is added.

External Technical Limitations

Changes in Box notes content are not considered for incremental crawls.
New items added to IBMConnections are reported as updated.
Changes made to the attachments of the item type Opportunity in Salesforce are not considered for incremental crawls.
Entitlements API is not supporting "User Overrides" on Jive connector. In this case, ACLs will not be retrieved.
Documents with exact same date (including milliseconds) will affect the statistics for Incremental Crawls on Jive connector.

Important Note

MapDB is not used in the new connector framework.

...

Page tree

Page History

Versions Compared

Old Version 14

New Version Current

Key

RDB Snapshot

New Features

Aspire Core

Applications

Field Mapper

Connectors

eRoom

IBM Connections

TeamForge

Publishers

Pub2HDFS

Services

JMS

Known Issues

Aspire Core

Applications

Archive Extractor

PST Extractor

Connectors

Aspire Connector Archetype

IBM Connections

RDB Source

AnchornewfeaturesnewfeaturesNew Features

AnchorbugfixesbugfixesBug Fixes

AnchorAspireCoreAspireCoreAspire Core

AnchorApplicationsApplicationsApplications

AnchorConnectorsConnectorsConnectors

SharePoint 2013

Staging

AnchorPublishersPublishersPublishers

AnchorServicesServicesServices

AnchorExtTechLimitExtTechLimitExternal Technical Limitations

AnchorToReleaseToReleaseTo Be Released

AnchorItemDeprecateItemDeprecateItems to Deprecate on Aspire 3.2

AnchorknownissuesknownissuesKnown Issues

AnchorAspireCore2AspireCore2Aspire Core

AnchorApplications2Applications2Applications

AnchorConnectors2Connectors2Connectors

AnchorKnownIssuesPublishersKnownIssuesPublishersPublishers

AnchorServices2Services2Services

AnchorUI2UI2UI

Solutions

OCR

External Technical Limitations

Important Note

Anchor
newfeatures
newfeatures
New Features

Anchor
bugfixes
bugfixes
Bug Fixes

Anchor
AspireCore
AspireCore
Aspire Core

Anchor
Applications
Applications
Applications

Anchor
Connectors
Connectors
Connectors

Anchor
Publishers
Publishers
Publishers

Anchor
Services
Services
Services

Anchor
ExtTechLimit
ExtTechLimit
External Technical Limitations

Anchor
ToRelease
ToRelease
To Be Released

Anchor
ItemDeprecate
ItemDeprecate
Items to Deprecate on Aspire 3.2

Anchor
knownissues
knownissues
Known Issues

Anchor
AspireCore2
AspireCore2
Aspire Core

Anchor
Applications2
Applications2
Applications

Anchor
Connectors2
Connectors2
Connectors

Anchor
KnownIssuesPublishers
KnownIssuesPublishers
Publishers

Anchor
Services2
Services2
Services

Anchor
UI2
UI2
UI