Introduction


Aspire is a framework and libraries of extensible components designed to enable creation of solutions to acquire data from one or more content repositories (such as file systems, relational databases, cloud storage, or content management systems), extract metadata and text from the documents, analyze, modify and enhance the content and metadata if needed, and then publish each document, together with its metadata, to a search engine or other target application.

Aspire uses Apache Felix (an open source implementation of OSGi) to install, start, stop, update, and uninstall Aspire components and applications without requiring a reboot, supporting improved uptime and making system administration easier. Each individual piece of processing functionality within Aspire is a modular component that can be used by itself, or in conjunction with other components to create an Aspire application.

What is Aspire used for?

Aspire is being used in many types of customer applications, here are some examples: 

  •   Enterprise search to enrich content with additional metadata to support advanced navigation.
  •   Staffing and recruitment to provide search and match solutions between candidate CVs and job descriptions
  •   State government information site to extract metadata from OCR files and normalize the data prior to indexing
  •   Records management to automatically categorize corporate data as it is migrated into SharePoint where content needs to be aggregated and categorized before searching
  •   Legal research to find and analyze content for forward and reverse citations to other content to improve recall and analysis
  •   Company intranet to automatically create enterprise-wide sitemaps for browsing style investigation
  •   Federal government information site to intelligently split up large single files pertaining to laws into searchable” chapters and clauses
  •   Basic content access (connector) to one or more content repositories for search engines
  •   Analyzing and grouping content geospatially for localization

Aspire is extremely flexible. By pulling the data processing pipelines out of the search engine, Aspire can more powerfully and efficiently manipulate content and metadata, can process it in multiple pipelines simultaneously (and over multiple machines)for higher performance, and then feed it to one or more engines for indexing.

The Aspire framework supports creating Natural Language Processing (NLP), Machine Learning, and other analytic processing for text through a rich set of basic components. More detailed descriptions can be found on this page: Natural Language Processing (NLP)

If you want to start using Aspire, see here.

  • No labels