You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Language processing requires linguistic resources. This includes:

  • Dictionaries of word variations (e.g. for lemmatizers)
  • Dictionaries of names, places, products, etc.
  • The pipeline configuration itself

The Language Processing Toolkit has a "Resource Management" system for reading and using these resources for language processing.

Note that resources are intended to be shared across all engines within an instance of Saga (and, possibly, across multiple nodes as well - depending on the implementation).

Goals:

  • Separated storage layer
    • Allow for resources to be stored in files or different database systems
  • Isolate storage details from pipeline functionality
    • Change providers without changing pipeline configuration
  • Allow extremely large dictionary resources to be stored and used centrally
    • For example, in a REDIS or similar distributed key-value system
  • Allow for Dev, Staging, and Production publishing
  • Allow for business user editors to edit dictionaries and publish updates
  • Allow for publishing of dynamic updates linguistic resources

Note that many of these goals are just goals for now and are in the process of being implemented.

Resource Providers


Types of Resources


Resource Details



  • No labels