Linguistic Resources

Language processing requires linguistic resources. This includes:

Dictionaries of word variations (e.g. for lemmatizers)
Dictionaries of names, places, products, etc.
The pipeline configuration itself

The Language Processing Toolkit has a "Resource Management" system for reading and using these resources for language processing.

Note that resources are intended to be shared across all engines within an instance of Saga (and, possibly, across multiple nodes as well - depending on the implementation).

Goals:

Separated storage layer
- Allow for resources to be stored in files or different database systems
Isolate storage details from pipeline functionality
- Change providers without changing pipeline configuration
Allow extremely large dictionary resources to be stored and used centrally
- For example, in a REDIS or similar distributed key-value system
Allow for Dev, Staging, and Production publishing
Allow for business user editors to edit dictionaries and publish updates
Allow for publishing of dynamic updates linguistic resources

Note that many of these goals are just goals for now and are in the process of being implemented.

Page tree

Resource Providers

Types of Resources

Resource Details