Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Language processing requires linguistic resources:

  • Dictionaries of word variations (e.g. for lemmatizers)
  • Dictionaries of names, places, products, etc.
  • The pipeline configuration itself

The Language Processing Toolkit has a "Resource Management" system for reading and using these resources for language processing.

Note that resources are intended to be shared across all engines within an instance of Saga (and, possibly, across multiple nodes as well - depending on the implementation).

Goals:

  • Separated storage layer
    • Allow for resources to be stored in files or different database systems
  • Isolate storage details from pipeline functionality
    • Change providers without changing pipeline configuration
  • Allow extremely large dictionary resources to be stored and used centrally
    • For example, in a REDIS or similar distributed key-value system
  • Allow for Dev, Staging, and Production publishing
  • Allow for business user editors to edit dictionaries and publish updates
  • Allow for publishing of dynamic updates linguistic resources

Note that many of these goals are just goals for now and are in the process of being implemented.

Resource Providers

A resource provider provides access to a specific set of resources. This can be a directory of resource files or a collection of tables.

Available Resource Provider Types

Locating the Resource Provider Class

The "type" parameter is used in resource configuration to locate the resource provider class. for example:

  "type":"FileSystem"

or

  "type":"com.accenture.saga.resourcemgr.filesystem.FileSystemProvider"

The type parameter is used to locate the class using the following steps:

  1. If the type parameter has periods in it
    1. Try and look for the class as specified
      1. For example:  com.accenture.saga.resourcemgr.filesystem.FileSystemProvider
  2. Otherwise:
    1. Try and find the class in the "com.accenture.saga.resourcemgr.filesystem" package
      1. For example:  "FileSystemProvider" → "com.accenture.saga.resourcemgr.filesystem.FileSystemProvider"
    2. Try and find the class with a "Provider" suffix in the "com.accenture.saga.resourcemgr.filesystem" package
      1. For example:  "FileSystem" → "com.accenture.saga.resourcemgr.filesystem.FileSystemProvider"

Types of Resources

Currently, there are two types of resources:

  • JSON Config
  • JSON Map

Resources Configuration