Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Warning

Page under cosntruction

Saga Server is a JavaSpark server, implementing Saga Library and other tools, as an out-of-the-box question answer system. fully integrated with Saga Admin UI.

Image Removed

Starting the Server

Starting the server is quite easy and can be done with this 2 options:

With built-in configuration (Saga UI compatible)

This execution means using the embedded configuration inside the server which expects a MongoDB called "sagaDB" with the collections and the formats expected from Saga Admin UI.

Code Block
languagetext
themeMidnight
java -jar -Dfile.encoding=UTF-8 saga-server-parser-0.0.1-SNAPSHOT.jar

With custom configuration and pipeline

This execution requires for a config.js and a pipelines.json file to be passed as parameters of the application, we recommend to overwrite the files within the jar of the server, in order to use and preserve the Saga Admin UI format of the database.

Code Block
languagetext
themeMidnight
java -jar -Dfile.encoding=UTF-8 saga-server-parser-0.0.1-SNAPSHOT.jar config.json pipelines.json

Server Configuration

Bellow is a partial example of the Server Parser config, we omitted the aggregations and fetchTimestamp aggregation from the providers to reduce space in this example:

Code Block
themeDJango
titleconfig.json
linenumberstrue
collapsetrue
{
  "config": {
    "apiPort": 8080,
    "libraryJars": ["./lib"],
    "actionManagerConfig": {
      "actions": "actions-provider:actiongroups"
    },
    "actionProviders": [
      {
        "name": "constant",
        "display": "Constant",
        "type": "Constant",
        "fetchesData": false
      },
      {
        "name": "Human",
        "display": "Human",
        "type": "Elasticsearch",
        "fetchesData": true,
        "transformation": "./transformations/es-transformation.js",
        "hosts": [
          {
            "host": "localhost",
            "port": 9200,
            "schema": "http"
          }
        ],
        "index": "wikidata",
        "query": {
          "bool": {
            "should": [
              {
                "term": {
                  "_id": {
                    "value": "{{human._id}}",
                    "boost": 9999
                  }
                }
              },
              {
                "match_phrase": {
                  "label": {
                    "query": "{{human}}",
                    "boost": 1
                  }
                }
              },
              {
                "match_phrase": {
                  "aliases": {
                    "query": "{{human}}",
                    "boost": 2
                  }
                }
              },
              {
                "query_string": {
                  "query": "{{human.match}}"
                }
              }
            ]
          }
        }
      },
      {
        "name": "geography",
        "display": "Geography",
        "type": "Elasticsearch",
        "fetchesData": true,
        "transformation": "./transformations/es-transformation.js",
        "hosts": [
          {
            "host": "localhost",
            "port": 9200,
            "schema": "http"
          }
        ],
        "index": "wikidata",
        "query": {
          "bool": {
            "should": [
              {
                "term": {
                  "_id": {
                    "value": "{{geography._id}}",
                    "boost": 9999
                  }
                }
              },
              {
                "match_phrase": {
                  "label": {
                    "query": "{{geography}}",
                    "boost": 1
                  }
                }
              },
              {
                "match_phrase": {
                  "aliases": {
                    "query": "{{geography}}",
                    "boost": 2
                  }
                }
              },
              {
                "query_string": {
                  "query": "{{geography.match}}"
                }
              }
            ]
          }
        }
      },
      {
        "name": "currency",
        "display": "Currency",
        "type": "Openexchange",
        "transformation": "./transformations/openexchange-transformation.js",
        "fetchesData": false,
        "appId": "6cfdc0df634a4796b1a40748cdbe6006"
      }
    ],
    "providers": [
      {
        "name": "actions-provider",
        "type": "MongoDB",
        "uri": "mongodb://localhost:27017",
        "database": "sagaDB",
        "aggregation": [...],
        "transactionCollection": "transactions",
        "fetchTimestamp": [...]
      },
      {
        "name": "entity-provider",
        "type": "MongoDB",
        "uri": "mongodb://localhost:27017",
        "database": "sagaDB",
        "aggregation": [...],
        "transactionCollection": "transactions",
        "fetchTimestamp": [...]
      },
      {
        "name": "patterns-provider",
        "type": "MongoDB",
        "uri": "mongodb://localhost:27017",
        "database": "sagaDB",
        "aggregation": [...],
        "transactionCollection": "transactions",
        "fetchTimestamp": [...]
      },
      {
        "name": "regex-provider",
        "type": "MongoDB",
        "uri": "mongodb://localhost:27017",
        "database": "sagaDB",
        "aggregation": [...],
        "transactionCollection": "transactions",
        "fetchTimestamp": [...]
      },
	  {
        "name": "equipment-provider",
        "type": "MongoDB",
        "uri": "mongodb://localhost:27017",
        "database": "sagaDB",
        "aggregation": [...],
        "transactionCollection": "transactions",
        "fetchTimestamp": [...]
      },
	  {
        "name": "unit-provider",
        "type": "MongoDB",
        "uri": "mongodb://localhost:27017",
        "database": "sagaDB",
        "aggregation": [...],
        "transactionCollection": "transactions",
        "fetchTimestamp": [...]
      }
    ]
  }
}
  • apiPort - port in which the server will listen for request
  • libraryJars - paths to the folders holding the jar which need to be added to the classpath. The paths can be either absolute or relative
  • actionManagerConfig - Configuration of the Action Manager, it decides which action providers needs to handle the graph response
    • actions - call the resource provider holding the action definitions.
  • actionProviders - A list of the implemented action providers to be use in the Action Manager. Each action definition can be found in Saga Actions
  • providers - Standar resource providers, the provider holding the actions is declared in this section also.

The other file part of the configuration is the pipelines.json files, which holds the pipelines definition, currently we have 2 pipelines for Saga Parser:

  • process - is defined for text processing 
  • ml - used as a machine learning preprocess for the Entity Trainer

Besides that, the pipelines are exactly as decribe in Configure Pipelines & Resource Providers

Code Block
languagejs
themeDJango
titlepipelines.json
linenumberstrue
collapsetrue
{
  "config": {
    "pipelineConfiguration": {
      "process": {
        "reader": {
          "type": "SimpleReader",
          "splitRegex": "[\r\n]+"
        },
        "stages": [
          {
            "type": "SentenceBreakerStage"
          },
          {
            "type": "WhitespaceTokenizerStage",
            "requiredFlags": [
              "SENTENCE"
            ]
          },
          {
            "type": "CharacterSplitter"
          },
          {
            "type": "CharChangeSplitter",
            "case": true,
            "number": true,
            "punctuation": true
          },
          {
            "type": "LemmatizeStage",
            "exclude": [
              "ob",
              "syn",
              "alt"
            ],
            "skipFlags": [
              "ALL_PUNCTUATION"
            ]
          },
          {
            "type": "CaseAnalysisStage"
          },
          {
            "type": "NumberRecognizer"
          },
          {
            "type": "StopWordsStage"
          },
          {
            "type": "RegexPatternStage",
            "patterns": "regex-provider:patterns",
            "caseInsensitive": true,
            "boundaryFlags": [
              "SENTENCE_SPLIT"
            ]
          },
          {
            "type": "DictionaryTaggerStage",
            "dictionary": "entity-provider:entities",
            "skipFlags": [
              "SKIP"
            ],
            "boundaryFlags": [
              "SENTENCE_SPLIT"
            ],
            "requiredFlags": [
              "TOKEN",
              "ALL_LOWER_CASE"
            ],
            "ignoreTags": [
              "root"
            ],
            "debug": true
          },
          {
            "type": "AdvancedPattern",
            "skipFlags": [
              "SKIP"
            ],
            "patterns": "patterns-provider:patterns",
            "debug": true
          }
        ]
      },
      "ml": {
        "reader": {
          "type": "SimpleReader",
          "splitRegex": "[\r\n]+"
        },
        "stages": [
          {
            "type": "QuotationBreakerStage",
            "singleQuotes": true
          },
          {
            "type": "SentenceBreakerStage",
            "skipFlags": [
              "PROCESSED"
            ]
          },
          {
            "type": "WhitespaceTokenizerStage",
            "requiredFlags": [
              "SENTENCE"
            ]
          },
          {
            "type": "CharChangeSplitter",
            "case": false,
            "number": true,
            "punctuation": true
          },
          {
            "type": "CaseAnalysisStage"
          },
          {
            "type": "StopWordsStage"
          },
          {
            "type": "RegexPatternStage",
            "patterns": "regex-provider:patterns",
            "caseInsensitive": true,
            "boundaryFlags": [
              "SENTENCE_SPLIT"
            ]
          },
          {
            "type": "DictionaryTaggerStage",
            "dictionary": "unit-provider:entities",
            "skipFlags": [
              "SKIP"
            ],
            "boundaryFlags": [
              "SENTENCE_SPLIT"
            ],
            "requiredFlags": [
              "TOKEN",
              "ALL_LOWER_CASE"
            ],
            "ignoreTags": [
              "root"
            ]
          },
          {
            "type": "AdvancedPattern",
            "skipFlags": [
              "SKIP"
            ],
            "patterns": "patterns-provider:patterns"
          },
          {
            "type": "DictionaryTaggerStage",
            "dictionary": "equipment-provider:entities",
            "skipFlags": [
              "SKIP"
            ],
            "boundaryFlags": [
              "SENTENCE_SPLIT"
            ],
            "requiredFlags": [
              "TOKEN",
              "ALL_LOWER_CASE"
            ],
            "ignoreTags": [
              "root"
            ]
          }
        ]
      }
    }
  }
}

...

When we refer to the Saga Solution we mean the combination of the Saga Management UISaga Server, Python Bridge and Saga Library, among other utilities created to accelerate the creation of an NLP (Natural Language Processing) base.  In the diagram below you can see how each part fits together and how the App and the Data Storage (of which Saga is independent) interacts with the Saga Solution.



Image Added