Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Welcome to the Getting Started guide, This is what you will achieve by following the next steps:

  • Getting Saga
  • Setting up the environment
  • Deploying Saga
  • Running Saga
  • Checking that it's working!

Prerequisites


  1. Elasticsearch 6.4.1 or above.
  2. Java 11 or above.

Step 1: Getting Saga


At the moment of writing, Saga version 1.0.0 has just been released. You can get it here (Saga team in MS Teams).

It is a bundle containing:

  1. Saga binaries
  2. Instance of Elasticsearch (with sample data for your convenience)
  3. Saga introductory presentation
  4. Saga user manual


Note

Creators and users of Saga are subscribed to this Saga team, so in case you you can always publish a message to get some help if you have questions or comments.


Step 2: Set up the environment


  1. Check that Java 11 is installed on your machine by running on terminal (or system console):

    $> java -version

  2. Unpackage Saga in your preferred location. 
    This is our recommended setup but you can pretty much handle the paths as you wish.
    This guide will refer to Saga's working directory as {SAGA_HOME}.
  3. Saga uses Elasticsearch (6.4.1 or above) and you can get it here.
    1. Deploy Elasticsearch (ES) under {SAGA_HOME} in something like {SAGA_HOME}/Elasticsearch-6.4.1.
    2. Run ES by executing the binary on {SAGA_HOME}/Elasticsearch-6.4.1/bin.
      • Saga can run on an empty ES instance; although you need to add new tags and resources.

Step 3: Deploy Saga


Once you have Saga in {SAGA_HOME} validate the following:

  1. There is a {SAGA_HOME}/lib folder containing the following JARs:
    • saga-classification-trainer-stage-1.0.0-SNAPSHOT
    • saga-name-trainer-stage-1.0.0-SNAPSHOT
  2. Check the basic configuration on {SAGA_HOME}/config/config.json:
    • "airPort": 8080 → The port used by the server.
    • "ipAdress": "0.0.0.0" → This IP/mask is used to restrict inbound connections, open to all connections by default.
    • "logger" → Each logger level config per handler.
    • "provider" → Data Resources, mainly used to specify a location of resource files like dictionaries and ES configuration
      • New filesystem providers can be added to group different resource files.
      • ES configuration includes the "port" to connect to . The default is 9200; you may change it to fit your environment.
    • "solutions" → Bundle solution schema. Its values may change to have multiple servers with different "solutions" or to switch from one to another.
      • A solution work as a domain. By default the "saga" solution creates ES indexes using the pattern "saga-<index>" and only loads indexes with the same pattern. 
        So you could have multiple solutions on a ES server.
        To switch between solutions you'd need to shut down the server, change the "indexName" value and restart the server.
  3. If you have some valid "models" you'd like to include on the server:
    1. Create a {SAGA-HOME}/nt-models folder for "name trainers" and copy the model there.
    2. Create a {SAGA-HOME}/ct-models folder for "classification trainers" and copy the model there.
  4. To add datasets:
    1. Create a {SAGA-HOME}/datasets folder.
    2. Each dataset must be placed in its own folder. This folder name will be the one displayed for "test runs".
    3. Each data document in the dataset must be compliant with Saga's data file JSON format.
    4. Each folder must contain a ".metadata" file with information about the dataset and how to read it.
      You can check the dataset format here.

Step 4: Run Saga


To run Saga:

  1. Check that Elasticsearch is running.
  2. Use the bundled startup script on {SAGA_HOME} (either startup.bat for Windows or startup.sh for Linux).
  3. If you didn't change the default port on the configuration, you should be able to access Saga at http://localhost:8080/.
    If not, then check your configuration for the right port.