You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

his is a guide on how to process text using Saga's REST services.

This tutorial assumes:

  • The reader ability to create a project with Maven Framework support
  • The data Saga will use is manage through the Saga's user interface.
  • Java 11+ is installed in the machine

In this page:

Configure pom.xml

To use the following code you'll need the next dependencies:

Sample pom.xml section
<dependency>
   <groupId>com.fasterxml.jackson.core</groupId>
   <artifactId>jackson-databind</artifactId>
   <version>{jackson-version}</version>
</dependency>

Feel free to use your favorite JSON processing API.


This guide will include simple usage of REST services and the general documentation of this services can be found here.

Processing Text

The next code works assuming:

  1. There is a tag named "{component}" that include "wing" as part of its patterns.
  2. There is a tag named "{aircraft}" that includes "LAK-12" as part of its patterns.
  3. The "{aircraft}" tag confidence adjustment is 2.


ProcessText
import org.codehaus.jackson.JsonNode;
import org.codehaus.jackson.map.ObjectMapper;

import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class ProcessText {

   public static void main(String[] args) {

      try {

         URL url = new URL("http://localhost:8080/_saga/processText");
         HttpURLConnection conn = (HttpURLConnection) url.openConnection();
         conn.setDoOutput(true);
         conn.setRequestMethod("POST");
         conn.setRequestProperty("Content-Type", "application/json");

         String input = "{" +
                        "\"q\":\"A WING FAILURE, RESULTING IN SUBSTANTIAL DAMAGE TO THE LAK-12 AIRCRAFT\"," +
                        "\"tags\":[\"aircraft\",\"component\"]," +
                        "\"splitRegex\": \"[\r|\n]+\"," +
                        "\"type\": \"text\"," +
                        "\"pretty\": true" +
                        "}";

         OutputStream os = conn.getOutputStream();
         os.write(input.getBytes());
         os.flush();

         ObjectMapper mapper = new ObjectMapper();

         JsonNode actualObj = mapper.readTree(new InputStreamReader(
               (conn.getInputStream())));

         if(actualObj != null){
            if (actualObj.get("_success").getBooleanValue()) {
               System.out.println("=================================================");
               System.out.println("=                    GRAPH                      =");
               System.out.println("=================================================\n\n");
               System.out.println(actualObj.get("data").get("graph").getTextValue());
               JsonNode nodeArray = actualObj.get("data").get("line");
               final String nodeTemplate = "%s (%.2f)[pos: %s]";
               List<String> nodeList = new ArrayList();
               if(nodeArray.isArray()){
                  nodeArray.forEach(jsonNode -> nodeList.add(String.format(nodeTemplate,
                        jsonNode.get("_item").getTextValue(),
                        jsonNode.get("confidence").getDoubleValue(),
                        jsonNode.get("character").getTextValue())));
               }
               System.out.println("=================================================");
               System.out.println("=           HIGHEST CONDIFIDENCE ROUTE          =");
               System.out.println("=================================================\n\n");
               System.out.println(nodeList.stream().collect(Collectors.joining(" -> ")));
            } else {
               System.out.println("Failure");
            }
         }
         conn.disconnect();
      } catch (MalformedURLException e) {
         e.printStackTrace();
      } catch (IOException e) {
         e.printStackTrace();
      }
   }
}

Choosing An Output Format

This is the JSON you can expect from the code:

Output
=================================================
=                    GRAPH                      =
=================================================


 V--------------------------------[A WING FAILURE, RESULTING IN SUBSTANTIAL DAMAGE TO THE LAK-12 AIRCRAFT]--------------------------------V 
 ^-[A]-V----[WING]-----V---[FAILURE,]----V-[RESULTING]-V-[IN]-V-[SUBSTANTIAL]-V-[DAMAGE]-V-[TO]-V-[THE]-V------[LAK-12]------V-[AIRCRAFT]-^ 
 ^-[a]-^----[wing]-----^---[failure,]----^-[resulting]-^-[in]-^-[substantial]-^-[damage]-^-[to]-^-[the]-^------[lak-12]------^-[aircraft]-^ 
       ^-[{component}]-^-[FAILURE]-V-[,]-^                                                              ^-[LAK]-V-[-]-V-[12]-^ 
                       ^-[failure]-^                                                                    ^-[lak]-^ 
                                                                                                        ^----[{aircraft}]----^ 

The first result from the code is the text-only representation of the Interpretation Graphs, and this is from the "text" type set on the service parameters.  It comes as a single value within the "graph" field of the JSON response. 

Output
=================================================
=           HIGHEST CONDIFIDENCE ROUTE          =
=================================================


A (0.40)[pos: 0:1] -> WING (0.51)[pos: 2:6] -> FAILURE, (0.50)[pos: 7:15] -> RESULTING (0.50)[pos: 16:25] -> IN (0.40)[pos: 26:28] -> SUBSTANTIAL (0.50)[pos: 29:40] -> DAMAGE (0.50)[pos: 41:47] -> TO (0.40)[pos: 48:50] -> THE (0.40)[pos: 51:54] -> {aircraft} (1.00)[pos: 55:61] -> AIRCRAFT (0.50)[pos: 62:70]

The second result is a text representation of the highest confidence route, in this case is almost the same as the original text but since we added extra importance to the "aircraft" tag you can see it is part of the route instead of the airplane name.  You can also access information like

  • "components" - A list of strings containing the parent components of the token.
  • "stage" - The source stage that generated the token.
  • "flags" - A list of flags assigned to the token.
  • "matching" - Original text reference with the character positions.

The "json" type parameter returns other than the highest confidence route, just as the "text" type but, also the list of semantic tags on the graph.  It would be something like this:

json type output
=================================================
=                 SEMANTIC TAGS                 =
=================================================


{component} (0.51)[pos: 2:6] -> {aircraft} (1.00)[pos: 55:61]

Only two semantic tags are returned since the matches were found once per tag.  You can access more information just as mentioned for the highest confidence route before, such as "components", "stage" and so on.

The "ux" type parameter will return a JSON structure with information useful for the Saga server application to show the interpretation graph, this is not really helpful unless you try to display it just as the application does.


Related articles

  • No labels