Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This his is a guide on how to create a new tag, assign a recognizer to it, add new patterns and text processing exampleprocess text using Saga's REST services.

Note

This tutorial assumes:

  • The reader ability to create a project with Maven Framework support
  • The data Saga will use is manage through the Saga's user interface, only the creation of new items will be addressed.
  • Java 11+ is installed in the machine
toc


Configure pom.xml

To use the following code you'll need the next dependencies:

Code Block
languagexml
themeRDark
titleSample pom.xml section
<dependency>
   <groupId>com.fasterxml.jackson.core</groupId>
   <artifactId>jackson-databind</artifactId>
   <version>{jackson-version}</version>
</dependency>
Note

Feel free to use your favorite JSON processing API.


This guide will include simple usage of REST services and the general documentation of this services can be found here.

Processing Text

Getting Basic Information

Before coding a new example there are some things we are assuming.

There is a default pipeline, known as baseline-pipeline, with the following structure:

Code Block
language
css
java
theme
RDark
FadeToGrey
title
baseline-pipeline{ "stages": [
ProcessText
linenumberstrue
package com.mkyong.rest.client;

import org.codehaus.jackson.JsonNode;
import org.codehaus.jackson.map.ObjectMapper;

import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class ProcessText {

   public static void main(String[] args) {

      try {

         URL url = new URL("http://localhost:8080/_saga/processText");
         HttpURLConnection conn = (HttpURLConnection) url.openConnection();
         conn.setDoOutput(true);
         conn.setRequestMethod("POST");
         conn.setRequestProperty("Content-Type", "application/json");

         String input = "{" +
                        "\"q\":\"A WING FAILURE, RESULTING IN SUBSTANTIAL DAMAGE TO THE LAK-12 AIRCRAFT\"," +
                        "\"tags\":[\"aircraft\",\"component\"]," +
          
{
              "
language
\"splitRegex\": 
"en
\"[\r|\n]+\"," +
                        "\"type\": \"text\"," +
                    
"TextBreakerStage"
    "\"pretty\": true" +
        
},
             "}";

         OutputStream os = conn.getOutputStream();
         os.write(input.getBytes());
         os.flush();

         ObjectMapper mapper = new ObjectMapper();


         JsonNode actualObj = mapper.readTree(new InputStreamReader(
               (conn.getInputStream())));

         if(actualObj != null){
            
"requiredFlags": [
if (actualObj.get("_success").getBooleanValue()) {

               System.out.println("=================================================");
  
"SENTENCE"
             System.out.println("=                    GRAPH                      =");
               System.out.println("=================================================\n\n");
            
],
   System.out.println(actualObj.get("data").get("graph").getTextValue());
            
"type": "WhitespaceTokenizerStage"
   JsonNode nodeArray = actualObj.get("data").get("line");
               final String nodeTemplate = "%s (%.2f)[pos: %s]";
               List<String> nodeList = new ArrayList();
               if(nodeArray.isArray()){
                  nodeArray.forEach(jsonNode -> {
            
},
         nodeList.add(String.format(nodeTemplate,
                
{
           jsonNode.get("matching").getTextValue(),
                           jsonNode.get("
type": "StopWordsStage"
confidence").getDoubleValue(),
                           jsonNode.get("character").getTextValue()));
                  });
               }
,

        
{
            System.out.println("=================================================");
              
"type": "CaseAnalysis"
      System.out.println("=           HIGHEST CONDIFIDENCE ROUTE          =");
                    System.out.println("=================================================\n\n");
           
},
    System.out.println(nodeList.stream().collect(Collectors.joining(" -> ")));
            } else {
            
"type": "CharChangeSplitter"
   System.out.println("Failure");
            }
    
] }

Pipelines information

You can get information regarding the pipelines using the REST API.  All we need for now is the name of the pipeline we need.  To get a list of pipeline names we can use something like this:

Datasets information

Adding New Items

Add A New Tag

Assign A Recognizer To Tag

Add Patterns To Recognizer

Processing Text
     }
         conn.disconnect();
      } catch (MalformedURLException e) {
         e.printStackTrace();
      } catch (IOException e) {
         e.printStackTrace();
      }
   }
}

Choosing An Output Format

Content by Label
showLabelsfalse
max5
spacessaga131
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ("embedded","saga","library","app") and type = "page" and space = "saga131"
labelssaga library app embedded

...