Generative AI client has been designed to allow running custom Groovy scripts. The scripts can do many different tasks but they are most useful when Generative AI services are needed to be called for crawled documents and the results from those calls are to be put in the document for later indexing. When writing Groovy scripts we can use so called binding variables to simplify AI related tasks:

REST client comes configured with required authentication methods based on UI parameters
REST client can be configured how to handle automatically typical throttling issues - like 429 errors
Methods for creating POST body for typical AI tasks are present
Methods for allowing REST calls for typical AI URLs are present
The REST call response comes in typical fashion for easy parsing and using it in the document
Support for general template scripts is here to allow selection of already prepared scripts with the support of template script properties
If secret information must be used in Groovy script they can be defined in UI, stored as encrypted and use as automatically decrypted in the scripts
Typical scenarios for running embeddings and prompts related tasks are built in and can be used with just several lines of Groovy script
Support for different AI services providers is supported - we support Azure and Google Palm in the present version

Easy Heading Free

navigationTitle	On this Page
wrapNavigationText	true
navigationExpandOption	expand-all-by-default

Binding variables

Name in script	Description	Aspire type	Init script	Process script
doc	Crawled document	AspireObject	false	true
component	Aspire workflow component running Groovy scripts	ComponentImpl	true	true
connection.client	REST client component for making AI calls	GenAIRestRequester	true	true
utilities.azure.embeddings utilities.google.embeddings	Methods related to "embeddings" processing	Embeddings	true	true
job	Job containing the crawled document	Job	false	true
secrets	Map of secrets provided in UI	Map<String,String>	true	true
template	Map of selected script template variables	Map<String,String>	true	true
utilities.azure.prompts utilities.google.prompts	Methods related to "prompts" processing	Prompts	true	true
utilities.textSplitter	Method related to text splitting	TextSplitterComponent	true	true
variables	Map of variables provided in initialize script	Map<String,Object>	true	false
utilities	Various helper methods	Utils	true	true

Document

The crawled document can be used for accessing metadata and the content and also for storing a new metadata acquired from AI:

Code Block

language	groovy

doc.add(embeddings.toAspireObject());

Component

The component can typically be used as a logger:

Code Block

language	groovy

  component.info(" %s","${doc.id}: Got embeddings for sentence: ${currentSentence}")

Connection

REST client is available via connection object and can be used for making requests to AI services

connection.client

Authentication

REST client is automatically configured using UI DXF configuration when initialized. When authentication method "NONE" is selected (default option) the authentication must happen in initialization script. In our examples we typically use adding "apiKey" header field

Code Block

language	groovy

connection.client.addHeader("apiKey", "${secrets.apiKey}");

Request

Number of methods can be used and all are listed in Javadoc of com.accenture.aspire.genaiclient.scriptsupport.rest.GenAIRestRequester. Here are selected methods most probably used in AI related scripts:

Method	Syntax	Init script	Process script
execute POST	HttpResponse<?> executePost(String url, AspireObject httpBody)	false	true
	HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory"). This can be converted using utilities methods like for example utilities.azure.embeddings.convertResponse to get desired output (see Embeddings and Prompts documentation on this page) url: AI service URL httpBody: The body of the POST. It can be also created using utilities method like for example utilities.azure.embeddings.createPostBody to make it easier when creating Embeddings and Prompts related requests (See Embeddings and Prompts documentation on this page)
execute GET	HttpResponse<?> executeGet(String url)	false	true
	HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory") url: AI service URL
Add header	addHeader(String name, String value);	true	false
	name: header name name: header value

Code Block

language	groovy

...
// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

....
def getEmbeddingsFromSentence(endpointEmbeddings, sentence) {
  response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence));
  def embeddings = utilities.azure.embeddings.convertResponse(sentence, response)
  return embeddings
}

Throttling, 429 Policy

429 policy and related throttling can be configured using UI DXF field Policy429. If a value is selected the connection will be throttled automatically:

None - no throttling
Thread blocking - affects just the current thread. Wait time and retries can be configured
Seed blocking - blocks the whole seed and takes into account the wait time required by AI service

You can also handle throttling manually in Groovy script. For example here is how to use Seed blocking:

Code Block

language	groovy

def resp = connection.client.executeGet("url"));
if(resp.getStatusCode() == 429){
  long pauseSeedUntil = System.currentTimeMillis() + (Integer.valueOf(resp.getHeaders().get("Retry-After")) * 1000);
  throw new com.accenture.aspire.services.ThrottlingNotificationException(pauseSeedUntil);
}

Embeddings

utilities.azure.embeddings, utilities.google.embeddings

aiService = azure|google(Palm)

Method

Syntax

Init script

Process script

Initialize Azure. Use it if you want to use the below mentioned "process" method

void utilities.azure.embeddings.initialize(AspireObject config)

true

false

config:

Name	Description	Required	Default	Example
endpoint	Endpoint	true	-	https://xxx-openai.openai.azure.com/
model	Model	true	-	text-embedding-ada-002
apiVersion	API version	true	-	2022-12-01
apiKey	API Key	true (if this auth is required)	-	690xxx59xxxx77520cxxx05

Initialize Google Palm. Do it if you want to use the below mentioned "process" method

void utilities.google.embeddings.initialize(AspireObject config)

true

false

config:

Name	Description	Required	Default	Example
endpoint	Endpoint	true	-
model	Model	true	-	embedding-gecko-001
apiVersion	API version	true	-	test-version
apiKey	API key	true (if this auth is required)	-	690xxx59xxxx77520cxxx05

Process. It creates embeddings for each text chunk provided in the list. It must be initialized first via "initialize"

VectorEmbeddingResult utilities.aiService.embeddings.process(List<String> splitText)

VectorEmbeddingResult: see the format below. All vectors are present.

splitText: text chunks for creating embeddings

false

true

Convert response. It converts the response from AI embeddings call. The response format can be slightly different for each AI provider. It can be converted to AspireObject and stored in the document.

VectorEmbeddingsResult utilities.aiService.embeddings.convertResponse(String text, AspireObjectResponse response)

response)

VectorEmbeddingsResult:

Method	Description
Hashtable<String,Double[]> getEmbeddings()	Gets the embedding vectors
AspireObject toAspireObject()	Converts to AspireObject

response: Http response to convert

Create POST body. It creates POST body for calling AI embeddings service

AspireObject utilities.aiService.embeddings.createPostBody(String text)

text: text to converted to the POST body

Create Sub document. It can be used when each embeddings chunk is to be posted as a separate sub job

AspireObject utilities.azure.embeddings.createSubDoc(VectorEmbeddingsResult vectorEmbeddingsResult, AspireObject doc, int chunkCount)

vectorEmbeddingResult: previously created embedding object

doc: the current document

chunkCount: the current text chunk number (see the example below)

Example of initialization script when we want to use complex embedding "process " method in the process script:

Code Block

language	groovy

import com.accenture.aspire.services.AspireObject;

utilities.textSplitter.initialize(getTextSplitterConfig("sentence"))
utilities.azure.embeddings.initialize(getEmbeddingsConfig())

def getEmbeddingsConfig() {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("endpoint", "${template.endpoint}");
    returnValue.add("model", "${template.model}");
    returnValue.add("apiVersion", "${template.apiVersion}");
    returnValue.add("apiKey", "${secrets.apiKey}");
    return returnValue;
}

def getTextSplitterConfig(String splitType) {
  .....
}

Example of process script using complex "process" method:

Code Block

language	groovy

def sentences = utilities.textSplitter.process(doc);
embeddings = utilities.azure.embeddings.process(sentences);
doc.add(embeddings.toAspireObject());

Example of process script publishing sub jobs for each embedding chunk:

Code Block

language	groovy

import com.accenture.aspire.services.AspireException

// split field "content" and create "sentences"
def sentences = utilities.textSplitter.process(doc);

// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

// generate and publish embeddings
sentences.eachWithIndex {currentSentence, sentencesCount ->
  embeddingVector = getEmbeddingsFromSentence(endpointEmbeddings, currentSentence)
  subJobAO = utilities.azure.embeddings.createSubDoc(embeddingVector, doc, sentencesCount);
  utilities.createSubJob(job, subJobAO)
}

def getEmbeddingsFromSentence(endpointEmbeddings, sentence) {
  response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence));
  def embeddings = utilities.azure.embeddings.convertResponse(sentence, response)
  return embeddings
}

Job

Job can be used when required as a parameter for other methods:

Code Block

language	groovy

utilities.createSubJob(job, subJobAO)

Secrets

Secrets defined in UI which are stored as encrypted can be accessed in scripts. They are automatically decrypted before using them.

Code Block

language	groovy

client.addHeader("api-key", "${secrets.apiKey}");

Template

If in UI a template script with properties has been selected we can access those properties in the script:

Code Block

language	groovy

def getEmbeddingsConfig() {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("endpoint", "${template.endpoint}");
    ....
}

Code Block

language	groovy

// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

Prompts

// TODO

utilities.azure.prompts, utilities.google.prompts

aiService = azure|google(Palm)

Method

Syntax

Init script

Process script

Initialize Azure. Use it if you want to use the below mentioned "process" method

void utilities.azure.prompts.initialize(AspireObject config)

true

false

config:

Name	Description	Required	Default	Example
endpoint	Endpoint	true	-	https://xxx-openai.openai.azure.com/
model	Model	true	-	text-embedding-ada-002
apiVersion	API version	true	-	2022-12-01
apiKey	API Key	true (if this auth is required)	-	690xxx59xxxx77520cxxx05

Initialize Google Palm. Do it if you want to use the below mentioned "process" method

void utilities.google.prompts.initialize(AspireObject config)

true

false

config:

Name	Description	Required	Default	Example
endpoint	Endpoint	true	-
model	Model	true	-	embedding-gecko-001
apiVersion	API version	true	-	test-version
apiKey	API key	true (if this auth is required)	-	690xxx59xxxx77520cxxx05

Process. It creates embeddings for each text chunk provided in the list. It must be initialized first via "initialize"

PromptsResult utilities.aiService.prompts.process(AspireObject doc)

PromptsResult: see the format below. All vectors are present.

doc: text chunks for creating embeddings

false

true

Convert response. It converts the response from AI embeddings call. The response format can be slightly different for each AI provider. It can be converted to AspireObject and stored in the document.

PromptsResult utilities.aiService.prompts.convertResponse(AspireObjectResponse response)

response)

PromptsResult:

Method	Description
AspireObject getResponseContent()	Gets the embedding vectors
AspireObject toAspireObject()	Converts to AspireObject

response: Http response to convert

Create POST body. It creates POST body for calling AI embeddings service

AspireObject utilities.aiService.prompts.createPostBody(Map<String, String> map)

text: text to converted to the POST body

Example of initialization script when we want to use complex embedding "process " method in the process script:

Code Block

language	groovy

import com.accenture.aspire.services.AspireObject;

utilities.azure.prompts.initialize(getPromptsConfig())

private AspireObject getPromptsConfig() {
    AspireObject configAO = new AspireObject("config");
    configAO.add("endpoint", "${template.endpoint}");
    configAO.add("model", "${template.model}");
    configAO.add("apiVersion", "${template.apiVersion}");
    configAO.add("apiKey", "${secrets.apiKey}");
    configAO.add("temperature", "${template.temperature}");
    configAO.add(getPromptsList());
    return configAO;
}

private AspireObject getPromptsList(){
    AspireObject prompts = new AspireObject("prompts");
    def returnValue = new ArrayList();
    AspireObject prompt = AspireObject.createFromJSON("prompt", "{\"prompt\":{\"promptType\":\"user\",\"useGroovy\":false,\"promptText\":\"Describe the paws of a polar bear named \\\"Thunder\\\"\"}}", false);
    returnValue.add(prompt);
    prompt = AspireObject.createFromJSON("prompt", "{\"prompt\":{\"promptType\":\"system\",\"useGroovy\":true,\"promptText\":\"return \\\"Describe it like you are \\\"+doc.getText(\\\"character\\\")+\\\"\\\";\"}}", false);
    returnValue.add(prompt);
    prompt = AspireObject.createFromJSON("prompt", "{\"prompt\":{\"promptType\":\"system\",\"useGroovy\":true,\"promptText\":\"return \\\"Describe it like it lived in the planet \\\"+doc.getText(\\\"planet\\\");\"}}", false);
    returnValue.add(prompt);
    prompts.add(returnValue);
    return prompts;
}

Example of process script using complex "process" method:

Code Block

language	groovy

def sentences = utilities.textSplitter.process(doc);
embeddingspromptsResponse = utilities.azure.embeddingsprompts.process(sentencesdoc);
doc.add(embeddingspromptsResponse.toAspireObject());

Example of process script publishing sub jobs for each embedding chunk:

Code Block

language	groovy

import com.accenture.aspire.services.AspireObject;

// url
endpointPrompts = "${template.endpoint}/openai/deployments/${template.model}/chat/completions?api-version=${template.apiVersion}"

...
for (String paragraph : small_piece_list) {
  answer = generateSummaryOfSummaries(endpointPrompts, paragraph.take(MODEL_MAX_MESSAGE_SIZE - MODEL_MAX_TOKENS))
  if (doc.get("summarizationError")?.getContent()) {
    component.info("%s", "summarization error detected on document: ${doc.get('summarizationError')}")
    searchFields.add("summarizationError", doc.getContent("summarizationError"))
    searchFields.add("generatedSummary", "")
    return
  } else {
    summaries.add(answer["summary"])
    keyphrases.addAll(answer["keyphrases"])
  }
}

def generateSummaryOfSummaries(endpointSummary, article) {
  def requestRetries = 0
  body = [
          "messages"         : [[
                                        "role"   : "system",
                                        "content": "You are a system that, given a text, extract a summary from it, and also a list of important keywords from it, based on the user input"],
                                [
                                        "role"   : "user",
                                        "content": "CONTENT={${article}}\
1.Clean [CONTENT] by removing formatting, special characters, and non-alphanumeric symbols.\
2. Read through the entire document to grasp its main points and arguments.\
3. Identify the key topics and supporting details presented in the document.\
4. Create an outline for the summary, noting the main sections or topics covered.\
5. Summarize each main section or topic in a clear and concise manner, using your own words, focusing on presenting the most significant and relevant information while leaving out unnecessary details.Aim for a summary length of 3-5 lines or a paragraph, depending on the document's size and complexity.\
6. Review the summary for accuracy and coherence with the original document, checking that the summary conveys the main points and ideas accurately.Respond as follows:\
SUMMARY:summary\
7. Provide the final list of max 100 important keyphrases without considering the frequency. Include all the abbreviations in the list, and do not repeat any keyphrases. Respond as follows:\
 KEYPHRASES:comma separated list of keyphrases"
                                ]
          ],
          "temperature"      : "${template.temperature}",
          "max_tokens"       : 1500,
          "top_p"            : 1.0,
          "frequency_penalty": 0.0,
          "presence_penalty" : 0.0
  ];
  response = connection.client.executePost(endpointSummary, utilities.azure.prompts.createPostBody(body));
  def request_result = getSummaryFromResponse(response)
  if (!request_result["isError"]) {
    return request_result
  } else {
    responseHeaders = response.getHeaders()
  }
  doc.add("summarizationError", "Errors on request for summary and keyphrases.")
  return ["isError": true, "summary": "", "keyphrases": []]
}

def getSummaryFromResponse(response) {
  def isError = true
  def summary = ""
  def keyphrases = []
  if (response.getStatusCode() == 200) {
    def content = response.getContent();
    def choices = content.get("choices");

    if (choices != null) {
      def finish_reason = choices.getText("finish_reason");
      if (finish_reason == "content_filter") {
        component.info("%s", "${doc.id}: Unable to generate summary. Document has content that was blocked by Azure content filter. Setting summarization error")
        doc.add("summarizationError", "Unable to generate summary due to Content Filtering Policy")
        summary = "";
        keyphrases = []
        isError = false
      } else {
        def message = choices.get("message");
        component.info("%s", "${doc.id}: message content: ${message}")
        def content_openai = message != null ? message.getText("content") : "";
        //the response from the AI has now 2 parts, one SUMMARY, and one KEYPHRASES. Parsing the message to get them and store in the value to return.
        def finder = (content_openai =~ /(SUMMARY|KEYPHRASES):\\s*(.+)/)
        finder.each { match ->
          if (match.size() == 3) {
            if (match[1] == "SUMMARY") {
              summary = match[2]
              component.info("%s", "${doc.id}: summary piece: ${summary}")
            }
            if (match[1] == "KEYPHRASES") {
              keyphrases = match[2].split(",")
              component.info("%s", "${doc.id}: keyphrases piece: ${keyphrases}")
            }
          }
        }
        isError = false
      }

    }
  } else {
   ....
  }
  return ["isError": isError, "summary": summary, "keyphrases": keyphrases]
}

Text Splitter

// TODO

Text splitter	utilities.textSplitter

Method

Syntax

Init script

Process script

Initialize

void utilities.textSplitter.initialize(AspireObject config)

true

false

config:

Name	Description	Default
splitType
fieldsToSplit
customSplitRegex
characterThreshold

Process

List<String> utilities.textSplitter.process(AspireObject doc)

List<String>:

TODO

doc:

TODO

false

true

Example of initialization script:

Code Block

language	groovy

import com.accenture.aspire.services.AspireObject;

utilities.textSplitter.initialize(getTextSplitterConfig("sentence"))

def getTextSplitterConfig(String splitType) {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("splitType", splitType);
    returnValue.add("fieldsToSplit", "content");
    returnValue.add("customSplitRegex", "\\|+");
    returnValue.add("characterThreshold", 4);
    return returnValue;
}

Example script:

Code Block

language	groovy

def sentences = utilities.textSplitter.process(doc);

Variables

// TODO

Utilities

// TODO

Page tree

Versions Compared

Old Version 29

New Version 30

Key

Binding variables

Document

Component

Connection

Authentication

Request

Throttling, 429 Policy

Embeddings

Job

Secrets

Template

Prompts

Text Splitter

Variables

Utilities

Page tree

Page History

Versions Compared

Old Version 29

New Version 30

Key

Binding variables

Document

Component

Connection

Authentication

Request

Throttling, 429 Policy

Embeddings

Job

Secrets

Template

Prompts

Text Splitter

Variables

Utilities