Generative AI client has been designed to allow running custom Groovy scripts. The scripts can do many different tasks but they are most useful when Generative AI services are needed to be called for crawled documents and the results from those calls are to be put in the document for later indexing. When writing Groovy scripts we can use so called binding variables to simplify AI related tasks:

REST client comes configured with required authentication methods based on UI parameters
REST client can be configured how to handle automatically typical throttling issues - like 429 errors
Methods for creating POST body for typical AI tasks are present
Methods for allowing REST calls for typical AI URLs are present
The REST call response comes in typical fashion for easy parsing and using it in the document
Support for general template scripts is here to allow selection of already prepared scripts with the support of template script properties
If secret information must be used in Groovy script they can be defined in UI, stored as encrypted and use as automatically decrypted in the scripts
Typical scenarios for running embeddings and prompts related tasks are built in and can be used with just several lines of Groovy script
Support for different AI services providers is supported - we support Azure and Google Palm in the present version

Binding variables

Name in script	Description	Aspire type	Init script	Process script
doc	Crawled document	AspireObject	false	true
component	Aspire workflow component running Groovy scripts	ComponentImpl	true	true
connection.client	REST client component for making AI calls	GenAIRestRequester	true	true
utilities.azure.embeddings utilities.google.embeddings	Methods related to "embeddings" processing	Embeddings	true	true
job	Job containing the crawled document	Job	false	true
secrets	Map of secrets provided in UI	Map<String,String>	true	true
template	Map of selected script template variables	Map<String,String>	true	true
utilities.azure.prompts utilities.google.prompts	Methods related to "prompts" processing	Prompts	true	true
utilities.textSplitter	Method related to text splitting	TextSplitterComponent	true	true
variables	Map of variables provided in initialize script	Map<String,Object>	true	false
utilities	Various helper methods	Utils	true	true

Document

The crawled document can be used for accessing metadata and the content and also for storing a new metadata acquired from AI:

doc.add(embeddings.toAspireObject());

Component

The component can typically be used as a logger:

  component.info(" %s","${doc.id}: Got embeddings for sentence: ${currentSentence}")

Connection

REST client is available via connection object and can be used for making requests to AI services

connection.client

Authentication

REST client is automatically configured using UI DXF configuration when initialized. When authentication method "NONE" is selected (default option) the authentication must happen in initialization script. In our examples we typically use adding "apiKey" header field

connection.client.addHeader("apiKey", "${secrets.apiKey}");

Request

Number of methods can be used and all are listed in Javadoc of com.accenture.aspire.genaiclient.scriptsupport.rest.GenAIRestRequester. Here are selected methods most probably used in AI related scripts:

Method	Syntax	Init script	Process script
execute POST	HttpResponse<?> executePost(String url, AspireObject httpBody)	false	true
	HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory"). This can be converted using utilities methods like for example utilities.azure.embeddings.convertResponse to get desired output (see Embeddings and Prompts documentation on this page) url: AI service URL httpBody: The body of the POST. It can be also created using utilities method like for example utilities.azure.embeddings.createPostBody to make it easier when creating Embeddings and Prompts related requests (See Embeddings and Prompts documentation on this page)
execute GET	HttpResponse<?> executeGet(String url)	false	true
	HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory") url: AI service URL
Add header	addHeader(String name, String value);	true	false
	name: header name name: header value

...
// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

....
def getEmbeddingsFromSentence(endpointEmbeddings, sentence) {
  response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence));
  def embeddings = utilities.azure.embeddings.convertResponse(sentence, response)
  return embeddings
}

Throttling, 429 Policy

429 policy and related throttling can be configured using UI DXF field Policy429. If a value is selected the connection will be throttled automatically:

None - no throttling
Thread blocking - affects just the current thread. Wait time and retries can be configured
Seed blocking - blocks the whole seed and takes into account the wait time required by AI service

You can also handle throttling manually in Groovy script. For example here is how to use Seed blocking:

def resp = connection.client.executeGet("url"));
if(resp.getStatusCode() == 429){
  long pauseSeedUntil = System.currentTimeMillis() + (Integer.valueOf(resp.getHeaders().get("Retry-After")) * 1000);
  throw new com.accenture.aspire.services.ThrottlingNotificationException(pauseSeedUntil);
}

Embeddings

utilities.azure.embeddings, utilities.google.embeddings

aiService = azure|google(Palm)

Method

Syntax

Init script

Process script

Initialize Azure. Use it if you want to use the below mentioned "process" method

void utilities.azure.initialize(AspireObject config)

true

false

config:

Name	Description	Required	Default	Example
endpoint	Endpoint	true	-	https://xxx-openai.openai.azure.com/
model	Model	true	-	text-embedding-ada-002
apiVersion	API version	true	-	2022-12-01
apiKey	API Key	true (if this auth is required)	-	690xxx59xxxx77520cxxx05

Initialize Google Palm. Do it if you want to use the below mentioned "process" method

void utilities.google.initialize(AspireObject config)

true

false

config:

Name	Description	Required	Default	Example
endpoint	Endpoint	true	-
model	Model	true	-	embedding-gecko-001
apiVersion	API version	true	-	test-version
apiKey	API key	true (if this auth is required)	-	690xxx59xxxx77520cxxx05

Process. It creates embeddings for each text chunk provided in the list. It must be initialized first via "initialize"

VectorEmbeddingResult utilities.aiService.embeddings.process(List<String> splitText)

VectorEmbeddingResult: see the format below. All vectors are present.

splitText: text chunks for creating embeddings

false

true

Convert response. It converts the response from AI embeddings call. The response format can be slightly different for each AI provider It can to be converted to AspireObject and stored in the document.

VectorEmbeddingsResult utilities.aiService.embeddings.convertResponse(String text, AspireObjectResponse response)

response)

VectorEmbeddingsResult:

Method	Description
Hashtable<String,Double[]> getEmbeddings()	Gets the embedding vectors
AspireObject toAspireObject()	Converts to AspireObject

response: Http response to convert

Create POST body. It creates POST body for calling AI embeddings service

AspireObject utilities.aiService.embeddings.createPostBody(String text)

text: text to converted to the POST body

Create Sub document. It can be used when each embeddings chunk is to be posted as separate subjob

AspireObject utilities.azure.embeddings.createSubDoc(VectorEmbeddingsResult vectorEmbeddingsResult, AspireObject doc, int chunkCount)

vectorEmbeddingResult: previously created embedding object

doc: the original script document

chunkCount: the current text chunk number (see the example below)

Example of initialization script when we want to use complex embedding "process " method in the process script:

import com.accenture.aspire.services.AspireObject;

utilities.textSplitter.initialize(getTextSplitterConfig("sentence"))
utilities.azure.embeddings.initialize(getEmbeddingsConfig())

def getEmbeddingsConfig() {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("endpoint", "${template.endpoint}");
    returnValue.add("model", "${template.model}");
    returnValue.add("apiVersion", "${template.apiVersion}");
    returnValue.add("apiKey", "${secrets.apiKey}");
    return returnValue;
}

def getTextSplitterConfig(String splitType) {
  .....
}

Example of process script using complex "process" method:

def sentences = utilities.textSplitter.process(doc);
embeddings = utilities.azure.embeddings.process(sentences);
doc.add(embeddings.toAspireObject());

Example of process script publishing sub jobs for each embedding chunk:

import com.accenture.aspire.services.AspireException

// split field "content" and create "sentences"
def sentences = utilities.textSplitter.process(doc);

// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

// generate and publish embeddings
sentences.eachWithIndex {currentSentence, sentencesCount ->
  embeddingVector = getEmbeddingsFromSentence(endpointEmbeddings, currentSentence)
  subJobAO = utilities.azure.embeddings.createSubDoc(embeddingVector, doc, sentencesCount);
  utilities.createSubJob(job, subJobAO)
}

def getEmbeddingsFromSentence(endpointEmbeddings, sentence) {
  response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence));
  def embeddings = utilities.azure.embeddings.convertResponse(sentence, response)
  return embeddings
}

Job

Job can be used when required as a parameter for other methods:

utilities.createSubJob(job, subJobAO)

Secrets

Secrets defined in UI which are stored as encrypted can be accessed in scripts. They are automatically decrypted before using them.

client.addHeader("api-key", "${secrets.apiKey}");

Template

If in UI a template script with properties has been selected we can access those properties in the script:

def getEmbeddingsConfig() {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("endpoint", "${template.endpoint}");
    ....
}

// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

Prompts

// TODO

Text Splitter

// TODO

Text splitter	utilities.textSplitter

Method

Syntax

Init script

Process script

Initialize

void utilities.textSplitter.initialize(AspireObject config)

true

false

config:

Name	Description	Default
splitType
fieldsToSplit
customSplitRegex
characterThreshold

Process

List<String> utilities.textSplitter.process(AspireObject doc)

List<String>:

TODO

doc:

TODO

false

true

Example of initialization script:

import com.accenture.aspire.services.AspireObject;

utilities.textSplitter.initialize(getTextSplitterConfig("sentence"))

def getTextSplitterConfig(String splitType) {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("splitType", splitType);
    returnValue.add("fieldsToSplit", "content");
    returnValue.add("customSplitRegex", "\\|+");
    returnValue.add("characterThreshold", 4);
    return returnValue;
}

Example script:

def sentences = utilities.textSplitter.process(doc);

Variables

// TODO

Utilities

// TODO

Page tree

Binding variables

Document

Component

Connection

Authentication

Request

Throttling, 429 Policy

Embeddings

Job

Secrets

Template

Prompts

Text Splitter

Variables

Utilities

Contact Us: [email protected]

Page tree

// TODO Generative AI Client - Groovy scripts support

Binding variables

Document

Component

Connection

Authentication

Request

Throttling, 429 Policy

Embeddings

Job

Secrets

Template

Prompts

Text Splitter

Variables

Utilities

Contact Us: [email protected]