You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 25 Next »

Generative AI client has been designed to allow running custom Groovy scripts. The scripts can do many different tasks but they are most useful when Generative AI services are needed to be called for crawled documents and the results from those calls are to be put in the document for later indexing. When writing Groovy scripts we can use so called binding variables to simplify AI related tasks:

  • REST client comes configured with required authentication methods based on UI parameters
  • REST client can be configured how to handle automatically typical throttling issues - like 429 errors
  • Methods for creating POST body for typical AI tasks are present
  • Methods for allowing REST calls for typical AI URLs are present
  • The REST call response comes in typical fashion for easy parsing and using it in the document
  • Support for general template scripts is here to allow selection of already prepared scripts with the support of template script properties
  • If secret information must be used in Groovy script they can be defined in UI, stored as encrypted and use as automatically decrypted in the scripts
  • Typical scenarios for running embeddings and prompts related tasks are built in and can be used with just several lines of Groovy script
  • Support for different AI services providers is supported - we support Azure and Google Palm in the present version

Binding variables


Name in scriptDescriptionAspire typeInit scriptProcess script
docCrawled documentAspireObjectfalsetrue
componentAspire workflow component running Groovy scriptsComponentImpltruetrue
connection.clientREST client component for making AI callsGenAIRestRequestertruetrue

utilities.azure.embeddings

utilities.google.embeddings

Methods related to "embeddings" processingEmbeddingstruetrue

job

Job containing the crawled documentJobfalsetrue

secrets

Map of secrets provided in UIMap<String,String>truetrue

template

Map of selected script template variablesMap<String,String>truetrue

utilities.azure.prompts

utilities.google.prompts

Methods related to "prompts" processingPromptstruetrue

utilities.textSplitter

Method related to text splittingTextSplitterComponenttruetrue

variables

Map of variables provided in initialize scriptMap<String,Object>truefalse

utilities

Various helper methodsUtilstruetrue

Document

The crawled document can be used for accessing metadata and the content and also for storing a new metadata acquired from AI:

doc.add(embeddings.toAspireObject());

Component

The component can typically be used as a logger:

  component.info(" %s","${doc.id}: Got embeddings for sentence: ${currentSentence}")

Connection

REST client is available via connection object and can be used for making requests to AI services

connection.client

Authentication

REST client is automatically configured using UI DXF configuration when initialized. When authentication method "NONE" is selected (default option) the authentication must happen in initialization script. In our examples we typically use adding "apiKey" header field

connection.client.addHeader("apiKey", "${secrets.apiKey}");

Request

Number of methods can be used and all are listed in Javadoc of com.accenture.aspire.genaiclient.scriptsupport.rest.GenAIRestRequester. Here are selected methods most probably used in AI related scripts:

MethodSyntaxInit scriptProcess script
execute POSTHttpResponse<?> executePost(String url, AspireObject httpBody)falsetrue

HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory"). This can be converted using utilities methods like for example utilities.azure.embeddings.convertResponse to get desired output (see Embeddings and Prompts documentation on this page)

url: AI service URL

httpBody: The body of the POST. It can be also created using utilities method like for example utilities.azure.embeddings.createPostBody to make it easier when creating Embeddings and Prompts related requests (See Embeddings and Prompts documentation on this page)



execute GET

HttpResponse<?> executeGet(String url)

falsetrue

HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory")

url: AI service URL



Add header

addHeader(String name, String value);

truefalse

name: header name

name: header value



...
// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

....
def getEmbeddingsFromSentence(endpointEmbeddings, sentence) {
  response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence));
  def embeddings = utilities.azure.embeddings.convertResponse(sentence, response)
  return embeddings
}

Throttling, 429 Policy

429 policy and related throttling can be configured using UI DXF field Policy429. If a value is selected the connection will be throttled automatically:

  • None - no throttling
  • Thread blocking - affects just the current thread. Wait time and retries can be configured
  • Seed blocking - blocks the whole seed and takes into account the wait time required by AI service

You can also handle throttling manually in Groovy script. For example here is how to use Seed blocking:

def resp = connection.client.executeGet("url"));
if(resp.getStatusCode() == 429){
  long pauseSeedUntil = System.currentTimeMillis() + (Integer.valueOf(resp.getHeaders().get("Retry-After")) * 1000);
  throw new com.accenture.aspire.services.ThrottlingNotificationException(pauseSeedUntil);
}


Embeddings 

utilities.azure.embeddings, utilities.google.embeddings

aiService = azure|google(Palm)

MethodSyntaxInit scriptProcess script

Initialize Azure. Use it if you want to use the below mentioned "process" method

void utilities.azure.initialize(AspireObject config)truefalse

config:

NameDescriptionRequiredDefaultExample
endpointEndpointtrue-https://xxx-openai.openai.azure.com/
modelModeltrue-

text-embedding-ada-002

apiVersionAPI versiontrue-
2022-12-01
apiKeyAPI Keytrue (if this auth is required)-690xxx59xxxx77520cxxx05


Initialize Google Palm. Do it if you want to use the below mentioned "process" method

void utilities.google.initialize(AspireObject config)

truefalse

config:

NameDescriptionRequiredDefaultExample
endpointEndpointtrue-
modelModeltrue-
embedding-gecko-001
apiVersionAPI versiontrue-
test-version
apiKeyAPI keytrue (if this auth is required)-690xxx59xxxx77520cxxx05




Process. It creates embeddings for each text chunk provided in the list. It must be initialized first via "initialize"

VectorEmbeddingResult utilities.aiService.embeddings.process(List<String> splitText)

VectorEmbeddingResult:  see the format below. All vectors are present.

splitText: text chunks for creating embeddings

falsetrue
Convert response. It converts the response from AI embeddings call. The response format can be slightly different for each AI provider It can  to be converted to AspireObject and stored in the document. 

VectorEmbeddingsResult utilities.aiService.embeddings.convertResponse(String text,  AspireObjectResponse response)

response)

VectorEmbeddingsResult:

MethodDescription
Hashtable<String,Double[]> getEmbeddings()Gets the embedding vectors
AspireObject toAspireObject()Converts to AspireObject

response: Http response to convert



Create POST body. It creates POST body for calling AI embeddings service

AspireObject utilities.aiService.embeddings.createPostBody(String text)

text: text to converted to the POST body



Create Sub document. It can be used when each embeddings chunk is to be posted as separate subjob 

AspireObject utilities.azure.embeddings.createSubDoc(VectorEmbeddingsResult vectorEmbeddingsResult, AspireObject doc, int chunkCount)

vectorEmbeddingResult: previously created embedding object

doc: the original script document

chunkCount: the current text chunk number (see the example below)



Example of initialization script when we want to use complex embedding "process " method in the process script:

import com.accenture.aspire.services.AspireObject;

utilities.textSplitter.initialize(getTextSplitterConfig("sentence"))
utilities.azure.embeddings.initialize(getEmbeddingsConfig())

def getEmbeddingsConfig() {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("endpoint", "${template.endpoint}");
    returnValue.add("model", "${template.model}");
    returnValue.add("apiVersion", "${template.apiVersion}");
    returnValue.add("apiKey", "${secrets.apiKey}");
    return returnValue;
}

def getTextSplitterConfig(String splitType) {
  .....
}

Example of process script using complex "process" method:

def sentences = utilities.textSplitter.process(doc);
embeddings = utilities.azure.embeddings.process(sentences);
doc.add(embeddings.toAspireObject());

Example of process script publishing sub jobs for each embedding chunk:

import com.accenture.aspire.services.AspireException

// split field "content" and create "sentences"
def sentences = utilities.textSplitter.process(doc);

// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

// generate and publish embeddings
sentences.eachWithIndex {currentSentence, sentencesCount ->
  embeddingVector = getEmbeddingsFromSentence(endpointEmbeddings, currentSentence)
  subJobAO = utilities.azure.embeddings.createSubDoc(embeddingVector, doc, sentencesCount);
  utilities.createSubJob(job, subJobAO)
}

def getEmbeddingsFromSentence(endpointEmbeddings, sentence) {
  response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence));
  def embeddings = utilities.azure.embeddings.convertResponse(sentence, response)
  return embeddings
}

Job

Job can be used when required as a parameter for other methods:

utilities.createSubJob(job, subJobAO)

Secrets

Secrets defined in UI which are stored as encrypted can be accessed in scripts. They are automatically decrypted before using them.

client.addHeader("api-key", "${secrets.apiKey}");

Template

If in UI a template script with properties has been selected we can access those properties in the script:

def getEmbeddingsConfig() {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("endpoint", "${template.endpoint}");
    ....
}
// url
endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}"

Prompts

// TODO

Text Splitter

// TODO

Text splitterutilities.textSplitter
MethodSyntaxInit scriptProcess script
Initializevoid utilities.textSplitter.initialize(AspireObject config)truefalse

config:

NameDescriptionDefault
splitType

fieldsToSplit

customSplitRegex

characterThreshold



Process

List<String> utilities.textSplitter.process(AspireObject doc)

List<String>:

TODO

doc:

TODO

falsetrue

Example of initialization script:

import com.accenture.aspire.services.AspireObject;

utilities.textSplitter.initialize(getTextSplitterConfig("sentence"))

def getTextSplitterConfig(String splitType) {
    AspireObject returnValue = new AspireObject("config");
    returnValue.add("splitType", splitType);
    returnValue.add("fieldsToSplit", "content");
    returnValue.add("customSplitRegex", "\\|+");
    returnValue.add("characterThreshold", 4);
    return returnValue;
}

Example script:

def sentences = utilities.textSplitter.process(doc);

Variables

// TODO

Utilities

// TODO

  • No labels