Generative AI client has been designed to allow running custom Groovy scripts. The scripts can do many different tasks but they are most useful when Generative AI services are needed to be called for crawled documents and the results from those calls are to be put in the document for later indexing. When writing Groovy scripts we can use so called binding variables to simplify AI related tasks:
Easy Heading Free | ||||||
---|---|---|---|---|---|---|
|
Name in script | Description | Aspire type | Init script | Process script |
---|---|---|---|---|
doc | Crawled document | AspireObject | false | true |
component | Aspire workflow component running Groovy scripts | ComponentImpl | true | true |
connection.client | REST client component for making AI calls | GenAIRestRequester | true | true |
utilities.azure.embeddings utilities.google.embeddings | Methods related to "embeddings" processing | Embeddings | true | true |
job | Job containing the crawled document | Job | false | true |
secrets | Map of secrets provided in UI | Map<String,String> | true | true |
template | Map of selected script template variables | Map<String,String> | true | true |
utilities.azure.prompts utilities.google.prompts | Methods related to "prompts" processing | Prompts | true | true |
utilities.textSplitter | Method related to text splitting | TextSplitterComponent | true | true |
variables | Map of variables provided in initialize script | Map<String,Object> | true | false |
utilities | Various helper methods | Utils | true | true |
The crawled document can be used for accessing metadata and the content and also for storing a new metadata acquired from AI:
Code Block | ||
---|---|---|
| ||
doc.add(embeddings.toAspireObject()); |
The component can typically be used as a logger:
Code Block | ||
---|---|---|
| ||
component.info(" %s","${doc.id}: Got embeddings for sentence: ${currentSentence}") |
REST client is available via connection object and can be used for making requests to AI services
connection.client |
---|
REST client is automatically configured using UI DXF configuration when initialized. When authentication method "NONE" is selected (default option) the authentication must happen in initialization script. In our examples we typically use adding "apiKey" header field
Code Block | ||
---|---|---|
| ||
connection.client.addHeader("apiKey", "${secrets.apiKey}"); |
Number of methods can be used and all are listed in Javadoc of com.accenture.aspire.genaiclient.scriptsupport.rest.GenAIRestRequester. Here are selected methods most probably used in AI related scripts:
Method | Syntax | Init script | Process script |
---|---|---|---|
execute POST | HttpResponse<?> executePost(String url, AspireObject httpBody) | false | true |
HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory"). This can be converted using utilities methods like for example utilities.azure.embeddings.convertResponse to get desired output (see Embeddings and Prompts documentation on this page) url: AI service URL httpBody: The body of the POST. It can be also created using utilities method like for example utilities.azure.embeddings.createPostBody to make it easier when creating Embeddings and Prompts related requests (See Embeddings and Prompts documentation on this page) | |||
execute GET | HttpResponse<?> executeGet(String url) | false | true |
HttpResponse: most likely this will be AspireObjectResponse (it depend on UI configuration field "responseFactory") url: AI service URL | |||
Add header | addHeader(String name, String value); | true | false |
name: header name name: header value |
Code Block | ||
---|---|---|
| ||
... // url endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}" .... def getEmbeddingsFromSentence(endpointEmbeddings, sentence) { response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence)); def embeddings = utilities.azure.embeddings.convertResponse(sentence, response) return embeddings } |
429 policy and related throttling can be configured using UI DXF field Policy429. If a value is selected the connection will be throttled automatically:
You can also handle throttling manually in Groovy script. For example here is how to use Seed blocking:
Code Block | ||
---|---|---|
| ||
def resp = connection.client.executeGet("url")); if(resp.getStatusCode() == 429){ long pauseSeedUntil = System.currentTimeMillis() + (Integer.valueOf(resp.getHeaders().get("Retry-After")) * 1000); throw new com.accenture.aspire.services.ThrottlingNotificationException(pauseSeedUntil); } |
utilities.azure.embeddings, utilities.google.embeddings aiService = azure|google(Palm) |
---|
Method | Syntax | Init script | Process script | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Initialize Azure. Use it if you want to use the below mentioned "process" method | void utilities.azure.initialize(AspireObject config) | true | false | |||||||||||||||||||||||||
config:
| ||||||||||||||||||||||||||||
Initialize Google Palm. Do it if you want to use the below mentioned "process" method | void utilities.google.initialize(AspireObject config) | true | false | |||||||||||||||||||||||||
config:
| ||||||||||||||||||||||||||||
ProcessVectorEmbeddingResult utilities.azure.embeddings.process(List<String> splitText). It creates embeddings for each text chunk provided in the list. It must be initialized first via "initialize" | VectorEmbeddingResult utilities.googleaiService.embeddings.process(List<String> splitText) VectorEmbeddingResult: TODO splitText: TODO see the format below. All vectors are present. splitText: text chunks for creating embeddings | false | true | |||||||||||||||||||||||||
Convert responseVectorEmbeddingsResult utilities.azure.embeddings.convertResponse(String text, AspireObjectResponse response). It converts the response from AI embeddings call. The response format can be slightly different for each AI provider It can to be converted to AspireObject and stored in the document. | VectorEmbeddingsResult utilities.googleaiService.embeddings.convertResponse(String text, AspireObjectResponse response) response) VectorEmbeddingsResult:TODO
response: TODOHttp response to convert | |||||||||||||||||||||||||||
Create POST body. It creates POST body for calling AI embeddings service | AspireObject utilities.azureaiService.embeddings.createPostBody(String text) text: text to converted to the POST body | |||||||||||||||||||||||||||
Create Sub document. It can be used when each embeddings chunk is to be posted as separate subjob | AspireObject utilities.azure.embeddings.createSubDoc(VectorEmbeddingsResult vectorEmbeddingsResult, AspireObject doc, int chunkCount) vectorEmbeddingResult: previously created embedding object doc: the original script document chunkCount: the current text chunk number (see the example below) |
Example of initialization script when we want to use complex embedding "process " method in the process Example init script:
Code Block | ||
---|---|---|
| ||
import com.accenture.aspire.services.AspireObject; utilities.textSplitter.initialize(getTextSplitterConfig("sentence")) utilities.azure.embeddings.initialize(getEmbeddingsConfig()) def getEmbeddingsConfig() { AspireObject returnValue = new AspireObject("config"); returnValue.add("endpoint", "${template.endpoint}"); returnValue.add("model", "${template.model}"); returnValue.add("apiVersion", "${template.apiVersion}"); returnValue.add("apiKey", "${secrets.apiKey}"); return returnValue; } def getTextSplitterConfig(String splitType) { ..... } |
Example of process script using complex "process" method:
Code Block | ||
---|---|---|
| ||
def sentences = utilities.textSplitter.process(doc); embeddings = utilities.azure.embeddings.process(sentences); doc.add(embeddings.toAspireObject()); |
Example of process script publishing sub jobs for each embedding chunk:
Code Block | ||
---|---|---|
| ||
import com.accenture.aspire.services.AspireException // split field "content" and create "sentences" def sentences = utilities.textSplitter.process(doc); // url endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}" // generate and publish embeddings sentences.eachWithIndex {currentSentence, sentencesCount -> embeddingVector = getEmbeddingsFromSentence(endpointEmbeddings, currentSentence) subJobAO = utilities.azure.embeddings.createSubDoc(embeddingVector, doc, sentencesCount); utilities.createSubJob(job, subJobAO) } def getEmbeddingsFromSentence(endpointEmbeddings, sentence) { response = connection.client.executePost(endpointEmbeddings, utilities.azure.embeddings.createPostBody(sentence)); def embeddings = utilities.azure.embeddings.convertResponse(sentence, response) return embeddings } |
Job can be used when required as a parameter for other methods:
Code Block | ||
---|---|---|
| ||
utilities.createSubJob(job, subJobAO) |
Secrets defined in UI which are stored as encrypted can be accessed in scripts. They are automatically decrypted before using them.
Code Block | ||
---|---|---|
| ||
client.addHeader("api-key", "${secrets.apiKey}"); |
If in UI a template script with properties has been selected we can access those properties in the script:
Code Block | ||
---|---|---|
| ||
def getEmbeddingsConfig() { AspireObject returnValue = new AspireObject("config"); returnValue.add("endpoint", "${template.endpoint}"); .... } |
Code Block | ||
---|---|---|
| ||
// url endpointEmbeddings = "${template.endpoint}/openai/deployments/${template.model}/embeddings?api-version=${template.apiVersion}" |
// TODO
// TODO
Text splitter | utilities.textSplitter |
---|
Method | Syntax | Init script | Process script | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Initialize | void utilities.textSplitter.initialize(AspireObject config) | true | false | |||||||||||||||
config:
| ||||||||||||||||||
Process | List<String> utilities.textSplitter.process(AspireObject doc) List<String>: TODO doc: TODO | false | true |
Example of initialization script:
Code Block | ||
---|---|---|
| ||
import com.accenture.aspire.services.AspireObject; utilities.textSplitter.initialize(getTextSplitterConfig("sentence")) def getTextSplitterConfig(String splitType) { AspireObject returnValue = new AspireObject("config"); returnValue.add("splitType", splitType); returnValue.add("fieldsToSplit", "content"); returnValue.add("customSplitRegex", "\\|+"); returnValue.add("characterThreshold", 4); return returnValue; } |
Example script:
Code Block | ||
---|---|---|
| ||
def sentences = utilities.textSplitter.process(doc); |
// TODO
// TODO