Configuration in File

In the Saga Server configuration file which is located in ./config/config.json there will be a field gpt3 (if field is not present it can be added), inside this field another called key will hold the API secret key for OpenAI, the key can either be encrypted or as plain text, this will be the key used in GPT3 Proxy for all call executed. Besides that key, there are other two fields: the openAIHost, which holds the url to the OpenAI BETA APIs "https://api.openai.com" and the openAIAPIVersion, which holds what version will be used, e.g. "v1".

./config/config.json

.
.
.
"gpt3": {
      "key": "encrypted:7E7fhr1bFaA1TQlh4ZdB",
      "openAIHost": "https://api.openai.com",
      "openAIAPIVersion": "v1"
},
.
.
.

We recommend to encrypt the key using our utility Saga Secure

GPT3 Predefined Configurations

One feature the GPT3 Proxy offers is the possibility to create and use predefined configurations for both semantic search and text completion (text generation)

The way to generate GPT3 configurations is to:

Access the Saga user interface,
Go to the three dots icon in the top right corner
Click on the "Tools" options
Select GPT3
Click on the plus button
Select either Search (for semantic search) or Completion (for text generation)

Completion Configuration

The Completion configuration holds all the provided parameters for the OpenAI API, each one has a detail description, the prompt text area is optional, but if none is provided the prompt must be specified when calling the process endpoint

Search Configuration

The Search configuration only asks for the engine to use and optionally the documents to use when processing the query. Same as the Completion, if no document is provided, they must be specified when calling the process endpoint.

Each document added will be represented by an index number which can be seen next to the label of each document like this Document (X). This will be the number specified in the response of the process endpoint

API

The GPT3 Proxy API mimics the OpenAI API, but restricting some features, handling the secret key and adding the possibility of working with predefined configurations.

GET saga/api/v2/gpt3/engines

Parameters

No parameter required

Request Examples

curl --location --request GET 'http://localhost:8080/saga/api/v2/gpt3/engines'

Response

[
  {
    "owner": "openai",
    "ready": true,
    "id": "ada",
    "object": "engine"
  },
  {
    "owner": "openai",
    "ready": true,
    "id": "babbage",
    "object": "engine"
  },
  {
    "owner": "openai",
    "ready": true,
    "id": "curie",
    "object": "engine"
  },
  {
    "owner": "openai",
    "ready": true,
    "id": "davinci",
    "object": "engine"
  }
]

The engines may be different from the ones in the sample

GET saga/api/v2/gpt3/configs

Parameters

No parameter required

Request Example

curl --location --request GET 'http://localhost:8080/saga/api/v2/gpt3/configs'

Response

[
  {
    "name": "search1",
    "_id": "FSEaz3YBJerddQVLu_Ly",
    "type": "search",
    "updatedAt": 1609792404466
  },
  {
    "name": "completion1",
    "_id": "EyEaz3YBJerddQVLf_L7",
    "type": "completion",
    "updatedAt": 1609792389083
  } 
]

POST saga/api/v2/gpt3/create

Completion Configuration

Parameters

name ( type=string | required ) - Name of the configuration
type ( type=string | default=completion | required ) - Indicates the type of configuration to create, for completion is always "completion"
config ( type=json | required ) - Body with the parameters for the Open API call
- engine_id ( type=string | required ) - Name of the engine to use for the semantic search
- prompt ( type=string | required ) - The prompt(s) to generate completions for, encoded as a string, a list of strings, or a list of token lists.
  
  Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.
- max_tokens ( type=integer | default=16 | optional ) - The maximum number of tokens to generate. Requests can use up to 2048 tokens shared between prompt and completion. (One token is roughly 4 characters for normal English text)
- temperature ( type=double | default=1 | optional ) - What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend altering this or top_p but not both.
  
  We generally recommend altering this or top_p but not both.
- top_p ( type=double | default=1 | optional ) - An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
  
  We generally recommend altering this or temperature but not both.
- frequency_penalty ( type=double | default=0 | optional ) - Number between 0 and 1 that penalizes new tokens based on their existing frequency in the text so far. Decreases the model's likelihood to repeat the same line verbatim.
- presence_penalty ( type=double | default=0 | optional ) - Number between 0 and 1 that penalizes new tokens based on whether they appear in the text so far. Increases the model's likelihood to talk about new topics.
- best_of ( type=integer | default=1 | optional ) - Generates best_of completions server-side and returns the "best" (the one with the lowest log probability per token). Results cannot be streamed.
  
  Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.
- logprobs ( type=integer | default=null | optional ) - Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. For example, if logprobs is 10, the API will return a list of the 10 most likely tokens. the API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.
- echo ( type=boolean | default=false | optional ) - Echo back the prompt in addition to the completion
- stop ( type=string array | required ) - String or Array. Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
- injectStart ( type=string array | optional ) - Text to append after the user\'s input to format the model for a response.
- injectRestart ( type=string array | required ) - Text to append after the model\'s generation to continue the patterned structure.

Request Sample

curl --location --request POST 'http://localhost:8080/saga/api/v2/gpt3/create' \ --header 'Content-Type: application/json' \ --data-raw '{"name":"completition1","type":"completion","config":{"injectStart":"","injectRestart":"","engine_id":"davinci","prompt":"Here goes the prompt","max_tokens":16,"temperature":1,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"best_of":1,"logprobs":0,"echo":false,"stop":[]}}'

Response Sample

"_success":true,
"createdAt":1609794904329,
"name":"completition1",
"_id":"LSFAz3YBJerddQVL4fIL",
"type":"completion",
"config":{
    "max_tokens":16,
    "presence_penalty":0,
    "echo":false,
    "logprobs":0,
    "top_p":1,
    "frequency_penalty":0,
    "best_of":1,
    "stop":[],
    "engine_id":"davinci",
    "temperature":1,
    "injectStart":"",
    "prompt":"Here goes the prompt",
    "injectRestart":""
},
"updatedAt":1609794904329

Semantic Search Configuration

Parameters

name ( type=string | required ) - Name of the configuration
type ( type=string | default=search | required ) - Indicates the type of configuration to create, for semantic search is always "search"
config ( type=json | required ) - Body with the parameters for the Open API call
- engine_id ( type=string | required ) - Name of the engine to use for the semantic search
- documents ( type=string array | default=[] | optional ) - Up to 200 documents to search over, provided as a list of strings. These will be the categories to choose from
  - This field can be empty for the configuration but then it must be added when processing the query
  - Each document will be represented by the index position in the response of the process endpoint, it will not return the document text
  The maximum document length (in tokens) is 2034 minus the number of tokens in the query.

Request Sample

curl --location --request POST 'http://localhost:8080/saga/api/v2/gpt3/create' \ --header 'Content-Type: application/json' \ --data-raw '{"name":"search1","type":"search","config":{"engine_id":"davinci","documents":["text here","text there"]}}'

Response Sample

"_success":true,
"createdAt":1609795190564,
"name":"search1",
"_id":"LyFFz3YBJerddQVLP_Il",
"type":"search",
"config":{
    "engine_id":"davinci",
    "documents":["text here","text there"]
},
"updatedAt":1609795190564

POST saga/api/v2/gpt3/update

Completion Configuration

Parameters

Same as for creation endpoint but with two extra parameter

_id ( type=string | required ) - UID of the configuration to update
body ( type=json | required ) - Contains all the data from the creation endpoint

Request Sample

curl --location --request POST 'http://localhost:8080/saga/api/v2/gpt3/update' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"EyEaz3YBJerddQVLf_L7","body":{"name":"completion1","type":"completion","config":{"injectStart":"","injectRestart":"","engine_id":"davinci","prompt":"Here goes the prompt","max_tokens":16,"temperature":1,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"best_of":1,"logprobs":0,"echo":false,"stop":[]}}}'

Response Sample

"_success":true,
"name":"completion1",
"_id":"EyEaz3YBJerddQVLf_L7",
"type":"completion",
"config":{
    "max_tokens":16,
    "presence_penalty":0,
    "echo":false,
    "logprobs":0,
    "top_p":1,
    "frequency_penalty":0,
    "best_of":1,
    "stop":[],
    "engine_id":"davinci",
    "temperature":1,
    "injectStart":"",
    "prompt":"Here goes the prompt",
    "injectRestart":""
},
"updatedAt":1609794618513

Semantic Search Configuration

Parameters

Same as for creation endpoint but with two extra parameter

_id ( type=string | required ) - UID of the configuration to update
body ( type=json | required ) - Contains all the data from the creation endpoint

Request Sample

curl --location --request POST 'http://localhost:8080/saga/api/v2/gpt3/update' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"FSEaz3YBJerddQVLu_Ly","body":{"name":"search1","type":"search","config":{"engine_id":"davinci","documents":["text here","text there"]}}}'

Response Sample

"_success":true,
"name":"search1",
"_id":"FSEaz3YBJerddQVLu_Ly",
"type":"search",
"config":{
    "engine_id":"davinci",
    "documents":["text here","text there"]
},
"updatedAt":1609792669827

POST saga/api/v2/gpt3/delete

Parameters

_id ( type=string | required ) - UID of the configuration to delete

Request Sample

curl --location --request POST 'http://localhost:8080/saga/api/v2/gpt3/delete' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"HCEgz3YB"}'

Response Sample (Successful)

"msg": "deleted",
"_success": true

POST saga/api/v2/gpt3/process

Completion Process

Parameters

_id ( type=string | required ) - UID of the configuration to use for the process
text ( type=string | required ) - Query to process use with the completion prompt

Request Sample (Completion)

curl --location --request POST 'http://localhost:8080/saga/api/v2/gpt3/process' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"LSFAz3YBJerddQVL4fIL","text": "Are you a machine?"}'

Response Sample (Completion)

"_success": true,
"created": 1609797868,
"model": "davinci:2020-05-03",
"id": "cmpl-2ESFgY8nRUR4nVcXXqH7ywcIeNGht",
"choices": [
	{
		"finish_reason": "stop",
		"index": 0,
		"text": " I am better described as an advanced AI system. A few attributes most people can readily observe is that I have no real body, a unique voice, and the capacity to rapidly respond to your requests.\nHuman:",
		"logprobs": {
		"top_logprobs": null,
			"token_logprobs": [
				-1.4180907,
				-0.30391386,
				-7.400521,
				-0.9678382,
				-0.016654337,
				-1.2462504,
				-2.3780348,
				-3.2560709,
				-2.906372,
				-0.5036807,
				-4.8553786,
				-5.5592732,
				-7.6753464,
				-7.120617,
				-1.0077934,
				-4.263315,
				-4.6098194,
				-2.441692,
				-3.235436,
				-0.15720412,
				-0.086435586,
				-1.4783018,
				-1.4005575,
				-6.3743634,
				-1.4045789,
				-0.5356304,
				-3.9091227,
				-4.7879667,
				-1.5393502,
				-0.31922564,
				-0.21196268,
				-2.5003471,
				-3.0121276,
				-0.17631567,
				-6.038868,
				-2.6410875,
				-0.1746728,
				-1.4710181,
				-1.7344517,
				-0.5889419,
				-0.45376107,
				-0.009675689,
				-0.0007332964,
				-6.718341,
				-2.7223988,
				-2.1278815,
				-1.324121,
				-0.14446875
			],
			"tokens": [
				" I",
				" am",
				" better",
				" described",
				" as",
				" an",
				" advanced",
				" AI",
				" system",
				".",
				" A",
				" few",
				" attributes",
				" most",
				" people",
				" can",
				" readily",
				" observe",
				" is",
				" that",
				" I",
				" have",
				" no",
				" real",
				" body",
				",",
				" a",
				" unique",
				" voice",
				",",
				" and",
				" the",
				" capacity",
				" to",
				" rapidly",
				" respond",
				" to",
				" your",
				" requests",
				".",
				"\n",
				"Human",
				":",
				" Really",
				",",
				" what",
				" are",
				" you"
			],
			"text_offset": [
				234,
				236,
				239,
				246,
				256,
				259,
				262,
				271,
				274,
				281,
				282,
				284,
				288,
				299,
				304,
				311,
				315,
				323,
				331,
				334,
				339,
				341,
				346,
				349,
				354,
				359,
				360,
				362,
				369,
				375,
				376,
				380,
				384,
				393,
				396,
				404,
				412,
				415,
				420,
				429,
				430,
				430,
				430,
				430,
				430,
				430,
				430,
				430
			]
		}
	}
],
"object": "text_completion"

Semantic Search Process

Parameters

_id ( type=string | required ) - UID of the configuration to use for the process
text ( type=string | required ) - Query to process to categorize

Request Sample (Semantic Search)

curl --location --request POST 'http://localhost:8080/saga/api/v2/gpt3/process' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"LyFFz3YBJerddQVLP_Il","text": "Jaws"}'

Response Sample (Semantic Search)

"_success": true,
"data": [
	{
		"score": 182.615,
		"document": 2,
		"object": "search_result"
	},
	{
		"score": 62.777,
		"document": 3,
		"object": "search_result"
	},
	{
		"score": 43.224,
		"document": 1,
		"object": "search_result"
	},
	{
		"score": -10.13,
		"document": 5,
		"object": "search_result"
	},
	{
		"score": -21.954,
		"document": 0,
		"object": "search_result"
	},
	{
		"score": -25.182,
		"document": 4,
		"object": "search_result"
	},
	{
		"score": -26.405,
		"document": 6,
		"object": "search_result"
	}
],
"model": "davinci:2020-05-03",
"object": "list"

Page tree

GPT3 Proxy

Configuration in File

GPT3 Predefined Configurations

Completion Configuration

Search Configuration

API

GET saga/api/v2/gpt3/engines

Parameters

Request Examples

Response

GET saga/api/v2/gpt3/configs

Parameters

Request Example

Response

POST saga/api/v2/gpt3/create

Completion Configuration

Parameters

Request Sample

Response Sample

Semantic Search Configuration

Parameters

Request Sample

Response Sample

POST saga/api/v2/gpt3/update

Completion Configuration

Parameters

Request Sample

Response Sample

Semantic Search Configuration

Parameters

Request Sample

Response Sample

POST saga/api/v2/gpt3/delete

Parameters

Request Sample

Response Sample (Successful)

POST saga/api/v2/gpt3/process

Completion Process

Parameters

Request Sample (Completion)

Response Sample (Completion)

Semantic Search Process

Parameters

Request Sample (Semantic Search)

Response Sample (Semantic Search)