You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

The GPT3 proxy manages the communication with OpenAI BETA APIs (Completion and Semantic Search), restricting some features and adding the possibility to use predefined configurations, just by calling the id assigned to each one

Configuration in File


In the Saga Server configuration file which is located in ./config/config.json there will be an filed gpt3 (if there is no field it can be added), inside this field another called key will hold the API secret key for OpenAI, the key can either be encrypted or as plain text, this will be the key used in GPT3 Proxy for all call executed. Besides that key, there are other two fields: the openAIHost, which holds the url to the OpenAI BETA APIs "https://api.openai.com" and the openAIAPIVersion, which holds what version will be used, e.g. "v1".


$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")


We recommend to encrypt the key using our utility Saga Secure



GPT3 Predefined Configurations

One feature the GPT3 Proxy offers is the possibility to create and use predefined configurations for both semantic search and text completion (text generation)

The way to generate GPT3 configurations is to:

  1. Access the Saga user interface,
  2. Go to the gear icon in the top right corner
  3. Select GPT3
  4. Click on the plus button
  5. Select either Search (for semantic search) or Completion (for text generation)



Completion Configuration

The Completion configuration holds all the provided parameters for the OpenAI API, each one has a detail description, the prompt text area is optional, but if none is provided the prompt must be specified when calling the process endpoint

Search Configuration

The Search configuration only asks for the engine to use and optionally the documents to use when processing the query. Same as the Completion, if no document is not provided, they must be specified when calling the process endpoint.

Each document added will be represented by an index number which can be seen in next to the label of each document like this Document (X). This will be the number specified in the response of the process endpoint

API

The GPT3 Proxy API mimics the OpenAI API, but restricting some features, handling the secret key and adding the possibility of working with predefined configurations.

GET saga/api/v2/gpt3/engines

Parameters

  • No parameter required

Request Examples

curl --location --request GET 'http://localhost:8080/_saga/gpt3/engines'

Response

[
  {
    "owner": "openai",
    "ready": true,
    "id": "ada",
    "object": "engine"
  },
  {
    "owner": "openai",
    "ready": true,
    "id": "babbage",
    "object": "engine"
  },
  {
    "owner": "openai",
    "ready": true,
    "id": "curie",
    "object": "engine"
  },
  {
    "owner": "openai",
    "ready": true,
    "id": "davinci",
    "object": "engine"
  }
]

The engines may be different from the ones in the sample

GET saga/api/v2/gpt3/configs

Parameters

  • No parameter required

Request Example

curl --location --request GET 'http://localhost:8080/_saga/gpt3/configs'

Response

[
  {
    "name": "search1",
    "_id": "FSEaz3YBJerddQVLu_Ly",
    "type": "search",
    "updatedAt": 1609792404466
  },
  {
    "name": "completion1",
    "_id": "EyEaz3YBJerddQVLf_L7",
    "type": "completion",
    "updatedAt": 1609792389083
  } 
]

POST saga/api/v2/gpt3/create

Completion Configuration

Parameters

  • name ( type=string | required ) - Name of the configuration

  • type ( type=string | default=completion | required ) - Indicates the type of configuration to create, for completion is always "completion"

  • config ( type=json | required ) - Body with the parameters for the Open API call

    • engine_id ( type=string | required ) - Name of the engine to use for the semantic search

    • prompt ( type=string | required ) - The prompt(s) to generate completions for, encoded as a string, a list of strings, or a list of token lists.

      Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.

    • max_tokens ( type=integer | default=16 | optional ) - The maximum number of tokens to generate. Requests can use up to 2048 tokens shared between prompt and completion. (One token is roughly 4 characters for normal English text)

    • temperature ( type=double | default=1 | optional ) - What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend altering this or top_p but not both.

      We generally recommend altering this or top_p but not both.

    • top_p ( type=double | default=1 | optional ) - An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

      We generally recommend altering this or temperature but not both.

    • frequency_penalty ( type=double | default=0 | optional ) - Number between 0 and 1 that penalizes new tokens based on their existing frequency in the text so far. Decreases the model's likelihood to repeat the same line verbatim.

    • presence_penalty ( type=double | default=0 | optional ) - Number between 0 and 1 that penalizes new tokens based on whether they appear in the text so far. Increases the model's likelihood to talk about new topics.

    • best_of ( type=integer | default=1 | optional ) - Generates best_of completions server-side and returns the "best" (the one with the lowest log probability per token). Results cannot be streamed.

      Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for max_tokens and stop.

    • logprobs ( type=integer | default=null | optional ) - Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. For example, if logprobs is 10, the API will return a list of the 10 most likely tokens. the API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.

    • echo ( type=boolean | default=false | optional ) - Echo back the prompt in addition to the completion

    • stop ( type=string array | required ) - String or Array. Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

    • injectStart ( type=string array | optional ) - Text to append after the user\'s input to format the model for a response.

    • injectRestart ( type=string array | required ) - Text to append after the model\'s generation to continue the patterned structure.

Request Sample

curl --location --request POST 'http://localhost:8080/_saga/gpt3/create' \ --header 'Content-Type: application/json' \ --data-raw '{"name":"completition1","type":"completion","config":{"injectStart":"","injectRestart":"","engine_id":"davinci","prompt":"Here goes the prompt","max_tokens":16,"temperature":1,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"best_of":1,"logprobs":0,"echo":false,"stop":[]}}'

Response Sample

$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")

Semantic Search Configuration

Parameters

  • name ( type=string | required ) - Name of the configuration

  • type ( type=string | default=search | required ) - Indicates the type of configuration to create, for semantic search is always "search"

  • config ( type=json | required ) - Body with the parameters for the Open API call

    • engine_id ( type=string | required ) - Name of the engine to use for the semantic search

    • documents ( type=string array | default=[] | optional ) - Up to 200 documents to search over, provided as a list of strings. These will be the categories to choose from

      • This field can be empty for the configuration but then it must be added when processing the query
      • Each document will be represented by the index position in the response of the process endpoint, it will not return the document text

      The maximum document length (in tokens) is 2034 minus the number of tokens in the query.

Request Sample

curl --location --request POST 'http://localhost:8080/_saga/gpt3/create' \ --header 'Content-Type: application/json' \ --data-raw '{"name":"search1","type":"search","config":{"engine_id":"davinci","documents":["text here","text there"]}}'

Response Sample

$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")

POST saga/api/v2/gpt3/update

Completion Configuration

Parameters

Same as for creation endpoint but with two extra parameter

  • _id ( type=string | required ) - UID of the configuration to update

  • body ( type=json | required ) - Contains all the data from the creation endpoint

Request Sample

curl --location --request POST 'http://localhost:8080/_saga/gpt3/update' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"EyEaz3YBJerddQVLf_L7","body":{"name":"completion1","type":"completion","config":{"injectStart":"","injectRestart":"","engine_id":"davinci","prompt":"Here goes the prompt","max_tokens":16,"temperature":1,"top_p":1,"frequency_penalty":0,"presence_penalty":0,"best_of":1,"logprobs":0,"echo":false,"stop":[]}}}'

Response Sample

$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")

Semantic Search Configuration

Parameters

Same as for creation endpoint but with two extra parameter
  • _id ( type=string | required ) - UID of the configuration to update

  • body ( type=json | required ) - Contains all the data from the creation endpoint

Request Sample

curl --location --request POST 'http://localhost:8080/_saga/gpt3/update' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"FSEaz3YBJerddQVLu_Ly","body":{"name":"search1","type":"search","config":{"engine_id":"davinci","documents":["text here","text there"]}}}'

Response Sample

$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")

POST saga/api/v2/gpt3/delete

Parameters

  • _id ( type=string | required ) - UID of the configuration to delete

Request Sample

curl --location --request POST 'http://localhost:8080/_saga/gpt3/delete' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"HCEgz3YB"}'

Response Sample (Successful)

$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")

POST saga/api/v2/gpt3/process

Completion Process

Parameters

  • _id ( type=string | required ) - UID of the configuration to use for the process

  • text ( type=string | required ) - Query to process use with the completion prompt

Request Sample (Completion)

curl --location --request POST 'http://localhost:8080/_saga/gpt3/process' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"LSFAz3YBJerddQVL4fIL","text": "Are you a machine?"}'

Response Sample (Completion)

$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")

Semantic Search Process

Parameters

  • _id ( type=string | required ) - UID of the configuration to use for the process

  • text ( type=string | required ) - Query to process to categorize

Request Sample (Semantic Search)

curl --location --request POST 'http://localhost:8080/_saga/gpt3/process' \ --header 'Content-Type: application/json' \ --data-raw '{"_id":"LyFFz3YBJerddQVLP_Il","text": "Jaws"}'

Response Sample (Semantic Search)

$action.getHelper().renderConfluenceMacro("$codeS$body$codeE")

  • No labels