When deploying GAIA API a common approach is to build a Linux container image. It's normally done using Docker. You can then deploy that container image in one of a few possible ways.

Using Linux containers has several advantages including security, replicability, simplicity, and others.

In a hurry and already know this stuff? Jump to the Dockerfile below.



Build a Docker Image for GAIA API

This section will show you how to build a Docker image for GAIA API from scratch, based on the official Python image.

This is what you would want to do in most cases, for example:

  • Using Kubernetes or similar tools
  • When running on a Raspberry Pi
  • Using a cloud service that would run a container image for you, etc.

Dockerfile

We provide a Dockerfile to build the GAIA API, specifically using the version 3.11.9. It is optimized for minimal size by using the slim variant of the Python Docker image. The Dockerfile is structured to allow customization through build-time arguments, enabling flexibility in configuring the application environment and dependencies.

Dockerfile
# syntax=docker/dockerfile:1

# Use a specific version of the Python image, slim variant to keep the image size minimal
ARG PYTHON_VERSION=3.11.9
FROM python:$PYTHON_VERSION-slim AS GaiaAPI

# Arguments for dependency installation and PYQPL library location
# Options for INSTALL_DEPENDENCIES: [ldap], [genai], [all], [ldap,genai] or leave empty
ARG INSTALL_DEPENDENCIES=""
ARG PYQPL_LOCATION=lib/pyqpl-1.1.4-py3-none-any.whl
ARG GAIA_CORE_LOCATION=lib/gaia_core-3.0.0.dev1-py3-none-any.whl

# Set GAIA_ENV as an environment variable, default value is 'default'
ARG GAIA_ENV=system_default
ENV GAIA_ENV=$GAIA_ENV

# Set CONFIG_URL as an environment variable, for custom configuration JSON file path
ARG CONFIG_URL=''
ENV CONFIG_URL=$CONFIG_URL

# Set number of Uvicorn workers, typically 1 is recommended in Docker
ARG UVICORN_WORKERS=1
ENV UVICORN_WORKERS=$UVICORN_WORKERS

# Set protocol (default 'http') as an environment variable
ARG PROTOCOL=http
ENV PROTOCOL=$PROTOCOL

# Set host for GAIA API, necessary unless default entrypoint is removed
ARG HOST=0.0.0.0
ENV HOST=$HOST

# Set port for GAIA API, necessary unless default entrypoint is removed
ARG PORT=8085
ENV PORT=$PORT

# Set domain name for GAIA API, necessary unless default entrypoint is removed
ARG DOMAIN_NAME=host.docker.internal
ENV DOMAIN_NAME=$DOMAIN_NAME

# Set cookie domain name for GAIA API, necessary unless default entrypoint is removed
ARG COOKIE_DOMAIN_NAME=''
ENV COOKIE_DOMAIN_NAME=$COOKIE_DOMAIN_NAME

# Set engine URL for GAIA API, necessary unless default entrypoint is removed
ARG ENGINE_URL=http://host.docker.internal:9200
ENV ENGINE_URL=$ENGINE_URL

# Set path to certificates, necessary only if mailer is enabled using custom SMTP
ARG CERTIFICATES_PATH=''
ENV CERTIFICATES_PATH=$CERTIFICATES_PATH

# Set AWS Elasticsearch credentials, only if using AWS service
ARG AWS_SERVICE=es
ENV AWS_SERVICE=$AWS_SERVICE

ARG AWS_REGION=us-east-1
ENV AWS_REGION=$AWS_REGION

# Set AWS Access Key and Session Token, required only if using access key and token
ARG AWS_ACCESS_KEY_ID=default-key
ENV AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID

ARG AWS_SECRET_ACCESS_KEY=default-secret
ENV AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY

ARG AWS_SESSION_TOKEN=default-token
ENV AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN

# Set to allow empty queries on search
ARG ALLOW_EMPTY_QUERY=false
ENV ALLOW_EMPTY_QUERY=$ALLOW_EMPTY_QUERY

# JWKS (JSON Web Key Set) URL is a location where public keys used for verifying JSON Web Tokens (JWTs) can be retrieved
ARG DELEGATE_JWKS_URL=''
ENV DELEGATE_JWKS_URL=$DELEGATE_JWKS_URL

# The LDAP server URL. It specifies the network address and protocol for connecting to the LDAP server.
ARG LDAP_URL=''
ENV LDAP_URL=$LDAP_URL

# Contains the password or credentials associated with the bindDN. It is used for authentication when establishing a
# connection with the LDAP server.
ARG LDAP_CREDENTIALS=''
ENV LDAP_CREDENTIALS=$LDAP_CREDENTIALS

# Contains bindDN of the user to connect with ldap and check the users.
ARG LDAP_BIND_DN=''
ENV LDAP_BIND_DN=$LDAP_BIND_DN

# Contains the searchBase to where in the LDAP look for the users.
ARG LDAP_SEARCH_BASE=''
ENV LDAP_SEARCH_BASE=$LDAP_SEARCH_BASE

# Unique identifier assigned to the client application by the IdP. It identifies the client application during
# authentication and authorization requests.
ARG OIDC_CLIENT_ID=''
ENV OIDC_CLIENT_ID=$OIDC_CLIENT_ID

# URI to the OpenID Connect configuration values from the provider's Well-Known Configuration Endpoint
ARG OIDC_OPENID_CONFIG_URI=''
ENV OIDC_OPENID_CONFIG_URI=$OIDC_OPENID_CONFIG_URI

# The secret used to sign and decrypt the JWT. Does not apply with Delegated
ARG AUTH_SECRET=52ecfd60e01b800355a8ce59780f9243b4662c3a236394ee
ENV AUTH_SECRET=$AUTH_SECRET

# IA Assistant #############################################################

# URI to the OpenID Connect configuration values from the provider's Well-Known Configuration Endpoint
ARG ASSISTANT_INDEX=''
ENV ASSISTANT_INDEX=$ASSISTANT_INDEX

# This is your openai key to use the chat from your service provider
ARG OPENAI_API_KEY=''
ENV OPENAI_API_KEY=$OPENAI_API_KEY

# Base url of your service provider for open ai chat
ARG OPENAI_ENDPOINT=''
ENV OPENAI_ENDPOINT=$OPENAI_ENDPOINT

# Api version of the openai chat
ARG OPENAI_API_VERSION=''
ENV OPENAI_API_VERSION=$OPENAI_API_VERSION

# Name of the model to be used
ARG OPENAI_MODEL=''
ENV OPENAI_MODEL=$OPENAI_MODEL

# List of function which would trigger and exit of the loop pipeline
ARG ASSISTANT_EXIT_FUNCTIONS=''
ENV ASSISTANT_EXIT_FUNCTIONS=$ASSISTANT_EXIT_FUNCTIONS

# Switch to root user for further operations
USER root:root

# Update all OS packages to the latest version
RUN apt-get update && apt-get -y upgrade

# Copy the current directory into /gaia_api in the container and set it as the working directory
WORKDIR /gaia_api
COPY . .

# Upgrade pip and install dependencies without using cache
RUN pip install --upgrade pip && \
	pip install --no-cache-dir -e .$INSTALL_DEPENDENCIES && \
    pip install --no-cache-dir $GAIA_CORE_LOCATION && \
    # PYQPL from the local lib folder, check the VERSION before installing!
	pip install --no-cache-dir $PYQPL_LOCATION



# Expose the specified port for the GAIA API
EXPOSE $PORT

# Default startup script command to run Uvicorn with specified configurations
ENTRYPOINT ["/bin/bash", "/gaia_api/startup.sh"]

Detailed Breakdown

  1. Python Base Image:

    • ARG PYTHON_VERSION=3.11.9 Defines a variable PYTHON_VERSION to specify the Python version. The default is 3.11.9.
    • FROM python:$PYTHON_VERSION-slim AS GaiaAPI: Uses the slim variant of the specified Python version as the base image and names the build stage GaiaAPI.
  2. Build Arguments for Dependencies and Configuration:

    • Various ARG instructions define default values for build-time variables like INSTALL_DEPENDENCIES, PYQPL_LOCATION, and environment configurations (GAIA_ENV, CONFIG_URL, etc.). These arguments allow for customization during the Docker image build process.
  3. Environment Variables:

    • ENV instructions set environment variables based on the ARG values. These variables configure the application runtime environment, including API settings (host, port, protocol), AWS credentials, and more.
  4. User Context:

    • USER root:root: Sets the user context to root for subsequent commands.
  5. Application Setup:

    • WORKDIR /gaia_api: Sets the working directory inside the container.
    • COPY . .: Copies the current directory's contents into /gaia_api in the container.
  6. Dependency Installation:

    • Installs and upgrades pip.
    • Installs the application dependencies and the PYQPL library without using the cache, ensuring a fresh install.
  7. Port Exposition:

    • EXPOSE $PORT: Exposes the port defined by the PORT environment variable.
  8. Default Entrypoint:

    • The ENTRYPOINT instruction sets the default command to run the startup.sh script.


Startup Script


GAIA Startup Script
#!/bin/bash

# Amount of retries that the server will restart
tries=3

# Startup/Restart loop
for i in $(seq 1 $tries)
do
	# Start the python server
	python -m uvicorn app.webapp:app --host $HOST --port $PORT --workers $UVICORN_WORKERS --no-server-header
	# Optional commands for running Uvicorn with proxy headers or SSL (uncomment as needed)
	# Uncomment to run Uvicorn with proxy headers if behind a proxy like Nginx or Traefik
	# python -m uvicorn app.webapp:app --proxy-headers --host $HOST --port $PORT --workers $UVICORN_WORKERS --no-server-header
	 
	# Uncomment to run Uvicorn with SSL, ensure SSL certificate and key paths are correct
	# python -m uvicorn app.webapp:app --host $HOST --port $PORT --ssl-keyfile /path/in/container/private.key", "--ssl-certfile", "/path/in/container/certificate.crt"]
	
	# If the server fails and exits we check if the exit code is 15 (most likely to be a planned restart from the shutdown endpoint)
	exit_status=$?
	if [ "${exit_status}" -ne 137 ];
	then
		tries=$((tries-1))
	fi
	
	# If we want to exit via Keyboard Interrupt, we skip the loop
	if [ "${exit_status}" -eq 0 ];
	then
		tries=0
	fi

	# Go back to Restart Loop after a 3 second timeout
	sleep 3
done


Detailed Startup Breakdown


  1. The tries variable has the number of maximum restarts of the server, this will exit the startup loop after reaching this limit.
  2. The for loop  is the startup/restart loop, this will start a new python server depending of the command you want to start the server.
    1. There are 3 different start commands, depending if SSL verification or proxy headers are needed.
  3. Then, we catch the exit code of the python process, this is to know if the server was shutdown on purpose (in case of a successful import) or if it was manually shutdown (via Keyboard Interrupt) and depending of this code, the loop will continue and a new python server instance will spawn.

Behind a TLS Termination Proxy

If you are running your container behind a TLS Termination Proxy (load balancer) like Nginx or Traefik, add the option --proxy-headers, this will tell Uvicorn to trust the headers sent by that proxy telling it that the application is running behind HTTPS, etc.

CMD python -m uvicorn app.webapp:app --proxy-headers --host $HOST --port $PORT --workers 1 --no-server-header  

Changing the Base Image

To change the base image to another Python base image, follow these steps:

  1. Identify the Desired Python Image:

    • Choose a different Python Docker image variant (e.g., buster, alpine) or version as required.
  2. Modify the Dockerfile:

    • Change the ARG PYTHON_VERSION value to the desired Python version.
    • Update the FROM instruction with the new image. For example, if you want to use Python 3.9 on Alpine, change it to:

       ARG PYTHON_VERSION=3.9 FROM python:$PYTHON_VERSION-alpine AS GaiaAPI
  3. Rebuild the Docker Image:

    • Build the Docker image again with the updated Dockerfile. This will use the new base image for the application.

      docker build -t my-gaia-api .


Building the Dockerfile

Building a Docker image from this Dockerfile involves using the docker build command. You can customize the build process by specifying various arguments, allowing for different configurations of the resulting image.

Basic Build Instructions

  1. Bare Minimum Build: To build the Docker image with the default settings (as specified in the Dockerfile), navigate to the directory containing the Dockerfile and run:

    docker build -t my-gaia-api .
    • -t my-gaia-api assigns the tag my-gaia-api to the built image.
    • . specifies the current directory as the build context.

Building with Custom Arguments

  1. Available Arguments: The Dockerfile includes several build-time arguments (ARG) that allow for customizing the build. Here is a list of the available arguments and their default values:

    • PYTHON_VERSION: The version of the Python image. Default is 3.11.9.
    • INSTALL_DEPENDENCIES: Options for additional dependencies. Default is an empty string.
    • PYQPL_LOCATION: Location of the PYQPL library. Default is lib/pyqpl-1.1.4-py3-none-any.whl.
    • GAIA_ENV: Environment setting for GAIA. Default is system_default.
    • CONFIG_URL: URL for custom configuration JSON file. Default is an empty string.
    • UVICORN_WORKERS: Number of Uvicorn workers. Default is 1.
    • PROTOCOL, HOST, PORT, DOMAIN_NAME, COOKIE_DOMAIN_NAME, ENGINE_URL, CERTIFICATES_PATH, AWS_SERVICE, AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN: Various settings for the GAIA API.
  2. Example Builds with Custom Arguments:

    • To build with a specific Python version and a set of dependencies:

      docker build --build-arg PYTHON_VERSION=3.9 --build-arg INSTALL_DEPENDENCIES=[ldap] -t my-gaia-api .
    • To build with a custom configuration URL and a specific number of Uvicorn workers:

      docker build --build-arg CONFIG_URL='http://myconfig.com/config.json' --build-arg UVICORN_WORKERS=2 -t my-gaia-api .
    • To build with AWS-specific configurations:

      docker build --build-arg AWS_ACCESS_KEY_ID=my-access-key --build-arg AWS_SECRET_ACCESS_KEY=my-secret-key -t my-gaia-api .
  • Each --build-arg flag allows you to override the default argument value specified in the Dockerfile.
  • Make sure to replace the example values (like my-access-key, http://myconfig.com/config.json) with actual values relevant to your setup.

Running a Container from the Built Image

After building the Docker image from the Dockerfile, you can run a container using the docker run command. This section provides instructions on how to do so, including how to utilize various environment variables for customizing the runtime behavior of your container.

Start the Docker Container

  • Run a container based on your image:
$ docker run -d --name mycontainer -p 8085:8085 myimage

Basic Run Instructions

  1. Bare Minimum Run: To run a container with the default settings, use the following command:

     docker run -d -p 8085:8085 --name my-gaia-api
    • -d runs the container in detached mode (in the background).
    • -p 8085:8085 maps port 8085 of the container to port 8085 on the host. Adjust the port numbers as needed based on the PORT environment variable.
    • --name my-gaia-api is the tag name of the image you built.

Running with Custom Environment Variables

  1. Available Environment Variables: The Dockerfile defines several environment variables (ENV) that you can override at runtime. Here is a list of the available environment variables:

    • GAIA_ENV, CONFIG_URL, UVICORN_WORKERS, PROTOCOL, HOST, PORT, DOMAIN_NAME, COOKIE_DOMAIN_NAME, ENGINE_URL, CERTIFICATES_PATH, AWS_SERVICE, AWS_REGION, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN: Various settings for the GAIA API.
  2. Example Runs with Custom Environment Variables:

    • To run with a specific environment and a custom port:

      docker run -d -p 5000:5000 -e GAIA_ENV=production -e PORT=5000 --name my-gaia-api
    • To run with a custom configuration URL:

      docker run -d -p 8085:8085 -e CONFIG_URL='http://myconfig.com/config.json' --name my-gaia-api
    • To run with specific AWS credentials:

      docker run -d -p 8085:8085 -e AWS_ACCESS_KEY_ID=my-access-key -e AWS_SECRET_ACCESS_KEY=my-secret-key --name my-gaia-api
    • To run with a custom engine URL, use the ENGINE_URL variable.

      docker run -d -p 8085:8085 -e ENGINE_URL=https://myengine.example.com --name my-gaia-api


  • The -e flag is used to set environment variables in the container. These variables can override the defaults set in the Dockerfile.
  • Ensure the port mappings (-p) align with the PORT environment variable.
  • Replace example values with actual values relevant to your setup.

Check it

You should be able to check it in your Docker container's URL, for example: http://localhost:8085/es/api (or equivalent, using your Docker host).

You will see something like:

GAIA API Running

Interactive API docs

Now you can go to http://localhost:8085/es/docs (or equivalent, using your Docker host).

You will see the automatic interactive API documentation (provided by Swagger UI):


Alternative API docs

Now you can go to http://localhost:8085/es/redoc (or equivalent, using your Docker host).

You will see the automatic interactive API documentation (provided by ReDoc):

Environment Variables Table

Below is a table listing all the environment variables used in the Dockerfile, along with their default values and descriptions.
NameDefaultDescription
GAIA_ENVsystem_defaultSets the environment for GAIA. Used to specify different settings for production, staging, etc.
CONFIG_URL''(empty)URL for a custom configuration JSON file.
UVICORN_WORKERS1Number of Uvicorn workers to use. Typically, 1 is recommended in Docker.
PROTOCOLhttpProtocol used by the GAIA API (e.g., http or https).
HOST0.0.0.0Host for the GAIA API. Set to 0.0.0.0 to allow connections from any IP.
PORT8085Port for the GAIA API.
DOMAIN_NAMEhost.docker.internalDomain name for the GAIA API. Typically set for external access.
COOKIE_DOMAIN_NAME'' (empty)Domain name for setting cookies in the GAIA API.
ENGINE_URLhttp://host.docker.internal:9200    URL for the GAIA API's backend engine.
CERTIFICATES_PATH'' (empty)Path to SSL certificates, necessary if SSL is enabled. (Currently use for the mailer option)
AWS_SERVICEesAWS service name, used when integrating with AWS services (e.g., Elasticsearch Service).
AWS_REGIONus-east-1AWS region for the service.
AWS_ACCESS_KEY_IDdefault-keyAWS Access Key ID, required for AWS services authentication.
AWS_SECRET_ACCESS_KEYdefault-secretAWS Secret Access Key, required for AWS services authentication.
AWS_SESSION_TOKENdefault-tokenAWS Session Token, used for temporary access to AWS services.
ALLOW_EMPTY_QUERYfalseAllow the API to accept and search for empty queries (" ", "*", "*.*")
DELEGATE_JWKS_URL
''(empty)

JWKS (JSON Web Key Set) URL is a location where public keys used for verifying JSON Web Tokens (JWTs) can be retrieved

LDAP_URL
''(empty)

The LDAP server URL. It specifies the network address and protocol for connecting to the LDAP server.

LDAP_CREDENTIALS
''(empty)

Contains the password or credentials associated with the bindDN. It is used for authentication when establishing a connection with the LDAP server.

OIDC_CLIENT_ID
''(empty)

Unique identifier assigned to the client application by the IdP. It identifies the client application during authentication and authorization requests.

OIDC_OPENID_CONFIG_URI
''(empty)

URI to the OpenID Connect configuration values from the provider's Well-Known Configuration Endpoint

AUTH_SECRET
52ecfd60e01b800355a8ce59780f9243b4662c3a236394ee

The secret used to sign and decrypt the JWT. Does not apply with Delegated

Changing the Base Image of the Dockerfile

Customizing the base image in the Dockerfile allows you to use a different version or variant of Python, depending on your application's needs. This can be particularly useful for matching the Python environment to your existing development or production environments.

  1. Identify the Desired Python Image:

    • Determine which Python Docker image you want to use. This could be a different version of Python or a variant like alpine, buster, etc. The official Python Docker images are available on Docker Hub.
  2. Edit the Dockerfile (2 options):

    1. Just Changing the Python Version
      1. Open the Dockerfile in a text editor.
      2. Locate the line that starts with ARG PYTHON_VERSION=.... This line defines the default Python version.
      3. Modify the PYTHON_VERSION argument to the desired version. For example, to use Python 3.8, change it to ARG PYTHON_VERSION=3.8.
      4. Next, locate the FROM python:$PYTHON_VERSION-slim AS GaiaAPI line. This line specifies the base image.
      5. Change slim to another variant if desired. For example, to use the Alpine variant, change it to FROM python:$PYTHON_VERSION-alpine AS GaiaAPI.
    2. Change to a Different Base Image
      1. Open the Dockerfile in a text editor.
      2. Remove or comment out any existing ARG PYTHON_VERSION=... line, if present.
      3. Modify the FROM instruction to reference your custom base image. For example, if your custom image is named custom-client:latest, update the line to:

        FROM custom-client:latest AS GaiaAPI
  3. Rebuild the Docker Image:

    • After saving your changes to the Dockerfile, rebuild the image to reflect the new base image. Use the docker build command with the appropriate tag:

      docker build -t my-gaia-api .

Example: Switching to Python 3.10 Alpine

  1. Modify the Dockerfile:

    • Change the PYTHON_VERSION to 3.10.
    • Update the FROM line to use the Alpine variant:

      dockerfile
       ARG PYTHON_VERSION=3.10 FROM python:$PYTHON_VERSION-alpine AS GaiaAPI
  2. Rebuild the Image:

    docker build -t my-gaia-api-alpine .
  • When changing the base image, especially to a different variant like Alpine, be aware of any dependencies or environment changes that may affect your application.
  • Alpine-based images are smaller and more secure but may require additional configuration for some Python packages due to their minimal nature.
  • Always test your application thoroughly after changing the base image to ensure compatibility and proper functioning.

Aditional Considerations

HTTPS

If we focus just on the container image for a Gaia API application (and later the running container), HTTPS normally would be handled externally by another tool.

Alternatively, HTTPS could be handled by a cloud provider as one of their services (while still running the application in a container).


Running on Startup and Restarts

There is normally another tool in charge of starting and running your container.

It could be Docker directly, Docker Compose, Kubernetes, a cloud service, etc.

In most (or all) cases, there's a simple option to enable running the container on startup and enabling restarts on failures. For example, in Docker, it's the command line option --restart.

Without using containers, making applications run on startup and with restarts can be cumbersome and difficult. But when working with containers in most cases that functionality is included by default. 


Replication - Number of Processes

If you have a cluster of machines with Kubernetes, Docker Swarm Mode, Nomad, or another similar complex system to manage distributed containers on multiple machines, then you will probably want to handle replication at the cluster level instead of using a process manager (like Gunicorn with workers) in each container.

One of those distributed container management systems like Kubernetes normally has some integrated way of handling replication of containers while still supporting load balancing for the incoming requests. All at the cluster level.

In those cases, you would probably want to build a Docker image from scratch as explained above, installing your dependencies, and running a single Uvicorn process instead of running something like Gunicorn with Uvicorn workers.


Load Balancer

When using containers, you would normally have some component listening on the main port. It could possibly be another container that is also a TLS Termination Proxy to handle HTTPS or some similar tool.

As this component would take the load of requests and distribute that among the workers in a (hopefully) balanced way, it is also commonly called a Load Balancer.

The same TLS Termination Proxy component used for HTTPS would probably also be a Load Balancer.

And when working with containers, the same system you use to start and manage them would already have internal tools to transmit the network communication (e.g. HTTP requests) from that load balancer (that could also be a TLS Termination Proxy) to the container(s) with your app.


One Load Balancer - Multiple Worker Containers

When working with Kubernetes or similar distributed container management systems, using their internal networking mechanisms would allow the single load balancer that is listening on the main port to transmit communication (requests) to possibly multiple containers running your app.

Each of these containers running your app would normally have just one process (e.g. a Uvicorn process running your Gaia API application). They would all be identical containers, running the same thing, but each with its own process, memory, etc. That way you would take advantage of parallelization in different cores of the CPU, or even in different machines.

And the distributed container system with the load balancer would distribute the requests to each one of the containers with your app in turns. So, each request could be handled by one of the multiple replicated containers running your app.

And normally this load balancer would be able to handle requests that go to other apps in your cluster (e.g. to a different domain, or under a different URL path prefix), and would transmit that communication to the right containers for that other application running in your cluster.


One Process per Containe

In this type of scenario, you probably would want to have a single (Uvicorn) process per container, as you would already be handling replication at the cluster level.

So, in this case, you would not want to have a process manager like Gunicorn with Uvicorn workers, or Uvicorn using its own Uvicorn workers. You would want to have just a single Uvicorn process per container (but probably multiple containers).

Having another process manager inside the container (as would be with Gunicorn or Uvicorn managing Uvicorn workers) would only add unnecessary complexity that you are most probably already taking care of with your cluster system.


Containers with Multiple Processes and Special Cases

Of course, there are special cases where you could want to have a container with a Gunicorn process manager starting several Uvicorn worker processes inside.

In those cases, you can use the official Docker image that includes Gunicorn as a process manager running multiple Uvicorn worker processes, and some default settings to adjust the number of workers based on the current CPU cores automatically.

Here are some examples of when that could make sense:

A Simple App

You could want a process manager in the container if your application is simple enough that you don't need (at least not yet) to fine-tune the number of processes too much, and you can just use an automated default (with the official Docker image), and you are running it on a single server, not a cluster.

Docker Compose

You could be deploying to a single server (not a cluster) with Docker Compose, so you wouldn't have an easy way to manage replication of containers (with Docker Compose) while preserving the shared network and load balancing.

Then you could want to have a single container with a process manager starting several worker processes inside.

Prometheus and Other Reasons

You could also have other reasons that would make it easier to have a single container with multiple processes instead of having multiple containers with a single process in each of them.

For example (depending on your setup) you could have some tool like a Prometheus exporter in the same container that should have access to each of the requests that come.

In this case, if you had multiple containers, by default, when Prometheus came to read the metrics, it would get the ones for a single container each time (for the container that handled that particular request), instead of getting the accumulated metrics for all the replicated containers.

Then, in that case, it could be simpler to have one container with multiple processes, and a local tool (e.g. a Prometheus exporter) on the same container collecting Prometheus metrics for all the internal processes and exposing those metrics on that single container.


Memory

If you run a single process per container you will have a more or less well-defined, stable, and limited amount of memory consumed by each of those containers (more than one if they are replicated).

And then you can set those same memory limits and requirements in your configurations for your container management system (for example in Kubernetes). That way it will be able to replicate the containers in the available machines taking into account the amount of memory needed by them, and the amount available in the machines in the cluster.

If your application is simple, this will probably not be a problem, and you might not need to specify hard memory limits. But if you are using a lot of memory (for example with machine learning models), you should check how much memory you are consuming and adjust the number of containers that runs in each machine (and maybe add more machines to your cluster).

If you run multiple processes per container (for example with the official Docker image) you will have to make sure that the number of processes started doesn't consume more memory than what is available.


Previous Steps Before Starting and Containers

If you are using containers (e.g. Docker, Kubernetes), then there are two main approaches you can use.

Multiple Containers

If you have multiple containers, probably each one running a single process (for example, in a Kubernetes cluster), then you would probably want to have a separate container doing the work of the previous steps in a single container, running a single process, before running the replicated worker containers.

If you are using Kubernetes, this would probably be an Init Container.

If in your use case there's no problem in running those previous steps multiple times in parallel (for example if you are not running database migrations, but just checking if the database is ready yet), then you could also just put them in each container right before starting the main process.

Single Container

If you have a simple setup, with a single container that then starts multiple worker processes (or also just one process), then you could run those previous steps in the same container, right before starting the process with the app. The official Docker image supports this internally.