You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

When deploying FastAPI applications a common approach is to build a Linux container image. It's normally done using Docker. You can then deploy that container image in one of a few possible ways.

Using Linux containers has several advantages including security, replicability, simplicity, and others.

In a hurry and already know this stuff? Jump to the Dockerfile below.

Dockerfile Preview
# Base image (alpine linux)
FROM python:3.11.2-alpine3.17 AS SearchAPI
 
#
WORKDIR /search_api
 
 
# copy source files
COPY . .
 
# Install Python Dependencies
RUN pip3 install --no-cache-dir -e .
 
# Install Pyqpl local library
RUN pip3 install lib/pyqpl-1.0.5-py3-none-any.whl
 
# exposed port for docker image [CAN BE CHANGED, BUT WHEN RUNNING THE IMAGE, THIS WILL BE PORT THAT WILL BE OPEN INSIDE DOCKER]
EXPOSE 8085
 
#
CMD ["uvicorn", "app.webapp:app", "--host", "0.0.0.0", "--port", "8085"]

Build a Docker Image for SearchA PI

This section will show you how to build a Docker image for Search API from scratch, based on the official Python image.

This is what you would want to do in most cases, for example:

  • Using Kubernetes or similar tools
  • When running on a Raspberry Pi
  • Using a cloud service that would run a container image for you, etc.

Package Requirements

Install the Search API requirements the same way it is specified in Install Python Dependencies

But in essence you need to do this

pip install -e .

And then 

pip install lib/pyqpl-VERSION_IN_PROJECT-py3-none-any.whl

Dockerfile

Now in the same project directory create a file Dockerfile with:

# Base image (alpine linux)
FROM python:3.11.2-alpine3.17 AS SearchAPI

# 
WORKDIR /search_api


# copy source files
COPY . .

# Install Python Dependencies
RUN pip3 install --no-cache-dir -e .

# Install Pyqpl local library
RUN pip3 install lib/pyqpl-1.0.5-py3-none-any.whl

# exposed port for docker image [CAN BE CHANGED, BUT WHEN RUNNING THE IMAGE, THIS WILL BE PORT THAT WILL BE OPEN INSIDE DOCKER]
EXPOSE 8085

# 
CMD ["uvicorn", "app.webapp:app", "--host", "0.0.0.0", "--port", "8085"]

Behind a TLS Termination Proxy

If you are running your container behind a TLS Termination Proxy (load balancer) like Nginx or Traefik, add the option --proxy-headers, this will tell Uvicorn to trust the headers sent by that proxy telling it that the application is running behind HTTPS, etc.

CMD ["uvicorn", "app.webapp:app", "--proxy-headers", "--host", "0.0.0.0", "--port", "8085"]

anBuild the Docker Image

Now that all the files are in place, let's build the container image.

  • Go to the project directory (in where your Dockerfile is, containing your app directory).
  • Build your FastAPI image:
$ docker build -t myimage .

Notice the . at the end, it's equivalent to ./, it tells Docker the directory to use to build the container image.

In this case, it's the same current directory (.).

Start the Docker Container

  • Run a container based on your image:
$ docker run -d --name mycontainer -p 8085:8085 myimage

Check it

You should be able to check it in your Docker container's URL, for example: http://localhost:8085/es/api (or equivalent, using your Docker host).

You will see something like:

Search API Running

Interactive API docs

Now you can go to http://localhost:8085/es/docs (or equivalent, using your Docker host).

You will see the automatic interactive API documentation (provided by Swagger UI):

Alternative API docs

Now you can go to http://localhost:8085/es/redoc (or equivalent, using your Docker host).

You will see the automatic interactive API documentation (provided by ReDoc):

HTTPS

If we focus just on the container image for a Search API application (and later the running container), HTTPS normally would be handled externally by another tool.

Alternatively, HTTPS could be handled by a cloud provider as one of their services (while still running the application in a container).

Running on Startup and Restarts

There is normally another tool in charge of starting and running your container.

It could be Docker directly, Docker Compose, Kubernetes, a cloud service, etc.

In most (or all) cases, there's a simple option to enable running the container on startup and enabling restarts on failures. For example, in Docker, it's the command line option --restart.

Without using containers, making applications run on startup and with restarts can be cumbersome and difficult. But when working with containers in most cases that functionality is included by default. 

Replication - Number of Processes

If you have a cluster of machines with Kubernetes, Docker Swarm Mode, Nomad, or another similar complex system to manage distributed containers on multiple machines, then you will probably want to handle replication at the cluster level instead of using a process manager (like Gunicorn with workers) in each container.

One of those distributed container management systems like Kubernetes normally has some integrated way of handling replication of containers while still supporting load balancing for the incoming requests. All at the cluster level.

In those cases, you would probably want to build a Docker image from scratch as explained above, installing your dependencies, and running a single Uvicorn process instead of running something like Gunicorn with Uvicorn workers.


Load Balancer

When using containers, you would normally have some component listening on the main port. It could possibly be another container that is also a TLS Termination Proxy to handle HTTPS or some similar tool.

As this component would take the load of requests and distribute that among the workers in a (hopefully) balanced way, it is also commonly called a Load Balancer.

The same TLS Termination Proxy component used for HTTPS would probably also be a Load Balancer.

And when working with containers, the same system you use to start and manage them would already have internal tools to transmit the network communication (e.g. HTTP requests) from that load balancer (that could also be a TLS Termination Proxy) to the container(s) with your app.


One Load Balancer - Multiple Worker Containers

When working with Kubernetes or similar distributed container management systems, using their internal networking mechanisms would allow the single load balancer that is listening on the main port to transmit communication (requests) to possibly multiple containers running your app.

Each of these containers running your app would normally have just one process (e.g. a Uvicorn process running your Search API application). They would all be identical containers, running the same thing, but each with its own process, memory, etc. That way you would take advantage of parallelization in different cores of the CPU, or even in different machines.

And the distributed container system with the load balancer would distribute the requests to each one of the containers with your app in turns. So, each request could be handled by one of the multiple replicated containers running your app.

And normally this load balancer would be able to handle requests that go to other apps in your cluster (e.g. to a different domain, or under a different URL path prefix), and would transmit that communication to the right containers for that other application running in your cluster.


One Process per Containe

In this type of scenario, you probably would want to have a single (Uvicorn) process per container, as you would already be handling replication at the cluster level.

So, in this case, you would not want to have a process manager like Gunicorn with Uvicorn workers, or Uvicorn using its own Uvicorn workers. You would want to have just a single Uvicorn process per container (but probably multiple containers).

Having another process manager inside the container (as would be with Gunicorn or Uvicorn managing Uvicorn workers) would only add unnecessary complexity that you are most probably already taking care of with your cluster system.


Containers with Multiple Processes and Special Cases

Of course, there are special cases where you could want to have a container with a Gunicorn process manager starting several Uvicorn worker processes inside.

In those cases, you can use the official Docker image that includes Gunicorn as a process manager running multiple Uvicorn worker processes, and some default settings to adjust the number of workers based on the current CPU cores automatically. I'll tell you more about it below in Official Docker Image with Gunicorn - Uvicorn.

Here are some examples of when that could make sense:

A Simple App

You could want a process manager in the container if your application is simple enough that you don't need (at least not yet) to fine-tune the number of processes too much, and you can just use an automated default (with the official Docker image), and you are running it on a single server, not a cluster.

Docker Compose

You could be deploying to a single server (not a cluster) with Docker Compose, so you wouldn't have an easy way to manage replication of containers (with Docker Compose) while preserving the shared network and load balancing.

Then you could want to have a single container with a process manager starting several worker processes inside.

Prometheus and Other Reasons

You could also have other reasons that would make it easier to have a single container with multiple processes instead of having multiple containers with a single process in each of them.

For example (depending on your setup) you could have some tool like a Prometheus exporter in the same container that should have access to each of the requests that come.

In this case, if you had multiple containers, by default, when Prometheus came to read the metrics, it would get the ones for a single container each time (for the container that handled that particular request), instead of getting the accumulated metrics for all the replicated containers.

Then, in that case, it could be simpler to have one container with multiple processes, and a local tool (e.g. a Prometheus exporter) on the same container collecting Prometheus metrics for all the internal processes and exposing those metrics on that single container.

Memory

If you run a single process per container you will have a more or less well-defined, stable, and limited amount of memory consumed by each of those containers (more than one if they are replicated).

And then you can set those same memory limits and requirements in your configurations for your container management system (for example in Kubernetes). That way it will be able to replicate the containers in the available machines taking into account the amount of memory needed by them, and the amount available in the machines in the cluster.

If your application is simple, this will probably not be a problem, and you might not need to specify hard memory limits. But if you are using a lot of memory (for example with machine learning models), you should check how much memory you are consuming and adjust the number of containers that runs in each machine (and maybe add more machines to your cluster).

If you run multiple processes per container (for example with the official Docker image) you will have to make sure that the number of processes started doesn't consume more memory than what is available.

Previous Steps Before Starting and Containers

If you are using containers (e.g. Docker, Kubernetes), then there are two main approaches you can use.

Multiple Containers

If you have multiple containers, probably each one running a single process (for example, in a Kubernetes cluster), then you would probably want to have a separate container doing the work of the previous steps in a single container, running a single process, before running the replicated worker containers.

If you are using Kubernetes, this would probably be an Init Container.

If in your use case there's no problem in running those previous steps multiple times in parallel (for example if you are not running database migrations, but just checking if the database is ready yet), then you could also just put them in each container right before starting the main process.

Single Container

If you have a simple setup, with a single container that then starts multiple worker processes (or also just one process), then you could run those previous steps in the same container, right before starting the process with the app. The official Docker image supports this internally.

Official Docker Image with Gunicorn - Uvicorn

There is an official Docker image that includes Gunicorn running with Uvicorn workers, as detailed in a previous chapter: Server Workers - Gunicorn with Uvicorn.

This image would be useful mainly in the situations described above in: Containers with Multiple Processes and Special Cases.

There's a high chance that you don't need this base image or any other similar one, and would be better off by building the image from scratch as described above in: Build a Docker Image for Search API.

This image has an auto-tuning mechanism included to set the number of worker processes based on the CPU cores available.

It has sensible defaults, but you can still change and update all the configurations with environment variables or configuration files.

It also supports running previous steps before starting with a script.

To see all the configurations and options, go to the Docker image page: tiangolo/uvicorn-gunicorn-fastapi.

Number of Processes on the Official Docker Image

The number of processes on this image is computed automatically from the CPU cores available.

This means that it will try to squeeze as much performance from the CPU as possible.

You can also adjust it with the configurations using environment variables, etc.

But it also means that as the number of processes depends on the CPU the container is running, the amount of memory consumed will also depend on that.

So, if your application consumes a lot of memory (for example with machine learning models), and your server has a lot of CPU cores but little memory, then your container could end up trying to use more memory than what is available, and degrading performance a lot (or even crashing).

Create a Dockerfile

Here's how you would create a Dockerfile based on this image:

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.9

#
WORKDIR /search_api
 
 
# copy source files
COPY . .
 
# Install Python Dependencies
RUN pip3 install --no-cache-dir -e .
 
# Install Pyqpl local library
RUN pip3 install lib/pyqpl-1.0.5-py3-none-any.whl
 
# exposed port for docker image [CAN BE CHANGED, BUT WHEN RUNNING THE IMAGE, THIS WILL BE PORT THAT WILL BE OPEN INSIDE DOCKER]
EXPOSE 8085

When to Use

You should probably not use this official base image (or any other similar one) if you are using Kubernetes (or others) and you are already setting replication at the cluster level, with multiple containers. In those cases, you are better off building an image from scratch as described above: Build a Docker Image for Search API.

This image would be useful mainly in the special cases described above in Containers with Multiple Processes and Special Cases. For example, if your application is simple enough that setting a default number of processes based on the CPU works well, you don't want to bother with manually configuring the replication at the cluster level, and you are not running more than one container with your app. Or if you are deploying with Docker Compose, running on a single server, etc.

Deploy the Container Image

After having a Container (Docker) Image there are several ways to deploy it.

For example:

  • With Docker Compose in a single server
  • With a Kubernetes cluster
  • With a Docker Swarm Mode cluster
  • With another tool like Nomad
  • With a cloud service that takes your container image and deploys it
  • No labels