Sharing State with FastAPI

FastAPI state sharing using Redis - all in Python.

Sharing State with FastAPI

Some practical advice on sharing state in FastAPI in Python

In this article, we will deal with one of the main problems with the distributed domain - sharing state. Then, we will implement a FastAPI application that shares state using Redis.

Example Problems

The following features can be solved by state sharing mechanisms:

  • Rate limiting. By default, a rate limiting algorithm will only have access to local context, but a more production ready solution will need to share context across all instances of your fleet. Imagine rate limiting a client to 1TPS - do you really want 1TPS PER process? Probably not.
  • Service Discovery. In cases where you cannot simply throw a load balancer in front of your fleet and call it a day, you will need to implement some type of service discovery. Perhaps you can use an off the shelf solution like Zookeeper - but if you want to manage your own business logic, you will need to share state.
  • Resource Management. Maybe you have a worker fleet, and a routing fleet. The routing fleet needs to have a shared awareness of which workers are busy in order to properly route requests.

These domain specific problems are interesting and will be covered in a future article - but for now we will work on developing a basic example that then be extended to solve each particular problem.

Sharing State

Before your Start - Avoid the Problem Altogether

The point of this article isn't to be too opinionated about your design choices - but if it's possible to design your system without requiring shared state between processes and instances - try that first.

Having that said here are some general strategies we will be looking at:

  • event based
  • gossip protocols
  • data store
  • in memory distributed cache

The Basic FastAPI App

Our basic app does something trivial - it fetches a value from memory and returns it to the user. In this case, we simply use the process ID as our "state".

Nothing magical here - and in a single process environment, there isn't much that could go wrong. After calling /set, subsequent calls return the process ID.

However, this clearly doesn't scale - since if we scale out our web server with multiple --workers or across several hosts, we will end up with a different results for /get depending on which process/host the request lands on. Not a big deal for this trivial program, but imagine if the state we cared about was rate limits of a customer - ouch.

Let's take a look at some general approaches to solve this problem...

Sharing State Using a Database

We can of course use a 3rd party or locally hosted key-value store to persist and retrieve our values.

This will work fine. In many cases though, this is overkill. You may not want to persist your values indefinitely, and you may not care about strong ACID properties that - while great features of many database solutions - come with cost and performance penalties.

If the state changes infrequently, you may be tempted to just add caching with long TTLs to the instances - but this comes with a delay in state change propagation that you may not be able to tolerate.

The next solution addresses the consistency problem in a different way.

Sharing State Using Events

Imagine we know a-priori that for every 1 set there will be 10000 get requests for our state. In that case we might be inclined to use an event based system. We still store our state in local memory, but we set up producers, consumers, and an event queue to push updates and synchronize our state across our fleet.

Of course, there will some convergence delay for our state, but it's a design option that can be very effective if state changes very infrequently, but still needs to have minimal propagation delay.

Sharing State Using Gossip

This is similar to the event based approach, except peer-to-peer gossip is used instead of a separate event queue.

This approach is simple in the sense that you don't need to introduce any additional infrastructure. You may have a tough time reasoning about your system's properties though - since it is not as intuitive what your convergence will be. Nevertheless, this solution is typically highly performant since there are no synchronous non-local state queries in the critical path of requests.

This approach has higher implementation complexity than the others.

Using an In-Memory Distributed Cache

Alright, this is the one we are going to be building - since in practice it tends to be a really effective solution that can suit most implementations.

Redis is a great choice for this - since you can more or less use a Python dict mental model when working with it. Redis provides no consistency guarantees though - so design your application accordingly (more on that later).

A Simple Redis State Sharing Application

Finally, some code to look at.

With a couple of small changes - we have swapped out our Python dictionary with a local Redis instance. Now, multiple application processes on our local machine will share state via the Redis instance.

Seems trivial - but its a powerful tool that can be used to implement all of the functionality needed to run applications at virtually any scale.

Improve Performance

While highly performant in its own right - there are some quick ways to improve the performance of this approach. In particular, you may find that network calls to a remote Redis cluster is the bottleneck in your application.

  • Use caching - easiest to implement with the obvious consistency tradeoff
  • Refresh state in a background thread - more complex than caching but reduces high percentile latency issues.
  • Use server side lua scripts to implement business logic - a powerful Redis feature that eliminates most network calls.

There are tradeoffs to these approaches, and often you will end up with your shared state being an approximation of the actual system state, which brings us to our next point...

Design for the approximate

The problem with a house of cards is that the design specification is extraordinarily specific. Any inaccuracies the implementation (i.e. you assembling it) or spontaneous events during operations (i.e. once its standing) are enough to ruin the system.

When we build software, especially software that is distributed or parallel, it is important for our functionality (the specification) to be resilient for situations when things don't go quite as expected.

In the context of this article, that means treating your shared state with some healthy "skepticism". Here are some quick tips on how to do that:

  • Handle internal failures and retry when appropriate. Maybe your eventually consistent shared state told you a resource existed, but when you went to query it - you got a ResourceNotFound exception - don't explode when that happens, calmly update the state and try the next resource.
  • Use timestamps and ignore stale state. Imagine hosts reserve a resource to do some work, then release it when complete. Well what happens if they fail before releasing? This can put is in some undesirable situations. An easy way to prevent this is to always maintain a "update timestamp" with your shared state, which you can use as a timeout mechanism when working with data.
  • Add jitter to periodic tasks that work with the shared state. Or better yet, explicitly prevent them from being too synchronized. Ideally you don't want every node in your application performing a shared state update at the exactly the same time.


Now that you know a general approach for sharing state, you can use it to implement more advanced features like service discovery and distributed throttling. Which we will cover in future explorations.


Subscribe to VidaVolta

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.