By default, FastAPI runs API calls in a serial fashion, meaning that only one request is processed at a time. This is because FastAPI uses an ASGI server such as Unicorn or Hypercorn, which are designed to handle requests serially by default.
If you want to run API calls in parallel, you can use a different server or deploy FastAPI behind a load balancer. For example, you can use a server that supports concurrent requests, such as Daphne or Hypercorn with the --workers
option.
Alternatively, you can use an async web framework such as Sanic or Quart, which are designed to handle requests concurrently.
It’s worth noting that running API calls in parallel can improve performance, but it can also increase the complexity of your application and introduce new challenges such as race conditions and resource contention. Therefore, it’s important to carefully consider the trade-offs before implementing parallelism in your application.