API Development: Scale to Millions of Users With a Robust Backend

What is Backend API

Your backend API is a combination of your application servers, databases, and your infrastructure powering your client-side applications or your API consumers. A properly implemented backend API allows you to drink your IPA while your entire system works for you.

API Anatomy

Here is what a typical API looks like:

API Gateway

When a client request comes from a web browser, mobile app, or a third party, the first place it goes to is an API Gateway. The goal of API Gateway is to route the request to a specific destination, which is a microservice most of the time. Depending on the API Gateway implementation, it can also handle user authentication and some request transformation before handing it off to the microservice. Last but not least, API Gateway handles the first level of security by routing a request coming from the public Internet to a private resource. Important to note that all resources behind the API Gateway like your microservices and database should remain private and not being exposed to the Internet. Only API Gateway should have the ability to route the right request from the public to the private territory.

Service Layer

That’s where the magic happens. Most of your business logic is contained at this level. If you have microservices architecture, different requests will go to different microservices. Depending on your use-case, your service can do any number of things like making a database query, calling another microservice or a 3rd party API, caching some data for future use, writing into a queue, or performing some computation.

How to Build APIs Right


A properly architected system is at the core of a robust and scalable API. If your API is architected well, everything else comes together naturally. Your product will be able to support your level of traffic and you’ll have a lot of happy customers.

When designing your API, you’ll need to minimize the following:

  1. A number of single points of failure (which is a database in most cases).
  2. A number of internal calls between services.
  3. Timing of each individual call.

The main thing to realize is that each call results in a chain of events. While the work is happening the user is waiting for a response. Time and time again we’ve seen a seemingly quick API call taking a long time or sometimes taking the system down. Everyone on the team needs to have a clear idea of which calls are made for each operation and which resources are used during each call. There are a lot of tools out there like NewRelic or DataDog that help you measure the timing of each individual call between services. You’ll be able to fix the issues before the software gets pushed to prod by instrumenting your app and running some load tests against it in your testing environment.


Once the system is architected for scale, supporting more traffic or less traffic should be as simple as a configuration change. Here are the main things to consider:

Scaling Automatically

Most of the cloud providers have auto-scaling capabilities based on certain rules you can specify. In addition to it, we recommend looking into Serverless tech allowing for scaling automatically with the increase or decrease in traffic. If your server provisioning is manual at the moment, it’s worth thinking about automating it. This way you can spend your engineering resources on developing new features or improving the strength of the overall system instead of manually provisioning servers.

Minimize Service Chaining

Especially with microservices, it’s easy to fall into a trap of ‘nano services’. Making your services too granular can lead to an increase in the number of inner service calls. It’s a good idea to avoid making one service call another service and so on and so forth. The longer the call chain, the greater the possibility of a failure at each step, and the overall call length increases too. In addition to it, debugging a call spanning across multiple services is much harder than debugging a single call to a single service.

Alerting and Monitoring

Alerting and monitoring is your best friend. It should be a part of the requirement for every feature. When your alerting and monitoring are properly set up you sleep better at night. It allows you to be more proactive about fixing the issues before they become problems. If something goes wrong, you’ll get notified and you’ll stay ahead of the game. This also increases the accountability of your team. The systems need to be developed so they always work without human intervention.

A great book on a subject is Art of Scalability.

Cost Efficiency

Your backend API is what makes your system scale to support a large number of users. With the increase in traffic comes the increase in cost. Therefore, when building your backend, you should be aware of the cost implications of your implementation decisions. We recommend focusing on the following areas:

Keeping your operational database slim

The main purpose of your operational database is to serve the immediate API requests. Anything else adds to the database size, decreases the database availability, and makes scaling the database harder, especially for SQL databases.

Keeping your non-production environments small

Depending on your use-case, you’ll need a number of different environments. For a simple scenario having just 3 environments works just fine (development, staging, and production). As your organization grows, you’ll have more use cases you need to support and the number of environments will grow too. The key here is to have all production and pre-production environments with the same configuration. And all the rest of the systems with a scaled-down version of your resources to keep the cost down. In addition to it, using Serverless or Containers helps to minimize idle server time.

Using the right tool for the job

We’ve seen many cases where a SQL database stores all application logs, acts as a search index for different kinds of queries, and even acts as a data warehouse while providing data for the API requests. There are solutions made specifically for each type of problem to solve. Logs can be archived into some flat files and processed offline. There are plenty of search solutions like Elastic Search or Algolia. A data warehouse needs to be separate from your runtime operational database so your data analyst doesn’t bring your runtime system down. Plus data warehouse users need the data in a different format. When using the right solution for each type of problem, you’ll be able to provide a better experience for the users. You’ll also have better control over your system cost.


Your backend infrastructure is the heart of your system. It supports your client applications at scale. Architecting your backend system right makes everything else easier. Using the right tool for the job is key. There are specialized solutions for each type of problem. The goal is to build it so it always works with some comprehensive alerting and monitoring in place notifying you about the issues in advance. If you need some help with building a robust backend API, don’t hesitate to reach out.

Get in touch

Have any questions about Backend API?
Don't hesitate to ask, we're here to help.