Back-of-the-envelope Back-of-the-envelope Calculation – System Design 04

system-design-system design04-back-cover calculation-back-of-the-envelope-hogantech-hoganblab

Preface

Back-of-the-envelope, also known as back-of-the-envelope calculation, is a method of calculating approximate values of complex problems using simple estimates.Let’s also review here. A decentralized system consists of computing nodes connected through a network. These nodes can be various types of servers, such as web servers, application servers, and storage servers.

When designing a decentralized system, it is important to understand the number of requests each node can handle. At the same time, we can also determine the required number of nodes and traffic, so we will use Back-of-the-envelope to calculate our rough estimate, and finally design the system we need.

Back-of-the-envelope

In reality, distributed systems are composed of computing nodes connected through the network. There are also various computing nodes in the software systems on the market, and they are connected in many different ways. Back-of-the-envelope can help us ignore the details of the system and focus on more important aspects, like the abstract concepts mentioned in the previous article.

Here is an example of Back-of-the-envelope:

  • The number of simultaneous TCP connections that the server can accept.
  • The number of requests per second (RPS) that a web page, database, or cache server can handle.
  • Storage needs of the service.

In these cases, software design flaws may result if unreasonable numbers are calculated. Therefore, when designing the system, we must use Back-of-the-envelope to make rough estimates, and then optimize and expand our system.

Data center server type

Data centers do not only have one type of server. Enterprise solutions use commodity hardware to reduce costs and find solutions to develop scalable systems. The following are commonly used in data centers to handle different workloads (Workload) Server type:

Web Server

For scalability, web servers are separated from application servers. The web server is the first node after the load balancer (Work Balancer). The web server is also the server that handles API calls from the client side. Usually the memory and storage resources vary according to the needs. Of course, usually the larger the memory and storage capacity, the better resources the server will have for processing. For example: Meta uses a server with 32GB RAM and 500GB capacity to meet a large number of calculations.

Application Server

Application servers are used to handle application and business logic. However, it is usually difficult to distinguish between web servers and application servers. The following are the differences between them. Application servers provide dynamic content, while web servers primarily provide static content to client browsers.

Storage Server

As the Internet becomes more and more developed, the amount of data that any network service needs to store will also increase explosively due to traffic and scale. Therefore, we need a storage server (which can be understood as a dedicated database server) to Handle huge amounts of data. We also need to select appropriate databases based on different data types. For example: Youtube uses the following databases:

  • Use Blob storage to store compiled video data.
  • Use Bigtable specifically to store large amounts of video thumbnails.
  • Use RDBMS to store user and video data, such as comments and likes data.
  • Use SQL & NoSQL to store various types of data for data analysis.

Common standards

The design, planning, and implementation of system services require a large investment of money, time, and manpower. If we don't know the types of workloads the machine can handle, it's difficult to design further. Latency is a very important thing that allows us to judge which machines are suitable for which workloads. The following is from Resources found on Github, and made into a table for readers’ reference.

Delay

 

projectTime (nanoseconds)
Execute an instruction1/1,000,000,000 seconds = 1 nanosecond
Fetch from L1 cache0.5 nanoseconds
branch prediction error5 nanoseconds
Fetch from L2 cache7 nanoseconds
Mutex lock/unlock25 nanoseconds
Retrieve from main memory100 nanoseconds
Send 2K bytes over 1Gbps network20,000 nanoseconds
Read 1MB sequentially from memory250,000 nanoseconds
Extract from new disk location (seek)8,000,000 nanoseconds
Read 1MB sequentially from disk20,000,000 nanoseconds
Send data packets to the United States and back150 milliseconds = 150,000,000 nanoseconds

QPS

In addition to the latencies listed above, there is also Queries Per Second (QPS), which measures the volume of database queries.

important rate Queries per second (QPS)
QPS processed by MySQL 1000
Key-Value database processing QPS 10,000
QPS processed by cache 100,000 – 1 million

Unit quantity

index approximation full name abbreviation
10 Thousand Kilobyte KB
20 Million Megabyte MB
30 Billion Gigabyte GB
40 Trillion Terabyte TB

Calculate request volume

Next, let’s explain, what is the number of requests the server can handle per second, Requests Per Second (RPS)?

Within the server, resources are limited, and depending on the type of client request, system bottlenecks may occur.

We can mainly divide it into two types of requests:

  • CPU-bound requests: The limiting factor for such requests is the CPU.
  • Memory-bound requests: Such requests are subject to memory limitations.

CPU-bound requests

A common formula for calculating RPS for CPU-intensive requests is:

				
					RPS-CPU = Num-CPU x 1 / Task-Time
				
			

Among them, the meaning of each variable is as follows:

  • RPS-CPU: CPU-intensive RPS
  • Num-CPU: Number of CPU threads
  • Task-time: The time required to complete each task

Memory-bound requests

For memory-intensive requests, we use the following formula:

				
					RPS-Memory = Worker-Memory / RAM-Size x 1 / Task-Time
				
			

Among them, the meaning of each variable is as follows:

  • RPS-Memory: Memory-intensive RPS
  • RAM-size: RAM size
  • Worker-Memory: The worker used by memory to manage memory

The service receives both CPU-intensive and memory-intensive requests. Assuming half of the requests are CPU-intensive and the other half are memory-intensive, the total RPS we can handle is:

				
					RPS = (RPS-CPU + RPS-Memory) / 2
				
			

The calculations above are just for understanding the approximation of estimating RPS. In reality, there may be many other factors that affect RPS. For example: If the data is not in RAM, or a request is made to the database server, a disk seek (Seek) is required, resulting in a delay. Other factors include: failures, errors in program code, node failures, power outages, network outages, etc., which are all inevitable factors.

Types of Computing in System Design Interviews

In a system design interview, we may need to perform the following types of estimates:

  1. Load estimation: Predict the number of requests, data volume, or user traffic your system can expect per second.
  2. Storage estimation: Estimate the amount of storage space required to process data generated by your system.
  3. Bandwidth estimation: Anticipated traffic and network bandwidth required for data transfer.
  4. Latency estimation: System architecture and components to predict response times and latencies.
  5. Resource estimation: Estimate the number of servers, CPUs, or memory required to handle the load.

Practical example of back cover calculation

Load estimation

Suppose you want to design a social media platform with 100 million daily active users (DAU) and each user publishes an average of 10 posts per day. To calculate the load, we need to count the total number of posts generated per day:

				
					100 million DAU * 10 posts/user = 1 billion posts/day
				
			

Then estimate the number of requests per second:

				
					1 billion posts/day/86,400 seconds/day = 11,574 requests/second
				
			

Storage estimation

Consider a photo-sharing app with 500 million users, each uploading an average of 2 photos per day. The average size of each photo is 2 MB. To estimate the storage space required for a day’s photos, calculate as follows:

				
					500 million users* 2 photos/user* 2 MB/photo = 2,000,000,000 MB/day
				
			

Bandwidth estimation

For a video streaming service with 10 million users streaming 1080p video at 4 Mbps, the required bandwidth can be calculated:

				
					10 million users * 4 Mbps = 40,000,000 Mbps
				
			

Latency estimation

Suppose you want to design an API that fetches data from multiple sources, and you know that the average latency of each source is 50 milliseconds, 100 milliseconds, and 200 milliseconds. Calculate the latency as follows:

				
					50 milliseconds + 100 milliseconds + 200 milliseconds = 350 milliseconds
				
			

If the process is parallel (Parallel), the total delay will be the maximum delay:

				
					max(50ms, 100ms, 200ms) = 200ms
				
			

Resource estimation

If you were to design a system that received 10,000 requests per second, each request would require 10 milliseconds of CPU time. To calculate the number of CPU cores required, simply calculate the total CPU time per second:

				
					10,000 requests/second * 10 milliseconds/request = 100,000 milliseconds/second
				
			

At this time, we can also assume that each CPU can process 1,000 milliseconds per core. Then the number of cores required is:

				
					100,000 ms/sec / 1,000 ms/core = 100 cores
				
			

Conclusion

Back-of-the-envelope is a method for quickly estimating system requirements and can be used in the early stages of system design. This approach enables effective design decisions to be made and avoids problems at subsequent stages.

Here are some considerations for system design using Back-of-the-envelope:

  • Back-of-the-envelope can only provide a rough estimate. If there is a real situation of designing the system, detailed analysis is required.
  • When performing Back-of-the-envelope, all major factors of the system need to be considered, including hardware, software, and networking.

Quote

Teach Yourself Programming in Ten Years

related articles

Non-functional features of software design – System Design 03

Application of abstraction in system design – System Design 02

Introduction to Modern System Design - System Design 01

2023 Yahoo! Software Engineer Software Engineer Interview

en_USEnglish