System Design

Vertical and Horizontal Scaling

Vertical Scaling (Making one server bigger):

Example: You have a server that is unprepared for how much traffic it’s receiving

Scalable Dimensions (Physical/Hardware):
Concurrent connections – by improving the power of your CPU you can allow more connections to your server

Memory – putting more memory to increase the quantity and speed

Network Interfaces – Adding something like an ethernet card to improve efficiency in connections

 

*Don’t just max out on one server’s hardware. If that server was to go down for any reason the entire system would be shut down as well. Adding better hardware to the server isn’t always the solution. 

 

Horizontal Scaling (Getting more servers of a same type):

 

The system will still be achieving the same thing. But, instead it will be split between multiple different servers. Instead of just one big one. 

 

Generally this is cheaper than vertical scaling. Purchasing a similar machine costs roughly the same price as upgrading one specific machine costs exponential money. The use of multiple machines also allows for more available if one was to go down the system will still work. 

Left Side is vertical scaling. Right side is horizontal scaling. 

Load Balancing

Same Example: You are running a website that is getting more traffic than it can handle.

The obvious idea is to add another machine to your system. But it’s not possible for them to have the same public-facing IP address, so users won’t know where to go.

To resolve this issue, you could introduce a load balancer into the equation. Instead of having the machines interact with public addresses, Instead, you should have the users contact a load balancer, which then splits the workload between the multiple machines in your system.

The load balancer also has to have a method for how it’s going to split the load:

 

Road Robin: If you have 3 machines in the system. The load balancer will send requests in a 1 -> 2 -> 3 -> 1… fashion. 

Load Based – In load based it switches destination based on how much work each machine is doing. This could either be based on how many connections each machine is having. Or, it could be resource based (checking CPU levels and stuff like that).

 

A load balancer brings scalability and availability to the system. It’s also much more convenient because all traffic first hits one address.

Cache

Caches are used everywhere on computers. They serve as short-term memory, which is usually able to be accessed much quicker. Things that have been requested recently are likely to be requested again.

Caches can be found at every layer of hardware, OS, web browser, and web application, but they are often found in the frontend.

Cache Types:

Application Server Cache: This is a cache located on the request layer. When a request is made, the cache will give a response if it has what it needs. Otherwise, it will get the data from a disk. If you have a load balancer, it can send a request to any other nodes. Which increased cache misses (https://www.geeksforgeeks.org/types-of-cache-misses/). There are two proposed solutions, such as the distributed cache and the global cache.

Distribute Cache: 

Each of the nodes in the system has a part of the overall cache. The system uses a hash function to figure out if the data requested is in the distributed cache. This is good for cache space since you can always add more cache space with more nodes.

Global cache: 

Every node shares the same cache space. There are two types of caches. One where the cache is responsible for retrieving the information if it is not found, or one where the request node is responsible for any information not found.

CDN (Content Distribution Network):

CDNs are used for caching a lot of static media. Upon its first request, it will ask the back-end servers and then store that information for later use.

Cache Invalidation:
When information is changed in the database, the cache could be storing the wrong information. To solve this, there are three solutions.

Write Through Cache:

Writes to the cache and the database at the same time. This is good and minimizes data loss, but it does have quite a bit of latency.

Write Around Cache:

You must first write everything to the storage and remove the stuff in the cache. So, it will take some time to repopulate the cache with the information.

 

Write-back Cache:

Everything is written to the cache. Then, writing the information to the storage will be done later. This has a high risk of data loss.

Cache Eviction Policies:

First in, first out (FIFO): The cache evicts the first accessed item first. It does not care how many times it has been accessed.

Last in, first out (LIFO): The most recent item used is removed. It does not care how many times it has been accessed.

Last Recently Used (LRU): The cache evicts the information that has been used the least recently.

Most Recently Used (MRU): The cache evicts the most recently used item opposite of LRU.

Least Frequently Used (LFU): This counts how often an item is needed. Those who are least needed are removed first.

Random Replacement (RR): Randomly selects a value to remove from the cache.

Sharding / Data Partitioning

Sharding is a technique to break up a big database into many smaller parts. 

Sharding Methods:

  • Horizontal partitioning (Range based)
    • Putting different rows into different DBs
    • Range based Sharding which is based of lexicographical names
    • Subject to imbalance. If you are sorting by alphabetical order there could be a lot of things needing to be stored at early letters rather than later letters. 
  • Vertical Partitioning
    • Divide tables based of features such as 1 for the user and 1 for location
    • Also subject to imbalance
  • Directory based partitioning
    • We query directory server that holds the mapping between each tuple key in the DB server

Sharding Criteria:

  • Hash Based (most popular)
    • Using hashcode of any entity value
  • List partitioning
    • Such as partitioning based on regions. However this could bring some issues with imbalance. In this example, there could be more users in one geological location than another
  • Round-Robin partitioning – You just go database 1 to 2 to 3 and back to 1
  • Composite Partitioning – Combining multiple different sharding strategies. 

The main issue with sharding is ACID compliance. Which is balancing out the distributed databases. The databases could also be located all around the globe, so joining them could be inefficient as well.

 

Indexing

Indexes are useful in a database to improve the speed at which the data is retrieved. There is a trade off with indexing. When indexing more overhead storage is used and writing to the database tends to be slower. However, there is the advantage of being able to read the information much faster. 

 

The index is just like a table of contents. It tells the person requesting the location of where the data lives instead of having to search the database. The order of the index tends to be in some sort of order. 

 

There are two different to do indexing:

  • Ordered Indexing
    • The column is sorted in ascending order. 
  • Hash Indexing
    • Indexing is done with a hash function as well as a hash table. 

 

Hash indexing tends to be more popular as it allows for more scalability.