Notes from Grokking the System Design Interview
When designing systems, we need to consider:
- What architectural pieces are used
- How pieces work together
- how best utilized the pieces, right tradeoffs
1. Load Balancing
Load balancer helps to spread the traffic across a cluster of servers to improve responsiveness and availability of applications, websites or databases.
To utilize full scalability and redundancy, we can try to balance the load at each layer of the system. We can add LBs at three places:
- Between the user and the web server
- Between web servers and an internal platform layer, like application servers or cache servers
- Between internal platform layer and database.
Benefits of load balancing:
Faster user experience, less downtime and wait time, fewer failed stressed components
How does the load balancer choose the backend server?
It regularly does health checks to maintain a healthy pool of servers.
- Least Connection Method
- Least Response Time Method
- Least Bandwidth Method – This method selects the server that is currently serving the least amount of traffic, measured in megabits per second (Mbps).
- Round Robin Method
- Weighted Round Robin Method
- IP Hash — This method calculates a hash of the IP address of the client to determine which server receives the request.
Add redundant load balancer to prevent single point of failure
A cache is like short-term memory: it has a limited amount of space, but is typically faster than the original data source and contains the most recently accessed items. Caches can exist at all levels in architecture but are often found at the level nearest to the front end, where they are implemented to return data quickly without taxing downstream levels.
Application cache: save data at local to enable faster access.
-> What about there are many nodes?
Use distributed caching : Use consistent hashing function. Just add new nodes in the pool to increase caching. Or use global caching: one single cache space. Either cache or request node is responsible to retrieve missing data.
Content Distribution Network (CDN)
CDNs are a kind of cache that comes into play for sites serving large amounts of static media.
Cache Invalidation. Algorithms:
Write-through cache: Under this scheme data is written into the cache and the corresponding database at the same time. The cached data allows for fast retrieval, and since the same data gets written in the permanent storage, we will have complete data consistency between cache and storage. Also, this scheme ensures that nothing will get lost in case of a crash, power failure, or other system disruptions.
Although write through minimizes the risk of data loss, since every write operation must be done twice before returning success to the client, this scheme has the disadvantage of higher latency for write operations.
Write-around cache: This technique is similar to write through cache, but data is written directly to permanent storage, bypassing the cache. This can reduce the cache being flooded with write operations that will not subsequently be re-read, but has the disadvantage that a read request for recently written data will create a “cache miss” and must be read from slower back-end storage and experience higher latency.
Write-back cache: Under this scheme, data is written to cache alone, and completion is immediately confirmed to the client. The write to the permanent storage is done after specified intervals or under certain conditions. This results in low latency and high throughput for write-intensive applications, however, this speed comes with the risk of data loss in case of a crash or other adverse event because the only copy of the written data is in the cache.
Cache eviction policies
- First In First Out (FIFO): The cache evicts the first block accessed first without any regard to how often or how many times it was accessed before.
- Last In First Out (LIFO): The cache evicts the block accessed most recently first without any regard to how often or how many times it was accessed before.
- Least Recently Used (LRU): Discards the least recently used items first.
- Most Recently Used (MRU): Discards, in contrast to LRU, the most recently used items first.
- Least Frequently Used (LFU): Counts how often an item is needed. Those that are used least often are discarded first.
- Random Replacement (RR): Randomly selects a candidate item and discards it to make space when necessary.
3. Sharding or Data Partitioning
Data partitioning (also known as sharding) is a technique to break up a big database (DB) into many smaller parts. It is the process of splitting up a DB/table across multiple machines to improve the manageability, performance, availability and load balancing of an application.
1. Partitioning Methods
a. Horizontal partitioning: In this scheme, we put different rows into different tables. if the value whose range is used for sharding isn’t chosen carefully, then the partitioning scheme will lead to unbalanced servers.
b. Vertical Partitioning: In this scheme, we divide our data to store tables related to a specific feature to their own server. For example, profile info on one server, friend list on another. If the application experiences additional growth, it needs further partitioning.
c. Directory Based Partitioning: A loosely coupled approach to work around issues mentioned in above schemes is to create a lookup service which knows your current partitioning scheme and abstracts it away from the DB access code. So, to find out where does a particular data entity resides, we query our directory server that holds the mapping between each tuple key to its DB server. This loosely coupled approach means we can perform tasks like adding servers to the DB pool or change our partitioning scheme without having to impact your application.
2. Partitioning Criteria
a. Key or Hash-based partitioning. Use consistent hashing.
b. List partitioning: In this scheme, each partition is assigned a list of values, so whenever we want to insert a new record, we will see which partition contains our key and then store it there.
c. Round-robin partitioning: This is a very simple strategy that ensures uniform data distribution.
d. Composite partitioning: Under this scheme, we combine any of above partitioning schemes to devise a new scheme.
3. Common Problems of Sharding
a. Joins and Denormalization: once a database is partitioned and spread across multiple machines it is often not feasible to perform joins that span database shards. Such joins will not be performance efficient since data has to be compiled from multiple servers. A common workaround for this problem is to denormalize the database so that queries that previously required joins can be performed from a single table.
b. Referential integrity: As we saw that performing a cross-shard query on a partitioned database is not feasible, similarly trying to enforce data integrity constraints such as foreign keys in a sharded database can be extremely difficult. Often in such cases, applications have to run regular SQL jobs to clean up dangling references.
c. Rebalancing: There could be many reasons we have to change our sharding scheme:
- The data distribution is not uniform, e.g., there are a lot of places for a particular ZIP code, that cannot fit into one database partition.
- There are a lot of load on a shard, e.g., there are too many requests being handled by the DB shard dedicated to user photos.
In such cases, either we have to create more DB shards or have to rebalance existing shards, which means the partitioning scheme changed and all existing data moved to new locations.
The goal of creating an index on a particular table in a database is to make it faster to search through the table and find the row or rows that we want.
The downside is that indexes make it slower to add rows or make updates to existing rows for that table since we not only have to write the data but also have to update the index. So adding indexes can increase the read performance, but at the same time, decrease the write performance.
A proxy server is an intermediary piece of hardware/software that sits between the client and the back-end server. It receives requests from clients and relays them to the origin servers.
For example, we can collapse the same (or similar) data access requests into one request and then return the single result to the user; this scheme is called collapsed forwarding.
Another great way to use the proxy is to collapse requests for data that is spatially close together in the storage (consecutively on disk). This strategy will result in decreasing request latency.
In cases where individual writes (or tasks) may take a long time, achieving high performance and availability requires different components of the system to work in an asynchronous way; a common way to do that is with queues.
Queues are implemented on the asynchronous communication protocol, meaning when a client submits a task to a queue they are no longer required to wait for the results
Queues are also used for fault tolerance as they can provide some protection from service outages and failures.
Queues play a vital role in managing distributed communication between different parts of any large-scale distributed system. There are a lot of ways to implement them and quite a few open source implementations of queues available like RabbitMQ, ZeroMQ, ActiveMQ, and BeanstalkD.
7. Redundancy and Replication
Creating redundancy in a system can remove single points of failure and provide backups if needed in a crisis.
8. SQL vs. NoSQL
most common types of NoSQL:
Key-Value Stores: Data is stored in an array of key-value pairs. The ‘key’ is an attribute name, which is linked to a ‘value’. Well-known key value stores include Redis, Voldemort and Dynamo.
Document Databases: In these databases data is stored in documents, instead of rows and columns in a table, and these documents are grouped together in collections. Each document can have an entirely different structure. Document databases include the CouchDB and MongoDB.
Wide-Column Databases: Instead of ‘tables,’ in columnar databases we have column families, which are containers for rows. Unlike relational databases, we don’t need to know all the columns up front, and each row doesn’t have to have the same number of columns. Columnar databases are best suited for analyzing large datasets – big names include Cassandra and HBase.
Graph Databases: These databases are used to store data whose relations are best represented in a graph. Data is saved in graph structures with nodes (entities), properties (information about the entities) and lines (connections between the entities). Examples of graph database include Neo4J and InfiniteGraph.
High Level Difference:
Schema: In SQL, each record conforms to a fixed schema, meaning the columns must be decided and chosen before data entry and each row must have data for each column. Whereas in NoSQL, schemas are dynamic. Columns can be added on the fly, and each ‘row’ (or equivalent) doesn’t have to contain data for each ‘column.’
Querying: SQL databases uses SQL (structured query language) for defining and manipulating the data, which is very powerful. In NoSQL database, queries are focused on a collection of documents.
Scalability: In most common situations, SQL databases are vertically scalable, i.e., by increasing the horsepower (higher Memory, CPU, etc.) of the hardware, which can get very expensive. On the other hand, NoSQL databases are horizontally scalable, meaning we can add more servers easily in our NoSQL database infrastructure to handle large traffic. Any cheap commodity hardware or cloud instances can host NoSQL databases, thus making it a lot more cost-effective than vertical scaling. A lot of NoSQL technologies also distribute data across servers automatically.
Reliability or ACID Compliancy (Atomicity, Consistency, Isolation, Durability): The vast majority of relational databases are ACID compliant. So, when it comes to data reliability and safe guarantee of performing transactions, SQL databases are still the better bet. Most of the NoSQL solutions sacrifice ACID compliance for performance and scalability.
SQL VS. NoSQL – Which one to use?
use SQL database: ensure ACID compliance. Your data is structured and unchanging.
use NoSQL database: Storing large volumes of data that often have little to no structure. Making the most of cloud computing and storage. Rapid development.
A variation of the traditional polling technique that allows the server to push information to a client, whenever the data is available. With Long-Polling, the client requests information from the server exactly as in normal polling, but with the expectation that the server may not respond immediately. That’s why this technique is sometimes referred to as a “Hanging GET”.
- If the server does not have any data available for the client, instead of sending an empty response, the server holds the request and waits until some data becomes available.
- Once the data becomes available, a full response is sent to the client. The client then immediately re-request information from the server so that the server will almost always have an available waiting request that it can use to deliver data in response to an event.
What is system design interview?
You need to get into what the companies want to know about you during these 40 minutes, which is basically “your approach and strategy to handle a problem” and how organized, disciplined, systematic, and professional you are at solving it. What is your capacity to analyze an issue and your level of professional mechanics to solve it step by step?
In short, system design interview is, just understanding it from interviewer’s perspective. During the whole process, it is your discussion with the interviewer that is of core importance.
7 Steps for system design interview:
Step 1: Requirements clarifications
Step 2: System interface definition
Step 3: Back-of-the-envelope estimation
Step 4: Defining data model
Step 5: High-level design
Step 6: Detailed design
Step 7: Identifying and resolving bottlenecks
One thought on “System Design Basics”