- Grokking the System Design Interview
Let’s design a photo-sharing service like Instagram, where users can upload photos to share them with other users.
- Users should be able to upload/download/view photos.
- Users can perform searches based on photo/video titles.
- Users can follow other users.
- The system should be able to generate and display a user’s News Feed consisting of top photos from all the people the user follows.
- Our service needs to be highly available.
- The acceptable latency of the system is 200ms for News Feed generation.
- Consistency can take a hit (in the interest of availability), if a user doesn’t see a photo for a while, it should be fine.
- The system should be highly reliable, any uploaded photo or video should never be lost.
The system would be read-heavy, so we will focus on building a system that can retrieve photos quickly.
- Practically users can upload as many photos as they like. Efficient management of storage should be a crucial factor while designing this system.
- Low latency is expected while viewing photos.
- Data should be 100% reliable. If a user uploads a photo, the system will guarantee that it will never be lost.
Capacity Estimation and Constraints
- Let’s assume we have 500M total users, with 1M daily active users.
- 2M new photos every day, 23 new photos every second.
- Average photo file size => 200KB
- Total space required for 1 day of photos
2M * 200KB => 400 GB
- Total space required for 10 years:
400GB * 365 (days a year) * 10 (years) ~= 1425TBDatabase Design:
We need to store relationships between users and photos, to know who owns which photo. We also need to store the list of people a user follows. For both of these tables, we can use a wide-column datastore like Cassandra. For the ‘UserPhoto’ table, the ‘key’ would be ‘UserID’ and the ‘value’ would be the list of ‘PhotoIDs’ the user owns, stored in different columns. We will have a similar scheme for the ‘UserFollow’ table.
a. Partitioning based on UserID Let’s assume we shard based on the ‘UserID’ so that we can keep all photos of a user on the same shard. If one DB shard is 1TB, we will need four shards to store 3.7TB of data. Let’s assume for better performance and scalability we keep 10 shards.
b. Partitioning based on PhotoID If we can generate unique PhotoIDs first and then find shard number through “PhotoID % 10”, this can solve the above problems. We would not need to append ShardID with PhotoID in this case as PhotoID will itself be unique throughout the system.
Ranking and News Feed Generation
In the final step, the server will submit all these photos to our ranking algorithm which will determine the top 100 photos (based on recency, likeness, etc.) and return them to the user. A possible problem with this approach would be higher latency, as we have to query multiple tables and perform sorting/merging/ranking on the results. To improve the efficiency, we can pre-generate the News Feed and store it in a separate table.
What are the different approaches for sending News Feed contents to the users?
. Pull: Clients can pull the News Feed contents from the server on a regular basis or manually whenever they need it. Possible problems with this approach are a) New data might not be shown to the users until clients issue a pull request b) Most of the time pull requests will result in an empty response if there is no new data.
2. Push: Servers can push new data to the users as soon as it is available. To efficiently manage this, users have to maintain a Long Poll request with the server for receiving the updates. A possible problem with this approach is when a user has a lot of follows or a celebrity user who has millions of followers; in this case, the server has to push updates quite frequently.
3. Hybrid: We can adopt a hybrid approach. We can move all the users with high followings to pull based model and only push data to those users who have a few hundred (or thousand) follows. Another approach could be that the server pushes updates to all the users not more than a certain frequency, letting users with a lot of follows/updates to regularly pull data.
News Feed Creation with Sharded Data
One of the most important requirement to create the News Feed for any given user is to fetch the latest photos from all people the user follows. For this, we need to have a mechanism to sort photos on their time of creation. To efficiently do this, we can make photo creation time part of the PhotoID. As we will have a primary index on PhotoID, it will be quite quick to find latest PhotoIDs.
How to scale?
For instance, one costly operation is to render users feed. The server has to go over everyone the user follows, fetch all the pictures from them, and rank them based on particular algorithms. When a user has followed many people with a large number of pictures, the operation can be slow.
Various approaches can be applied here. We can upgrade the ranking algorithm if it’s the bottleneck, e.g. if we are ranking by date, we can just read the top N most recent pictures from each person with infinite scroll feature. Or we can use offline pipelines to precompute some signals that can speed up the ranking.
The point is that it’s unlikely to have someone following hundreds of users, but it’s likely to have someone with thousands of pictures. Therefore, accelerating the picture fetching and ranking is the core.
Since a picture sharing system is full of images, I would like to ask what can be optimized related to images?
First of all, it’s usually recommended to store all pictures separately in production. Amazon S3 is one of the most popular storage systems. However, you don’t need to be able to come up with this.
The point is that images are usually of large size and seldom get updated. So a separate system for image storage has a lot of advantages. For instance, cache and replication can be much simpler when files are static.
Secondly, to save space, images should be compressed. One common approach is to only store/serve the compressed version of images. Google photos actually is using this approach with unlimited free storage.