Skip to content

Commit

Permalink
Merge branch 'master' into develop
Browse files Browse the repository at this point in the history
* master:
  Add Dyn DDoS attack link (#13)
  Fix multiple typos (#14)
  Add more disadvantages of REST (#9)
  Fix typo - lobsided -> lopsided (#11)
  Fix typo in Study Guide section (#10)
  Fix #1: Update reverse proxy image (#8)
  • Loading branch information
donnemartin committed Mar 11, 2017
2 parents d5f125e + 69d4e5a commit ebdb203
Showing 1 changed file with 14 additions and 12 deletions.
26 changes: 14 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,8 +183,8 @@ More experienced candidates are generally expected to know more about system des
Start broad and go deeper in a few areas. It helps to know a little about various key system design topics. Adjust the following guide based on your experience, what positions you are interviewing for, and which companies you are interviewing with.

* **Short** - Aim for **breadth** with system design topics. Practice by solving **some** interview questions.
* **Medium** - Aim for **breadth** and **some depth** with system design topics. Practice by solving a **many** interview questions.
* **Long** - Aim for **breadth** and **more depth** with system design topics. Practice by solving a **most** interview questions.
* **Medium** - Aim for **breadth** and **some depth** with system design topics. Practice by solving **many** interview questions.
* **Long** - Aim for **breadth** and **more depth** with system design topics. Practice by solving **most** interview questions.

| | Short | Medium | Long |
|---|---|---|---|
Expand Down Expand Up @@ -485,7 +485,7 @@ There are two main patterns to support high availability: **fail-over** and **re

With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active's IP address and resumes service.

The length of downtime is determined by whether the passive server is already running in 'hot' standy or whether it needs to start up from 'cold' standby. Only the active server handles traffic.
The length of downtime is determined by whether the passive server is already running in 'hot' standby or whether it needs to start up from 'cold' standby. Only the active server handles traffic.

Active-passive failover can also be referred to as master-slave failover.

Expand Down Expand Up @@ -541,7 +541,7 @@ Services such as [CloudFlare](https://www.cloudflare.com/dns/) and [Route 53](ht

* Accessing a DNS server introduces a slight delay, although mitigated by caching described above.
* DNS server management could be complex, although they are generally managed by [governments, ISPs, and large companies](http://superuser.com/questions/472695/who-controls-the-dns-servers/472729).
* DNS services have recently come under DDoS attack, preventing users from accessing websites such as Twitter without knowing Twitter's IP address(es).
* DNS services have recently come under [DDoS attack](http://dyn.com/blog/dyn-analysis-summary-of-friday-october-21-attack/), preventing users from accessing websites such as Twitter without knowing Twitter's IP address(es).

### Source(s) and further reading

Expand Down Expand Up @@ -618,7 +618,7 @@ Load balancers can route traffic based on various metrics, including:

* Random
* Least loaded
* Seesion/cookies
* Session/cookies
* [Round robin or weighted round robin](http://g33kinfo.com/info/archives/2657)
* [Layer 4](#layer-4-load-balancing)
* [Layer 7](#layer-7-load-balancing)
Expand Down Expand Up @@ -719,7 +719,7 @@ Workers in the application layer also help enable [asynchronism](#asynchronism).

### Microservices

Related to this discussion are [microservices](https://en.wikipedia.org/wiki/Microservices), which can be described as a suite of independently deployable, small, modular services. Each service runs a unique process and communicates through a well-definied, lightweight mechanism to serve a business goal. <sup><a href=https://smartbear.com/learn/api-design/what-are-microservices>1</a></sup>
Related to this discussion are [microservices](https://en.wikipedia.org/wiki/Microservices), which can be described as a suite of independently deployable, small, modular services. Each service runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business goal. <sup><a href=https://smartbear.com/learn/api-design/what-are-microservices>1</a></sup>

Pinterest, for example, could have the following microservices: user profile, follower, feed, search, photo upload, etc.

Expand Down Expand Up @@ -755,8 +755,8 @@ A relational database like SQL is a collection of data items organized in tables
**ACID** is a set of properties of relational database [transactions](https://en.wikipedia.org/wiki/Database_transaction).

* **Atomicity** - Each transaction is all or nothing
* **Consistency** - Any tranaction will bring the database from one valid state to another
* **Isolation** - Excuting transactions concurrently has the same results as if the transactions were executed serially
* **Consistency** - Any transaction will bring the database from one valid state to another
* **Isolation** - Executing transactions concurrently has the same results as if the transactions were executed serially
* **Durability** - Once a transaction has been committed, it will remain so

There are many techniques to scale a relational database: **master-slave replication**, **master-master replication**, **federation**, **sharding**, **denormalization**, and **SQL tuning**.
Expand Down Expand Up @@ -844,7 +844,7 @@ Common ways to shard a table of users is either through the user's last name ini
##### Disadvantage(s): sharding

* You'll need to update your application logic to work with shards, which could result in complex SQL queries.
* Data distribution can become lobsided in a shard. For example, a set of power users on a shard could result in increased load to that shard compared to others.
* Data distribution can become lopsided in a shard. For example, a set of power users on a shard could result in increased load to that shard compared to others.
* Rebalancing adds additional complexity. A sharding function based on [consistent hashing](http://www.paperplanes.de/2011/12/9/the-magic-of-consistent-hashing.html) can reduce the amount of transferred data.
* Joining data from multiple shards is more complex.
* Sharding adds more hardware and additional complexity.
Expand All @@ -857,7 +857,7 @@ Common ways to shard a table of users is either through the user's last name ini

#### Denormalization

Denormalization attemps to improve read performance at the expense of some write performance. Redundant copies of the data are written in multiple tables to avoid expensive joins. Some RDBMS such as [PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL) and Oracle support [materialized views](https://en.wikipedia.org/wiki/Materialized_view) which handle the work of storing redudant information and keeping redundant copies consistent.
Denormalization attempts to improve read performance at the expense of some write performance. Redundant copies of the data are written in multiple tables to avoid expensive joins. Some RDBMS such as [PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL) and Oracle support [materialized views](https://en.wikipedia.org/wiki/Materialized_view) which handle the work of storing redundant information and keeping redundant copies consistent.

Once data becomes distributed with techniques such as [federation](#federation) and [sharding](#sharding), managing joins across data centers further increases complexity. Denormalization might circumvent the need for such complex joins.

Expand Down Expand Up @@ -1108,7 +1108,7 @@ There are multiple levels you can cache that fall into two general categories: *
* Fully-formed serializable objects
* Fully-rendered HTML

Generaly, you should try to avoid file-based caching, as it makes cloning and auto-scaling more difficult.
Generally, you should try to avoid file-based caching, as it makes cloning and auto-scaling more difficult.

### Caching at the database query level

Expand Down Expand Up @@ -1351,7 +1351,7 @@ TCP is useful for applications that require high reliability but are less time c

Use TCP over UDP when:

* You need all of the data to arrive in tact
* You need all of the data to arrive intact
* You want to automatically make a best estimate use of the network throughput

### User datagram protocol (UDP)
Expand Down Expand Up @@ -1458,6 +1458,8 @@ REST is focused on exposing data. It minimizes the coupling between client/serv

* With REST being focused on exposing data, it might not be a good fit if resources are not naturally organized or accessed in a simple hierarchy. For example, returning all updated records from the past hour matching a particular set of events is not easily expressed as a path. With REST, it is likely to be implemented with a combination of URI path, query parameters, and possibly the request body.
* REST typically relies on a few verbs (GET, POST, PUT, DELETE, and PATCH) which sometimes doesn't fit your use case. For example, moving expired documents to the archive folder might not cleanly fit within these verbs.
* Fetching complicated resources with nested hierarchies requires multiple round trips between the client and server to render single views, e.g. fetching content of a blog entry and the comments on that entry. For mobile applications operating in variable network conditions, these multiple roundtrips are highly undesirable.
* Over time, more fields might be added to an API response and older clients will receive all new data fields, even those that they do not need, as a result, it bloats the payload size and leads to larger latencies.

### RPC and REST calls comparison

Expand Down

0 comments on commit ebdb203

Please sign in to comment.