Distributed systems in an alternative universe #15

lukego · 2016-03-09T08:31:19Z

Consider distributed system programming in this universe:

Computers have 8 cores, 64KB of RAM, and 256KB of SSD.
Computers communicate with each other via NFS access to a common file server.
NFS servers can be clustered in order to connect a few LANs together. (Each NFS server supports around a dozen computers.)
Network interfaces are high-speed: between 100 Gbps and 1 Tbps.

Do you recognize that universe? If you are into mechanical sympathy then you might because this is a description of a normal x86 server when viewed through the lens of 90s computing:

The 8 "cores" are the execution ports of the Haswell CPU. That is, each x86 core internally has 8 asymmetric cores that execute micro-instructions.
The 64KB RAM is L1 cache and the 256KB of SSD is L2 cache.
The NFS server is L3 cache. This is shared storage for all of the cores with high throughput and medium latency. The protocol used to access this is not NFS but MESIF.
The I/O interfaces really are fast. The ones that are private to each core, like L1 and L2 cache, can consistently deliver nearly 1Tbps of throughput without any contention.

I find this a useful mental model for thinking about software performance. The way we would optimize distributed systems software for a network like this is also the way we should optimize application software running on x86 servers.

For example, considering Are packet copies cheap or expensive? is like comparing the performance of mv, cat, and cp over NFS. We might expect mv to be fast because the data never has to pass over the wire. How about cat and cp though? This is complicated: you have to consider the relative cost of the latency to request data, the cost of the bandwidth (remembering that the network is full-duplex), the wider implications of taking a read vs write lock on the data, and what else you are planning to do with the data (cp may actually speed up the application if it copies the file onto local storage for further operations).

Next thought: I would never try to troubleshoot network performance problems without access to basic tools like Wireshark. That is where you can see problems due to Nagle's algorithm, delayed acks, small congestion windows, zero windows, and so on. So what is the Wireshark for MESIF?

The text was updated successfully, but these errors were encountered:

javajosh · 2016-03-09T20:44:53Z

Hey! This is thought-provoking but I wonder how you can justify the analogy between asymmetric execution ports and (symmetric) CPU cores. E.g. Ports 6 and 7 were introduced with Haswell and are specialized to do math/branching, and address store, respectively. So how do you justify treating each execution port as a "core"? Thanks.

halayli · 2016-03-10T01:32:17Z

I think Intel PCM & CMT are a good start.

https://software.intel.com/en-us/articles/intel-performance-counter-monitor

chrismatthieu · 2016-06-02T21:57:50Z

Interesting thought process ;)

Have you seen http://computes.io? I've been building an HPC supercomputer that can distribute javascript functions/operations to any/all cores on a globally distributed network. These cores can share memory via firebase and storage via IPFS and message each other in realtime as well...

lukego added the tech label Mar 9, 2016

lukego mentioned this issue Mar 9, 2016

PMU musings snabbco/snabb#808

Open

This was referenced Mar 14, 2016

文章更新 [ 2016-03-09 ] yutingzhao1991/github-blogs-daily#44

Open

文章更新 [ 2016-03-06 - 2016-03-12 ] yutingzhao1991/github-blogs-weekly#30

Open

lukego mentioned this issue Apr 12, 2017

Computer architecture for network engineers #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed systems in an alternative universe #15

Distributed systems in an alternative universe #15

lukego commented Mar 9, 2016

javajosh commented Mar 9, 2016

halayli commented Mar 10, 2016

chrismatthieu commented Jun 2, 2016

Distributed systems in an alternative universe #15

Distributed systems in an alternative universe #15

Comments

lukego commented Mar 9, 2016

javajosh commented Mar 9, 2016

halayli commented Mar 10, 2016

chrismatthieu commented Jun 2, 2016