Following the tweet above, I’ve decided to do a thread dump of my favorite computer science papers.
This is not a you should read these papers kind of post, it’s a curated list of greatcomputer science papers that I’ve enjoyed reading and re-reading over the past years.
(I think you should read them as well!)
💡 You’ll learn about a technique called a log-structured file system thatwrites all modifications to disk sequentially, thereby speeding up both filewriting and crash recovery.
💡 You’ll learn about a disk-based index structure called B-Tree and itsdifferent variations. The paper does quite a good job of explaining whythey have been so successful over the years.
💡 You’ll continue to learn about low-cost indexing for a file experiencinga high rate of record inserts over an extended period. The paper also providesa nice comparison of LSM-tree and B-tree I/O costs.
💡 You’ll learn about log processing, Kafka’s architecture, and design principlesincluding producers, brokers, and consumers.
💡 You’ll learn about the ZooKeeper wait-free coordination kernel and a lot ofdistributed systems concepts that are nicely described in the paper.
💡 You’ll learn about one-way functions, the Lamport-Diffie one-time signature,and a new “tree-signature” also known as Merkle tree.
💡 Leslie Lamport’s most cited paper. You’ll learn about logical clocks,real-time synchronization, and concepts such as “total ordering” and “happened-before”.
💡 You’ll learn about strategies for improving a system’s overall availabilitywhile tolerating some kind of graceful degradation.
💡 You’ll learn about reliability in computer systems, whenever it has to copewith the failure of one or more of its components.
💡 You’ll learn about a strong correctness condition for concurrent objects thatguarantees a strict time ordering of read and write operations in a multi-threaded environment.
💡 You’ll learn about a data structure that makes the eventual consistency of adistributed object possible without coordination between replicas.
💡 You’ll learn about an optimization made to state-based CRDTs that ensure convergenceby disseminating only recently applied changes, instead of the entire (possibly large) state.
💡 You’ll learn about Erlang, concurrent programming, message passing, fault-tolerance,and the concept of “let it crash”.
Looking for more papers?
These are my favorites.
I might be missing a few papers, for sure.
You can still find a lot of curated papers for you to read at@papers_we_love,@intensivedata,and @therealdatabass.
