Distributed And Cloud-Based Storage Systems
TTh 2:00pm - 3:15pm, CSIC 3118

The guiding philosophy of this course is that the best way to learn about real systems is to build one. We will gain an in-depth understanding of the issues involved in designing and deploying large-scale distributed file systems. In the course of this investigation we will be tackling a variety of topics, such as peer-to-peer systems, remote procedure calls, multi-threading, consensus protocols, cloud systems, layered systems (supporting high-level consistency guarantees on top of cloud services), and security as it relates to such systems.



Pete Keleher <keleher@cs.umd.edu> (include "828" in all correspondance)
Office hours: M 1-3pm and by appt. in AVW 4157


The class will consist of lectures by the instructor, student project presentations, a midterm and a final, and a series of probably five programming projects, all in the language Go (fear not if you don't know anything about go, we'll all be learning together). The end goal is to have built a full-scale reliable, highly-available, and secure distributed file system, using both local disks and cloud services as backing stores. My lectures will be split between those describing the tools we will use to build our file systems, and lectures based on recent research in the literature (such as those at FAST 2014, SOSP 2013, OSDI 2012, and USENIX ATC 2014).

Examples of technologies we may use include FUSE (and MacFUSE), key value stores like Bolt or gkvlite or diskv or leveldb-go, the Amazon Simple Storage Service (and go binding), Google's Protocol Buffers or json (from Go), Google's Go language, PAXOS, SQLite, Snappy, and Apple's development kit for the iPad.

Office hours: after class in my office (4157 A.V. Williams).

Note that the following set of papers is only a placeholders: more will come, some will go away.

Note: this list is out of date and will be updated by the first day of class. The papers will continue to be mutable until the week before the any given day, so please continue to check.

Tuesday Thursday
Aug 29Intro Aug 31Intro/Go
Reading: (notes)

Sep 5"The Design and Implementation of a Log-Structured File System"


"A Low-bandwidth Network File System"

Sep 7"Deciding when to forget in the Elephant file system"

"The Google File System" (background, no blog: "GFS: Evolution on Fast-forward")
Project 1  due midnight Sunday

Sep 12"Semantic File Systems"

"A Logic File System"

Sep 14"Replication, history, and grafting in the Ori file system"

"The Design and Implementation of the Warp Transactional Filesystem"

Sep 19"To FUSE or Not to FUSE: Performance of User-Space File Systems"

"Operating System Support for Planetary-Scale Network Services"

Consensus and Byzantine Consensus

Sep 21Global system event orderings.


Sep 26"Algorithms and Data Structures for Efficient Free Space Reclamation in WAFL"

"Knockoff: Cheap Versions in the Cloud"

Sep 28"Application Crash Consistency and Performance with CCFS"

"Eventually Consistent"

Optional, no blog: "Camlistore is your personal storage system for life"
Project 2 due midnight Sunday

Oct 3No Class Oct 5"Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System"


"Flexible update propagation for weakly consistent replication"


"Session Guarantees for Weakly Consistent Replicated Data"

Oct 10 "Quantifying Eventual Consistency with PBS"

 "Bolt-on causal consistency"

Oct 12Databases. No reading
Oct 17More Databases. No reading
(see above slides)
Oct 19"Existential Consistency: Measuring and Understanding Consistency at Facebook" - guowei (here)

"f4: Facebook’s Warm BLOB Storage System"

Project 3 due midnight Sunday

Oct 24Raft, in Theory and in your Laptops
 "In search of an understandable consensus algorithm"
Oct 26"Salt: Combining ACID and BASE in a Distributed Database"

"Highly Available Transactions: Virtues and Limitations"

Oct 31"Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications" - jackie (slides)

"FAWN: A fast array of wimpy nodes"

Nov 2"OceanStore: An Architecture for Global-Scale Persistent Storage" - greg (slides)


"Scalable Consistency in Scatter"

Nov 7"Sinfonia: A New Paradigm for Building Scalable Distributed Systems" - daniel (here)

"Dynamo: amazon's highly available key-value store" - colin (slides)

Nov 9

"Spanner: Google's Globally-Distributed Database" - allen

Nov 14No Class Nov 16"Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS" - yusef (here)

"Ambry: LinkedIn’s Scalable Geo-Distributed Object Store" - sankha (here)
Project 4a due Sunday Night

Nov 21"Implementing Linearizability at Large Scale and Low Latency"

"CalvinFS: Consistent WAN Replication and Scalable Metadata Management for Distributed File Systems"


Nov 23Thanksgiving
Nov 28"CORFU: A Shared Log Design for Flash Clusters."

"Tango: Distributed data structures over a shared log"


Nov 30Fault Tolerance and Security
Notes on fault tolerence in distributed, unreliable, asynchronous environments.

"Practical Byzantine Fault Tolerance" - zach

"Transactional storage for geo-replicated systems"

Dec 5"The case for RAMCloud"

"Fast crash recovery in RAMCloud"

Dec 7Final Exam due Dec 16 at 12:30pm.
Project 5 is due midnight Sunday

Late Policies

All projects will have a due date, and a late due date two days later.
  • Do each project by yourself. Sadly, we can and do detect and fail those that do not abide by this policy each semester. You may ask, and answer, general questions on Piazza.
  • Your grade loses 20% of the max score if the project is turned in after the due date, but by the late due date. Anything after the late due date gives you a zero.

Attendance and general grading policies

Students are responsible for all material covered, and all announcements, deadlines, policies, etc., discussed in lecture and discussion section, regardless of whether they were in class to hear the information or not. It’s understood that students may occasionally have to miss class for various reasons, but email and office hours are not intended as a replacement for class attendance. Consequently, only students who typically and regularly attend class will receive assistance during office hours.

Coursework will count toward the final grade according to the following percentages:

  1. Projects: 70%
    • There will five to six projects, each with approximately a two-week time window.
    • Projects will be weighted equally.
    • All projects must get at least half credit to pass the course.
  2. Blog entries: 5%
    • You are required to upload a blog entry before each class except the first. More details in class.
  3. Midterm/Final: 25%
    • There will be one test, timing TBD.


Final exam: TBD

Academic integrity

The Campus Senate has adopted a policy asking students to include the following statement on each examination or assignment in every course: “I pledge on my honor that I have not given or received any unauthorized assistance on this examination (or assignment).” Consequently, you will be requested to include this pledge on each exam and project. You may review the University’s Code of Academic Integrity for yourself at