Trade-offs in implementing consistent distributed storage

Date of Completion

January 2011


Engineering, Computer|Operations Research|Computer Science




Distributed data services use replication to ensure data availability and survivability. With replication comes the challenge of guaranteeing data consistency when multiple clients access the replicas concurrently, and various consistency models have been proposed and studied. Atomicity is the strongest consistency model, providing the illusion that data is accessed sequentially. A basic problem in distributed computing is the implementation of atomic objects (registers) that support read and write operations. This thesis explores the communication costs of atomic read/write register implementations in asynchronous message-passing systems with crash-prone processors. It considers such implementations under three assumptions.^ First, we consider implementations in the single writer, multiple reader (SWMR) setting. It is known that under certain restrictions on the number of readers, it is possible to obtain implementations where each read and write operation terminates after a single round-trip message exchange with a set of replicas. Such operations are called fast. This thesis removes any restrictions on the number of readers and introduces a new implementation where writes are fast, and at most one read operation performs two round-trips per write operation. Subsequently, we show that such SWMR implementations impose limitations on the number of replica failures and that multiple writer, multiple reader (MWMR) implementations with such characteristics are impossible.^ Then, we consider implementations in the SWMR setting where operations access the replicated register by sending messages to a predefined sets of replicas with non-empty intersections, called quorums. We show that more than one two-round read operation may be required for each write in this setting and that general quorum-based implementations are not fault-tolerant. Then we explore trading operation latency for fault-tolerance and introduce a new decision tool that enables some read operations to be fast in any general quorum construction.^ Finally, we examine the latency of read and write operations in the MWMR setting. First, we study the connection between fast operations and quorum intersections in any quorum-based implementation. The decision tools introduced in the SWMR setting are then adapted to the MWMR setting to enable fast read operations. Lastly, the thesis develops a new technique leading to a near optimal implementation that allows (but does not guarantee) both fast read and write operations in the MWMR setting. ^