Paper: Calvin: Fast Distributed Transactions for Partitioned Database Systems
Distributed transactions are costly because they use agreement protocols. Calvin says, surprisingly, that using a deterministic database allows you to avoid the use of agreement protocols. The approach is to use a deterministic transaction layer that does all the hard work before acquiring locks and the beginning of transaction execution.Overview:
Many distributed storage systems achieve high data access throughput via partitioning and replication, each system with its own advantages and tradeoffs. In order to achieve high scalability, however, today’s systems generally reduce transactional support, disallowing single transactions from spanning multiple partitions. Calvin is a practical transaction scheduling and data replication layer that uses a deterministic ordering guarantee to significantly reduce the normally prohibitive contention costs associated with distributed transactions. Unlike previous deterministic database system prototypes, Calvin supports disk-based storage, scales near-linearly on a cluster of commodity machines, and has no single point of failure. By replicating transaction inputs rather than effects, Calvin is also able to support multiple consistency levels—including Paxos based strong consistency across geographically distant replicas—at no cost to transactional throughput.
If you are interested Daniel Abadi gives a very accessible overview of Calvin in If all these new DBMS technologies are so scalable, why are Oracle and DB2 still on top of TPC-C? A roadmap to end their dominance.