Paper: SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine
Thursday, May 15, 2014 at 9:05AM
Todd Hoff in Paper

So how do you knit multiple datacenters and many thousands of phones and other clients into a single cooperating system?

Usually you don't. It's too hard. We see nascent attempts in services like Firebase and Parse. 

SwiftCloud, as described in SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine, goes two steps further, by leveraging Conflict free Replicated Data Types (CRDTs), which means "data can be replicated at multiple sites and be updated independently with the guarantee that all replicas converge to the same value. In a cloud environment, this allows a user to access the data center closer to the user, thus optimizing the latency for all users."

While we don't see these kind of systems just yet, they are a strong candidate for how things will work in the future, efficiently using resources at every level while supporting huge numbers of cooperating users.

Abstract:

Client-side logic and storage are increasingly used in web and mobile applications to improve response time and availability. Current approaches tend to be ad-hoc and poorly integrated with the server-side logic. We present a principled approach to integrate client and server-side storage. We support mergeable and strongly consistent transactions that target either client or server replicas and provide access to causally-consistent snapshots efficiently. In the presence of infrastructure faults, a client-assisted failover solution allows client execution to resume immediately and seamlessly access consistent snapshots without waiting. We implement this approach in SwiftCloud, the first transactional system to bring geo-replication all the way to the client machine. Example applications show that our programming model is useful across a range of application areas. Our experimental evaluation shows that SwiftCloud provides better fault tolerance and at the same time can improve both latency and throughput by up to an order of magnitude, compared to classical geo-replication techniques.

Related Articles

Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.