F1 and Spanner Holistically Compared
Tuesday, October 8, 2013 at 8:45AM
Todd Hoff in Database, distirbuted, google

This aricle, F1: A Distributed SQL Database That Scales by Srihari Srinivasan, is republished with permission from a blog you really should follow: Systems We Make - Curating Complex Distributed Systems.

With both the F1 and Spanner papers out its now possible to understand their interplay a bit holistically. So lets start by revisiting the key goals of both systems.

Key Goals of F1′s design
Spanner’s objectives
F1 – An overview
  • F1 is built on top of Spanner. Spanner offers support for for features such as – strong consistency through distributed transactions (2PC), global ordering based on timestamps, synchronous replication via Paxos, fault tolerance, automatic rebalancing of data etc.
  • To this F1 adds features such as
    • Distributed SQL queries with ability to perform joins over external data
    • Transactional consistency of indexes
    • Asynchronous schema changes
    • Uses a new ORM library
F1′s Architecture
Data Model – Hierarchical Schema

Query Processing in F1

Its not very surprising to observe that query processing in F1 looks remarkably similar to what we find in the more recent SQL-on-Hadoop solutions such as Cloudera’s Impala, Apache Drill and predecessor shared nothing parallel databases.

Lifecycle of a query
  • Each query has a query coordinator node. Its the node that receives the SQL query request.
  • The coordinator plans the query for execution, receives results from the penultimate execution nodes, performs
    any final aggregation, sorting, or filtering, and then streams the results back to the client,
  • Given that data is arbitrarily partitioned the query planner determines the extent of parallelism needed to minimize the processing time for a query.
  • Based on the data required to be processed and scope for parallelism the planner/optimizer may even choose to repartition the qualifying data

Dealing with network latencies

F1′s main data store is Spanner, which is a remote data source. F1 SQL can also access other remote data sources whose accesses involve highly variable network latency.

The issues associated with remote data access (which are issues due to network latency) are mitigated through the use of batching and pipelining across various stages of the query life cycle. Also the query operators are designed to stream as much data as possible to the subsequent stages of the processing pipeline.

And finally….

The F1 system has been managing all AdWords advertising campaign data in production since early 2012. AdWords is a vast and diverse ecosystem including 100s of applications and 1000s of users, all sharing the same database. This database is over 100 TB, serves up to hundreds of thousands of requests per second, and runs SQL queries that scan tens of trillions of data rows per day. Availability reaches five nines, even in the presence of unplanned outages, and observable latency on our web applications has not increased compared to the old MySQL system. 

Related Articles 

Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.