Behind The Scenes of Google Scalability

The recent Data-Intensive Computing Symposium brought together experts in system design, programming, parallel algorithms, data management, scientific applications, and information-based applications to better understand existing capabilities in the development and application of large-scale computing systems, and to explore future opportunities.

Google Fellow Jeff Dean had a very interesting presentation on Handling Large Datasets at Google: Current Systems and Future Directions. He discussed:

• Hardware infrastructure
• Distributed systems infrastructure:
–Scheduling system
–GFS
–BigTable
–MapReduce
• Challenges and Future Directions
–Infrastructure that spans all datacenters
–More automation

It is really like a "How does Google work" presentation in ~60 slides?

Check out the slides and the video!