This is a wonderfully informative Amazon update based on Joachim Rohde's discovery of an interview with Amazon's CTO. You'll learn about how Amazon organizes their teams around services, the CAP theorem of building scalable systems, how they deploy software, and a lot more. Many new additions from the ACM Queue article have also been included. Amazon grew from a tiny online bookstore to one of the largest stores on earth. They did it while pioneering new and interesting ways to rate, review, and recommend products. Greg Linden shared is version of Amazon's birth pangs in a series of blog articles Site: http://amazon.com
I have a few apache servers ( arround 11 atm ) serving a small amount of data ( arround 44 gigs right now ). For some time I have been using rsync to keep all the content equal on all servers, but the amount of data has been growing, and rsync takes a few too much time to "compare" all data from source to destination, and create a lot of I/O. I have been taking a look at MogileFS, it seems a good and reliable option, but as the fuse module is not finished, we should have to rewrite all our apps, and its not an option atm. Any ideas? I just want a "real time, non resource-hungry" solution alternative for rsync. If I get more features on the way, then they are welcome :) Why I prefer to use a Distributed File System instead of using NAS + NFS? - I need 2 NAS, if I dont want a point of failure, and NAS hard is expensive. - Non-shared hardware, all server has their own local disks. - As files are replicated, I can save a lot of money, RAID is not a MUST. Thnx in advance for your help and sorry for my english :)
What do you guys think/know about the scalability of the popular CMSs (like Joomla, Drupal or Typo3)? Any experience/suggestions there? I'm not sure which to pick yet... Thanks, Stephan
Dan has genuine insight into building software and large scale scalable systems in particular. You'll always learn something interesting reading his blog.
A Quick Hit of What's InsideInverting the Reliability Stack, In Support of Non-Stop Software, Chaotic Perspectives, Latency Exists, Cope!, A Real eBay Architect Analyzes Part 3, Avoiding Two Phase Commit, Redux
Although I have a basic working knowledge of memory, SSDs and the like, I am not technical...I have never developed or deployed a system. I was exposed to ram-disks years ago, when their expense limited their use to very small files or DB applications. I am looking to "get current" on what role memory plays in curremt WEB 2.0 design and deployments. How is memory commonly used to remove latency and accelerate performance in typical Web 2.0 architectures? What role can memory play in massive scale-out implementations? Are there such a thing as memory "best practives"? If memory were cheap, would that significantly change the way systems are designed and deployed? What commercial and open source products that use memory are used, what are the benefits and trade-offs? Can anyone suggest what sources - people, books, papers, products - I might look into to gain a practical understanding of this topic?
Hi there, what do you think is crucial in the code designing of a scalable site? How does one prepare for webfarms and clusters (e.g. in PHP)? Thanks, Stephan
Anyone knows what's behind this service? http://www.mediatemple.net/webhosting/gs/ thanks!
Royans' scalability blog and his main blog are excellent sources of scalability information. Take a look.
A Quick Hit of What's InsideSharding: Different from Partitioning and Federation ?, Adventures of scaling eins.de, Session, state and scalability
Theo Schlossnagle is the author of Scalable Internet Architecture and the funder of OmniTI , a global leader in Internet technology services that power the World Wide Web and email. As you might imagine Theo frequently posts on interesting topics for the scalable website builder.