We started High Scalability to help you build successful scalable websites. This site tries to bring together all the lore, art, science, practice, and experience of building scalable websites into one place so you can learn how to build your system with confidence. Hopefully this site will move you further and faster along the learning curve of success. Please Start Here.
OK, this interview is with me on Java scalability issues. I sound like a bigger idiot than I would like, but I suppose it could have been worse. The Java World folks were very nice and did a good job, so there’s no blame on them :-)
The interview went an interesting direction, but there’s more I’d like add and I will do so here. Two major rules regarding Java and scalability have popped out at me:
The fecundity of the Java ecosystem can most readily be seen with the efforts to tame our multi-core future. There’s a multi-core crisis going in case you haven’t heard. It’s all you’ll hear mentioned in the halls of the Pentagon. The CPU wizards have maxed out on clock speed and the only way we can scale is by adding more cores. And we don't know how to do that. At 100 cores common ways of doing things break down. Locks don't scale. Cache contention for shared memory slows us down. Bandwidth on the bus is limited. TLB for managing more memory is in short supply. And we need more high speed network cards to handle faster CPUs.
And that’s just at the hardware level. It’s worse for programmers. Locking is just a nightmare. I didn't believe that at first. Early in my career I worked on several multi-core systems. I thought everything was cool. Be careful and it will all work out. But work with a group and it all goes to hell. People add functions, locks, take too much time. Problems like deadlock, priority inversion, and high latency all kill a system.
What can we do?
MySQL™ database, an open source database, delivers high performance and reliability while keeping costs low by eliminating licensing fees. The Solaris™ Cluster product is an integrated hardware and software environment that can be used to create highly-available data services. This article explains how to deploy the MySQL database in a Solaris Cluster environment. The article addresses the following topics:
* "Advantages of Deploying MySQL Database with Solaris Cluster" on page 1 discusses the benefits provided by a Solaris Cluster deployment of the MySQL database.
* "Overview of Solaris Cluster" on page 2 provides a high-level description of the hardware and software components of the Solaris Cluster.
* "Installation and Configuration" on page 8 explains the procedure for deploying the MySQL database on a Solaris Cluster.
This article assumes that readers have a basic understanding of Solaris Cluster and MySQL database installation and administration.
Reducing the costs of IT infrastructure and improving the manageability and efficiency of web services pose significant challenges for many organizations in today's economic climate. Recent studies describe the challenges IT managers face administering the proliferation of x86-based servers used to run web services applications. Those reports reveal that using large number of x86-based systems can increase space and power consumption, as well as cost and asset management overhead. In addition, many of these x86-based systems run a mixture of operating system and application software leading to increased management complexity and potential security concerns.
Faced with these challenges, many organizations are attracted by the idea of consolidating web and application services from multiple x86-based servers to a smaller number of high-performance servers. This approach strives to help simplify management, improve performance, and increase the efficiency of delivering web services. The combined capabilities of the Sun Fire T1000 server and Solaris Containers technology in particular offer significant promise as a web-tier consolidation platform. The Sun Fire T1000 server offers high aggregate throughput performance in a small, power-efficient footprint. Solaris containers provide a complete, isolated, and secure runtime environment for applications, enabling multiple web servers to run safely and efficiently on the same platform.
This paper explores the configuration and testing of the Sun Fire T1000 server as a web-tier consolidation platform. It discusses methodologies used to consolidate multiple web servers onto a single Sun Fire T1000 server, and explains the steps used to configure the Solaris Containers. In addition, to determine the effectiveness of this approach, testing was performed to evaluate the consolidated Sun Fire T1000 system against a baseline configuration of current Xeon servers, a popular choice as web server platform.
As individuals and businesses depend on the Web more than ever to conduct business, rapid and reliable content retrieval is critical. Reducing wait time improves productivity and increases user satisfaction. Web proxy technology has emerged as an effective solution to improve performance, help ensure content availability and enhance network security by caching and filtering Web content. The combination of Sun SPARC Enterprise servers with CoolThreads technology and the Sun Java System Web Proxy Server software provides a compelling foundation for a robust Web proxy solution. Sun SPARC Enterprise T1000 and T2000 servers include the UltraSPARC T1 processor with CoolThreads technology, offering six or eight cores with four threads per core. The Sun Java System Web Proxy Server software is highly threaded and takes advantage of the large number of threads supported by Sun UltraSPARC T1 processors with CoolThreads technology. Together, these products provide a highly scalable solution that accommodates a large number of requests, addresses peak loads, and provides future headroom for growth. This document explores the use of a Sun SPARC Enterprise T1000 server and the Sun Java System Web Proxy Server software as a replacement for an existing Web proxy implementation that used the SQUID Web proxy server software deployed on x86 servers.
With the explosive growth of the Internet, increasing complexity of user requirements, and wide choice of hardware, operating systems, and middleware, IT executives are facing new challenges in their application infrastructures. Rapid expansion of the application tier has resulted in significant cost and complexity, and many organizations are simply running out of datacenter space, power, and cooling.
It is widely recognized that MySQL is the most popular database software in the world. Since its inception in 1995, there have been 11 million product installations around the world in a wide variety of markets. There are more installations of MySQL in use today than any other database architecture. From startup companies hoping to be the next Web2.0 poster child to large global enterprises, the MySQL database architecture has proven to be flexible, extendable, scalable, and more than capable of filling high-capacity database roles in very different venues.
Sun FireTM X4540 Server as Backup Server for Zmanda's Amanda Enterprise 2.6 Software
by Thomas Hanvey (Sun Microsystems) and Dmitri Joukovski and Ken Crandall (Zmanda)
September, 2008
Explosive data growth, combined with demanding requirements for data availability, has placed a tremendous burden on IT operations staff at businesses of all sizes. Yet, many organizations do not have the staff or budget to purchase and manage complex and expensive backup and recovery software products.
The Sun FireTM X4540 server can deliver massive storage capacity and remarkable throughput so it is well-suited as a nearline storage platform for backup and restore applications. Combining the power of the SolarisTM 10 Operating System with the data integrity and simplified administration of ZFS, the Sun Fire X4540 server can be an ideal candidate for streamlining and improving backup and restore operations.
Amanda Enterprise Edition from Zmanda was designed to address these challenges, providing a backup and recovery solution that combines fast installation, simplified management, enterprise-class functionality, and low-cost subscription fees. As an open source product, Amanda Enterprise Edition uses only standard formats and tools, effectively freeing you from being locked into a vendor to recover your archived data. This guide discusses how to quickly configure the Sun Fire X4540 server as a backup server for Amanda Enterprise
Edition software.
With more users interacting, working, purchasing, and communicating over the network than ever before, Web 2.0 infrastructure is taking center stage in many organizations. Demand is rising, and companies are looking for ways to tackle the performance and scalability needs placed on Web infrastructure without raising IT operational expenses. Today companies are turning to efficient, high-performance, open source solutions as a way to decrease acquisition, licensing, and other ongoing costs and stay within budget constraints.
Update 34: Apparently Amazon pulled this article. I'm not sure what that means. Maybe time went backwards or something? Amazon dramatically drops SimpleDB pricing to $0.25 per GB per month from $1.50 per GB. This puts SimpleDB on par with Google App Engine. They also announced a few new features: a SQL-like SELECT API as well as a Batch Put operation to streamline uploading of multiple items or attributes. One of the complaints against SimpleDB is that programmers end up writing too much code to do simple things. These features and a much cheaper price should help considerably. And you can store lots of data now. GAE is still capped.
Update 33: Amazon announces Elastic Block Store (EBS), which provides lots of normal looking disk along with value added features like snapshots and snapshot copying. But database's may find EBS too slow. RightScale tells us Why Amazon’s Elastic Block Store Matters.
Update 32: You can now get all attributes for a property when querying. Previously only the ID was returned and the attributes had to be returned in separate calls. This makes the programmer's job a lot simpler. Artificial levels of parallelization code can now be dumped.
Update 31: Amazon fixes a major hole in SimpleDB by adding the ability to sort query results. Previously developers had to sort results by hand which was a non-starter for many. Now you can do basic top 10 type queries with ease.
Update 30: Amazon SimpleDB - A distributed, highly-scalable, light-weight, query-able, attribute store by Sebastian Stadil. It introduces the CAP theorem and the basics of SimpleDB. Sebastian does a lot of great work in the AWS world and in what must be his limited free time, runs the AWS Meetup group.
Scalability Perspectives is a series of posts that highlights the ideas that will shape the next decade of IT architecture. Each post is dedicated to a thought leader of the information age and his vision of the future. Be warned though – the journey into the minds and perspectives of these people requires an open mind.
Marc Andreessen is known as an internet pioneer, entrepreneur, investor, startup coach, blogger, and a multi-millionaire software engineer best known as co-author of Mosaic, the first widely-used web browser, and founder of Netscape Communications Corporation. He was the chair of Opsware, a software company he founded originally as Loudcloud, when it was acquired by Hewlett-Packard. He is also a co-founder of Ning, a company which provides a platform for social-networking websites. He has recently joined the Board of Directors of Facebook and eBay.
Marc is an investor in several startups including Digg, Metaplace, Plazes, Qik, and Twitter. His passion is to create new technologies, to start new companies, and to scale them up.
From Marc's Blog Post on the Three Kinds of Platforms:
One of the hottest of hot topics these days is the topic of Internet platforms, or platforms on the Internet. However, the concept of "platform" is also the focus of a swirling vortex of confusion -- lots of platform-related concepts, many of them highly technical, bleeding together; lots of people harboring various incompatible mental images of what's about to happen in our industry as a consequence of various platforms. I think this confusion is due in part to the term "platform" being overloaded and being used to mean many different things, and in part because there truly are a lot of moving parts at play that intersect in fascinating but complex ways.
Marc attempts to disentangle and examine the topic of "Internet platform" in detail. He has identified three distinct approaches to providing an Internet platform and shows us where each of the three approaches could go.

In Log Everything All the Time I advocate applications shouldn't bother logging at all. Why waste all that time and code? No, wait, that's not right. I preach logging everything all the time. Doh. Facebook obviously feels similarly which is why they opened sourced Scribe, their internal logging system, capable of logging 10s of billions of messages per day. These messages include access logs, performance statistics, actions that went to News Feed, and many others.
Imagine hundreds of thousands of machines across many geographical dispersed datacenters just aching to send their precious log payload to the central repository off all knowledge. Because really, when you combine all the meta data with all the events you pretty much have a complete picture of your operations. Once in the central repository logs can be scanned, indexed, summarized, aggregated, refactored, diced, data cubed, and mined for every scrap of potentially useful information.
Just imagine the log stream from all of Facebook's Apache servers alone. Brutal. My guess is these are not real-time feeds so there are no streaming query issues, but the task is still daunting. Let's say they log 10 billion messages a day. That's over 1 million messages per second!
When no off the shelf products worked for them they built their own. Scribe can be downloaded from Sourceforge. But the real action is on their wiki. It's here you'll find some decent documentation and their support forums. Not much activity on the site so you haven't missed your chance to be a charter member of the Scribe guild.
A logging system has three broad components:
Update 2: Sorting 1 PB with MapReduce. PB is not peanut-butter-and-jelly misspelled. It's 1 petabyte or 1000 terabytes or 1,000,000 gigabytes. It took six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers and the results were replicated thrice on 48,000 disks.
Update: Greg Linden points to a new Google article MapReduce: simplified data processing on large clusters. Some interesting stats: 100k MapReduce jobs are executed each day; more than 20 petabytes of data are processed per day; more than 10k MapReduce programs have been implemented; machines are dual processor with gigabit ethernet and 4-8 GB of memory.
Google is the King of scalability. Everyone knows Google for their large, sophisticated, and fast searching, but they don't just shine in search. Their platform approach to building scalable applications allows them to roll out internet scale applications at an alarmingly high competition crushing rate. Their goal is always to build a higher performing higher scaling infrastructure to support their products. How do they do that?
Scalability Perspectives is a series of posts that highlights the ideas that will shape the next decade of IT architecture. Each post is dedicated to a thought leader of the information age and his vision of the future. Be warned though – the journey into the minds and perspectives of these people requires an open mind.
Van Jacobson is a Research Fellow at PARC. Prior to that he was Chief Scientist and co-founder of Packet Design. Prior to that he was Chief Scientist at Cisco. Prior to that he was head of the Network Research group at Lawrence Berkeley National Laboratory. He's been studying networking since 1969. He still hopes that someday something will start to make sense.
As the Internet is being overrun with video traffic, many wonder if it can survive. With challenges being thrown down over the imbalances that have been created and their impact on the viability of monopolistic business models, the Internet is under constant scrutiny. Will it survive? Or will it succumb to the burden of the billion plus community that is constantly demanding more and more?
Does the Net Need an Upgrade? To answer this question a distinguished panel of Van Jacobson, Rick Hutley, Norman Lewis, David S. Isenberg has discussed the issue on the Supernova conference. In this compelling debate available on IT Conversations, the panel addresses the question and provides some differing perspectives. One of the perspectives is Content-based networking described by Van Jacobson.
Update 7: Where Amazon’s Data Centers Are Located, Expanding the Cloud: Amazon CloudFront. Why Amazon's CDN Offering Is No Threat To Akamai, Limelight or CDN Pricing. Amazon has launched their CDN with "“low latency, high data transfer speeds, and no commitments.” The perfect relationship for many. The majority of the locations are in North America, but some are in Europe and Asia.
Update 6: Amazon Launching New Content Delivery Network: No Threat To Major CDNs, Yet. All the Amazon will kill all other CDNs is a bit overblown. As usual Dan Rayburn sets us straight: The offering won't support streaming, live broadcasting, or provide many of the other products and services that video content owners need...the real story here is that Amazon is going to offer a high performance method of distributing content with low latency and high data transfer rates.
Update 5: When It Comes To Content Delivery Networks, What Is The "Edge"?. Dan Rayburn is on edge about the misuse of the term edge: closest location to the user does not guarantee quality, often content is not delivered from the closest location, all content is not replicated at every "edge" location. Lots of other essential information.
Update 4: David Cancel runs a great test to see if you should be Using Amazon S3 as a CDN?. Conclusion: "CacheFly performed the best but only slightly better than EdgeCast. The S3 option was the worst with the Nginx/DIY option performing just over 100 ms faster." Also take look at Part 2 - Cacheability?
Update 3: Mr. Rayburn takes A Detailed Look At Akamai's Application Delivery Product . They create a "bi-nodal overlay network" where users and servers are always within 5 to 10 milliseconds of each other. Your data center hosted app can't compete. The problem is that people (that is, me) can understand the data center model. I don't yet understand how applications as a CDN will work.
Update 2: Dan Rayburn starts an interesting series of articles on Highlights Of My Day In Cambridge With Akamai. Akamai is moving strong into the application distribution business. That would make an interesting cloud alternative..
Update: Streamingmedia links to new CDN DF Splash that specializes in instant-on TV-quality video streaming.
A question was raised on the forum asking for a CDN recommendation. As usual there are no definitive answers, but here are three useful articles that may help your deliberations.
Lots and lots of good stuff to learn, even if you didn't roll out of bed this morning pondering the deeper mysteries of content delivery networks and the Canadian dollar.

Rich Wolski, professor of Computer Science at the University of California, Santa Barbara, gave a spirited talk on Eucalyptus to a large group of very interested cloudsters at the Eucalyptus Cloud Meetup. If Rich could teach computer science at every school the state of the computer science industry would be stratospheric. Rich is dynamic, smart, passionate, and visionary. It's that vision that prompted him to create Eucalyptus in the first place. Rich and his group are experts in grid and distributed computing, having a long and glorious history in that space. When he saw cloud computing on the rise he decided the best way to explore it was to implement what everyone accepted as a real cloud, Amazon's API. In a remarkably short time they implement Eucalyptus and have been improving it and tracking Amazon's changes ever since.
The question I had going into the meetup was: should Eucalyptus be used to make an organization's private cloud? The short answer is no.
The project is of high quality, the people are of the highest quality, but in the end Eucalyptus is a research project from a university. As an academic project Eucalyptus is subject to changes in funding and the research interests of the team. When funding sources dry up so does the project. If the team finds another research area more interesting, or if they get tired of chasing a continuous stream of new Amazon features, or no new grad students sign on, which will happen in a few years, then the project goes dark.
Fears over continuity have at least two solutions: community support and commercial support. Eucalyptus could become community supported open source project. This is unlikely to happen though as it conflicts with the research intent of Eucalyptus. The Eucalyptus team plans to control the core for research purposes and encourage external development of add-on service like SQS. Eucalyptus won't go commercial as University projects must stay clear from commercial pretensions. Amazon is "no comment" on Eucalyptus so it's not clear what they would think of commercial development should it occur.
Taken together these concerns imply Eucalyptus is not a good base for an enterprise quality private cloud. Which they readily admit. It's not enterprise ready Rich repeats. It's not that the quality isn't there. It is and will be. And some will certainly base their private cloud on Eucalyptus, but when making a decision of this type you have to be sure your cloud infrastructure will be around for the long haul. With Eucalyptus that is not necessarily the case. Eucalyptus is still a good choice for it's original research purpose, or as cheap staging platform for Amazon, or as base for temporary clouds, but as your rock solid private cloud infrastructure of the future Eucalyptus isn't the answer.
The long answer is a little more nuanced and interesting.
Data centers are reshaping themselves by taking ideas from public cloud providers, such as Amazon and Google. The idea is to make the data center more cost-effective by enabling on-demand utility-based computing rather than dedicated machines. At the same time, it is clear that to make IT operations more effective, it doesn't make sense to run all the applications that are currently hosted in a company's data center in the private cloud. This calls for an integration between private and public cloud. In this post i discuss some of the challenges involved in making that happen:
1. How do we design applications to be cloud-agnostic?
2. How do we enable seamless fail-over to a public cloud?
3. Future-proofing: There are many cases in which we can't make a clear decision as to where our application should be running at the time of writing or developing the application. We would like to be in a position to change the decision as to where our application will be running even after our application has been completely developed.
Update 2: Overcast: Conversations on Cloud Computing. Listened to the first two podcasts and they're doing a great job. Worth a look. The singing and dance routines are way over the top however :-)
Update: 9 Sources of Cloud Computing News You May Not Know About by James Urquhart. I folded in these recommendations.
Can't get enough cloud computing? Then you must really be a glutton for punishment! But just in case, here are some cloud computing resources, collected from various sources, that will help you transform into a Tesla silently flying solo down the diamond lane.
Yahoo has developed a new language called Pig Latin that fit in a sweet spot between high-level declarative querying in the spirit of SQL, and low-level, procedural programming `a la map-reduce and combines best of both worlds.
The accompanying system, Pig, is fully implemented, and compiles Pig Latin into physical plans that are executed over Hadoop, an open-source, map-reduce implementation. Pig has just graduated from the Apache Incubator and joined Hadoop as a subproject.
The paper has a few examples of how engineers at Yahoo! are using Pig to dramatically reduce the time required for the development and execution of their data analysis tasks, compared to
using Hadoop directly.
References: Apache Pig Wiki