Update: InfoQ links to a few excellent Eucalyptus updates: Velocity Conference Video by Rich Wolski and a Visualization.com interview Rich Wolski on Eucalyptus: Open Source Cloud Computing. Eucalyptus is generating some excitement on the Cloud Computing group as a potential vendor neutral EC2 compatible cloud platform. Two reasons why Eucalyptus is potentially important: private clouds and cloud portability: Private clouds. Let's say you want a cloud like infrastructure for architectural purposes but you want it to run on your own hardware in your own secure environment. How would you do this today? Hm.... Cloud portability. With the number of cloud offerings increasing how can you maintain some level of vendor neutrality among this "swarm" of different options? Portability is a key capability for cloud customers as the only real power customers have is in where they take their business and the only way you can change suppliers is if there's a ready market of fungible services. And the only way their can be a market is if there's a high degree of standardization. What should you standardize on? The options are usually to form a great committee and take many years to spec out something that doesn't exist, nobody will build, and will never really work. Or have each application create a high enough layer interface that portability is potentially difficult, but possible. Or you can take a popular existing API, make it the general API, and everyone else is accommodated using an adapter layer and the necessary special glue to take advantage of value add features for each cloud. With great foresight Eucalyptus has chosen to create a cloud platform based on Amazon's EC2. As this is the most successful cloud platform it makes a lot of sense to use it as a model. We see something similar with the attempts to port Google AppEngine to EC2 thus making GAE a standard framework for web apps. So developers would see GAE on top of EC2. A lot of code would be portable between clouds using this approach. Even better would be to add ideas in from RightScale, 3Tera, and Mosso to get a higher level view of the cloud, but that's getting ahead of the game. Just what is Eucalyptus? From their website: Overview ¶ Elastic Computing, Utility Computing, and Cloud Computing are (possibly synonymous) terms referring to a popular SLA-based computing paradigm that allows users to "rent" Internet-accessible computing capacity on a for-fee basis. While a number of commercial enterprises currently offer Elastic/Utility/Cloud hosting services and several proprietary software systems exist for deploying and maintaining a computing Cloud, standards-based open-source systems have been few and far between. EUCALYPTUS -- Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems -- is an open-source software infrastructure for implementing Elastic/Utility/Cloud computing using computing clusters and/or workstation farms. The current interface to EUCALYPTUS is interface-compatible with Amazon.com's EC2 (arguably the most commercially successful Cloud computing service), but the infrastructure is designed to be modified and extended so that multiple client-side interfaces can be supported. In addition, EUCALYPTUS is implemented using commonly-available Linux tools and basic web service technology making it easy to install and maintain. Overall, the goal of the EUCALYPTUS project is to foster community research and development of Elastic/Utility/Cloud service implementation technologies, resource allocation strategies, service level agreement (SLA) mechanisms and policies, and usage models. The current release is version 1.0 and it includes the following features: * Interface compatibility with EC2 * Simple installation and deployment using Rocks cluster-management tools * Simple set of extensible cloud allocation policies * Overlay functionality requiring no modification to the target Linux environment * Basic "Cloud Administrator" tools for system management and user accounting * The ability to configure multiple clusters, each with private internal network addresses, into a single Cloud. The initial version of EUCALYPTUS requires Xen to be installed on all nodes that can be allocated, but no modifications to the "dom0" installation or to the hypervisor itself. For more discussion see:
A report from the CloudCamp conference on cloud computing, held in London in July 2008.
Given S3's recent failure (Cloud Status tells the tale) Kevin Burton makes the excellent suggestion of fronting S3 with a caching proxy server. A caching proxy server can reply to service requests without contacting the specified server, by retrieving content saved from a previous request, made by the same client or even other clients. This is called caching. Caching proxies keep local copies of frequently requested resources. In normal operation when an asset (a user's avatar, for example) is requested the cache is tried first. If the asset is found in the cache then it's returned. If the asset is not in the cache it's retrieved from S3 (or wherever) and cached. So when S3 goes down it's likely you can ride out the down time by serving assets out of the cache. This strategy only works when using S3 as a CDN. If you are using S3 for its "real" purpose, as a storage service, then a caching proxy can't help you... Amazon doesn't used S3 as a CDN either Amazon Not Building Out AWS To Compete With CDNs. They use Limelight Networks. Some proxy options are: Squid, Nginx, Varnish. Planaroo shares how a small startup responds to an S3 outage (summarized):
Robert Scoble in an often poignant FriendFeed thread commiserating PodTech's unfortunate end, shared what he learned about creating a successful startup. Here's a summary of a Robert's rules and why Machiavelli just may agree with them:
Jeff Atwood started a barn burner of a conversation in Maybe Normalizing Isn't Normal on how to create a fast scalable tagging system. Jeff eventually asks that terrible question: which is better -- a normalized database, or a denormalized database? And all hell breaks loose. I know, it's hard to imagine database debates becoming contentious, but it does happen :-) It's lucky developers don't have temporal power or rivers of blood would flow. Here are a few of the pithier points (summarized):
ZooKeeper is a high available and reliable coordination system. Distributed applications use ZooKeeper to store and mediate updates key configuration information. ZooKeeper can be used for leader election, group membership, and configuration maintenance. In addition ZooKeeper can be used for event notification, locking, and as a priority queue mechanism. It's a sort of central nervous system for distributed systems where the role of the brain is played by the coordination service, axons are the network, processes are the monitored and controlled body parts, and events are the hormones and neurotransmitters used for messaging. Every complex distributed application needs a coordination and orchestration system of some sort, so the ZooKeeper folks at Yahoo decide to build a good one and open source it for everyone to use. The target market for ZooKeeper are multi-host, multi-process C and Java based systems that operate in a data center. ZooKeeper works using distributed processes to coordinate with each other through a shared hierarchical name space that is modeled after a file system. Data is kept in memory and is backed up to a log for reliability. By using memory ZooKeeper is very fast and can handle the high loads typically seen in chatty coordination protocols across huge numbers of processes. Using a memory based system also mean you are limited to the amount of data that can fit in memory, so it's not useful as a general data store. It's meant to store small bits of configuration information rather than large blobs. Replication is used for scalability and reliability which means it prefers applications that are heavily read based. Typical of hierarchical systems you can add nodes at any point of a tree, get a list of entries in a tree, get the value associated with an entry, and get notification of when an entry changes or goes away. Using these primitives and a little elbow grease you can construct the higher level services mentioned above. Why would you ever need a distribute coordination system? It sounds kind of weird. That's more the question I'll be addressing in this post rather than how it works because the slides and the video do a decent job explaining at a high level what ZooKeeper can do. The low level details could use another paper however. Reportedly it uses a version of the famous Paxos Algorithm to keep replicas consistent in the face of the failures most daunting. What's really missing is a motivation showing how you can use a coordination service in your own system and that's what I hope to provide... Kevin Burton wants to use ZooKeeper to to configure external monitoring systems like Munin and Ganglia for his Spinn3r blog indexing web service. He proposes each service register its presence in a cluster with ZooKeeper under the tree "/services/www." A Munin configuration program will add a ZooKeeper Watch on that node so it will be notified when the list of services under /services/www changes. When the Munin configuration program is notified of a change it reads the service list and automatically regenerates a munin.conf file for the service. Why not simply use a database? Because of the guarantees ZooKeeper makes about its service:
Some Fast Facts
In the more cool stuff I've never heard of before department is something called Self Cleansing Intrusion Tolerance (SCIT). Botnets are created when vulnerable computers live long enough to become infected with the will to do the evil bidding of their evil masters. Security is almost always about removing vulnerabilities (a process which to outside observers often looks like a dog chasing its tail). SCIT takes a different approach, it works on the availability angle. Something I never thought of before, but which makes a great deal of sense once I thought about it. With SCIT you stop and restart VM instances every minute (or whatever depending in your desired window vulnerability).... This short exposure window means worms and viri do not have long enough to fully infect a machine and carry out a coordinated attack. A machine is up for a while. Does work. And then is torn down again only to be reborn as a clean VM with no possibility of infection (unless of course the VM mechanisms become infected). It's like curing cancer by constantly moving your consciousness to new blemish free bodies. Hmmm... SCIT is really a genius approach to scalable (I have to work in scalability somewhere) security and and fits perfectly with cloud computing and swarm (cloud of clouds) computing. Clouds provide plenty of VMs so there is a constant ready supply of new hosts. From a software design perspective EC2 has been training us to expect failures and build Crash Only Software. We've gone stateless where we can so load balancing to a new VM is not problem. Where we can't go stateless we use work queues and clusters so again, reincarnating to new VMs is not a problem. So purposefully restarting VMs to starve zombie networks was born for cloud computing. If a wider move could be made to cloud backed thin clients the internet might be a safer place to live, play, and work. Imagine being free(er) from spam blasts and DDOS attacks. Oh what a wonderful world it would be...
Flickr's lone database guy Dathan Pattishall made his excellent presentation available on how on how Flickr scales its backend to handle tremendous loads. Some of this information is available in Flickr Architecture, but the paper is so good it's worth another read. If you want to see sharding done right, at scale, take a look.
Framework Fixation SolutionsHow can you avoid the framework fixation crash?
Hi, Generating unique ids is a common requirements in many projects. Generally, this responsibility is given to Database layer. By using sequences or some other technique. This is a problem for horizontal scalability. What are the Guid generation schemes used in high scalable web sites generally? I have seen use java's SecureRandom class to generate Guid. What are the other methods generally used? Thanks Unmesh