Sequoia is a transparent middleware solution offering clustering, load balancing and failover services for any database. Sequoia is the continuation of the C-JDBC project. The database is distributed and replicated among several nodes and Sequoia balances the queries among these nodes. Sequoia handles node and network failures with transparent failover. It also provides support for hot recovery, online maintenance operations and online upgrades.
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.
If you have a lot of static content to store and you aren't looking forward to setting up and maintaining your own giganto SAN, maybe you can push off a lot of the hard lifting to a CDN? Jesse Robbins at O'Reilly Radar posts that you have a lot more options now because the number of Content Distribution Networks have doubled since last year. In fact, Dan Rayburn says there are now 28 CDN providers in the market. Hopefully you can find reasonable pricing at one of them. Other than easing your burden, why might a CDN work for you? Because it makes your site faster and customers like that. How can a CDN so dramatically improve your site's performance? Steve Saunders, author of High Performance Web Sites: Essential Knowledge for Front-End Engineers, has using a CDN has one of his "Thirteen Simple Rules for Speeding Up Your Web Site." About CDNs Steve says:
Remember that 80-90% of the end-user response time is spent downloading all the components in the page: images, stylesheets, scripts, Flash, etc. This is the Performance Golden Rule, as explained in The Importance of Front-End Performance. Rather than starting with the difficult task of redesigning your application architecture, it's better to first disperse your static content. This not only achieves a bigger reduction in response times, but it's easier thanks to content delivery networks. ... At Yahoo!, properties that moved static content off their application web servers to a CDN improved end-user response times by 20% or more. Switching to a CDN is a relatively easy code change that will dramatically improve the speed of your web site.It's at least worth looking into if looking for a performance boost or are concerned about storing so many buckets of bits.
Hi, Can someone teach me how you implement network switch fail over since we are paranoid for single point of failure. For example, you have: a dozen web servers -> switch -> DB cluster that switch is a SPOF. How does one implement dual switch in a fail over fashion?
Hi, Every application server has its own session management implementations for supporting high scalability. But an application architect/developer has to design and implement the application to make the best use of it. What are the guiding principles and pattern for session state management? Websphere System management red book mentions that "Session management performance is optimum when session data per user is around 2Kb. It degrades if session data is more than that". I have following questions. 1. How do you measure session data per user? 2. It is generally recommended that you should keep all the session state in database and keep only the keys in HttpSession object. Then everytime a web request is processed, session data is fetched from the database. This way all the data remains in memory only till the request is processed and actual data in HttpSession is very less. (Only few keys). What is the general practice? At what point you should be switching from keeping data in HttpSession to database? Are websites like Amazon or eBay follow this? 3. Is there any open source framework which helps you do session management in a way mentioned in point no. 2? Thanks, Unmesh Thanks, Unmesh
This is a wonderfully informative Amazon update based on Joachim Rohde's discovery of an interview with Amazon's CTO. You'll learn about how Amazon organizes their teams around services, the CAP theorem of building scalable systems, how they deploy software, and a lot more. Many new additions from the ACM Queue article have also been included. Amazon grew from a tiny online bookstore to one of the largest stores on earth. They did it while pioneering new and interesting ways to rate, review, and recommend products. Greg Linden shared is version of Amazon's birth pangs in a series of blog articles Site: http://amazon.com
I have a few apache servers ( arround 11 atm ) serving a small amount of data ( arround 44 gigs right now ). For some time I have been using rsync to keep all the content equal on all servers, but the amount of data has been growing, and rsync takes a few too much time to "compare" all data from source to destination, and create a lot of I/O. I have been taking a look at MogileFS, it seems a good and reliable option, but as the fuse module is not finished, we should have to rewrite all our apps, and its not an option atm. Any ideas? I just want a "real time, non resource-hungry" solution alternative for rsync. If I get more features on the way, then they are welcome :) Why I prefer to use a Distributed File System instead of using NAS + NFS? - I need 2 NAS, if I dont want a point of failure, and NAS hard is expensive. - Non-shared hardware, all server has their own local disks. - As files are replicated, I can save a lot of money, RAID is not a MUST. Thnx in advance for your help and sorry for my english :)
What do you guys think/know about the scalability of the popular CMSs (like Joomla, Drupal or Typo3)? Any experience/suggestions there? I'm not sure which to pick yet... Thanks, Stephan
Dan has genuine insight into building software and large scale scalable systems in particular. You'll always learn something interesting reading his blog.