lighttpd (pronounced "lighty") is a web server which is designed to be secure, fast, standards-compliant, and flexible while being optimized for speed-critical environments. Its low memory footprint (compared to other web servers), light CPU load and its speed goals make lighttpd suitable for servers that are suffering load problems, or for serving static media separately from dynamic content. lighttpd is free software / open source, and is distributed under the BSD license. lighttpd runs on GNU/Linux and other Unix-like operating systems and Microsoft Windows. * Load-balancing FastCGI, SCGI and HTTP-proxy support * chroot support * select()-/poll()-based web server * Support for more efficient event notification schemes like kqueue and epoll * Conditional rewrites (mod_rewrite) * SSL and TLS support, via openSSL. * Authentication against an LDAP server * rrdtool statistics * Rule-based downloading with possibility of a script handling only authentication * Server-side includes support * Flexible virtual hosting * Modules support * Cache Meta Language (currently being replaced by mod_magnet) * Minimal WebDAV support * Servlet (AJP) support (in versions 1.5.x and up) * HTTP compression using mod_compress and the newer mod_deflate ( 1.5.x )
This paper is a great overview of different lightweight web servers. A lot of websites use lightweight web servers to serve images and static content. YouTube is one example: http://highscalability.com/youtube-architecture. So if you need to improve performance consider changing over a different web server for some types of content. Overview: Recent years have enjoyed a florescence of interesting implementations of Web servers, including lighttpd, litespeed, and mongrel, among others. These Web servers boast different combinations of performance, ease of administration, portability, security, and related values. The following engineering study surveys the field of lightweight Web servers to help you find one likely to meet the technical requirements of your next project. "Lightweight" Web servers like lighttpd, litespeed, and mongrel can offer dramatic benefits for your projects. This article surveys the possibilities and shows how they apply to you. Important dimensions for evaluation of a Web server include: * Performance: How fast does it respond to requests? * Scalability: Does the server continue to behave reliably when many users simultaneously access it? * Security: Does the server do only the operations it should? What support does it offer for authenticating users and encrypting its traffic? Does its use make nearby applications or hosts more vulnerable? * Availability: What are the failure modes and incidences of the server? * Compliance to standards: Does the server respect the pertinent RFCs? * Flexibility: Can the server be tuned to accommodate heavy request loads, or computationally demanding dynamic pages, or expensive authentication, or ...? * Platform requirements: On what range of platforms is the server available? Does it have specific hardware needs? * Manageability: Is the server easy to set up and maintain? Is it compatible with organizational standards for logging, auditing, costing, and so on?
A lot of sites hosted in San Francisco are down because of at least 6 back-to-back power outages power outages. More details at laughingsquid. Sites like SecondLife, Craigstlist, Technorati, Yelp and all Six Apart properties, TypePad, LiveJournal and Vox are all down. The cause was an underground explosion in a transformer vault under a manhole at 560 Mission Street. Flames shot 6 feet out from the manhole cover. Over PG&E 30,000 customers are without power. What's perplexing is the UPS backup and diesel generators didn't kick in to bring the datacenter back on line. I've never toured that datacenter, but they usually have massive backup systems. It's probably one of those multiple simultaneous failure situations that you hope never happen in real life, but too often do. Or maybe the infrastructure wasn't rolled out completely. Update: the cause was a cascade of failures in a tightly couples system that could never happen :-) Details at Failure Happens: A summary of the power outage at 365 Main. It's just these sorts of emergencies that make us think. How would You handle a similar failure? How can you make your website span more than one data center? Is the cost of putting in all this infrastructure worth it compared to your website being down for a day? All good hard to answer and even harder to implement questions. Geo-distributed clustering is not easy, which is why most companies don't do it, but for some help distributing your website take a look at http://highscalability.com/tags/geo-distributed-clusters.
If you want to adopt a shard architecture, but don't want to start from scratch, you may want to consider Hibernate's sharding system. Hibernate Shards is a framework that is designed to encapsulate and minimize this complexity by adding support for horizontal partitioning to Hibernate Core. Hibernate Shards key features: * Standard Hibernate programming model - Hibernate Shards allows you to continue using the Hibernate APIs you know and love: SessionFactory, Session, Criteria, Query. If you already know how to use Hibernate, you already know how to use Hibernate Shards. * Flexible sharding strategies - Distribute data across your shards any way you want. Use one of the default strategies we provide or plug in your own application-specific logic. * Support for virtual shards - Think your sharding strategy is never going to change? Think again. Adding new shards and redistributing your data is one of the toughest operational challenges you will face once you've deployed your shard-aware application. Hibernate Sharding supports virtual shards, a feature designed to simplify the process of resharding your data. * Free/open source - Hibernate Shards is licensed under the LGPL (Lesser GNU Public License)
Google Talk is Google's instant communications service. Interestingly the IM messages aren't the major architectural challenge, handling user presence indications dominate the design. They also have the challenge of handling small low latency messages and integrating with many other systems. How do they do it? Site: http://www.google.com/talk
The Architecture* Data Center. * Storage. * Development Environment. * OS. * Web Server. * Database. * Database abstraction layer. * Load balancing. * Web Framework. * Real-time messaging. * Identity management. * Distributed job management. * Ad serving. * Standard API to website. * AJAX library. * PHP Cache. * Object and Content Cache. * Client Side Cache. * Monitoring. * Log Analysis. * Testing. * Performance Analysis. * Backup and Restore. * Fault Tolerance. * Scalability Plan. * Business Continuity Plan. * Future Directions.
Lessons LearnedTo discuss this article please visit the forums at
As users come to depend on MySQL, they find that they have to deal with issues of reliability, scalability, and performance--issues that are not well documented but are critical to a smoothly functioning site. This book is an insider's guide to these little understood topics. Author Jeremy Zawodny has managed large numbers of MySQL servers for mission-critical work at Yahoo!, maintained years of contacts with the MySQL AB team, and presents regularly at conferences. Jeremy and Derek have spent months experimenting, interviewing major users of MySQL, talking to MySQL AB, benchmarking, and writing some of their own tools in order to produce the information in this book. In High Performance MySQL you will learn about MySQL indexing and optimization in depth so you can make better use of these key features. You will learn practical replication, backup, and load-balancing strategies with information that goes beyond available tools to discuss their effects in real-life environments. And you'll learn the supporting techniques you need to carry out these tasks, including advanced configuration, benchmarking, and investigating logs. Topics include: * A review of configuration and setup options * Storage engines and table types * Benchmarking * Indexes * Query Optimization * Application Design * Server Performance * Replication * Load-balancing * Backup and Recovery * Security
This paper is behind a registration-wall, you can't do anything on the MySQL site without filling out a form of some kind, but it's a short, decent introduction to using MySQL for a good sized website.
A Quick Hit of What's InsideScale-out vs. Scale Up, Customers using MySQL, Scale-Out Reference Architecture
Eventually every database system hit its limits. Especially on the Internet, where you have millions of users which theoretically access your database simultaneously, eventually your IO system will be a bottleneck. [A] promising but more complex solution with nearly no scale-out limits is application partitioning. If and when you get into the top-1000 rank on alexa , you have to think about such solutions.