Wikipedia and Wikimedia have some of the best, most complete real-world documentation on how to build highly scalable systems. This paper by Domas Mituzas covers a lot of details about how Wikipedia works, including: an overview of the different packages used (Linux, PowerDNS, LVS, Squid, lighttpd, Apache, PHP5, Lucene, Mono, Memcached), how they use their CDN, how caching works, how they profile their code, how they store their media, how they structure their database access, how they handle search, how they handle load balancing and administration. All with real code examples and examples of configuration files. This is a really useful resource.
Comments
Re: Paper: Wikipedia's Site Internals, Configuration, Code Examp
Very detailed document really covering most (or all?) topics mentioned in the post.
I haven't yet finished reading it, still in progress, but that's already absolutely clear that it's worth reading, thanks for the link!