Friday, November 2, 2007 at 1:46AM
WordPress.com hosts 300 servers in 5 different data centers. It's always useful to learn how large installations manage all their unruly children: Currently we Nagios for server health monitoring, Munin for graphing various server metrics, and a wiki to keep track of all the server hardware specs, IPs, vendor IDs, etc. All of these tools have suited us well up until now, but there have been some scaling issues. The post covers how these different tools are working for them and the comment section has some interesting discussions too.