Inspired by a xkcd comic, Peter Norvig, Director of Research at Google and all around interesting and nice guy, has created an above par code kata involving a regex program that demonstrates the core inner loop of many successful systems profiled on HighScalability.
The original code is at xkcd 1313: Regex Golf, which comes up with an algorithm to find a short regex that matches the winners and not the losers from two arbitrary lists. The Python code is readable, the process is TDDish, and the problem, which sounds simple, but soon explodes into regex weirdness, as does most regex code. If you find regular expressions confusing you'll definitely benefit from Peter's deliberate strategy for finding a regex.
The post demonstrating the iterated improvement of the program is at xkcd 1313: Regex Golf (Part 2: Infinite Problems). As with most first solutions it wasn't optimal. To improve the program Peter recommends the following steps: