37signals Architecture

Update 7: Basecamp, now with more vroom. Basecamp application servers running Ruby code were upgraded and virtualization was removed. The result: A 66 % reduction in the response time while handling multiples of the traffic is beyond what I expected. They still use virtualization (Linux KVM), just less of it now.
Update 6: Things We’ve Learned at 37Signals. Themes: less is more; don't worry be happy.
Update 5: Nuts & Bolts: HAproxy . Nice explanation (post, screencast) by Mark Imbriaco of why HAProxy (load balancing proxy server) is their favorite (fast, efficient, graceful configuration, queues requests when Mongrels are busy) for spreading dynamic content between Apache web servers and Mongrel application servers.
Update 4: O'Rielly's Tim O'Brien interviews David Hansson, Rails creator and 37signals partner. Says BaseCamp scales horizontally on the application and web tier. Scales up for the database, using one "big ass" 128GB machine. Says: As technology moves on, hardware gets cheaper and cheaper. In my mind, you don't want to shard unless you positively have to, sort of a last resort approach.
Update 3: The need for speed: Making Basecamp faster. Pages now load twice as fast, cut CPU usage by a third and database time by about half. Results achieved by: Analysis, Caching, MySQL optimizations, Hardware upgrades.
Update 2: customer support is handled in real-time using Campfire.
Update: highly useful information on creating a customer billing system.


In the giving spirit of Christmas the folks at 37signals have shared a bit about how their system works. 37signals is most famous for loosing Ruby on Rails into the world and they've use RoR to make their very popular Basecamp, Highrise, Backpack, and Campfire products. RoR takes a lot of heat for being a performance dog, but 37signals seems to handle a lot of traffic with relatively normal sounding resources. This is just an initial data dump, they promise to add more details later. As they add more I'll update it here.

Site: http://www.37signals.com

Information Sources

  • Ask 37signals: Numbers?
  • Ask 37signals: How do you process credit cards?
  • Behind the scenes at 37signals: Support
  • Ask 37signals: Why did you restart Highrise?

    Platform

  • Ruby on Rails
  • Memcached
  • Xen
  • MySQL
  • S3 for image storage

    The Stats

  • 30 servers ranging from single processor file servers to 8 CPU application servers for about 100 CPUs and 200GB of RAM.
  • Plan to diagonally scale by reducing the number of servers to 16 for about 92 CPU cores (each significantly faster than what are used today) and 230 GB of combined RAM.
  • Xen virtualization will be used to improve system management.
  • Basecamp (web based project management)
    * 2,000,000 people with accounts
    * 1,340,000 projects
    * 13,200,000 to-do items
    * 9,200,000 messages
    * 12,200,000 comments
    * 5,500,000 time tracking entries
    * 4,000,000 milestones

  • Backpack (personal and small business information management)
    * Just under 1,000,000 pages
    * 6,800,000 to-do items
    * 1,500,000 notes
    * 829,000 photos
    * 370,000 files

  • Overall storage stats (Nov 2007)
    * 5.9 terabytes of customer-uploaded files
    * 888 GB files uploaded (900,000 requests)
    * 2 TB files downloaded (8,500,000 requests)

    The Architecture

  • Memcached caching is used and they are looking to add more. Yields impressive performance results.
  • URL helper methods are used rather than building the URLs by hand.
  • Standard ActiveRecord built queries are used, but for performance reasons they will also "dig in and use" find_by_sql when necessary.
  • They fix Rails when they run into performance problems. It pays to be king :-)
  • Amazon’s S3 is used for storage of files upload by users. Extremely happy with results.

    Credit Card Processing Process

  • Bill monthly. It makes credit card companies more comfortable because they won't be on the hook for a large chunk of change if your company goes out of business. Customers also like it better because it costs less up front and you don't need a contract. Just pay as long as you want the service.

  • Get a Merchant Account. One is needed to process credit cards. They use Chase Bank. Use someone you trust and later negotiate rates when you get enough volume that it matters.
  • Authorize.net is the gateway they use to process the credit card charge.
  • A custom built system handles the monthly billing. It runs each night and bills the appropriate people and records the result.
  • On success an invoice is sent via email.
  • On failure an explanation is sent to the customer.
  • If the card is declined three times the account is frozen until a valid card number is provided.
  • Error handling is critical because problems with charges are common. Freeze to fast is bad, freezing too slow is also bad.
  • All products are being converted to using a centralized billing service.
  • You need to be PCI DSS (Payment Card Industry Data Security Standard) compliant.
  • Use a gateway service that makes it so you don't have to store credit card numbers on your site. That makes your life easier because of the greater security. Some gateway services do have reoccurring billing so you don't have to do it yourself.

    Customer Support

  • Campfire is used for customer service. Campfire is a web-based group chat tool, password-protectable, with chatting, file sharing, image previewing, and decision making.
  • Issues discussed are used to drive code changes and the subversion commit is shown in the conversation. Seems to skip a bug tracking system, which would make it hard to manage bugs and features in any traditional sense, ie, you can't track subversion changes back to a bug and you can't report what features and bugs are in a release.
  • Support can solve problems by customers uploading images, sharing screens, sharing files, and chatting in real-time.
  • Developers are always on within Campfire addressing problems in real-time with the customers.

    Lessons Learned

  • Take a lesson from Amazon and build internal functions as services from the start. This make it easier to share them across all product lines and transparently upgrade features.
  • Don't store credit card numbers on your site. This greatly reduces your security risk.
  • Developers and customers should interact in real-time on a public forum. Customers get better service as developers handle issues as they come up in the normal flow of their development cycle. Several layers of the usual BS are removed. Developers learn what customers like and dislike which makes product development more agile. Customers can see the responsiveness of the company to customers by reading the interactions. This goes a long ways to give potential customers the confidence and the motivation to sign up.
  • Evolve your software by actual features needed by users instead of making up features someone might need someday. Otherwise you end up building something that nobody wants and won't work anyway.