Scaling Early Stage Startups

Todd Hoff's picture

Mark Maunder of No VC Required--who advocates not taking VC money lest you be turned into a frog instead of the prince (or princess) you were dreaming of--has an excellent slide deck on how to scale an early stage startup. His blog also has some good SEO tips and a very spooky widget showing the geographical location of his readers. Perfect for Halloween! What is Mark's other worldly scaling strategies for startups?

Site: http://novcrequired.com/

Information Sources

  • Slides from Seattle Tech Startup Talk.
  • Scaling Early Stage Startups blog post by Mark Maunder.

    The Platform

  • Linxux
  • An ISAM type data store.
  • Perl
  • Httperf is used for benchmarking.
  • Websitepulse.com is used for perf monitoring.

    The Architecture

  • Performance matters because being slow could cost you 20% of your revenue. The UIE guys disagree saying this ain't necessarily so. They explain their reasoning in Usability Tools Podcast: The Truth About Page Download Time. The idea is: "There was still another surprising finding from our study: a strong correlation between perceived download time and whether users successfully completed their tasks on a site. There was, however, no correlation between actual download time and task success, causing us to discard our original hypothesis. It seems that, when people accomplish what they set out to do on a site, they perceive that site to be fast." So it might be a better use of time to improve the front-end rather than the back-end.

  • MySQL was dumped because of performance problems: MySQL didn't handle a high number of writes and deletes on large tables, writes blow away the query cache, large numbers of small tables (over 10,000) are not well supported, uses a lot of memory to cache indexes, maxed out at 200 concurrent read/write queuries per second with over 1 million records.

  • For data storage they evolved to a fixed length ISAM like record scheme that allows seeking directly to the data. Still uses file level locking and its benchmarked at 20,000+ concurrent reads/writes/deletes. Considering moving to BerkelyDB which is a very highly performing and is used by many large websites, especially when you primarily need key-value type lookups. I think it might be interesting to store json if a lot of this data ends up being displayed on the web page.

  • Moved to httpd.prefork for Perl. That with no keepalive on the application servers uses less RAM and works well.

    Lessons Learned

  • Configure your DB and web server correctly. MySQL and Apache's memory usage can easily spiral out of control which leads gridingly slow performance as swapping increases. Here are a few resources for helping with configuration issues.

  • Serve only the users you care about. Block content theives that crawl your site using a lot of valuable resources for nothing. Monitor the number of content pages they fetch per minute. If a threshold is exceeded and then do a reverse lookup on their IP address and configure your firewall to block them.

  • Cache as much DB data and static content as possible. Perl's Cache::FileCache was used to cache DB data and rendered HTML on disk.

  • Use two different host names in URLs to enable browser clients to load images in parallele.

  • Make content as static as possible Create a separate Image and CSS server to serve the static content. Use keepalives on static content as static content uses little memory per thread/process.

  • Leave plenty of spare memory. Spare memory allows Linux to use more memory fore file system caching which increased performance about 20 percent.

  • Turn Keepalive off on your dynamic content. Increasing http requests can exhaust the thread and memory resources needed to serve them.

  • You may not need a complex RDBMS for accessing data. Consider a lighter weight database BerkelyDB.

  • Comments

    Re: Scaling Early Stage Startups

    "they evolved to a fixed length ISAM like record scheme" - I'm not clear, which application is that? They're not using BerkleyDB and no longer using MySQL, do they say what they are using?

    Callum

    Re: Scaling Early Stage Startups

    Hi Callum,

    I got an email from Todd with your question. :)

    We built our own fast file storage routines from the ground up. It's loosely based on ISAM or MySQL's MyISAM in that it uses fixed length sequential records. It's a lot faster for certain specific operations that we require. Unfortunately it's not open source at this time but perhaps we'll release it in future.

    Regards,

    Mark Maunder
    FEEDJIT Founder & CEO

    Re: Scaling Early Stage Startups

    In the powerpoint "slide deck" he says about MySQL "MySQL doesn’t support a large number of small tables (over 10,000)."

    Why on earth would you have over 10,000 tables? That sounds like bad design.

    Re: Scaling Early Stage Startups

    @Dimitri: Joining in a bit late but in answer to your question, some cases are more efficiently solved by using multiple small tables rather than one large one.

    A case in point is WordPress Multi User (http://mu.wordpress.org/faq/) creating tables for each blog.

    Todd Hoff's picture

    Re: Scaling Early Stage Startups

    Just in case you don't want to follow the link :-)

    WordPress MU creates tables for each blog, which is the system we found worked best for plugin compatibility and scaling after lots of testing and trial and error. This takes advantage of existing OS-level and MySQL query caches and also makes it infinitely easier to segment user data, which is what all services that grow beyond a single box eventually have to do. We're practical folks, so we'll use whatever works best, and for the 2.3m and counting on WordPress.com, MU has been a champ.

    Re: Scaling Early Stage Startups

    We built our own fast file storage routines from the ground up. It's loosely based on ISAM or MySQL's MyISAM in that it uses fixed length sequential records. It's a lot faster for certain specific operations that we require. Unfortunately it's not open source at this time but perhaps we'll release it in future.

    Re: Scaling Early Stage Startups

    Nice start for early beginners
    -----
    sea plants
    sea grapes...plant roots

    Comment viewing options

    Select your preferred way to display the comments and click "Save settings" to activate your changes.

    Post new comment

    The content of this field is kept private and will not be shown publicly.
    • Web page addresses and e-mail addresses turn into links automatically.
    • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><div ?=?><p ?=?> <img ?=?><h1 ?=?><h2 ?=?><h3 ?=?>
    • Lines and paragraphs break automatically.
    • Glossary terms will be automatically marked with links to their descriptions
    • You may link to webpages through the weblinks registry

    More information about formatting options

    To combat spam, please enter the code in the image.