advertise
Monday
Sep102007

Blog: Esoteric Curio by Theo Schlossnagle

Theo Schlossnagle is the author of Scalable Internet Architecture and the funder of OmniTI , a global leader in Internet technology services that power the World Wide Web and email. As you might imagine Theo frequently posts on interesting topics for the scalable website builder.

A Quick Hit of What's Inside

Partitioning vs. Federation vs. Sharding, PostgreSQL warm standby on ZFS crack, Scalability vs. Performance: it isn't a battle

Click to read more ...

Monday
Sep102007

Book Store

Monday
Sep102007

Is there a difference between partitioning and federation and sharding?

Unlike Theo Schlossnagle, author of Scalable Internet Architectures, I am not a stickler for semantics because I have an unswerving faith in the ultimate unknowability of the world as experienced by others. That's why it is Theo who bravely tackles the differences in his informative blog post Partitioning vs. Federation vs. Sharding. Royans Tharakan also talks about it on his blog. Is there a difference and does it really matter to all our intrepid scalable website builders? Generally whatever Theo says is probably close to the truth. Yet, in my mind I think of partitioning as a basic level category and federation and sharding as more specific (subordinate) instances of partitioning. And partitioning is a more specific instance of the more more general (superordinate) category divide-and-conquer. Which isn't a useful way to think about the topic at all. So, let's say federation is like Star Trek. The Vulcans, Klingons, and Humans live in very separate policy domains, but they each pledge to work together to make sure Captain Kirk always gets the girl. And sharding is like AJAX, a great marketing term for stuff that may have already existed, but has taken on a new useful life on its own. And that new useful life is that there are very specific examples of how sharding works, how it has been successful for existing web sites, and how you can create your scalable web site use shards. Federation and partitioning are far more nebulous less pragmatic concepts, so I am more than happy to AJAXify sharding into the popular lexicon :-) Related Articles:

  • An Unorthodox Approach to Database Design : The Coming of the Shard.

    Click to read more ...

  • Sunday
    Sep092007

    Clustering Solution

    Hi, I'm interested in peoples thoughts on the best choice for a database clustering solution. I have a database that is mostly varchars and numbers that doesn't store any binary data at all. It's used at about 70% read and 30% writes - though we're using memcached at the moment so it's not really hit that hard. We're currently using mysql with m/cluster, but are interested in a new solution. Possible candidate so far are unicluster (which doesn't seem mature yet.) or DRBD. Had anyone had a similar experience and can make any suggestions? Thanks

    Click to read more ...

    Saturday
    Sep082007

    Making the case for PHP at Yahoo! (Oct 2002)

    This presentation by Michael Radwin describes why Yahoo! had standardized on PHP going forward. It describes how after reviewing all the web technologies including their own internal ones, PHP was choosen. It shows that not only technical reasons , but also business and development processes were taken into account.

    Click to read more ...

    Saturday
    Sep082007

    MP3.com Web Templating Architecture (March, 2000)

    In March, 2000, I did a talk about how we scaled with semi-static files while splitting data from presentation. For dynamic pages we used mod_perl doing an internal redirect with the XML on the style templates. Since then Apache 2.0 contains the concept of filters to allow for similar functionality.

    Click to read more ...

    Friday
    Sep072007

    Joost Network Architecture

    Colm MacCarthaigh, Network Architect at Joost, gave this presentation at the UK Network Operators' Forum Meeting in Manchester on April 3rd, 2007.

    Click to read more ...

    Thursday
    Sep062007

    Product: Perdition Mail Retrieval Proxy

    Perdition is a fully featured POP3 and IMAP4 proxy server. It is able to handle both SSL and non-SSL connections and redirect users to a real-server based on a database lookup. Perdition supports modular based database access. ODBC, MySQL, PostgreSQL, GDBM, POSIX Regular Expression and NIS modules ship with the distribution. The API for modules is open allowing arbitrary modules to be written to allow access to any data store. Perdition has many uses. Including, creating large mail systems where an end-user's mailbox may be stored on one of several hosts, integrating different mail systems together, migrating between different email infrastructures, and bridging plain-text, SSL and TLS services. It can also be used as part of a firewall. The use of perditon to scale mail services beyond a single box is discussed in high capacity email.

    Click to read more ...

    Thursday
    Sep062007

    Scaling IMAP and POP3

    Another scalability strategy brought to you by Erik Osterman: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA load balancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked file system, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less used server. If any server goes offline, it only affects the fraction of users assigned to that server.

    Click to read more ...

    Thursday
    Sep062007

    Why doesn't anyone use j2ee?

    From a reader:

    > Was reading through your very interesting/useful site. >Most of the architectures are non j2ee-Does that mean that >there aren't enough websites that are scalable(with youtube > like userbase) built with j2ee tech-would like to know if there > are any and their architecture as >well.
    eBay uses Java, but in a very pragmatic way. They use servlets, an application server, the JDK, and they do the rest themselves. They skip JSP, entity beans, and JMS. When you need to scale putting all your eggs in one basket is a risky strategy. Why use JSP when you can do better? When use entity beans when you can do better? Use servlets because they are a very effective way of handling http requests. Use Java because it is fast, runs everywhere, and has a boat load of libraries you can use to build your build your custom system. Probably the major reason J2EE is absentee is simply LAMP. LAMP is just so incredibly functional for most 2-tier shared nothing sites they don't need a better infrastructure for writing an application tier. Personally, I pretty excited about GWT which uses Java and servlets. We'll see if that starts to take off a little bit more.

    Click to read more ...