Strategy

Letting Clients Know What's Changed: Push Me or Pull Me?

I had a false belief
I thought I came here to stay
We're all just visiting
All just breaking like waves
The oceans made me, but who came up with me?
Push me, pull me, push me, or pull me out .

So true Perl Jam (Push me Pull me lyrics), so true. I too have wondered how web clients should be notified of model changes. Should servers push events to clients or should clients pull events from servers? A topic worthy of its own song if ever there was one.

To pull events the client simply starts a timer and makes a request to the server. This is polling. You can either pull a complete set of fresh data or get a list of changes. The server "knows" if anything you are interested in has changed and makes those changes available to you. Knowing what has changed can be relatively simple with a publish-subscribe type backend or you can get very complex with fine grained bit maps of attributes and keeping per client state on what I client still needs to see.

Polling is heavy man. Imagine all your clients hitting your servers every 5 seconds even if there are no updates. And if every poll request ends up in a flurry of database requests your database can be hammered. Of course, caching can smooth out this jagged trip, but if you keep per client state you need more clever per client cache views. The overhead of polling can be mitigated somewhat by piggy backing updates on replies to client requests.

So if polling has a high overhead then it makes sense to only send data when there's an update the client should see. That is, we push data to the client. The current push model favorite is Comet: a World Wide Web application architecture in which a web server sends data to a client program (normally a web browser) asynchronously without any need for the client to explicitly request it. It allows creation of event-driven web applications, enabling real-time interaction otherwise impossible in a browser.

Nothing comes for free however and pushing has a surprising amount of overhead too. A connection has to be kept open between the client and server for the new data to pushed over. Typically servers don't handle large tables of connections very well so this approach hasn't worked well. You had to spread the connections over multiple servers. Fortunately operating systems are getting better at handling large numbers of connections.

For every connection you also have to store the data to push to the client and you need a thread to send it. It's easy to see how this could go bad with naive architectures.

Architecturally I've always sided on polling for complete datasets rather than pushing or polling just for changes. This is the simplest and best self-healing architecture. Machines can go up and down at will and your client will always be correct and consistent. There's no chance for the stream of changes to get out of sync. Your client view will always be correct. The server side doesn't have to do anything too special. Clients already know how to do it. And you use client resources to do the polling and the update on the client side.

All you have to do to scale polling is have enough machines, smart caching to handle the load, enough bandwidth to handle larger datasets, and a problem where low latency isn't required. That's all :-)

The Comet Daily, not affiliated with Super Man I hear, is making a strong case for push in their articles Comet is Always Better Than Polling and 20,000 Reasons Why Comet Scales.

Special application server software is needed because your typical app server can't handle lots of persistent connections. They tend to run out of threads and memory. Greg Wilkins talks about these and other issues in Blocking Servlets, Asynchronous Transport. This is all pretty standard stuff when you build your own messaging system, but I guess it has taken a while to move into the web infrastructure.

With Comet they found:
The key result is that sub-second latency is achievable even for 20,000 users. There is an expected latency vs. throughput tradeoff: for example, for 5,000 users, 100ms latency is achievable up to 2,000 messages per second, but increases to over 250ms for rates over 3,000 messages per second.

Interesting results, especially if your application requires low latency updates. Most people haven't deployed or even considered push based architectures. With Comet it's at least something to think about.

I can't resist adding this cute animation of a
llama push me pull me.

Letting Clients Know What's Changed: Push Me or Pull Me?

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale