« Some Facebook Secrets to Better Operations | Main | Paper: GargantuanComputing—GRIDs and P2P »

A Scalability checklist?

Hi everyone, I'm researching on Scalability for a college paper, and found this site great, but it has too many tips, articles and the like, but I can't see a hierarchical organization of subjects, I would need something like a checklist of things or fields, or technologies to take into account when assesing scalability.

So far I've identified these:

- Hardware scalability:
- scale out
- scale up

- Cache
What types of cache are there? app-level, os-level, network-level, I/O-level?
- Load Balancing
- DB Clustering

Am I missing something important? (I'm sure I am)
I don't expect you to give a lecture here, but maybe point some things out, give me some useful links...

Reader Comments (1)

Like most complex subjects scalability defies easy and meaningful categorization. Some points I find common are:

0. Optimize browser code. A fast loading site makes for a good user experience.
1. Move work to the browser. Put as much of the UI logic in the browser as possible to offload the server, keep the user experience crisp.
2. Use Ajax to make targeted requests and keep up with events.
3. Cache content close to the edge. Use a CDN (when you can) to serve static content as close to the browser as possible because this makes a better user experience.
4. Thin, load balanced, web servers. Implement functionality by load balancing requests across backend servers. Use a scripting language like Ruby, PHP, Python etc to compose pages. Scale by adding more machines.
5. Reverse proxy caching. Don't hit the backend unless you need to.
6. Connect to the backend using a service layer interface. Web server scripting code doesn't implement (much) business logic. It should glue together information gathered from encapsulated services that implement the business logic. Scale by asynchronously queuing requests and load balancing them across more servers.
7. Cache application level objects. Shed load from the database by putting a cache between the service layer and the database. Avoid disk and work from RAM as much as possible. This doesn't just mean do what you did before, just with cache, it means actively develop around the idea that you have fast in memory access available.
8. Shard. Make it so work can be parallelized to the degree you need to scale.
9. Share nothing. Parallelizing work means minimizing sharing and dependencies between components.
10. Cloud yourself. Make an elastic architecture that can grow and shrink horizontally.
11. Use *aaS. Use a storage service, a CPU service, and whatever else you can as a service. Do as little as you can get away with so you can spend time on your own application. Leverage economies of scale in areas that don't need to be your core competency.
12. Automate. Don't deploy by hand or do anything you can convince a machine to do for you.
13. Monitor. Monitor everything all the time and use that information to make your system better.
14. Use appropriate database technology. This is of course the tricky part and obviously interacts with all the other choices that have been made. With all the other strategies in place it's quite possible you can get away with a relatively simple database setup. Caching layers can protect the database from load and avoid complex queries and other database killers. Or maybe use web scale databases like BigTable or SimpleDB.

November 29, 1990 | Unregistered CommenterTodd Hoff

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>