Free Book: Practical Scalablility Analysis with the Universal Scalability Law
If you are very comfortable with math and modeling Dr. Neil Gunther's Universal Scalability Law is a powerful way of predicting system performance and whittling down those bottlenecks. If not, the USL can be hard to wrap your head around.
There's a free eBook for that. Performance and scalability expert Baron Schwartz, founder of VividCortex, has written a wonderful exploration of scalability truths using the USL as a lens: Practical Scalablility Analysis with the Universal Scalability Law.
As a sample of what you'll learn, here are some of the key takeaways from the book:
- Scalability is a formal concept that is best defined as a mathematical function.
- Linear scalability means equal return on investment. Double down on workers and you’ll get twice as much work done; add twice as many nodes and you’ll increase the maximum capacity twofold. Linear scalability is oft claimed but seldom delivered.
- Systems scale sublinearly because of contention, which adds queueing delay, and crosstalk, which inflates service times. The penalty for contention grows linearly and the crosstalk penalty grows quadratically. (An alternative to the crosstalk theory is that longer queues are more costly to manage.)
- Contention causes throughput to asymptotically approach the reciprocal of the serialized fraction of the workload. If your workload is 5% serialized you’ll never grow the effective speedup by more than 20-fold
- Crosstalk causes the system to regress. The harder you try to push systems with crosstalk, the more time they spend fighting amongst themselves.
- To build scalable systems, avoid contention (serialization) and crosstalk (synchronization). The contention and crosstalk penalties degrade system scalability and performance much faster than you’d think. Even tiny amounts of serialization or pairwise data synchronization cause big losses in efficiency.
- If you can’t avoid crosstalk, partition (shard) into smaller systems that will lose less efficiency by avoiding the explosion of service times at larger sizes.
- To model systems with the USL, obtain measurements of throughput at various levels of load or size, and use regression to estimate the parameters to Equation 3.
- To forecast scalability beyond what’s observable, be pessimistic and treat the USL as a best-case scenario that won’t really happen. Use Equation 4 to forecast the maximum possible throughput, but don’t forecast too far out. Use Equation 6 to forecast response time.
- Use your judgment to predict limitations that USL can’t see, such as saturation of network bandwidth or changes in the system’s model when all of the CPUs become busy
- Use the USL to explain why systems aren’t scaling well. Too much queueing? Too much crosstalk? Treat the USL as a pessimistic model and demand that your systems scale at least as well as it does.
- If you see superlinear scaling, check your measurements and how you’ve set up the system under test. In most cases σ should be positive, not negative. Make sure you’re not varying the system’s dimensions relative to each other and creating apparent superlinear efficiencies that don’t really exist.
- It’s fun to fantasize about models that might match observed system behavior more closely than the USL, but the USL arises analytically from how we know queueing systems work. Invented models might not have any basis in reality. Besides, the USL usually models systems extremely well up to the point of inflection, and modeling what happens beyond that isn’t as interesting as knowing why it happens.
- Never trust a scatterplot with an arbitrary curve fit through it unless you know why that’s the right curve. Don’t confuse the USL, hockey stick charts from queueing theory, or other charts that just happen to have similar shapes. Know what shape various plots should exhibit, and suspect bad measurements or other mistakes if you don’t see them.
Note, the link to the eBook requires entering some data, but it's free, well written, and useful, so it's probably worth it.