Free Book: Is Parallel Programming Hard, And, If So, What Can You Do About It?
“The trouble ain’t that people are ignorant: it’s that they know so much that ain’t so." -- Josh Billings
Is Parallel Programming Hard? Yes. What Can You Do About It? To answer that, Paul McKenney, Distinguished Engineer at IBM Linux Technology Center, vetran of parallel powerhouses SRI and Sequent, has written an epic 400+ page book: Is Parallel Programming Hard, And, If So, What Can You Do About It?
The goal of the book? "To help you understand how to design shared-memory parallel programs to perform and scale well with minimal risk to your sanity."
So it's not a book about parallelism in the sense of getting the most out of a distributed system, it's a book in the mechanical-sympathy sense of getting the most out of a single machine.
Some example section titles: Introduction, Alternative to Parallel Programming, What Makes Parallel Programming Hard, Hardware and its Habits, Tools of the Trade, Counting, Partitioning and Synchronized Design, Locking, Data Ownership, Deferred Processing, Data Structures, Validation, Formal Verification, Putting it All Together, Advanced Synchronization, Parallel Real-Time Computing, Ease of Use, Conflicting Visions of the Future.
To get a feel for the kind of things you'll learn in the book, here's an interview where Paul talks about what in parallel programming is the hardest to master:
One common perception is that the best way to obtain an optimal parallel program is to start with the optimal sequential program. However, in my experience, low-level retrofitting of parallelism is rarely optimal. You should instead start with the high-level problem and work out a way to partition it. In other words, parallelism is first and foremost a design problem. If you get the design right, the coding will not be hard. If you fail to get the design right, then yes, coding the resulting multicore mess really can be mind-crushingly difficult. In particular, if you are pushing all of the program’s data through a single serialized data structure such as a stack or a queue, you are unclear on the concept. Yes, such a data structure will keep its elements in order, but what good is that when parallelism will rearrange the order in which they are processed both before they are added and after they are removed?
For another example, some people believe that parallelism must necessarily infuse all aspects of computing. While this might eventually be proven to be the case, in the meantime parallelism is but one performance-optimization technique of many. To see why this demotion of parallelism from holy grail to performance-optimization technique is justified, imagine a sequential program whose performance is satisfactory to its users. Why on earth should such a program be parallelized? There are real costs to doing so, and absolutely no benefits.
Finally, some people believe that the “problem” of parallel programming must be solved using one single optimal tool. This viewpoint makes about as much sense as insisting that the laptop I am using to type these words be constructed using only one type of fastener. “You can use hinges, screws, nails, rivets, glue, velcro, welds, or clamps. But only one of them!” The fact is that most of the recent patches that have done the most to improve the Linux kernel’s performance and scalability have used combinations of a number of synchronization mechanisms, and there is no reason to expect this situation to change.