Strategy: Stop Using Linked-Lists
What data structure is more sacred than the link list? If we get rid of it what silly interview questions would we use instead? But not using linked-lists is exactly what Aater Suleman recommends in Should you ever use Linked-Lists?
In The Secret To 10 Million Concurrent Connections one of the important strategies is not scribbling data all over memory via pointers because following pointers increases cache misses which reduces performance. And there’s nothing more iconic of pointers than the link list.
Here are Aeter's reasons to be anti-linked-list:
- They reduce the benefit of out-of-order execution.
- They throw off hardware prefetching.
- They reduce DRAM and TLB locality.
- They cannot leverage SIMD.
- They are harder to send to GPUs.
He also demolishes the pros of linked-lists, finding arrays a better option in almost every case. Good discussion in the comment section as not everyone agrees.
Patrick Wyatt details how a linked-list threading bug repeatedly crashed Starcraft. Also a good comment discussion with a lot of disagreement.
KjellKod is also against linked-lists: Number crunching: Why you should never, ever, EVER use linked-list in your code again. The idea is mathematical complexity does not reflect real-life performance in situ on actual hardware:
Big-O notations tells nothing about how one (data structure with algorithm) fare against another. Big-O will only tell you how performance will degrade as n increases. So comparing one a data structure that is RAM intensive to another data structure that is cache friendly from an abstract Big-O point-of-view is just pointless.
The key as we've seen before is locality of reference. When memory is local and you can avoid cache misses and avoid paging then operations like copying are dirt cheap, when in the mind of the programmer they are expensive. When you don't have locality of reference the thing that programmers think is dirt cheap, following pointers, is extremely expensive. While the first part of the article is frustratingly vague, the last half is full of excellent explanatory detail. Well worth reading. And there are even more good comments in the comment section.
But people really like their link lists. Probably as much as people used to love goto statements. So if performance really matters to you then this is something to think about rather than immediately dismiss as heresy.
Related Articles
- Discuss on Hacker News and on Reddit
- On C Linked Lists (Profiling and Optimizing)
- High performance alternative to bounded queues for exchanging data between concurrent threads - linked lists make poor queues also. Ring buffers backed by arrays are much better (from jhawk28)