Strategy

Antirez: You Need to Think in Terms of Organizing Your Data for Fetching

Salvatore Sanfilippo wrote a response to Michel Martens' An Open Minded Reader. There's nothing in the post or response that's controversial. I was just struck at what a clear explication the conversation was on all the effort that goes into optimizing read paths. We optimize reads through denormalisation, a crazy quilt of caching layers, key-value databases, clustering of related tables, SSD/RAM, DHTs, moving functions to storage, secondary indexes, separating OLAP from OLTP, etc etc. We often focus so much on specific techniques that we can forget the bigger picture of what's going on. This little exchange made me look again at the forest, not just the trees.

Michel Martens:

What does it mean to use Redis as a traditional database? If it means to save all your data and expect to retrieve it later in new and creative ways, then we have to agree that better tools are available. It is one of Redis tradeoffs: you have to think in advance how you will want to get your data back. Another tradeoff has to do with space: Redis is not a good fit for Big Data. It's not even a good fit for Medium Data. You are in charge of making good use of the available memory, and there's still no elegant way to work around that limitation.

Antirez responds with:

Not at all, the whole idea of its data model, and part of the fact that it will be so fast to retrieve your data, is that you need to think in terms of organising your data for fetching. You need to design with the query patterns in mind. In short most of the times your data inside Redis is stored in a way that is natural to query for your use case, that's why is so fast apart from being in memory, there is no query analysis, optimisation, data reordering. You are just operating on data structures via primitive capabilities offered by those data structures. End of the story.

...

The reality is that fancy queries are an awesome SQL capability (so incredible that it was hard for all us to escape this warm and comfortable paradigm), but not at scale. So anyway if your data needs to be composed to be served, you are not in good waters.

Both posts are well worh reading.

On Hacker News

Kafka 101

This is a guest article by Stanislav Kozlovski, an Apache Kafka Committer. If you would like to connect with Stanislav, you can do so on Twitter and LinkedIn. Originally developed in LinkedIn during 2011, Apache Kafka is one of the most popular open-source Apache projects out there. So far it

Capturing A Billion Emo(j)i-ons

This blog post was written by Dedeepya Bonthu. This is a repost from her Medium article, approved by the author. In stadiums, sports fans love to express themselves by cheering for their favorite teams, holding up placards and team logos. Emoji’s allow fans at home to rapidly express themselves,

Brief History of Scaling Uber

This blog post was written by Josh Clemm, Senior Director of Engineering at Uber Eats. This is a repost from his LinkedIn article, approved by the author. On a cold evening in Paris in 2008, Travis Kalanick and Garrett Camp couldn't get a cab. That's when

Behind AWS S3’s Massive Scale

This is a guest article by Stanislav Kozlovski, an Apache Kafka Committer. If you would like to connect with Stanislav, you can do so on Twitter and LinkedIn. AWS S3 is a service every engineer is familiar with. It’s the service that popularized the notion of cold-storage to the

Related Articles

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale