Notes from A NOSQL Evening in Palo Alto
I along with 180 other people and veritable who's who of NoSQL vendors, attended the A NoSQL Evening in Palo Alto NoSQL Meetup on Tuesday. The format was a panel of 10 vendors--10gen, Basho, CouchOne, Cloudant, Cloudera, GoGrid, InfiniteGraph, Membase, Riptano, Scality--sitting in two rows of chairs in front of what seemed like a pretty diverse audience. Tim Anglade (founder, A NOSQL Summer) moderated. Tim kept things moving by asking a few leading questions and the panel chimed in with answers. Quite a few questions came from the audience, which was refreshing.
Overall a genial evening with some good discussion. I was pleased that the panel members didn't just automatically slip into marketing speak. Most of the discussions were on point rather than just another excuse to hit the talking points. There were some complaints about the talk not being technical enough, but I don't think that was really the purpose of this kind of talk. The panel format is excellent at giving a wide range of views on general topics, and that's exactly how the evening went.
Some key takeaways:
- Good energy. A lot of people are trying to good things and are excited to be in a space where technology still matters more than politics. Real problems are being solved for customers and that's motivating.
- NoSQL took away the relational model and gave nothing back. Using NoSQL for complex data puts way too much pressure on the programmer.
- NoSQL will not converge. There's no consensus on what the next thing will be, so we are unlikely to see any standardization in the NoSQL world any time soon. There is a convergence on some features, but it seems the products will evolve to serve specific markets. This is not a bad thing. NoSQL doesn't need to converge on one stack. Products can remain differentiated by being able solve specific problems.
- NoSQL has a parallel to the "back to the land movement". As the relational world and the framework world got ever more complex and expensive, a counter movement developed that sought out simplicity and transparency.
There's really no way to say what the discussion was about because it went in a lot of different directions. So I'll just share a few of the points that stood out. And there's also no way I can say who said what either. Pretend it came from NoSQL's collective unconscious. I apologize in advance if I missed something or got something wrong, I can only take notes so fast.
What drove the innovation of the NOSQL movement? It all seemed to come at once.
This is where having a large panel really shines. A wide variety of answers were given, each adding a brick to what must remain an unfinished building that is the true answer.
- Need.
- Failure of the RDBMS. Supporters of the relational model always said relationships aren't the problem, implementations are the problem, but we never seemed to get implementations that people wanted to use.
- Advent of the Cloud and commodity computing.
- Nature of new applications made it feasible. Social networking and interactive web sites are a natural fit.
- BigData.
- Highly concurrent application with different access patterns.
- Knowledge of P2P systems.
- Survivors of the dot com crash were all using NoSQL tech and that spread out.
- Many NoSQL type systems existed previously but were proprietary. They needed to be productized and be made generally useful and attractive to developers.
- Scale.
- Data modeling for relational databases was hard to do, even if you got it correct. The schemaless NoSQL model is a more natural fit for dynamic applications.
- Relational databases lied about their ability to scale by adding more hardware.
- The high cost of RDBMS clusters.
- SQL at scale isn't pretty because it required sharding on top of the database, which looses all benefits of the relational model, so why do it?
- Building sharded system ontop of SQL is a one of a kind build, it's not a transferable skill. If you build a system with a NoSQL database you've learned something that can be used going forward when building other systems.
Seems to be Two Trends, Becoming Developer Friendly and Operationally Friendly.
- There was an interesting discussion about if all the products are being sucked into the same place by customers who were all saying the same things and having the same problems. They want horizontal scalability, easy to develop, easy to operate, elastic, reliability, and support for complex data types. Will all the products end up looking alike? The consensus seemed to be that no, there were enough different problems and approaches that there wouldn't be convergence, though there will be common threads.
- The Relational Database Market is multiple billions of dollars large and the NoSQL market is 1% of that. Essentially nothing. By opening up new markets by helping people do things they could never do before, the NoSQL market could grow to 10x the existing relational database market. NoSQL helps you store all the data and do something with it, relational doesn't.
- It was asserted that NoSQL can handle 70% of the use cases that a relational database covers.
- NoSQL has not arrived. We still need basic stuff. Core functionality still needs development.
- The flexibility of NoSQL data modeling is the key to agility and innovation.
- NoSQL is in the BigData space which has a lot of customers with deep pockets.
- A lot of data is being dropped on the floor currently that could be mined at profit. That's a big opportunity for NoSQL.
- Is BigData a myth? If you have any two of Velocity, Volume, or Variety then you have a BigData problem, it's not just about volume. It depends on what you are doing with the data.
- Can we pick just one database? NoSQL typically doesn't support range queries, secondary indexes, and other programmer friendly features.
- Politics and skills of the staff are heavily involved in product wins in the enterprise. Most growth is from new applications where the relational bias doesn't have to be battered down quite as hard.
Use Cases
This was an interesting section of the evening because after all the talk about how well NoSQL filled so many use cases, there wasn't much talk about what those use cases were. It's clear more work has to be done on figuring out what these use case are. Some applications:
- Realtime and batch analytics.
- Graph analytics. Targeting social graphs and knowledge graphs.
- Realtime targeting.
- Session store.
- Anything that needs a to store and get key-value data with bounded latencies.
- Applications requiring dynamic schemas, particularly those that take in feeds where the schema changes a lot.
- Aggregation of data from different sources.
- Pattern recognition.
- Syncing data between customers.
- Untethered scenarios where data is locally edited and then later synced over the network.
- Spaces where technology matters to solve real business problems. A place with fewer political battles.
- NoSQL isn't just about BigData, it's about problems that require high concurrency and support for large numbers of interactive users at the data layer.
- NoSQL took away the relational model and gave nothing back said an audience member. NoSQL has lost the idea of how data relates to other data. All those nasty relationships are pushed to the poor programmer to implement in code rather than the database managing it. We've sacrificed usability. NoSQL is about concurrency, latency, and scalability, but not about data.
- We've built all these crazy new models. What's next?
- Explicit concurrency and parallelism will be necessary to exploit resources.
A clear theme is: real-time, interactive applications, at scale with low latency guarantees.
NoSQL Unification
- One reason SQL won is because there was a standard. There was a standard because there was a market leader like IBM to create a standard. There's no equivalent market leader in the NoSQL world so there won't be a unification for a very long time.
- A lot of crazy sh*t hasn't been invented yet. New problems will cause new products to be developed.
- NoSQL doesn't need to converge back to one stack. NoSQL doesn't need to reinvent relational databases again using a different form. A thousand flowers can bloom. NoSQL can focus on solving practical business problems in different ways.
- Indexes of data will be magnitudes larger than the data. Improvements in memory technology will make this possible. Lots of available RAM will make it possible to solve problems that were too impossible or too expensive to solve before.
- CPUs are still sitting at 10%-20% usage, IO is at a premium and the network is easily saturated.
Business Models
- Most enterprises don't have the expertise to setup and maintain a NoSQL infrastructure so it will be used as service.
- Wide variety of different approaches to monetizing NoSQL. Some people think hiring an operations team is difficult so don't want to be in that game and others are attacking the full support play as an opportunity.
- Startups are cool with using open source products without a lot of support. Enterprises really want to have a relationship with a trusted company that they can call at 2AM to fix problems.
- Going open source is a form of escrow, letting customers feel safe because they have the code.
- Open source is a good adjunct to hiring because you can choose from project committers about whom you know something about their skill, personality, and knowledge of the project.
- Some companies are selling support and services. Others are going the "extras" route where advance features are sold separately from the core.
- Closed source is seen as a viable model as long as you have the support to go along with it. Easier for more established companies to pull off. Startups are more likely to be successful with open source.
- Enterprises are in some way the most appropriate market for NoSQL because they have knowledge and resources to understand and test NoSQL to verify that it works.
- NoSQL companies should cooperate together to better compete with the relational market.
- There will be bugs, things happen, have a plan to deal with these professionally. People will always say bad things about you, the key is to take control of positive press generated by customers.
What's Next?
At the end companies talked about what was next for them.
- Reduce barriers to entry.
- Add features. There's a lot catch-up being played as different vendors try to match each other feature-wise.
- Scale support.
- Build more partnerships. This seems a key strategy for some companies.
- Add more language bindings and APIs.
- Port to more platforms.
- Push more logic into the database.
- Own the entire stack. Cassandra mentioned this in the context of directly supporting Lucene. I'm not exactly clear what it means though.