Big Data In the Cloud Using Cloudify

Edd Dumbill wrote an interesting article on O’Reilly Radar covering the current solutions for running Big Data in the Cloud

Big data and cloud technology go hand-in-hand. Big data needs clusters of servers for processing, which clouds can readily provide.

Big PaaS

Edd touched briefly on the role of PaaS for delivering Big Data applications in the cloud

Beyond IaaS, several cloud services provide application layer support for big data work. Sometimes referred to as managed solutions, or platform as a service

(PaaS), these services remove the need to ucale things such as databases or MapReduce, reducing your workload and maintenance burden. Additionally, PaaS providers can realize great efficiencies by hosting at the application level, and pass those savings on to the customer.

To put it simply, managing data clusters is one thing. Being able to process the data is yet another challenge that we need to think about when we’re dealing with application platforms, as I noted in one of my earlier posts

and this is where PaaS plays an important role.

The main challenge is that quite often the management of the data processing logic is built on completely different scaling, availability and monitoring tools than the one used for managing our Big Data deployment. It turns out, that this silo thinking leads to a whole set of complexities starting from the inconsistency in having multiple managers, each determined in a different way when there is a failure or scaling event, and that quite often end up conflicting with one another. Having lots of moving parts is yet another challenge that makes the entire deployment pretty much a complete mess.

In this post, I wanted to cover more specifically how I see the evolution of cloud application platforms (PaaS) to support Big Data. I’ll refer specifically to Cloudify

which was designed primarily to support Big Data applications.

Read the full story here.