There were many talks recently about twitter scalability and their specific choice of language such as Scala to address their existing Ruby based scalability. In this post i tried to provide a more methodical approach for handling twitter scalability challenges that is centered around the right choice of architecture patterns rather then the language itself. The architecture pattern are given in a generic fashion that is not specific to twitter itself and can serve anyone who is looking to build a scalable real time web application in the near future.
This post I provided a summary of recent discussions outlining the main challenges that developers face today when deploying their existing JEE application to the cloud such as complexity, database integration, security, standard JEE support etc. In this post i also provided summary of how we managed to handle those challenges with our new Cloud Computing Framework by pointing to an existing production reference of a leading Telco provider.
For those interested in building scalable systems, today I will speak about the Facebook Char architecture. Starting keynote:
''When your feature’s userbase will go from 0 to 70 million practically overnight, scalability has to be baked in from the start.''Eugene Lutuchy, lead engineer on Facebook Chat
Facebook's engg. director aditya talks about facebook architecture. How they use mysql, php and memcache. How they have modified the above to suit their requirements.
I'm seeking for a design pattern or advice or directions. I need to count views/downloads of a set of resources, let them to be identified by their respective URLs. This is not a big problem. I also need to keep a list of viewed/downloaded resources in the last X days. This list needs to be updated every now and then to reflect real last X days of usage. So resources that were requested prior to X days get evicted from it. So it's sort of a black box, you feed messages (download request) in and it gives you that list of URLs with counters on the other end. How would you go about designing it?
Hibernate and iBATIS and other similar tools have documentation with recommendations for avoiding the "N+1 select" problem. The problem being that if you wanted to retrieve a set of widgets from a table, one query would be used to to retrieve all the ids of the matching widgets (select widget_id from widget where ...) and then for each id, another select is used to retrieve the details of that widget (select * from widget where widget_id = ?). If you have 100 widgets, it requires 101 queries to get the details of them all. I can see why this is bad, but what if you're doing entity caching? i.e. If you run the first query to get your list of ids, and then for each widget you retrive it from the cache. Surely in that case, N+1(+caching) is good? Assuming of course that there is a high probability of all of the matching entities being in the cache. I may be asking a daft question here - one whose answer is obviously implied by the large scalable mechanisms for storing data that are in use these days.
Learned lessons from the largest player (Flickr, YouTube, Google, etc) I would like to write today about some learned lessons from the biggest player in the high Scalable Web application. I will divide the lessons into 4 points: * Start slow, and small, and measuring the right thing. * Vertical Scalability vs. Horizontal Scalability. * Every problem has its own solution. * General learned lesson Read more
Lessons learned from OpenX's large-scale deployment to Amazon EC2: