advertise
Saturday
Jun282008

ID generation schemes

Hi, Generating unique ids is a common requirements in many projects. Generally, this responsibility is given to Database layer. By using sequences or some other technique. This is a problem for horizontal scalability. What are the Guid generation schemes used in high scalable web sites generally? I have seen use java's SecureRandom class to generate Guid. What are the other methods generally used? Thanks Unmesh

Click to read more ...

Wednesday
Jun112008

Pyshards aspires to build sharding toolkit for Python

I've been interested in sharding concepts since first hearing the term "shard" a few years back. My interest had been piqued earlier, the first time I read about Google's original approach to distributed search. It was described as a hashtable-like system in which independent physical machines play the role of the buckets. More recently, I needed the capacity and performance of a Sharded system, but did not find helpful libraries or toolkits which would assist with the configuration for my language of preference these days, which is Python. And, since I had a few weeks on my hands, I decided I would begin the work of creating these tools. The result of my initial work the Pyshards project, a still-incomplete python and MySQL based horizontal partitioning and sharding toolkit. HighScalability.com readers will already know that horizontal partitioning is a data segmenting pattern in which distinct groups of physical row-based datasets are distributed across multiple partitions. When the partitions exist as independent databases and when they exist within a shared-nothing architecture they are known as shards. (Google apparently coined the term shard for such database partitions, and pyshards has adopted it.) The goal is to provide big opportunities for database scalability while maintaining good performance. Sharded datasets can be queried individually (one shard) or collectively (aggregate of all shards). In the spirit of The Zen of Python, Pyshards focuses on one obvious way to accomplish horizontal partitioning, and that is by using a hash/modulo based algorithm. Pyshards provides the ability to reasonably add polynomial capacity (number of original shards squared) without re-balancing (re-sharding). Pyshards is designed with re-sharding in mind (because the time will come when you must re-balance) and provides re-sharding algorithms and tools. Finally, Pyshards aspires to provide a web-based shard monitoring tool so that you can keep an eye on resource capacity. So why publish an incomplete open source project? I'd really prefer to work with others who are interested in this topic instead of working in a vacuum. If you are curious, or think you might want to get involved, come visit the project page, join a mailing list, or add a comment on the WIKI. http://code.google.com/p/pyshards/wiki/Pyshards Devin

Click to read more ...

Monday
Jun092008

Apple's iPhone to Use a Centralized Push Based Notification Architecture

Update 2: Hank Williams says iPhone Background Processing: Not Fixed But Halfway There. Excellent analysis of all the reasons you need real background processing. Hey, you can't even build an alarm clock! Hard to believe some commenters say it's not so.. Update: Josh Lowensohn of Webware tells us Why users should be scared of Apple's new notification system. A big item on the iPhone developer iWishlist has been background processing. If you can't write an app to poll for new data in the background how will you keep your even more important non-foreground app in sync? Live from the Apple developer conference we learn the solution is a centralized push based architecture. Here's the relevant MacRumorsLive transcript:

  • Thanking the developers for their hard work. Now talking about how the #1 request has been background support. Apple wants to solve this problem.
  • The wrong solution would be to allow for background processes -- bad for battery life and performance. Poking fun at Windows Mobile's task manager.
  • Apple has come up with a far better solution -- a push notification service available for all developers.
  • When the user quits the application, Apple will push updates from their servers to the iPhone. The developer's servers push the notifications to Apple. These updates can include badges, sounds, and custom messages. This requires just one persistent connection and is extremely scalable.
  • Apple has come up with a far better solution -- a push notification service available for all developers. A poll based architecture is good when system capabilities are relatively symmetric. Clearly mobile phones are restricted along a number of dimensions, the most important being battery power. Having a large number of apps constantly polling for updates sucks down battery power faster than vampires at phlebotomist convention. So Apple's logic is sound. Keep a single connection over which data is pushed and work on the phone is minimized. You also maximize battery life and maximize bandwidth usage because data can be aggregated on the server side and be sent in large chunks rather than a random distribution of small packets. The mechanics of how this works isn't clear. Must all apps needing to push data to a phone become part of Apple's private iPhone cloud? Smart for Apple as it gives them complete control. For sculptors of the ultimate user experience you want total control. Not so good for developers as it's just another garden with a very high wall protecting it.

    Click to read more ...

  • Monday
    Jun092008

    FaceStat's Rousing Tale of Scaling Woe and Wisdom Won

    Lukas Biewald shares a fascinating slam by slam recount of how his FaceStat (upload your picture and be judged by the masses) site was battered by a link on Yahoo's main page that caused an almost instantaneous 650,000 page view jump on their site. Yahoo spends considerable effort making sure its own properties can handle the truly massive flow from the main page. Turning the Great Eye of the Internet towards an unsuspecting newborn site must be quite the diaper ready experience. Theo Schlossnagle eerily prophesized about such events in The Implications of Punctuated Scalabilium for Website Architecture: massive, unexpected and sudden traffic spikes will become more common as a fickle internet seeks ever for new entertainments (my summary). Exactly FaceStat's situation. This is also one of our first exposures to an application written on Merb, a popular Ruby on Rails competitor. For those who think Ruby is the problem, their architecture now serves 100 times the original load. How did our fine FaceStat fellowship fair against Yahoo’s onslaught? Not a lot of details of FaceStat’s architecture are available, so it’s not that kind of post. What interested me is that it’s a timely example of Theo’s traffic spike phenomena and I was also taken with how well the team handled the challenge. Few could do better so quickly. In fact, let’s apply Theo’s rubric for how to handle these situations to FaceStat:
  • Be Alert: build automated systems to detect and pinpoint the cause of these issues quickly (in less than 60 seconds). None initially, but they are building in more monitoring to better handle future situations. Better monitoring would have alerted them to the problems long before they actually were alerted. Perhaps many more potential customers might have been converted to actual customers. You can never have enough monitoring!
  • Be Prepared: understand the bottlenecks of your service systemically. As the system was relatively simple, new, and quickly changed, my assumption is they were fully aware of their system’s shortcomings, they were just busy with adding features rather than worrying about performance and scalability.
  • Perform Triage: understand the importance of the various services that make up your site. Definitely. They “started ripping out every database intensive feature” in response to the load.
  • Be Calm: any action that is not analytically driven is a waste of time and energy. They stayed amazingly calm as can be seen from the following quote: “It’s one thing to code scalably and grow slowly under increasing load, but it’s been a blast to crazily rearchitect a live site like FaceStat in a day or two.” I’m not sure how analytically driven they were however  All-in-all an impressive response to the Great Eye’s undivided attention. But not everyone was impressed as I. A commenter named Bernard said: Sorry, but this is a really dumb story. Given how dirt cheap things like slicehost and linode are, it is crazy that you launched a web app and had not already prepared a redundant, highly-scalable architecture… I’d say you were damn lucky that the disappointed users came back at all. Commenter Will thought it was a “Nice problem to be having!” Which it is, of course, being noticed is better than being ignored. But Lukas was spot on when he lamented about being noticed too soon has a downside: After working so hard to get users to come to your site, it’s amazingly frustrating to see hundreds of thousands of people suddenly locked out. Clearly we still don’t have the ability for developers to create scalable systems as simply as they create exploratory systems. Ed from Rackspace posted that they could help with their Auto Scale of Arrays feature. And Rackspace would be an excellent solution, but the cost would be $500/month and a $2500 setup fee. No “let’s put on a show” startup can afford those costs. The mode FaceStat was in is typical: We find that a Rails-like platform is invaluable for rapidly prototyping a new site, especially since we started FaceStat as a pure experiment with no idea whether people would like it or not, and with a very different feature set in mind compared to what it later became. A pay as you grow model is essential for scalability because that’s the only way you can bake scalability in from the start. And even with all the impressive advances in the industry we still don’t have the software infrastructure to make scaling second nature.

    Information Sources

  • Scaling Fast by Lukas Biewald
  • FaceStat scales! on Dlores BLog

    Platform

  • Merb. Ruby based MVC framework that is ORM-agnostic.
  • Thin. A fast and very simple Ruby web server.
  • Slicehost. Hosting service. Able to quickly provision servers as needed.
  • Amazon’s S3. Image server. Latency is high but it handles the load.
  • Capistrano. Automated deployment.
  • Git with github. Source code control system. Supports efficient simultaneous development, quick merging and deployment.
  • God. Server monitoring and management.
  • Memcached. Application caching layer.
  • PostgreSQL

    The Stats

  • Six app servers.
  • One big database machine.

    The Architecture

  • FaceStat is a write heavy application and performs involved calculations on data.
  • S3 is used to offload the responsibility for storing images. This freed them from the massive bandwidth requirements and complexity of managing their own images.
  • Memcached offloads reads from the database to allow the database to have more time for writes.

    Lessons Learned

  • Monitor the site. The sooner you know about a problem the faster it can be fixed. Don't rely on user email or email from exception handlers or you'll never get ahead of problems.
  • Communicate with your users with an error page. A meaningful error pages shows you care and that you are working on the problem. That's enough for a second chance with most people.
  • Use a cached statically generated homepage. Hard to beat that for performance.
  • Big sites might want to give a heads up when they mention smaller sites. Just a short polite email saying how your world will soon turn upside down would do.
  • High-level platform really doesn’t matter compared to overall architecture. How you handle writes, reads, caching, deployment, monitoring, etc are relatively framework independent and it's how you solve those problems that matter.
  • Ruby and Merb supported rapid prototyping to experiment and create a radically different system form the one they intended.

    Click to read more ...

  • Sunday
    Jun082008

    Search fast in million rows

    I have a table .This table has many columns but search performed based on 1 columns ,this table can have more than million rows. The data in these columns is something like funny,new york,hollywood User can search with parameters as funny hollywood .I need to take this 2 words and then search on column whether that column contain this words and how many times .It is not possible to index here .If the results return say 1200 results then without comparing each and every column i can't determine no of results.I need to compare for each and every column.This query is very frequent .How can i approach for this problem.What type of architecture,tools is helpful. I just know that this can be accomplished with distributed system but how can i make this system. I also see in this website that LinkedIn uses Lucene for search .Is Lucene is helpful in my case.My table has also lots of insertion ,however updation in not very frequent.

    Click to read more ...

    Friday
    Jun062008

    GigaOm Structure 08 Conference on June 25th in San Francisco

    If you just can't get enough high scalability talk you might want to take a look GigaOm's Structure 08 conference. The slate of speakers looks appropriately interesting and San Francisco is truly magical this time of year. High Scalability readers even get a price break is you use the HIGHSCALE discount code! I'll be on vacation so I won't see you there, but it looks like a good time. For a nice change of pace consider visiting MoMA next door. Here's a blurb on the conference:

    A reminder to our readers about Structure 08, GigaOm's upcoming
    conference dedicated to web infrastructure. In addition to keynotes
    from leaders like Jim Crowe, chairman and CEO of Level 3
    Communications and Werner Vogels, CTO of Amazon, the event will
    feature workshops from Google App Engine, Microsoft and a special
    workshop from Fenwick and West who will cover how to raise money for
    an infrastructure start up. Learn from the guru's at Amazon, Google,
    Microsoft, Sun, VMWare and more about what the future of internet
    infrastructure is at Structure 08, which will be held on June 25 in
    San Francisco.
    

    Click to read more ...

    Friday
    Jun062008

    Economies of Non-Scale

    Scalability forces us to think differently. What worked on a small scale doesn't always work on a large scale -- and costs are no different. If 90% of our application is free of contention, and only 10% is spent on a shared resources, we will need to grow our compute resources by a factor of 100 to scale by a factor of 10! Another important thing to note is that 10x, in this case, is the limit of our ability to scale, even if more resources are added. 1. The cost of non-linearly scalable applications grows exponentially with the demand for more scale. 2. Non-linearly scalable applications have an absolute limit of scalability. According to Amdhal's Law, with 10% contention, the maximum scaling limit is 10. With 40% contention, our maximum scaling limit is 2.5 - no matter how many hardware resources we will throw at the problem. This post discuss in further details how to measure the true cost of non linearly scalable systems and suggest a model for reducing that cost significantly.

    Click to read more ...

    Wednesday
    Jun042008

    LinkedIn Architecture

    LinkedIn is the largest professional networking site in the world. LinkedIn employees presented two sessions about their server architecture at JavaOne 2008. This post contains a summary of these presentations. Key topics include:

    • Up-to-date statistics about the LinkedIn user base and activity level
    • The evolution of the LinkedIn architecture, from 2003 to 2008
    • "The Cloud", the specialized server that maintains the LinkedIn network graph
    • Their communication architecture

    Click to read more ...

    Monday
    Jun022008

    Total Cost of Ownership for different web development frameworks

    I would like to compile a comparison matrix on the total cost of ownership for .Net, Java, Lamp & Rails. Where should I start? Has anyone seen or know of a recent study on this subject?

    Click to read more ...

    Saturday
    May312008

    memcached and Storage of Friend list

    My first post, please be gentle. I know it is long. You are all like doctors - the more info, the better the diagnosis. ----------- What is the best way to store a list of all of your friends in the memcached cache (a simple boolean saying “yes this user is your friend”, or “no”)? Think Robert Scoble (26,000+ “friends”) on Twitter.com. He views a list of ALL existing users, and in this list, his friends are highlighted. I came up with 4 possible methods: --store in memcache as an array, a list of all the "yes" friend ID's --store your friend ID's as individual elements. --store as a hash of arrays based on last 3 digits of friend's ID -- so have up to 1000 arrays for you. --comma-delimited string of ID's as one element I'm using the second one because I think it is faster to update. The single array or hash of arrays feels like too much overhead calculating and updating – and even just loading – to check for existence of a friend. The key is FRIEND[small ID#]_[big ID#]. The value is 1. This way there are no dupes. (I add u as friend, it always adds me as ur friend...I remove u, u remove me). Store with it 2 additional flags: One denotes start of entries. One denotes end of entries. As friends are added, the end flag position relative to new friends will become meaningless, but that is ok (I think). To see if someone is your friend, the system checks if both start and end flags exist. If both exist, it can check for existence of friend ID - if exists, then friend. Start flag is required. If start flag is pushed out of cache, we must assume some friends were also pushed out. Currently, the system loads from DB in a daemon in the background after you log in (if two flags are not already set). Until the two flags are set, it does db lookups. There is no timeout on the data in cache. Adding/removing friends to your account adds/removes to/from memcache - so, theoretically, it might never have to pre-load anything. Downside of my method is if the elements span multiple servers and one dies, you loose some of your friends (that's the upside of using arrays). I don't know how to resolve if the lost box didn't contain either of the flags -- in that case, the users' info will NEVER get refreshed. This is my concern. Any ideas? Thanks so much!!!

    Click to read more ...