Is there any type of server technology that allows visitors to a website to become part of the server? Like with bittorrent, users share some of their bandwidth, so would this be possible with web servers where a person goes to a website, downloads and runs the software which makes their internet connection and cpu and hdd become part of the web server?
GemFire Enterprise is in-memory distributed data management platform that pools memory (and CPU, network and optionally local disk) across multiple processes to manage application objects and behavior. With the 6.0 release, GemFire has reached a stage of maturity in its evolution. GemStone touts this version as the true 'best of breed' distributed caching technology, solving scalability issues in all industries.
Facebook has the second largest installation of Hadoop (a software platform that lets one easily write and run applications that process vast amounts of data), Yahoo being the first.
Learn how they do it and what are the challenges on DBMS2 blog, which is a blog for people who care about database and analytic technologies.
on Wiki someone posted "...For relatively small installations, pub/sub provides the opportunity for better scalability than traditional client-server, through parallel operation, message caching, tree-based or network-based routing, etc. However, as systems scale up to become datacenters with thousands of servers sharing the pub/sub infrastructure, this benefit is often lost; in fact, scalability for pub/sub products under high load in large deployments is very much a research challenge." Does anyone have something to say regarding scaling Publish/subscribe models?
Wille Faler has created an excellent list of best practices for building scalable and high performance systems. Here's a short summary of his points:
The goal of DryadLINQ is to make distributed computing on large compute cluster simple enough for ordinary programmers. DryadLINQ combines two important pieces of Microsoft technology: the Dryad distributed execution engine and the .NET Language Integrated Query (LINQ).
The Dryad Project is investigating programming models for writing parallel and distributed programs to scale from a small cluster to a large data-center.
Art of Distributed
Part 1: Rethinking about distributed computing modelsI ‘m getting a lot of questions lately about the distributed computing, especially distributed computing model, and MapReduce, such as: What is MapReduce? Can MapReduce fit in all situations? How we can compares it with other technologies such as Grid Computing? And what is the best solution to our situation? So I decide to write about the distributed computing article in two parts. First one about the distributed computing model and what is the difference between them. In the second part I will discuss the reliability, and distributed storage systems. Download the article in PDF format. Download the article in MS Word format. I wait for your comments, and questions, and I will answer it in part two.
We are planning to be the first company to do a one million user load test and are looking for someone willing to be the first to have been subjected to such a test! Is YOUR site scalable enough? How do you KNOW? http://capcalblog.blogspot.com. Randy Hayes CapCal
The abstract for the talk given by Bob Ippolito, co-founder and CTO of Mochi Media, Inc:
Building large systems on top of a traditional single-master RDBMS data storage layer is no longer good enough. This talk explores the landscape of new technologies available today to augment your data layer to improve performance and reliability. Is your application a good fit for caches, bloom filters, bitmap indexes, column stores, distributed key/value stores, or document databases? Learn how they work (in theory and practice) and decide for yourself.Bob does an excellent job highlighting different products and the key concepts to understand when pondering the wide variety of new database offerings. It's unlikely you'll be able to say oh, this is the database for me after watching the presentation, but you will be much better informed on your options. And I imagine slightly confused as to what to do :-) An interesting observation in the talk is that the more robust products are internal to large companies like Amazon and Google or are commercial. A lot of the open source products aren't yet considered ready for prime-time and Bob encourages developers to join a project and make patches rather than start yet another half finished key-value store clone. From my monitoring of the interwebs this does seem to be happening and existing products are starting to mature. From all the choices discussed the column database Vertica seems closest to Bob's heart and it's the product they use. It supports clustering, column storage, compression, bitmapped indexes, bloom filters, grids, and lots of other useful features. And most importantly: it works, which is always a plus :-) Here's a summary of some of the points talked about in the presentation: