How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala, the social online storage service which scales as new nodes join the P2P network.
The goal of Wua.la is to provide distributed online storage that is:
by harnessing the idle resources of participating computers.
This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995:
"The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader"
After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the result: Wuala. Wuala is a new way of storing, sharing, and publishing files on the internet. It enables its users to trade parts of their local storage for online storage and it allows us to provide a better service for free. In this Google Tech Talk, Dominik will explain what Wuala is and how it works, and he will also show a demo.
i didnt find a nearly good solution for this problem yet:
imagine, you're responsible for a small CDN network (static images), with two different datacenter. the balancing for the two DC is done with a anycast nameservice (a nameserver in every DC, user gets on nearest location). so, one of the scenario is that one of the datacenters goes down completly. you can do a monitoring on the nameserver and only route to the dc which is still alive, no problem. But what about the TTL from the DNS-Records? Tiny TTLs like 2 min. are often ignored by several ISP (e.g. AOL). so, the client doesn't get the IP from the other Datacenter. what could be a solution in this scenario?
Hi,
I was wondering if anyone has found any information on how to architect a system to support high availability file uploads. My scenario: I have an Apache server proxying requests to a bunch of Tomcat Java application servers. When I need to upgrade my site, I stop and upgrade each of the Tomcat servers one at a time. This seems to work well as Apache automatically routes subsequent requests for the stopped app server to the remaining app servers that are up. The problem is that if a user is uploading a file when the app server is stopped, the upload fails and the user has to upload the file again. This is problematic as uploading files is an integral feature of the site and it's frustrating for the users to have to restart their uploads every time I upgrade the site (which I want to be able to do frequently).
Has anyone seen any information on how this can be done or have ideas on how this can be architected? I imagine sites like Flickr must have a solution to this problem as I have seen presentations they say that they are able to upgrade their site several times a day without the users noticing.
Thanks!
Tuyen
Our company offers a web service that is provided to users from several different hosting centers across the globe.
The content and functionality at each of the servers is almost exactly the same, and we could have based them all in a single location. However, we chose to distribute the servers geographically to offer our users the best performance, regardless where they might be.
Up until now, the only content on the servers that has had to be synchronized is the server software itself. The features and functionality of our service are being updated regularly, so every week or two we push updates out to all the servers at basically the same time. We use a relatively manual approach to do the updating, but it works fine.
Sometime soon, however, our synchronization needs are going to get a bit more complex.
In particular, we'll soon start offering a feature at our site that will involve a database with content that will change on an almost second-by-second basis, based on user input and activity. For performance reasons, a complete instance of this database will have to be present locally at each of our server locations. At the same time, the content of the database will have to be synchronized across all server locations, so that users get the same database content, regardless of the server they choose to visit.
We have not yet chosen the database that we'll use for this functionality, although we are leaning towards MySql. (We are also considering PostgreSQL.)
So, my question for the assembled experts is: What approach is the best one for us to use to synchronize the database instances across our servers?
Ideally, we'd like a solution that is resilient to a server location becoming unavailable, and we'd also prefer a solution that makes efficient use of bandwidth. (Processing power doesn't cost us a lot; bandwidth, on the other hand, can get expensive.)
FWIW ...
(1) Our servers run Apache and Tomcat on top of Centos.
(2) I've found the following "how to" that suggests an approach involving MySQL that could address our needs: http://capttofu.livejournal.com/1752.html
Thanks!
Recent comments
3 hours 41 min ago
3 hours 55 min ago
4 hours 3 min ago
6 hours 19 min ago
7 hours 1 min ago
8 hours 3 min ago
8 hours 10 min ago
1 day 10 hours ago
1 day 11 hours ago
4 days 11 hours ago