Scaling an image upload service

Hi,

First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers.

The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image.
Users can then view the images via our web server.
Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible.

Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware.

Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache.

I appreciate any input on the subject.

atif.ghaffar's picture

Re: Scaling an image upload service

Some ideas.

Separate you upload and download servers.
This way, you will know easily which servers to scale.

Use high-performance servers such as nginx for downloading, lighttpd for uploading, etc.
See http://trac.lighttpd.net/trac/wiki/Docs:ModUploadProgress if you want to attach a progress meter. Pretty cool for large uploads so the user knows that something is happening and the server/browser hasnt frozen.

You will also need to tune the upload/download servers accordingly.
Perhaps you will allow 1MB POST, 60secon input request time, 60 second process time, etc.
On the download server, do not allow POST and kill everything that takes more than 1 second to finish.

hope this helps·

Re: Scaling an image upload service

Thanks for the input Atif,

I just want to clarify a few things regarding the points you have made.

Regarding separating the download and upload servers; do you mean running both services on the same machine. Each running on different ports such as 80 for download and 8080 for upload.

If it meant two separate machines I assume I would need a SAN or some service like MogileFS to allow the download server access to the uploaded images.

I assume that using lighttpd for uploading would require PHP correct?
The uploads are quite small so I currently don't require a progress meter, however, I have wondered if version 1.5 of lighttpd could be used in a production server.

I will certainly look at tuning the performance.

Thanks again for you help

atif.ghaffar's picture

Re: Scaling an image upload service

Hello agallagher,

Regarding separating the download and upload servers; do you mean running both services on the same machine. Each running on different ports such as 80 for download and 8080 for upload.
A. I meant multiple machines for each function. (10 servers for uploads, 5 for downloads, etc)

If it meant two separate machines I assume I would need a SAN or some service like MogileFS to allow the download server access to the uploaded images.
A. Yes a NAS would do just fine.

I assume that using lighttpd for uploading would require PHP correct?
A. No PHP is not required to use lighttpd. You can write your upload logic is any language that you wish. Perhaps choose a server designed specifically for uploads. Or just use a FTP server.

I have wondered if version 1.5 of lighttpd could be used in a production server.
A. We use lighttpd 1.5 in production since 2 months and its doing quiet well, we are moving all scattered upload functionatlies to this upload server.

Re: Scaling an image upload service

Very interesting information, I am looking forward to implementing such a system

Thanks again Atif,

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><div ?=?><p ?=?> <img ?=?> <embed ?=?> <h1 ?=?><h2 ?=?><h3 ?=?>
  • Lines and paragraphs break automatically.
  • Glossary terms will be automatically marked with links to their descriptions
  • You may link to webpages through the weblinks registry

More information about formatting options

To combat spam, please enter the code in the image.