Virus Scanning for Uploaded content

All,
What is the best way to scan the content being uploaded by the users? Is there any open source solution available to do that? How does YouTube, flickr and other user uploadable content sites handle this?
Any insight would be greatly appreciated!
Regards,
Janakan Rajendran.

Re: Virus Scanning for Uploaded content

ClamAV? :D

http://www.clamav.net/

I use it on all our servers.

Re: Virus Scanning for Uploaded content

Alberto,

Thanks for your reply. Yes, I heard of Clamav. But I haven't heard that the bigger names like YouTube, Flickr using it.
Is ClamAV efficient in scanning media files too in terms of accuracy/performance (fast?)?

Regards,
Janakan Rajendran.

Re: Virus Scanning for Uploaded content

It has nice doc: http://wiki.clamav.net/Main/WebHome

Faster than commercial solutions to get db updates:
http://www.pcwelt.de/start/sicherheit/archiv/111012/index2.html
http://www.pcwelt.de/start/sicherheit/archiv/108562/index.html
http://www.heise.de/security/news/meldung/53427
( In german just take a look at the lists :)

Who uses it? http://www.clamav.org/about/who-use-clamav/
I can see some known names like: DynDNS, Barracuda, xs4all

It has got some awards: http://www.clamav.org/about/awards/

Probably it lags begins some commercial solutions on performance, but until now it has never been a cpu hog for me.

atif.ghaffar's picture

Re: Virus Scanning for Uploaded content

Janakan,

You did ask for the opensource version.
ClamAV is quiet good.

Is this also for the CDN project that you are planning?
We use this at our ISP for scanning files uploaded via FTP.
We use Pure-ftpd and it has a upload script hook that can do what you want to the file after it has been uploaded and can decided wether to keep or remove it.

For the http based upload, you will have to follow a similar scheme.
Allow people to upload a file in a non-downloadable-area, queue that file for an inspection and inform the uploader after the inspection what the result it. This is less expensive aproach that doing everything in real-time.

Wether you use scanning or not, it is anyway a reasonable good idea to separate your upload server from your other content-serving servers.

Hope this helps.

Re: Virus Scanning for Uploaded content

Atif,

Thanks for your response. I like the idea of seperating it from content delivery servers before successful scanning. Is there any paid commerical solutions available rather than ClamAV? If I get simultaneous uploads, I'm concerned about multiple threads support from ClamAV.

Regards,
Janakan Rajendran

atif.ghaffar's picture

Re: Virus Scanning for Uploaded content

Janakan,

I dont know about any paid service. Havent had the need to look into it yet.
For the multiple files... what is bothering you?

while (incoming files) {
scan_and_report.sh $file &; # fork it!
}

fork as much as you can handle.

Or distribute it on different machines.

perhaps use a central database where you put a reference to all uploaded files.
Then from a dispatcher dispatch the files to different machines in batches. (you can put your threshhold, for example each machine recieves no more than 50 requests at one time).

When the scanning process finish, the scanner reports back to the database with OK or KO.

Once you have an OK, move the file to where it should be.

Re: Virus Scanning for Uploaded content

Not open-source but there are commercial products for this type scanning. Symantec has one product:

http://www.symantec.com/business/products/overview.jsp?pcid=2251&pvid=83...

ICAP is a protocol specification meant for handling this type of processing: http://en.wikipedia.org/wiki/Internet_Content_Adaptation_Protocol

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><div ?=?><p ?=?> <img ?=?><h1 ?=?><h2 ?=?><h3 ?=?>
  • Lines and paragraphs break automatically.
  • Glossary terms will be automatically marked with links to their descriptions
  • You may link to webpages through the weblinks registry

More information about formatting options

To combat spam, please enter the code in the image.