advertise
« Platform virtualization - top 25 providers (software, hardware, combined) | Main | How to Organize a Database Table‚Äôs Keys for Scalability »
Monday
Dec292008

Paper: Spamalytics: An Empirical Analysisof Spam Marketing Conversion

Under the philosophy that the best method to analyse spam is to become a spammer, this absolutely fascinating paper recounts how a team of UC Berkely researchers went under cover to infiltrate a spam network. Part CSI, part Mission Impossible, and part MacGyver, the team hijacked the botnet so that their code was actually part of the dark network itself. Once inside they figured out the architecture and protocols of the botnet and how many sales they were able to tally. Truly elegant work.

Two different spam campaigns were run on a Storm botnet network of 75,800 zombie computers. Storm is a peer-to-peer botnet that uses spam to creep its tentacles through the world wide computer network. One of the campains distributed viruses in order to recruit new bots into the network. This is normally accomplished by enticing people to download email attachments. An astonishing one in ten people downloaded the executable and ran it, which means we won't run out of zombies soon. The downloaded components include: Backdoor/downloader, SMTP relay, E-mail address stealer, E-mail virus spreader, Distributed denial of service (DDos) attack tool, pdated copy of Storm Worm dropper. The second campaign sent pharmacuticle spam ("libido boosting herbal remedy”) over the network.

Haven't you always wondered who clicks on spam and how much could spammers possibly make? In the study only 28 sales resulted from 350 million spam e-mail messages sent over 26 days. A conversion rate of well under 0.00001% (typical advertising campaign might have a conversion of 2-3%). The average purchase price was about $100 for $2,731.88 in total revenue. The reserchers estimate total daily revenue attributable to Storm’s pharmacy campaign is about $7000 and that they pick up between 3500 and 8500 new bots per day through their Trojan distribution system. And this is with only 1.5% of the entire network in use.

So, the spammers would take in total revenue about $3.5 million a year from one product from one network. Imagine the take with multiple products and multiple networks? That's why we still have spam. And since the conversion rate is already so low, it seems spam will always be with us.

As fascinating as all the spamonomics are, the explanation of the botnet architecture is just as fascinating. Storm uses a three-level self-organizing hierarchy pictured here:

  • worker bots - make requests for work and upon receiving orders send spam as requested. Works pull work from higher layers.
  • proxy bots - act as coordinators between workers and master servers.
  • master servers - send commands to the workers and receive their status reports. There are small number of master servers hosted at “bullet-proof” hosting centers and are likely directly managed by the botmaster.

    A host selects its worker or proxy role automatically. If a firewall doesn't prevent inbound communication the infected host becomes a proxy, otherwise the host becomes a worker. As workers pull work from proxies there's no need to contact one directly. Proxies on the other hand are directly contacted by master servers so communication must be bidirectional.

    Storm communicates using two separate protocols:
  • An encrypted version of the UDP-based Overnet protocol and is used primarily as a directory service to find other nodes. Overnet is a peer-to-peer protocol that uses a distributed hash table mechanism to find peers.
  • A custom TCP-based protocol for masters sending command and control commands to proxies and workers. Command and control traffic to the worker bots is unecrypted which makes a man-in-the-middle attack possible and is how the researchers carried out their caper.

    According to Brandon Enright: When a peer wants to find content in the network, it computes (or is given) the hash of that content and then searches adjacent peers. Those peers respond with their adjacent peers that are closer. This is repeated until the searching peer gets close enough to the content that a node there will be able to provide a search result. This is a complicated and interesting process that the Spamalytics paper goes into in a lot more detail on as do some references at the end of this post.

    Storm harnesses a large, unreliable, constantly changing distributed system to do work. It's an architecture worth learning from and we'll explore some of those lessons in a later post.

    Related Articles

  • On the Spam Campaign Trail
  • Scaling Spam Eradication Using Purposeful Games: Die Spammer Die!
  • Can cloud computing smite down evil zombie botnet armies?
  • Inside the Storm: Protocols and Encryption of the Storm Botnet by Joe Stewart, GCIG Director of Malware Research, SecureWorks
  • Exposing Stormworm by Brandon Enright. A lot of excellent low level protocol details.
  • Storm Botnet
  • Global Guerrillas by John Robb - Networked tribes, systems disruption, and the emerging bazaar of violence. Resilient Communities, decentralized platforms, and self-organizing futures.
  • References (1)

    References allow you to track sources for this article, as well as articles that were written in response to this article.

    Reader Comments (1)

    Although you measured the response rate for the the downloaded emails, which you are correct, amazing at 1 in 10, it would have been more enlightening to see what was the actual open rate of the spam email.

    March 17, 2011 | Unregistered CommenterRich McKelvey

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>