This article will touch upon how Kraken.io built and scaled an image optimization platform which serves millions of requests per day, with the goal of maintaining high performance at all times while keeping costs as low as possible. We present our infrastructure as it is in its current state at the time of writing, and touch upon some of the interesting things we learned in order to get it here.
Let’s make an image optimizer
You want to start saving money on your CDN bills and generally speed up your websites by pushing less bytes over the wire to your user’s browser. Chances are that over 60% of your traffic are images alone.
Using ImageMagick (you did read ImageTragick, right?) you can slash down the quality of a JPEG file with a simple command:
$ convert -quality 70 original.jpg optimized.jpg
$ ls -la
-rw-r--r-- 1 matylla staff 5897 May 16 14:24 original.jpg
-rw-r--r-- 1 matylla staff 2995 May 16 14:25 optimized.jpg
Congratulations. You’ve just brought down the size of that JPEG by ~50% by butchering it’s quality. The image now looks like Minecraft. It can’t look like that - it sells your products and services. Ideally, images on the Web should have outstanding quality and carry no unnecessary bloat in the form of excessively high quality or EXIF metadata.
You now open your favourite image-editing software and start playing with Q levels while saving a JPEG for the Web. It turns out that this particular image you test looks great at Q76. You start saving all your JPEGs with quality set to 76. But hold on a second… Some images look terrible even with Q80 while some would look just fine even at Q60.
Ok. You decide to automate it somehow - who wants to manually test the quality of millions of images you have the “privilege” of maintaining. So you create a script that generates dozens of copies of an input image at different Q levels. Now you need a metric that will tell you which Q level is perfect for a particular image. MSE? SSIM? MS-SSIM? PSNR? You’re so desperate that you even start calculating and comparing perceptual hashes of different versions of your input image.
Some metrics perform better than others. Some work well for specific types of images. Some are blazingly fast while the others take a long time to complete. You can get away by reducing the number of loops in which you process each image but then chances are that you miss your perfect Q level and the image will either be heavier than it could be or quality degradation will be too high.
And what about product images against white backgrounds? You really want to reduce ringing/haloing artifacts around the subject. What about custom chroma-subsampling settings on per-image basis? That red dress against white background looks all washed-out now. You’ve learned that stripping EXIF metadata will bring the file size down a bit but you’ve also removed Orientation tag and now your images are all rotated incorrectly.
And that’s only the JPEG format. For your PNGs probably you’d want to re-compress your 7-Zip or Deflate compressed images with something more cutting-edge like Google’s Zopfli. You spin up your script and watch the fan on your CPU start to melt...