HotPads Shows the True Cost of Hosting on Amazon
Mather Corgan, president of HotPads, gave a great talk on how HotPads uses AWS to run their real estate search engine. I loved the presentation for a few reasons:
This a really good example mix of where many companies are or would like to be with their applications.
Their total costs are about $11K/month, which is about what they were paying at their previous provider. I found this is a little surprising as I thought the cloud would be more expensive, but they only pay for what they need instead of having to over provision for transient uses like testing. And some servers aren't necessary anymore as EBS handles backups so database slave servers are no longer required.
There are lots more lessons like this that I've abstracted down below.
Site: http://hotpads.com - a map-based real estate search engine, listing homes for sale, apartments, condos, and rental houses.
Stats
Platform
Costs
* $150: 2 Small HAProxy Load Balancers - 2 for failover, these have the elastic IPs, round robin DNS point at the elastic IPs.
* $1,200: 3-5 Large Tomcat Web Servers - an array of 3 run at night and 5 during the day.
* $1,500: 5 Large Tomcat Job Servers
* $900: 1 X-Large 1 Large Index Server - used to power property search and have several GB of RAM for the JVM
* $1,200: 1 X-Large 2 Large MySQL masters
* $1,200: 1 X-Large 2 Large MySQL slaves
* $300: 1 Large Messaging Server ActiveMQ - will be replaced with SQS
* $300: 1 Large Map tile creation servers Tilecache
* $600: Development/testing/migration/ servers
Lessons Learned
* For a 67 KB object (600 px image) which is where the cost of putting an image into S3 equals the cost of storing it there and about equal the cost of storing it once.
* For a 6.7 KB object (15 px thumb nail) the put (small fee for putting an object into S3) cost is 10x the storage transfer costs.
* In April 330 GB of images downloaded at $.15/GB cost $49. 55mm GETs at $1/mm cost $55. 42mm PUTs at $1/1k cost $420!
* $100 download and GETs of maptiles.
* So S3 very cheap for larger files, watch out for lots of short lived small files.
* Makes frequently viewed listings faster.
* For infrequently viewed listings the CloudFront has to go to S3 to get the file the first time which means you have to pay twice for a file that will be viewed only once.
* Used on database servers because it's faster than local storage (especially for random writes), blocks of data redundant, and supports easy backups and versioning via cloning.
* Only 10% cost overhead.
* Allowed them to get rid of second set of slaves because the backups were so CPU intensive they had to have slaves to do the backups. EBS allows snapshots of running drives so the extra slaves are unnecessary.
* Databases are I/O bound and the CPU is vastly underutilized so there's extra capacity when you need it.
* 1 year for the cost of 6 months and guaranteed (denied one time) to get an instance.
* Con is tied to an instance type and they want more flexibility to choose instance types as their software changes and take advantage of new instance types as they are released.