Paper: SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine

So how do you knit multiple datacenters and many thousands of phones and other clients into a single cooperating system?

Usually you don't. It's too hard. We see nascent attempts in services like Firebase and Parse. 

SwiftCloud, as described in SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine, goes two steps further, by leveraging Conflict free Replicated Data Types (CRDTs), which means "data can be replicated at multiple sites and be updated independently with the guarantee that all replicas converge to the same value. In a cloud environment, this allows a user to access the data center closer to the user, thus optimizing the latency for all users."

While we don't see these kind of systems just yet, they are a strong candidate for how things will work in the future, efficiently using resources at every level while supporting huge numbers of cooperating users.


Click to read more ...


Google Says Cloud Prices Will Follow Moore’s Law: Are We All Renters Now?

After Google cut prices on their Google Cloud Platform Amazon quickly followed with their own price cuts. Even more interesting is what the future holds for pricing. The near future looks great. After that? We'll see.

Adrian Cockcroft highlights that Google thinks prices should follow Moore’s law, which means we should expect prices to halve every 18-24 months.

That's good news. Greater cost certainty means you can make much more aggressive build out plans. With the savings you can hire more people, handle more customers, and add those media rich features you thought you couldn't afford. Design is directly related to costs.

Without Google competing with Amazon there's little doubt the price reduction curve would be much less favorable.

As a late cloud entrant Google is now in a customer acquisition phase, so they are willing to pay for customers, which means lower prices are an acceptable cost of doing business. Profit and high margins are not the objective. Getting market share is what is important.

Amazon on the other hand has been reaping the higher margins earned from recurring customers. So Google's entrance into the early product life cycle phase is making Amazon eat into their margins and is forcing down prices over all.

But there's a floor to how low prices can go. Alen Peacock, co-founder of Space Monkey has an interesting graphic telling the story. This is Amazon's historical pricing for 1TB of storage in S3, graphed as a multiple of the historical pricing for 1TB of local hard disk:

Alen explains it this way:

Cloud prices do decrease over time, and have dropped significantly over the timespan shown in the graph, but this graph shows cloud storage prices as a multiple of hard disk prices. In other words, hard disk prices are dropping much faster than datacenter prices. This is because, right, datacenters have costs other than hard disks (power, cooling, bandwidth, building infrastructure, diesel backup generators, battery backup systems, fire suppression, staffing, etc). Most of those costs do not follow Moore's Law -- in fact energy costs are on a long trend upwards. So over time, the gap shown by the graph should continue to widen.


The economic advantages of building your own (but housed in datacenters) is there, but it isn't huge. There is also some long term strategic advantage to building your own, e.g., GDrive dropped their price dramatically at will because Google owns their datacenters, but Dropbox couldn't do that without convincing Amazon to drop the price they pay for S3.

Costs other than hardware began dominating in datacenters several years ago, Moore's Law-like effects are dampened. Energy/cooling and cooling costs do not follow Moore's Law, and those costs make up a huge component of the overall picture in datacenters. This is only going to get worse, barring some radical new energy production technology arriving on the scene.

What we're [Space Monkey] interested in, long term, is dropping the floor out from underneath all of these, and I think that only happens if you get out of the datacenter entirely.

As the size of cloud market is still growing there will still be a fight for market share. When growth slows and the market is divided between major players a collusionary pricing phase will take over. Cloud customers are sticky customers. It's hard to move off a cloud. The need for higher margins to justify the cash flow drain during the customer acquisition phase will reverse the favorable trends we are seeing now.

Until then it seems the economics indicate we are in a rent, not a buy world.

Related Articles 

  • IaaS Series: Cloud Storage Pricing – How Low Can They Go? - "For now it seems we can assume we’ve not seen the last of the big price reductions."
  • The Cloud Is Not Green
  • Brad Stone: “Bill Miller, the chief investment officer at Legg Mason Capital Management and a major Amazon shareholder, asked Bezos at the time about the profitability prospects for AWS. Bezos predicted they would be good over the long term but said that he didn’t want to repeat “Steve Jobs’s mistake” of pricing the iPhone in a way that was so fantastically profitable that the smartphone market became a magnet for competition.” 

Sponsored Post: Apple, Cloudant, CopperEgg, Logentries,, PagerDuty, HelloSign, CrowdStrike, Gengo, ScaleOut Software, Couchbase, MongoDB, BlueStripe, AiScaler, Aerospike, LogicMonitor, AppDynamics, ManageEngine, Site24x7  

Who's Hiring?

  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here.
    • Enterprise Software Engineer. Apple's Emerging Technology Services group provides a Java based SOA platform for various applications to interact with each other. The platform is designed to handle millions of messages a day with very low latency. We have an immediate opening for a talented Software Engineer in a highly visible team who is passionate about exploring emerging technologies to create elegant scalable solutions. Please apply here
    • Mobile Services Software Engineer. The Emerging Technologies/Mobile Services team is looking for a proactive and hardworking software engineer to join our team. The team is responsible for a variety of high quality and high performing mobile services and applications for internal use. Please apply here
    • Sr. Software Engineer-iOS Systems. Do you love building highly scalable, distributed web applications? Does the idea of performance tuning Java applications make your heart leap? If so, iOS Systems is looking for a highly motivated, detail-oriented, energetic individual with excellent written and oral skills who is not afraid to think outside the box and question assumptions. Please apply here
    • Senior Software Engineering Manager. As a Senior Software Engineering Manager on our team, you will be managing teams of very dedicated and talented engineering team. You will be responsible for managing the development of mobile point of sale system on iPod touch hardware. Please apply here.
    • Sr Software Engineer - Messaging Services. An exciting opportunity for a Software Engineer to join Apple's Messaging Services team. We build the cloud systems that power some of the busiest applications in the world, including iMessage, FaceTime and Apple Push Notifications. We handle hundreds of millions of active users using some of the most desirable devices on the planet and several Billion iMesssages/day, 40 billion push notifications/day, 16+ trillion push notifications sent to date. Please apply here.

  • Engine Programmer - C/C++. Wargaming|BigWorld is seeking Engine Programmers to join our team in Sydney, Australia. We offer a relocation package, Australian working visa & great salary + bonus. Your primary responsibility will be to work on our PC engine. Please apply here

  • Senior Engineer wanted for large scale, security oriented distributed systems application that offers career growth and independent work environment. Use your talents for good instead of getting people to click ads at CrowdStrike. Please apply here.

  • Ops Engineer - Are you passionate about scaling and automating cloud-based websites? Love Puppet and deployment scripts? Want to take advantage of both your sys-admin and DevOps skills? Join HelloSign as our second Ops Engineer and help us scale as we grow! Apply at

  • Human Translation Platform Gengo Seeks Sr. DevOps Engineer. Build an infrastructure capable of handling billions of translation jobs, worked on by tens of thousands of qualified translators. If you love playing with Amazon’s AWS, understand the challenges behind release-engineering, and get a kick out of analyzing log data for performance bottlenecks, please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events

  • The Biggest MongoDB Event Ever Is On. Will You Be There? Join us in New York City June 23-25 for MongoDB World! The conference lineup includes Amazon CTO Werner Vogels and Cloudera Co-Founder Mike Olson for keynote addresses.  You’ll walk away with everything you need to know to build and manage modern applications. Register before April 4 to take advantage of super early bird pricing.

  • Upcoming Webinar: Practical Guide to SQL - NoSQL Migration. Avoid common pitfalls of NoSQL deployment with the best practices in this May 8 webinar with Anton Yazovskiy of Thumbtack Technology. He will review key questions to ask before migration, and differences in data modeling and architectural approaches. Finally, he will walk you through a typical application based on RDBMS and will migrate it to NoSQL step by step. Register for the webinar.

Cool Products and Services

  • The NoSQL "Family Tree" from Cloudant explains the NoSQL product landscape using an infographic. The highlights: NoSQL arose from "Big Data" (before it was called "Big Data"); NoSQL is not "One Size Fits All"; Vendor-driven versus Community-driven NoSQL.  Create a free Cloudant account and start the NoSQL goodness

  • Finally, log management and analytics can be easy, accessible across your team, and provide deep insights into data that matters across the business - from development, to operations, to business analytics. Create your free Logentries account here.

  • CopperEgg. Simple, Affordable Cloud Monitoring. CopperEgg gives you instant visibility into all of your cloud-hosted servers and applications. Cloud monitoring has never been so easy: lightweight, elastic monitoring; root cause analysis; data visualization; smart alerts. Get Started Now.

  • PagerDuty helps operations and DevOps engineers resolve problems as quickly as possible. By aggregating errors from all your IT monitoring tools, and allowing easy on-call scheduling that ensures the right alerts reach the right people, PagerDuty increases uptime and reduces on-call burnout—so that you only wake up when you have to. Thousands of companies rely on PagerDuty, including Netflix, Etsy, Heroku, and Github.

  • Aerospike Releases Client SDK for Node.js 0.10.x. This client makes it easy to build applications in Node.js that need to store and retrieve data from a high-performance Aerospike cluster. This version exposes Key-Value Store functionality - which is the core of Aerospike's In-Memory NoSQL Database. Platforms supported: CentOS 6, RHEL 6, Debian 6, Debian7, Mac OS X, Ubuntu 12.04. Write your first app:

  • consistent: to be, or not to be. That’s the question. Is data in MongoDB consistent? It depends. It’s a trade-off between consistency and performance. However, does performance have to be sacrificed to maintain consistency? more.

  • Do Continuous MapReduce on Live Data? ScaleOut Software's hServer was built to let you hold your daily business data in-memory, update it as it changes, and concurrently run continuous MapReduce tasks on it to analyze it in real-time. We call this "stateful" analysis. To learn more check out hServer.

  • LogicMonitor is the cloud-based IT performance monitoring solution that enables companies to easily and cost-effectively monitor their entire IT infrastructure stack – storage, servers, networks, applications, virtualization, and websites – from the cloud. No firewall changes needed - start monitoring in only 15 minutes utilizing customized dashboards, trending graphs & alerting.

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...


4 Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO

This is a guest repost by Venkatesh CM at Architecture Issues Scaling Web Applications.

I will cover architecture issues that show up while scaling and performance tuning large scale web application in this blog.

Lets start by defining few terms to create common understanding and vocabulary. Later on I will go through different issues that pop-up while scaling web application like

  • Architecture bottlenecks
  • Scaling Database
  • CPU Bound Application
  • IO Bound Application

Determining optimal thread pool size of an web application will be covered in next blog.


Click to read more ...


Stuff The Internet Says On Scalability For May 9th, 2014

Hey, it's HighScalability time:

NASA captures Guatemala volcano erupting from space 
  • 40,000 exabytes: from now until 2020, the digital universe will about double every two years; $650,000: amount raised by the MaydayPAC in one week.
  • Quotable Quotes:
    • @BenedictEvans: Masayoshi Son: $20m initial investment in Alibaba, current stake worth $58bn.
    • @iamdevloper: I sneezed earlier and Siri compiled it to valid Perl.
    • @cdixon: "There is not enough competition in the last mile market to allow a true market to function" 
    • @PatrickMcFadin: Get ready for some serious server density. AMD is working on K12, brand-new x86 and ARM cores. This plus 8T SSD? 

  • With age comes changing priorities. Facebook is now 10 and has grown up. They are no longer moving fast and breaking things. They are now into the stability thing. Letting developers know they are a stable platform. The play is to get all that beautiful data from developers by being the platform for the Internet. On which an ad platform is built like a castle protecting a river valley. Interesting that Twitter said No! to becoming a platform, turning away developers. What has happened to Twitter's growth? The thought processes that lead to such different conclusions about the future would be interesting to understand.

  • Better than a Tauntaun roasted over an open light saber. An ode to 17 database in 33 minutes - RailsConf2014 by tobyhede. My favorite is "MySQL - The same as PostgreSQL but controlled by an evil overlord." 

  • Well, when you explain it that way...why GNU grep is fast: GNU grep is fast because it AVOIDS LOOKING AT EVERY INPUT BYTE; GNU grep is fast because it EXECUTES VERY FEW INSTRUCTIONS FOR EACH BYTE that it *does* look at.

  • How Gilt's Insane Traffic Spikes Pushed It Off Rails To Scala. It's unusual to have your expected traffic pattern to be a 100x spike once a day for 15 minutes, but that's the life of flash sales. Started as a Rails app. That didn't scale. They switched from Java to Scala because the Java system became too monolithic. They also bought into Akka and the whole Reactive platform idea. Architecture is terms of hundreds of microservices. Microservices keep a wall between unrelated services, reduces complexity, and keeps development friction-free.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Click to read more ...


Update on Disqus: It's Still About Realtime, But Go Demolishes Python

Our last article on Disqus: How Disqus Went Realtime With 165K Messages Per Second And Less Than .2 Seconds Latency, was a little out of date, but the folks at Disqus have been busy implementing, not talking, so we don't know a lot about what they are doing now, but we do have a short update in C1MM and NGINX by John Watson and an article Trying out this Go thing.

So Disqus has grown a bit:

  • 1.3 billion unique visitors
  • 10 billion page views
  • 500 million users engaged in discussions
  • 3 million communities
  • 25 million comments

They are still all about realtime, but Go replaced Python in their Realtime system:

Click to read more ...


The Quest for Database Scale: the 1 M TPS challenge - Three Design Points and Five common Bottlenecks to avoid

This a guest post by Rajkumar Iyer, a Member of Technical Staff at Aerospike.

About a year ago, Aerospike embarked upon a quest to increase in-memory database performance - 1 Million TPS on a single inexpensive commodity server. NoSQL has the reputation of speed, and we saw great benefit from improving latency and throughput of cacheless architectures. At that time, we took a version of Aerospike delivering about 200K TPS, improved a few things - performance went to 500k TPS - and published the Aerospike 2.0 Community Edition. We then used kernel tuning techniques and published the recipe for how we achieved 1 M TPS on $5k of hardware.

This year we continued the quest. Our goal was to achieve 1 Million database transactions per second per server; more than doubling previous performance. This compares to Cassandra’s boast of 1M TPS on over 300 servers in Google Compute Engine - at a cost of $2 million dollars per year. We achieved this without kernel tuning. 

This article describes the three design points we kept in mind and the five common bottlenecks we avoided to create a simpler recipe you can follow for high performance operations with Aerospike.

Three Design Points

Click to read more ...


Stuff The Internet Says On Scalability For May 2nd, 2014

Hey, it's HighScalability time:

Google's new POWER8 server motherboard
  • 1 trillion: number of scents your nose can smell; millions of square feet: sprawling new server farms
  • Quotable Quotes:
    • Gideon Lewis-Kraus: As the engineer and writer Alex Payne put it, these startups represent “the field offices of a large distributed workforce assembled by venture capitalists and their associate institutions,” doing low-overhead, low-risk R&D for five corporate giants. In such a system, the real disillusionment isn’t the discovery that you’re unlikely to become a billionaire; it’s the realization that your feeling of autonomy is a fantasy, and that the vast majority of you have been set up to fail by design.
    • @aphyr: I was already sold on immutability, pure functions, combinators, etc. What forced me out of Haskell was the impenetrable, haphazard docs.
    • @monicalent: Going to be migrating away from Rackspace to save some cash. Recommendations? Both Digital Ocean and Linode are half the price for 1GB RAM.
    • Linus Torvalds: So while I'd love for 'make' to be super-efficient, at the same time I'd much rather optimize the kernel to do what make needs really well, and have CPU's that don't take too long either.
    • Joe Landman: when the networking revolution comes, the cheap switches will be the first ones against the wall
    • @etherealmind: Buying public cloud can say I can't afford a house so I'll buy a tent. Because that works just as well, right ?
    • Neil DeGrasse Tyson: The act of doing it perfectly is the measure of it going unnoticed.[....] when it's done perfectly it goes unnoticed or, at best, it's just taken for granted.
  • How we scaled Freshdesk (Part I) – Before Sharding. Requests per week boomed from 2 million to 65 million. They scaled vertically for as long as they could to handle the increased load. Increasing RAM, CPU and I/O. First using read slaves for their heavily read weighted traffic, then assigning queries to particular slaves. Writes still needed to scale, so they turned on MySQL partitioning. Then caching of objects and html partials. They also used different storage engines for different functions. RedShift for analytics and data mining. Redis for state and as a job queue. But in the end they had to embrace the shard.

  • This took some guts. Fouresquare uses data driven decision making to decide to deportalze their app: We looked at the session analysis and saw that only 1 in 20 sessions had both social and discovery. Why not actually just split those apart, because 19 out of 20 times, tapping on one icon or the other, you have satisfied your need completely.

  • The blockchain story is bullshit: Looking at the blockchain from a realist’s standpoint, it is not obvious that there is a need for a worse-performing database, that an unregulated oligarchy has disproportionate power over, that isn’t improved with administrator arbitration. It looks like a technology looking for a problem to solve, rather than a technology created to solve a problem.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Click to read more ...


Paper: Can Programming Be Liberated From The Von Neumann Style? 

Famous computer scientist John Backus, he's the B in BNF(Backus-Naur form) and the creator of Fortran, gave a Turing Award Lecture titled Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs, that has layed out a division in programming that lives long after it was published in 1977. 

It's the now familiar argument for why functional programming is superior:

The assignment statement is the von Neumann bottleneck of programming languages and keeps us thinking in word-at-a-time terms in much the same way the computer's bottleneck does.


The second world of conventional programming languages is the world of statements. The primary statement in that world is the assignment statement itself. All the other statements of the language exist in order to make it possible to perform a computation that must be based on this primitive construct: the assignment statement.

Here's a response by Dijkstra A review of the 1977 Turing Award Levgure by John Backus. And here's an interview with Dijkstra.

Great discussion on a recent Hacker News thread and an older thread. Also on Lambda the Ultimate. Nice summary of the project by David Bolton

There's nothing I can really add to the discussion as much smarter people than me have argued this endlessly. Personally, I'm more of a biology than a math and a languages are for communicating with people sort of programmer. So the argument from formal systems have never persuaded me greatly. Doing a proof of a bubble sort in school was quite enough for me. Its applicability to real complex systems has always been in doubt.

The problem of how to best utilize distributed cores is a compelling concern. Though the assumption that parallelism has to be solved at the language level and not how we've done it, at the system level, is not as compelling.

It's a passionate paper and the discussion is equally passionate. While nothing is really solved, if you haven't deep dived into this dialogue across the generations, it's well worth your time to do so. 


10 Tips for Optimizing NGINX and PHP-fpm for High Traffic Sites

Adrian Singer has boiled down 7 years of experience to a set of 10 very useful tips on how to best optimize NGINX and PHP-fpm for high traffic sites:

  1. Switch from TCP to UNIX domain sockets. When communicating to processes on the same machine UNIX sockets have better performance the TCP because there's less copying and fewer context switches.
  2. Adjust Worker Processes. Set the worker_processes in your nginx.conf file to the number of cores your machine has and  increase the number of worker_connections.
  3. Setup upstream load balancing. Multiple upstream backends on the same machine produce higher throughout than a single one.
  4. Disable access log files. Log files on high traffic sites involve a lot of I/O that has to be synchronized across all threads. Can have a big impact.
  5. Enable GZip
  6. Cache information about frequently accessed files
  7. Adjust client timeouts.
  8. Adjust output buffers.
  9. /etc/sysctl.conf tuning.
  10. Monitor. Continually monitor the number of open connections, free memory and number of waiting threads and set alerts if thresholds are breached. Install the NGINX stub_status module.

Please take a look at the original article as it includes excellent configuration file examples.

Page 1 ... 6 7 8 9 10 ... 173 Next 10 Entries »