advertise
Friday
Sep122014

Stuff The Internet Says On Scalability For September 12th, 2014

Hey, it's HighScalability time:


Each dot in this image is an entire galaxy containing billions of stars. What's in there?
  • Quotable Quotes:
    • mseepgood: Or "another language that's becoming popular, Node.js"
    • Joe Moreno: What good are billions of cycles of CPU power that make me wait. I shouldn't have to wait longer and longer due to launching, buffering, syncing, I/O and latency.
    • @stevecheney: Apple Pay is the magic that integrated hardware / software produces. No one else in the world can do this.
    • @etherealmind: Next gen Intel Xeon E5 V3 CPU includes packet processor for 40GBE, 30x increase in OpenSSL crypto, 25% increase in DPDK perf. #IDF14
    • @pbailis: There's actually an interesting question in understanding when to break "sharing" -- at core, NUMA domain, server, or cluster level?
    • @fmueller_bln: Just wait some minutes for vagrant to provision a vm with puppet and you’ll know why docker may be better option for dev machines...

  • Encryption will make fighting the spam war much costlier reveals Mike Hearn in an awesome post: A brief history of the spam war, where he gives insightful color commentary of the punch counter punch between World Heavyweight Champion Google and the challenger, Clever Spammer.  Mike worked in the Gmail trenches for over four years and recommends: make sending email cost money; use money to create deposits using bitcoin. 

  • jeswin: No other browser can practically implement or support Dart. If they do their implementation will be slower than Google's, and will get classified as inferior. < Ignoring the merits of Dart, this is an interesting ecosystem effect. By rating sites for non quality of content reasons Google can in effect select for characteristics over which they have a comparative advantage. It's not an arms length transaction. 

  • Dateline Seattle. Social media users execute a coordinated denial of service attack on cell networks, preventing those in need from accessing emergency services. Who are these terrorists? Football fans. City of Seattle asks people to stop streaming videos, posting photos because of football. Tweets, Instagram, YouTube, and Snapchat are overloading the cell networks so calls can't get through. Should the cell network expand capacity? Should there be an app tax to constrain demand? Should users pay per packet? As a 49ers fan I have another suggestion...move games to a different venue, perhaps the moon. That will help.

  • Are you a militant cable cutter who thinks the future of  TV is the Internet? Not so fast says Dan Rayburn in Internet Traffic Records Could Be Broken This Week Thanks To Apple, NFL, Sony, Xbox, EA and Others: Delivering video over the Internet at the same scale and quality that you do over a cable network isn’t possible. The Internet is not a cable network and if you think otherwise, you will be proven wrong this week. We’re going to see long download times, more buffering of streams, more QoS issues and ISPs that will take steps to deal with the traffic. 

  • Ted Nelson takes on the impossible in on How Bitcoin Actually Works (Computers for Cynics #7). And he does an excellent job, sharing his usual insight with a twist. The title is misleading however. There's hardly any cynicism. How disappointing! Ted is clearly impressed with the design and implementation of bitcoin. For good reason. No matter what you think of bitcoin and its potential role in society, it is a very well thought out and impressive piece of technology. On par with Newton, Mr. Nelson suggests. If you watch this you'll probably realize that you don't actually understand bitcoin, even if you think you do, and that's a good thing.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Sep102014

10 Common Server Setups For Your Web Application

If you need a good overview of different ways to setup your web service then Mitchell Anicas has written a good article for you: 5 Common Server Setups For Your Web Application.

We've even included a few additional possibilities at no extra cost.

  1. Everything on One Server. Simple. Potential for poor performance because of resource contention. Not horizontally scalable. 
  2. Separate Database Server. There's an application server and a database server. Application and database don't share resources. Can independently vertically scale each component. Increases latency because the database is a network hop away.
  3. Load Balancer (Reverse Proxy). Distribute workload across multiple servers. Native horizontal scaling. Protection against DDOS attacks using rules. Adds complexity. Can be a performance bottleneck. Complicates issues like SSL termination and stick sessions.
  4. HTTP Accelerator (Caching Reverse Proxy). Caches web responses in memory so they can be served faster. Reduces CPU load on web server. Compression reduces bandwidth requirements. Requires tuning. A low cache-hit rate could reduce performance. 
  5. Master-Slave Database Replication. Can improve read and write performance. Adds a lot of complexity and failure modes.
  6. Load Balancer + Cache + Replication. Combines load balancing the caching servers and the application servers, along with database replication. Nice explanation in the article.
  7. Database-as-a-Service (DBaaS). Let someone else run the database for you.  RDS is one example from Amazon and there are hosted versions of many popular databases.
  8. Backend as a Service (BaaS). If you are writing a mobile application and you don't want to deal with the backend component then let someone else do it for you. Just concentrate on the mobile platform. That's hard enough. Parse and Firebase are popular examples, but there are many more.
  9. Platform as a Service (PaaS). Let someone else run most of your backend, but you get more flexibility than you have with BaaS to build your own application. Google App Engine, Heroku, and Salesforce are popular examples, but there are many more.
  10. Let Somone Else Do it. Do you really need servers at all? If you have a store then a service like Etsy saves a lot of work for very little cost. Does someone already do what you need done? Can you leverage it?
Monday
Sep082014

How Twitter Uses Redis to Scale - 105TB RAM, 39MM QPS, 10,000+ Instances 

Yao Yue has worked on Twitter’s Cache team since 2010. She recently gave a really great talk: Scaling Redis at Twitter. It’s about Redis of course, but it's not just about Redis.

Yao has worked at Twitter for a few years. She's seen some things. She’s watched the growth of the cache service at Twitter explode from it being used by just one project to nearly a hundred projects using it. That's many thousands of machines, many clusters, and many terabytes of RAM.

It's clear from her talk that's she's coming from a place of real personal experience and that shines through in the practical way she explores issues. It's a talk well worth watching.

As you might expect, Twitter has a lot of cache.

Timeline Service for one datacenter using Hybrid List:
  • ~40TB allocated heap
  • ~30MM qps
  • > 6,000 instances
Use of BTree in one datacenter:
  • ~65TB allocated heap
  • ~9MM qps
  • >4,000 instances

You'll learn more about BTree and Hybrid List later in the post.

A couple of points stood out:

  • Redis is a brilliant idea because it takes underutilized resources on servers and turns them into valuable service.
  • Twitter specialized Redis with two new data types that fit their use cases perfectly. So they got the performance they needed, but it locked them into an older code based and made it hard to merge in new features. I have to wonder, why use Redis for this sort of thing? Just create a timeline service using your own datastructures. Does Redis really add anything to the party?
  • Summarize large chunks of log data on the node, using your local CPU power, before saturating the network.
  • If you want something that’s high performance separate the fast path, which is the data path, away from the slow path, which is the command and control path. 
  • Twitter is moving towards a container environment with Mesos as the job scheduler. This is still a new approach so it's interesting to hear about how it works. One issue is the Mesos wastage problem that stems from requirement to specify hard resource usage limits in a complicated runtime world.
  • A central cluster manager is really important to keep a cluster in a state that’s easy to understand.
  • The JVM is slow and C is fast. Their cache proxy layer is moving back to C/C++.
With that in mind, let's learn more about how Redis is used at Twitter:

Why Redis?

Click to read more ...

Friday
Sep052014

Stuff The Internet Says On Scalability For September 5th, 2014

Hey, it's HighScalability time:


Telephone Tower, late 1880s, 5000 telephone lines. Switching FTW.
  • 1.3 trillion: row table in SQL server; 100,000: galaxies in the Laniakea supercluster.
  • Quotable Quotes:
    • @pbailis: OLAP: data at rest, queries in motion. Stream processing: data in motion, queries at rest. PSoup: data in motion, queries in motion.
    • @ronpepsi: Scaling rule: addressing one bottleneck always starts the clock ticking on another one. (The same goes for weak links in chains.)
    • @utahkay: Our mental models are deterministic, and break down when you reach high utilization in a stochastic system. 

  • Instagram introduced Hyperlapse, their answer to a world that doesn't move fast enough already. And here's the story of how they did it: The Technology behind Hyperlapse from Instagram. It combines time travel and psychadelics, I think you'll enjoy it.

  • Etsy CEO to Businesses: If Net Neutrality Perishes, We Will Too. The idea of being a common carrier is old, deep, and powerful. It creates markets that grow rather than monopolies the choke economies to death. Ferries were required to be common carriers, that is they must ferry all people and goods at the same price.  Otherwise communities would not survive. AT&T became a monopoly on the promise of universal service and becoming a common carrier for all. The Internet is a more important version of the same idea.

  • To make lots and lots of money you need to hitch your star to a fast growing something. Google placed ads on an exponentially expanding inventory of 3rd party web content. Winner. Now Google is exploiting another phenomena experiencing an exponential growth curve: data. This time they aren't placing ads, they are calculating functions with BigQuery. Put On Your Streaming Shoes is a story showing just why and how this jump to another fast growing something will likely succeed.

  • Just an incredible look into the structure behind PhotoGate. Notes on the Celebrity Data Theft. These aren't just script kiddies. These are sophisticated and organized groups. Are hacker networks the new roving band of Vikings looking to rape and pillage? Though it would help if the villages were better protected.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Sep032014

Strategy: Change the Problem

James T. Kirk's infamous gambit in Starfleet's impossible to win Kobayashi Maru test was to redefine the problem into a challenge he could beat. 

Interestingly, an article titled Shifts In Algorithm Design, says something like the same gambit is the modern method of solving algorithmic problems.

In the past: 

I, Dick, recall the “good old days of theory.” When I first started working in theory—a sort of double meaning—I could only use deterministic methods. I needed to get the exact answer, no approximations. I had to solve the problem that I was given—no changing the problem.

 

In the good old days of theory, we got a problem, we worked on it, and sometimes we solved it. Nothing shifty, no changing the problem or modifying the goal. 

Today:

Click to read more ...

Tuesday
Sep022014

Sponsored Post: Apple, Scalyr, Tumblr, Gawker, FoundationDB, CopperEgg, Logentries, BlueStripe, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here. 
    • Site Reliability Engineer. The iOS Systems team is building out a Site Reliability organization. In this role you will be expected to work hand-in-hand with the teams across all phases of the project lifecycle to support systems and to take ownership as they move from QA through integrated testing, certification and production.  Please apply here.
    • Server Software Engineer - Maps Community. As an engineer woking on Maps Community services, your primary responsibility will be backend server software development for the services that power our data crowdsourcing efforts. You’ll be part of a small team working in Java and Scala to add new features and improve our core infrastructure, leveraging best-of-breed frameworks for scalable distributed computing. Please apply here

  • Make Tumblr fast, reliable and available for hundreds of millions of visitors and tens of millions of users. As a Site Reliability Engineer you are a software developer with a love of highly performant, fault-tolerant, massively distributed systems. Apply here.

  • Systems & Networking Lead at Gawker. We are looking for someone to take the initiative on the lowest layers of the Kinja platform. All the way down to power and up through hardware, networking, load-balancing, provisioning and base-configuration. The goal for this quarter is a roughly 30% capacity expansion, and the goal for next quarter will be a rolling CentOS7 upgrade as well as to planning/quoting/pitching our 2015 footprint and budget. For the full job spec and to apply, click here: http://grnh.se/t8rfbw

  • FoundationDB is seeking outstanding developers to join our growing team and help us build the next generation of transactional database technology. You will work with a team of exceptional engineers with backgrounds from top CS programs and successful startups. We don’t just write software. We build our own simulations, test tools, and even languages to write better software. We are well-funded, offer competitive salaries and option grants. Interested? You can learn more here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events

  • Your event here.

Cool Products and Services

  • Better, Faster, Cheaper: Pick Three. Scalyr is your universal tool for visibility into your production systems. Log aggregation, server metrics, monitoring, alerting, dashboards, and more. Not just “hosted grep” or “hosted graphs”; our columnar data store enables enterprise-grade functionality with sane pricing and insane performance. Trusted by in-the-know companies like Codecademy – get on board!

  • CopperEgg. Simple, Affordable Cloud Monitoring. CopperEgg gives you instant visibility into all of your cloud-hosted servers and applications. Cloud monitoring has never been so easy: lightweight, elastic monitoring; root cause analysis; data visualization; smart alerts. Get Started Now.

  • Whitepaper Clarifies ACID Support in Aerospike. In our latest whitepaper, author and Aerospike VP of Engineering & Operations, Srini Srinivasan, defines ACID support in Aerospike, and explains how Aerospike maintains high consistency by using techniques to reduce the possibility of partitions. 

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Sep012014

Let's Build Maker Cities for Maker People Around New Resources Like Bandwidth, Compute, and Atomically-Precise Manufacturing

TL;DR: There’s a lot of unused space in North America. Yet cities like San Francisco are becoming ever more expensive because of a bubble created by high tech jobs that seemingly can be done anywhere. Historically cities are built around resources that provide some service to humans. The age of infrastructure rising around physical resources is declining while the age of digital resource exploitation is rising. Cities are still valuable because they are amazing idea and problem solving machines. How about we create thousands of new Maker Cities in the vast emptiness that is North America and build them around digital resources like bandwidth, compute power, Atomically-Precise Manufacturing (AMP), and all things future and bright?

Observation Number One: There’s lots of empty space out there.

Click to read more ...

Friday
Aug292014

Stuff The Internet Says On Scalability For August 29th, 2014

Hey, it's HighScalability time:


In your best Carl Sagan voice...Billions and Billions of Habitable Planets.
  • Quotable Quotes:
    • @Kurt_Vonnegut: Another flaw in the human character is that everybody wants to build and nobody wants to do maintenance.
    • @neil_conway: "The paucity of innovation in calculating join predicate selectivities is truly astounding."
    • @KentBeck: power law walks into a bar. bartender says, "i've seen a hundred power laws. nobody orders anything." power law says, "1000 beers, please".
    • @CompSciFact: RT @jfcloutier: Prolog: thinking in proofs Erlang: thinking in processes UML: wishful thinking

  • For your acoustic listening pleasure let me present...The Orbiting Vibes playing Scaling Doesn't Matter. I don't quite understand how it relates to scaling, but my deep learning algorithm likes it. 

  • The Rise of the Algorithm. Another interesting podcast with James Allworth and Ben Thompson. Much pondering of how to finance content. Do you trust content with embedded affiliate links? Do you trust content written by writers judged on their friendliness to advertisers? Why trust at all is the bigger question. Facebook is the soft news advertisers love. Twitter is the hard news advertisers avoid. A traditional newspaper combined both. Humans are the new horses. < Capitalism doesn't care if people are employed anymore than it cared about horses being employed. Employment is simply a byproduct of inefficient processes. The Faith that the future will provide is deliciously ironic given the rigorous rationalism underlying most of the episodes.

  • Great reading list for Berkeley CS286: Implementation of Database Systems, Fall 2014. 

  • Is it just me or is it totally weird that all the spy systems use the same diagrams that any other project would use? It makes it seem so...normal. The Surveillance Engine: How the NSA Built Its Own Secret Google.

  • The Mathematics of Herding Sheep. By little border collie Annie embodies a very smart algorithm to herd sheep:  When sheep become dispersed beyond a certain point, dogs put their effort into rounding them up, reintroducing predatory pressure into the herd, which responds according to selfish herd principles, bunching tightly into a more cohesive unit. < What's so disturbing is how well this algorithm works with people.

  • Inside Google's Secret Drone-Delivery Program. What I really want are pick-up drones, where I send my drone to pick stuff up. Or are pick-up and delivery cars a better bet? Though I can see swarms of drones delivering larger objects in parts that self-assemble

  • Lambda Architecture at Indix: "break down the various stages in your data pipeline into the layers of the architecture and choose technologies and frameworks that satisfy the specific requirements of each layer."

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Aug272014

The 1.2M Ops/Sec Redis Cloud Cluster Single Server Unbenchmark

This is a guest post by Itamar Haber, Chief Developers Advocate, Redis Labs.

While catching up with the world the other day, I read through the High Scalability guest post by Anshu and Rajkumar's from Aerospike (great job btw). I really enjoyed the entire piece and was impressed by the heavy tweaking that they did to their EC2 instance to get to the 1M mark, but I kept wondering - how would Redis do?

I could have done a full-blown benchmark. But doing a full-blown benchmark is a time- and resource-consuming ordeal. And that's without taking into account the initial difficulties of comparing apples, oranges and other sorts of fruits. A real benchmark is a trap, for it is no more than an effort deemed from inception to be backlogged. But I wanted an answer, and I wanted it quick, so I was willing to make a few sacrifices to get it. That meant doing the next best thing - an unbenchmark.

An unbenchmark is, by (my very own) definition, nothing like a benchmark (hence the name). In it, you cut every corner and relax every assumption to get a quick 'n dirty ballpark figure. Leaning heavily on the expertise of the guys in our labs, we measured the performance of our Redis Cloud software without any further optimizations. We ran our unbenchmark with the following setup:

Click to read more ...

Monday
Aug252014

MixRadio Architecture - Playing with an Eclectic Mix of Services

This is a guest repost by Steve Robbins, Chief Architect at MixRadio.

At MixRadio, we offer a free music streaming service that learns from listening habits to deliver people a personalised radio station, at the single touch of a button. MixRadio marries simplicity with an incredible level of personalization, for a mobile-first approach that will help everybody, not just the avid music fan, enjoy and discover new music. It's as easy as turning on the radio, but you're in control - just one touch of Play Me provides people with their own personal radio station.
 
The service also offers hundreds of hand-crafted expert and celebrity mixes categorised by genre and mood for each region. You can also create your own artist mix and mixes can be saved for offline listening during times without signal such as underground travel, as well as reducing data use and costs.
 
Our apps are currently available on Windows Phone, Windows 8, Nokia Asha phones and the web. We’ve spent years evolving a back-end that we’re incredibly proud of, despite being British! Here's an overview of our back-end architecture.

 

Architecture Overview

Click to read more ...