advertise
Friday
Dec182015

Stuff The Internet Says On Scalability For December 18th, 2015

Hey, it's HighScalability time:


In honor of a certain event what could be better than a double-bladed lightsaber slicing through clouds? (ESA/Hubble & NASA)

 

If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.
  • 66,000 yottayears: lifetime of an electron; 3 Gbps: potential throughput for backhaul microwave networks; 1.2 trillion: yearly Google searches; $100 trillion: global investible capital; 2.5cm: range of chip powered by radio waves; 

  • Quotable Quotes:
    • @KarenMN: He's making a database / He's sorting it twice / SELECT * from contacts WHERE behavior = 'nice' / SQL Clause is coming to town
    • abrkn: Every program attempts to expand until it has an app store. Those programs which cannot so expand are replaced by ones which can.
    • Amin Vahdat: Some recent external measurements indicate that our [Google] backbone carries the equivalent of 10 percent of all the traffic on the global Internet. The rate at which that volume is growing is faster than for the Internet as a whole.
    • Prismatic:  we also learned content distribution is a tough business and we’ve failed to grow at a rate that justifies continuing to support our Prismatic News products.
    • On General Pershing: Pershing was the way he was because he knew that winning wars was in the details. Troops who paid attention to the small things would master the big things. 
    • jbob2000: Wow! A single developer working on small websites doesn't need MVC? What a revelation! I bet he doesn't have any pesky problems, such as; working in large teams, long term support, developer turn over, documentation, changing requirements, deadlines, scaling, etc. etc. Oh, but the rendered HTML looks nice!
    • Poldrack: That was totally unexpected, but it shows that being caffeinated radically changes the connectivity of your brain
    • @ValaAfshar: Uber is less than 6 years old and now valued more than 80% of S&P 500 companies.
    • @HNTitles: Scaling Pinterest - From 0 to Startup: How We Use That. What startups use to prevent concussions
    • @Carnage4Life: Top 5 qualities of successful teams at Google 1 Failure is OK 2 Dependability 
      3 Clear structure 4 Meaning 5 Impact
    • Ustun Ozgur: The tides have changed there too. Now, you need just two endpoints: One for serving the initial HTML, one for the API endpoints. This is the essence of web programming in the future: Two endpoints to rule them all.
    • @ErlangerNick: US: 1 brewery per 78k people, 10 new breweries per week. UK: 1 brewery per 50k people, 15 new breweries per week when scaling populations.
    • jerf: When rewriting something, you should generally strive for a drop-in replacement that does the same thing, in some cases, even matching bug-for-bug
    • @EricMinick: "We found that where code deployments are most painful, you’ll find the poorest IT performance... and culture" - 2015 Puppet State of DevOps
    • @StartupLJackson: I'm going on the record to say the killer app for Bitcoin is not turning $1 of electricity into $.50 of BTC. 
    • @nntaleb: Paris blokes missed the point that it is not just temp rising, but its volatility rising more than the average! 2nd order effect=fragility
    • Julian Dunn: Unfortunately, I believe that the “large attack surface” is a fundamental design problem with containers being an evolutionary, not a revolutionary step from VMs and bare metal.
    • The Shade Tree Developer: sharing a database is like drug abusers sharing needles.
    • Joe Young: Keurig coffee machines are the bane of my trade. They are not built to last, some rarely make it a year in our business. They have no replaceable parts, so I can not fix them.
    • wh-uws: This is why slack is winning. They took many of the concepts of what makes irc great abd put a much better user experience on top. Why is that so hard for people to understand?
    • @chamath: New VC dynamics: Returns being generated by new firms. Legacy firms increasingly dated and out of touch. 

  • The Talk Show interviewed Apple senior vice president of software engineering Craig Federighi about Swift. The upshot wasn't anything technical, it was a feeling: If you were worried that Apple is going to dangle Swift, get you pot committed, and then pull it out from under you, that seems highly unlikely. It's clear from the interview Apple is using Swift, they are excited about Swift, and it's here to stay. Plan accordingly. John Siracusa is dead on in his discussion of garbage collection. Swift is using ARC instead of garbage collection, which is a bet on determinism winning over virtual machine based language approaches, which is a good bet IMHO, even in the age of more powerful mobile processors.

  • Elon Musk’s Billion-Dollar AI Plan Is About Far More Than Saving the World. Those AIs are so clever. How do you distribute AIs as deep and wide into society as possible? You make it free and open! That's how the AIs are going to take over, riding the open source meme to victory. 

  • It's odd how in software we try to reduce coupling at all costs, yet in biology every opportunity to communicate and create feedback loops is exploited. Maybe it's we who are doing it wrong? Cells send tiny parcels to each other: cells package various molecules into tiny bubble-like parcels called extracellular vesicles to send important messages - in sickness and health.

  • Now that's disaster planning! Elon Musk worries third World War would ruin Mars mission.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Dec162015

How Does the Use of Docker Effect Latency?

A great question came up on the mechanical-sympathy list that many others probably have as well: 

I keep hearing about [Docker] as if it is the greatest thing since sliced bread, but I've heard anecdotal evidence that low latency apps take a hit. 

Who better to answer than Gil Tene, Vice President of Technology and CTO, Co-Founder, of Azul Systems? Like Stephen Curry draining a deep transition three, Gil can always be counted on for his insight:

And here's Gil's answer:

Putting aside questions of taste and style, and focusing on the effects on latency (the original question), the analysis from a pure mechanical point of view is pretty simple: Docker uses Linux containers as a means of execution, with no OS virtualization layer for CPU and memory, and with optional (even if default is on) virtualization layers for i/o. 

CPU and Memory

From a latency point of view, Docker's (and any other Linux container's) CPU and memory latency characteristics are pretty much indistinguishable from Linux itself. But the same things that apply to latency behavior in Linux apply to Docker.

If you want clean & consistent low latency, you'll have to the same things you need to do on non-dockerized and non-containerized Linux for the same levels of consistency. E.g. if you needed to keep the system as a whole under control (no hungry neighbors), you'll have to do that at the host level for Docker as well.

If you needed to isolate sockets or cores and choose which processes end up where, expect to do the same for your docker containers and/or the threads within them.

If you were numactl'ing or doing any sort of directed numa-driven memory allocation, the same will apply.

And some of the stuff you'll need to do may seem counter-style to how some people want to deploy docker, but if you are really interested in consistent low latency, you'll probably need to break out the toolbox and use the various cgroups, tasksets and other cool stuff to assert control over how things are laid out. But if/when you do, you won't be able to tell the difference (in terms of CPU and memory latency behaviors) between a dockeriz'ed process and one that isn't.

I/O

Disk I/O

I/O behavior under various configurations is where most of the latency overhead questions (and answers) usually end up. I don't know enough about disk i/o behaviors and options in docker to talk about it much. I'm pretty sure the answer to anything throughput and latency sensitive for storage will be "bypass the virtualization and volumes stuff, and provide direct device access to disks and mount points".

Networking

The networking situation is pretty clear: If you want one of those "land anywhere and NAT/bridge with some auto-generated networking stuff" deployments, you'll probably pay dearly for that behavior in terms of network latency and throughput (compared to bare metal dedicated NICs on normal linux). However, there are options for deploying docker containers (again, may be different from how some people would like to deploy things) that provide either low-overhead or essentially zero-latency-overhead network links for docker. Start with host networking and/or use dedicated IP addresses and NICs, and you'll do much better than the bridged defaults. But you can go to things like Solarflare's NICs (which tend to be common in bare metal low latency environments already), and even do kernel bypass, dedicated spinning-core network stack things that will have a latency behavior no different (on Docker) than if you did the same on bare metal Linux.

 

Docker (which is "userland as a unit") is not about packing lots of thing into a box. Neither is guest-OS-as-a-unit virtualization. Sure, they can both be used for that (and often are), but the biggest benefit they both give is the ability to ship around a consistent, well captured configuration. And the ability to develop, test, and deploy that exact same configuration. This later turns into being able to easily manage deployment and versioning (including roll backs), and being able to do cool things like elastic sizing, etc. There are configuration tools (puppet/chef/...) that can be used to achieve similar results on bare metal as well, of course (assuming they truly control everything in your image), but the ability to pack up your working stuff as a bunch of bits that can "just be turned on" is a very appealing.

I know people who use virtualization even with a single guest-per-host (e.g. an AWS r3.8xlarge instance type is probably that right now). And people who use docker the same way (single container per host). In both cases, it's about configuration control and how things get deployed, and not at all about packing things in a smaller footprint.

The low latency thing then becomes a "does it hurt?" question. And Docker hurts a lot less than hypervisor or KVM based virtualization does when it comes to low latency, and with the right choices for I/O (dedicated NICs, cores, and devices), it becomes truly invisible.

On HackerNews

Monday
Dec142015

Does AMP Counter an Existential Threat to Google?

When AMP (Accelerated Mobile Pages) was first announced it was right inline with Google’s long standing project to make the web faster. Nothing seemingly out of the ordinary.

Then I listened to a great interview on This Week in Google with Richard Gingras, Head of News at Google, that made it clear AMP is more than just another forward looking initiative from Google. Much more.

What is AMP? AMP is two things. AMP is a restricted subset of HTML designed to make the web fast on mobile devices. AMP is also a strategy to counter an existential threat to Google: the mobile web is in trouble and if the mobile web is in trouble then Google is in trouble.

In the interview Richard says (approximately):

The alternative [to a strong vibrant community around AMP] is devastating. We don’t want to see a decline in the viability of the mobile web. We don’t want to see poor experiences on the mobile web propel users into proprietary platforms.

This point, or something very like it, is repeated many times during the interview. With ad blocker usage on the rise there’s a palpable sense of urgency to do something. So Google stepped up and took leadership in creating AMP when no one else was doing anything that aligned with the principles of the free and open web.

The irony for Google is that advertising helped break the web. We have fouled our own nest.

Why now? Web pages are routinely between 2MB and 10 MB in size for only 80K worth of content. The blimpification of web pages comes from two general sources: beautification and advertising. Lots of code and media are used to make the experience of content more compelling. Lots of code and media are used in advertising.

The result: web pages have become very very slow. And a slow web is a dead web, especially in the parts of the world without fast or cheap mobile networks, which is much of the world. For many of these people the Internet consists of their social network, not the World Wide Web, and that’s not a good outcome for lots of people, including Google. So AMP wants to make people fall in love with the web again by speeding it up using a simple, cachable, and open format.

Does AMP work? Pinterest found AMP pages load four times faster and use eight times less data than traditional mobile-optimized pages. So, yes.

Is AMP being adopted? Seems like it.  Some of those on board are: WordPress, Nuzzle, LinkedIn, Twitter. Fox News, The WSJ, The NYT, Huffington Post, BuzzFeed, The Washington Post, BBC, The Economist, FT, Vox Media, LINE, Viber, and Tango, comScore, Chartbeat, Google Analytics, Parse.ly, Network18, and many more. Content publishers clearly see value in the survival of the web. Developers like AMP too. There are over 4500 developers on the AMP GitHub project.

When will AMP start? Google will reportedly send traffic to AMP pages in Google Search starting in late February, 2016.

Will Google advantage AMP in search results? Not directly says Google, but since faster sites rank better, AMP will implicitly rank higher compared to heavier weight content. We may have a two tiered web: the fast AMP based web and the slow bloated traditional web. Non AMP pages can still be made fast of course, but all of human history argues against it.

The AMP talk featured a well balanced panel representing a wide variety of interests. Leo Laporte, famous host and founder of TWiT, represents the small content publisher. He views AMP with a generally positive yet skeptical eye. AMP is open source, but it is still controlled by Google, so is the web still the open web? Jeff Jarvis is a journalism professor and a long time innovative thinker on how journalism can stay alive in the modern era. Jeff helped inspire the idea of AMP and sees AMP as a way publishers can distribute content to users on whatever form of media users are consuming. Kevin Marks is as good a representative for the free and open web as you could ask for. Matt Cutts as a very early employee at Google is of course pro Google, but he’s also represents an engineering perspective. Richard Gingras is the driving force behind AMP at Google. He’s also a compelling evangelist for AMP and the need for a true new Web 2.0.

Here’s a gloss of the discussion. I’m not attributing who said what, just the outstanding points that help reveal AMP’s vision for the future of the open web:

Origin Story

Click to read more ...

Friday
Dec112015

Stuff The Internet Says On Scalability For December 11th, 2015

Hey, it's HighScalability time:


Cheesy Star Trek graphics? Nope. It's hot gas streaming into Pandora’s Cluster.

 

If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.

  • 100 millionJohn Henry as played by a conventional computer loses to a quantum computer; 400,000: cores in PayPal's OpenStack deployment; 10TB: max size of Google Cloud SQL database; 9%: Kickstarter projects that don't deliver; $2.3 trillion: worth of The Forbes 400 members; billions: worth of Spanish treasure ship;

  • Quotable Quotes:
    • Pandalicious: I actually expect that down the road most large open source projects will start distributing a standardized build environment via docker containers. 
    • @glasnt: "Optimise for speed flexibility & evolution" "Whoever is iterating faster has a huge advantage" - @adrianco #yow15 
    • @erikbryn: LIDAR goes from $75K to $500, leaves Moore's Law in the Dust
    • Henry Miller: One has to believe wholeheartedly in what one is doing, realize that it is the best one can do at the moment—forego perfection now and always!—and accept the consequences which giving birth entails.
    • @jedws: "uber is way more reliable on Saturday and Sunday because there are no engineers working on the.system" #yow15
    • @samkottle: "Waffles are like kubernetes on a dish" -@rbranson
    • @brian_klaas: No server is easier to manage than no server, but are we moving all the complexity to the front-end?
    • @Carnage4Life: Death of #unbundling part 2: Facebook shutting down lab which shipped side apps like Hello, Rooms & Slingshot 
    • @carlosfairgray: Efforts to drive uncertainty out of development have only driven innovation out of development. #yow15 @DReinertsen 
    • : “Let’s legislate secure cryptographic backdoors” is the 21st century’s “let’s pass a law to make π = 3”
    • @jessitron: To call an API, or just grab it from the database? Don't tap into another team at the spine. Talk to their faces.
    • Brian Chesky: One of the keys to get to scale, is to do things that don’t scale. One other important lesson within this lesson is — 100 customers who love you > 1,000,000 users.
    • IbanezDavy: The areas of where we expect quantum computers to be faster are roughly known. There are cases where classical computers will still perform better than a quantum computer. But D Wave has been criticized of not truly having a quantum computer, so I think they are motivated in just demonstrating that they do indeed have one.
    • @tiagogriffo: "We developed the product so fast that marketing had not time to change the requirements" said a PM. From @DReinertsen talk at #yow15
    • @xaprb: push 10,000 metrics/sec at 1-sec resolution for 1000 servers for a year and see if it scales forever ;-)

  • Apple has open sourced Swift for reals, not just a code dump months too late to be of use. Swift is on github, you can look at the code, see the entire version history from the very first check-in, see what's changing, contribute, file bugs, etc. So it's a real open source project. Apple is even porting key frameworks like their Foundation libraries over to Swift. If you are looking for the one language to rule them all, that can run fast enough on the server, be used for web apps, and run on mobile, Swift is making the case for being that language, which is no doubt what Apple also wants it for. Incentives align. Expect developers to quickly fillout the tool chain. How does Swift compare? Go vs Node vs Rust vs Swift. Swift is fast, but lacks language primitives for parallelism. 

  • Ruby can be much faster. 25,000+ Req/s for Rack JSON API with MRuby~ MRuby is a minimal version of Ruby, that can be embedded in any system that supports C...There is a new HTTP web server called H2O, which is really, really fast...When H2O is compiled, it embeds a MRuby interpreter that can be used to run Ruby code. The result: an astonishing: 28,000+ requests per second.

  • Fox guarding the chickens. U.S. states pass laws backing Uber’s view of drivers as contractors.

  • In the same way there's always a tradeoff between ASIC and white box solutions, there's also an ebb and flow between domain specific languages and general purpose languages. Google replaced Sawzall, a DSL for performing powerful, scalable analysis, with a software ecosystem built around Go. Replacing Sawzall — a case study in domain-specific language migration. The result: we’ve found that with carefully designed libraries we can get most of the benefits of Sawzall in Go while gaining the advantages of a powerful general-purpose language. The overall response of analysts to these changes has been extremely positive. Today, logs analysis is one of the most intensive users of Go at Google, and Go is the most-used language for reading logs through the logs proxy.

  • There's a new data mining Barbie. The new talking Hello Barbie doll has the mind of Siri: "Equipped with Siri-like voice-recognition software and a wi-fi connection, Hello Barbie can respond to questions from kids about everything from her favorite color to career goals." Unfortunately I can't take credit for the data mining comment, I heard it on TWiT

  • If you have 70 data caching stations around the world connected with fast links and you are already expert at caching your own content, starting your own CDN makes a lot sense. So that's what Google did. Cloud CDN. Interestingly, Google may be trying to turn these caching stations into datacenters, so says Google's Secret Plan to Catch Up to Amazon and Microsoft in Cloud. If you could use Kubernetes to place work on the edge and combine that with some kind of multi-datacenter database, you would have yourself very low latency access to a lot of mobile devices.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Dec092015

Free Red Book: Readings in Database Systems, 5th Edition

For the first time in ten years there has been an update to the classic Red Book, Readings in Database Systems, which offers "readers an opinionated take on both classic and cutting-edge research in the field of data management."

Editors Peter Bailis, Joseph M. Hellerstein, and Michael Stonebraker curated the papers and wrote pithy introductions. Unfortunately, links to the papers are not included, but a kindly wizard, Nindalf, gathered all the referenced papers together and put them in one place.

What's in it?

  • Preface 
  • Background introduced by Michael Stonebraker 
  • Traditional RDBMS Systems introduced by Michael Stonebraker 
  • Techniques Everyone Should Know introduced by Peter Bailis 
  • New DBMS Architectures introduced by Michael Stonebraker
  • Large-Scale Dataflow Engines introduced by Peter Bailis 
  • Weak Isolation and Distribution introduced by Peter Bailis 
  • Query Optimization introduced by Joe Hellerstein 
  • Interactive Analytics introduced by Joe Hellerstein 
  • Languages introduced by Joe Hellerstein 
  • Web Data introduced by Peter Bailis 
  • A Biased Take on a Moving Target: Complex Analytics by Michael Stonebraker 
  • A Biased Take on a Moving Target: Data Integration by Michael Stonebraker

Related Articles

 

Tuesday
Dec082015

Sponsored Post: StatusPage.io, Redis Labs, Jut.io, SignalFx, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Senior Devops Engineer - StatusPage.io is looking for a senior devops engineer to help us in making the internet more transparent around downtime. Your mission: help us create a fast, scalable infrastructure that can be deployed to quickly and reliably.

  • At Scalyr, we're analyzing multi-gigabyte server logs in a fraction of a second. That requires serious innovation in every part of the technology stack, from frontend to backend. Help us push the envelope on low-latency browser applications, high-speed data processing, and reliable distributed systems. Help extract meaningful data from live servers and present it to users in meaningful ways. At Scalyr, you’ll learn new things, and invent a few of your own. Learn more and apply.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events

  • Your event could be here. How cool is that?

Cool Products and Services

  • Real-time correlation across your logs, metrics and events.  Jut.io just released its operations data hub into beta and we are already streaming in billions of log, metric and event data points each day. Using our streaming analytics platform, you can get real-time monitoring of your application performance, deep troubleshooting, and even product analytics. We allow you to easily aggregate logs and metrics by micro-service, calculate percentiles and moving window averages, forecast anomalies, and create interactive views for your whole organization. Try it for free, at any scale.

  • Turn chaotic logs and metrics into actionable data. Scalyr replaces all your tools for monitoring and analyzing logs and system metrics. Imagine being able to pinpoint and resolve operations issues without juggling multiple tools and tabs. Get visibility into your production systems: log aggregation, server metrics, monitoring, intelligent alerting, dashboards, and more. Trusted by companies like Codecademy and InsideSales. Learn more and get started with an easy 2-minute setup. Or see how Scalyr is different if you're looking for a Splunk alternative or Sumo Logic alternative.

  • SignalFx: just launched an advanced monitoring platform for modern applications that's already processing 10s of billions of data points per day. SignalFx lets you create custom analytics pipelines on metrics data collected from thousands or more sources to create meaningful aggregations--such as percentiles, moving averages and growth rates--within seconds of receiving data. Start a free 30-day trial!

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex goes beyond monitoring and measures the system's work on your servers, providing unparalleled insight and query-level analysis. This unique approach ultimately enables your team to work more effectively, ship more often, and delight more customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required. http://aiscaler.com

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Dec072015

The Serverless Start-up - Down with Servers!

teletext.io

This is a guest post by Marcel Panse and Sander Nagtegaal from Teletext.io.

In our early Peecho days, we wrote an article explaining how to build a really scalable architecture for next to nothing, using Amazon Web Services. Auto-scaling, merciless decoupling and even automated bidding on unused server capacity were the tricks we used back then to operate on a shoestring. Now, it is time to take it one step further.

We would like to introduce Teletext.io, also known as the serverless start-up - again, entirely built around AWS, but leveraging only the Amazon API Gateway, Lambda functions, DynamoDb, S3 and Cloudfront.

The Virtues of Constraint

We like rules. At our previous start-up Peecho, product owners had to do fifty push-ups as payment for each user story that they wanted to add to an ongoing sprint. Now, at our current company myTomorrows, our developer dance-offs are legendary: during the daily stand-ups, you are only allowed to speak while dancing - leading to the most efficient meetings ever.

This way of thinking goes all the way into our product development. It may seem counter-intuitive at first, but constraints fuel creativity. For example, all our logo design is done with technical diagramming tool Omnigraffle, so there is no way we could use hideous lens flares and such. Anyway - recently, we launched yet another initiative called Teletext.io. So, we needed a new restriction.

At Teletext.io, we are not allowed to use servers. Not even one.

It was a good choice. We will explain why.

Why Servers are Bad

Click to read more ...

Friday
Dec042015

Stuff The Internet Says On Scalability For December 4th, 2015

Hey, it's HighScalability time:


Change: Elliott $800,000 in 1960, 8K RAM, 2kHz CPU vs Raspberry Pi Zero, $5, 1Ghz, 512MB

 

If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.

  • 434,000: square-feet in Facebook's new office;  $62.5 billion: Uber's valuation; 11: DigitalOcean datacenters; $4.45 billion: black Friday online sales; 2MPH: speed news traveled in 1500; 95: percent of world covered by mobile broadband; 86%: items Amazon delivers that weigh less than five pounds.

  • Quotable Quotes:
    • Jeremy Hsu: Is anybody thinking about how we’ll have to code differently to accommodate the jump from a 1-exaflop supercomputer to 10 exaflops? There is not enough attention being paid to this issue.
    • @kml: “Process drives away talent” - @adrianco at #yow15
    • capkutay: Seems like a lot of the momentum behind containers is driven by the Silicon Valley investment community.
    • @taotetek: IoT is turning homes into datacenters with no system administrators and no security team.
    • @asymco: On Thursday and early Friday, mobile traffic accounted for nearly 60% of all online shopping traffic, and 40% of all online sales
    • Mobile App Developers are Suffering: It’s just too saturated. The barriers to adoption and therefore monetization are too high. It’s easier on the web.
    • Taleb: It is foolish to separate risk taking from the risk management of ruin.
    • Maxime Chevalier-Boisvert:  I believe dynamic languages are here to stay. They can be very nimble, in ways that statically typed languages might never be able to match. We’re at a point in time where static typing dominates mainstream thought in the programming world, but that doesn’t mean dynamic languages are dead.
    • @__edorian: "Can i have a static linked binary?" - "No that's stupid, it's slower and takes more space!" - "Can i have a docker image?" - "Sure!
    • @grzegorz_dyk: When I see people talking about fine grained #microservices I am thinking: why not use actors? #akka #erlang
    • Henry Miller: When you can’t create you can work.
    • @ValaAfshar: For the first time ever, online media consumption is bigger than TV consumption. 
    • @matthewfellows: I learned today that Airbus code is reviewed by hand... in raw assembly code #yow15 @dius_au
    • Rich Hickey: Programmers know the benefits of everything and the tradeoffs of nothing
    • Robin Harris: Cheap storage is changing the world. Whether it is in the cloud, on a dash cam, or embedded in an app, cheap – as in inexpensive – storage is enabling new relationships between individuals, and with culture, power, and groups.
    • @sustrik: libmill shows 1400x performance improvement in c10k scenarios. Wow! I love low-hanging fruit.
    • @jmckenty: At Scale: Bigger than what you’ve got now.
    • John Cage: My notion of how to proceed in a society to bring change is not to protest the thing that is evil, but rather to let it die its own death.
    • @b6n: preemptively blog about how you scaled to support the million users you don't have yet.
    • @joeweinman: When will the FCC start addressing app neutrality?
    • @ufried: i have this post about data scalability always open in a tab, just to remind me of some essentials once in a while 

  • Personalization is getting more personal and more useful. Personalized Nutrition: Healthy foods are unique to individuals: Israeli research teams have demonstrated that there exists a high degree of variability in the responses of different individuals to identical meals...Using their set of amassed data, the researchers then went a step further, applying machine-learning algorithm to their cohort of 800 participants and developing an algorithm capable of predicting individualized PPGRs (postprandial (post-meal) glycemic responses). This intricate algorithm incorporates 137 features representing meal content, daily activity, blood parameters, CGM-derived features, questionnaires, and microbiome features.

  • Now that's putting concertina wire on the walled garden fence. WhatsApp is blocking links to a competing messenger app.

  • As programming is a creative act, perhaps the ultimate creative act, this advice applies to programmers too. Ira Glass: Nobody tells this to people who are beginners, I wish someone told me. All of us who do creative work, we get into it because we have good taste. But there is this gap. For the first couple years you make stuff, it’s just not that good. It’s trying to be good, it has potential, but it’s not. But your taste, the thing that got you into the game, is still killer. And your taste is why your work disappoints you. A lot of people never get past this phase, they quit. Most people I know who do interesting, creative work went through years of this. We know our work doesn’t have this special thing that we want it to have. We all go through this. And if you are just starting out or you are still in this phase, you gotta know its normal and the most important thing you can do is do a lot of work. Put yourself on a deadline so that every week you will finish one story. It is only by going through a volume of work that you will close that gap, and your work will be as good as your ambitions. And I took longer to figure out how to do this than anyone I’ve ever met. It’s gonna take awhile. It’s normal to take awhile. You’ve just gotta fight your way through.

  • So you want a revolution, what will be the cost? It’s a Trap: Emperor Palpatine’s Poison Pill: In this case study we found that the Rebel Alliance would need to prepare a bailout of at least 15%, and likely at least 20%, of GGP in order to mitigate the systemic risks and the sudden and catastrophic economic collapse. Without such funds at the ready, it likely the Galactic economy would enter an economic depression of astronomical proportions.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Dec012015

Deep Lessons from Google and eBay on Building Ecosystems of Microservices

When you look at large scale systems from Google, Twitter, eBay, and Amazon, their architecture has evolved into something similar: a set of polyglot microservices.

What does it looks like when you are in the polyglot microservices end state? Randy Shoup, who worked in high level positions at both Google and eBay, has a very interesting talk exploring just that idea: Service Architectures at Scale: Lessons from Google and eBay.

What I really like about Randy's talk is how he is self-consciously trying to immerse you in the experience of something you probably have no experience of: creating, using, perpetuating, and protecting a large scale architecture.

In the Ecosystem of Services section of the talk Randy asks: What does it look like to have a large scale ecosystem of polyglot microservices? In the Operating Services at Scale section he asks: As a service provider what does it feel like to operate such a service? In the Building a Service section he asks: When you are a service owner what does it look like? And in the Service Anti-Patterns section he asks: What can go wrong?

A very powerful approach.

The highlight of the talk for me was the idea of aligning incentives, a consistent theme that crosscuts the entire endeavour. While never explicitly pulled out as a separate strategy, it's the motivation behind why you want small teams to develop small clean services, why a charge back model for internal services is so powerful, how architecture can evolve without an architect, how clean design can evolve from a bottom up process, and how standards can evolve without a central committee.

My takeaway is the deliberate aligning of incentives is how you scale both a large, dynamic organization and a large, dynamic code base. Putting in the right incentives nudges things into happening without explicit control, almost in the same way more work in a distributed system gets done when you remove locks, don't share state, communicate with messages, and parallelize everything.

Let's see how large scale systems are built in the modern era...

Polyglot Microservices are the End Game

Click to read more ...

Friday
Nov272015

Stuff The Internet Says On Scalability For November 27th, 2015

Hey, it's HighScalability time:


The most detailed picture of the Internet ever as compiled by an illegal 420,000-node botnet.
  • $40 billion: P2P lending in China; 20%: amount of all US margin expansion accounted for by Apple since 2010; 11: years of Saturn photos; 117: number of different steering wheels offered for a VW Golf; 1Gbps: speed of a network using a lightbulb.

  • Quotable Quotes:
    • @jaksprats: If we could compile a subset of JavaScript to Lua, JS could run on Server(Node,js), Browser, Desktop, iOS, & Android.JS could run EVERYWHERE
    • @wilkieii: Tech: "Don't roll your own crypto if you aren't an expert" *replaces nutrition with Soylent, currency with bitcoin* *puts wifi in lightbulb*
    • @brianpeddle: The architecture of one human brain would require a zettabyte of capacity. Full simulation of a human brain by 2023.
    • MarshalBanana: That can still easily be the right choice. Complex algorithms trade asymptotic performance for setup cost and maintenance cost. Sometimes the tradeoff isn't worth it.
    • kevindeasi: There are so many things to know nowadays. Backend: Sql, NoSql, NewSql, etc. Middlware: Django, NodeJs, Spring, Groovy, RoR, Symfony, etc. Client: Angular, Ember, React, Jquery, etc. I haven't even mentioned hardware, security, servers/cloud, and api. Now you also need to know about theory, UI/UX, git, deploying servers, HTTP, scrum, software development process, testing.
    • Brian Chesky~ It was better to have 100 people who loved us vs. 1M people who liked us. All movements grow this way.
    • idlewords: All the advantages of a dedicated server without the hassle of saving tons of money.
    • jorangreef: Well, how would you handle massive traffic spikes? Through a combination of vertical and horizontal scaling? Through having excess capacity? Except that I would probably want to start with something fast and inexpensive to begin with.
    • @jaykreps: "The bigger the interface, the weaker the abstraction"--@rob_pike
    • Animats: That still irks me. The real problem is not tinygram prevention. It's ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful.
    • @jaykreps: Distributed computing is the new normal: Mesos, K8s = dist'd processes; Cassandra, Kafka, etc = dist'd data; microservices = dist'd apps.
    • @bradfitz: OH: "Well you can add nodes to the cluster. They made that work well, but you can't remove them. It's the Hotel California of auto-scaling."

  • Creating Your Own EC2 Spot Market -- Part 2. Video encoding represents 70% of Netflix's computing needs. And Netflix has a daily peak of 12,000 unused instances. So they created their own spot market to improve encoding throughput by the equivalent of a 210% increase in encoding capacity. Using their update real-time approach they were able to perform an encoding job in 18 hours that they expected to take a few days. Great article with a lot of deep thinking on the topic.

  • Amen! We should come up with a catchy name for RAII so more languages support it because RAII is awesome and simplifies code!

  • Google as a cloud company instead of an ad company? It could happen: Google's Holzle Envisions Cloud Business Eclipsing Ads in 2020. Google announced Custom Machine Types  so you can configure the number of virtual CPUs and the amount RAM you want for you machine. I imagine this nifty feature is enabled by Google's advanced datacenter scheduling software, but it will take more than that to beat AWS and Azure. To take market share Google may need to instigate a price war. Though it looks like Google might make a lot of money charging back to Google.

  • Good explanation of what is servless computing by Leonardo Federico: the phrase “serverless” doesn’t mean servers are no longer involved. It simply means that developers no longer have to think "that much" about them. Computing resources get used as services without having to manage around physical capacities or limits. Let's take for example AWS Lambda. "Lambda allows you to NOT think about servers. Which means you no longer have to deal with over/under capacity, deployments, scaling and fault tolerance, OS or language updates, metrics, and logging."

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...