« Stuff The Internet Says On Scalability For March 9th, 2018 | Main | Sponsored Post: Clover, Triplebyte, Exoscale, Symbiont, Loupe, Etleap, Aerospike, Scalyr, Domino Data Lab, MemSQL »

Stuff The Internet Says On Scalability For March 2nd, 2018

Hey, it's HighScalability time: 


Algorithms described like IKEA instructions. Can anyone assemble these? (Algorithms and data structures)


If you like this sort of Stuff then please support me on Patreon. And please consider recommending my new book—Explain the Cloud Like I'm 10—to whole entire world. 


  • $75 million: Dropbox saved moving out of S3; 159 million: Spotify monthly active users; 80 million: more records added to Have I Been Pwned; 9%: universe expanding faster than predicted; $2,222,279: Warren Buffett won his long bet against hedge fund mangers; 60,000: Mayan houses found in Guatemala using LiDAR; $14.2 billion: PaaS revenue; ~180 million: years until first sun after the big whatever it was; $1,599: cost of stolen Extended Validation (EV) certificate; 8,000X: query speedup using GPU database; 2.4 million: Google requests to be forgotten; 6 minutes: time to IoT device attack on the internet; 103 million: tweets sent about the Olympics; 320,000: increase in Chloe Kim's twitter followers; 150 kg: acorns stored by woodpeckers in a telecom antenna; 0.14ms: Fsync performance on Intel PC-3700; Q: earliest known article on Wikipedia; 800Gbps+: memcached reflection/amplification attacks; 2M+: Google-Landmarks image training set; 3 million: graphics cards purchased by cryptocurrency miners; 30%: Uber and Lyft drivers lose money; 

  • Quotable Quotes:
    • @mikko: Interesting point raised in a reddit thread: Satoshi's original bitcoins are now a quantum canary. Once we see them moving, we’ll know that someone has a functioning advanced quantum computer. It's just too big a prize not to be the first thing you’d do with a quantum computer.
    • @brettberson: I just learned from a former longtime Amazon employee, the idea for Prime came from an IC [individual contributor] engineer. He wrote up a 6 page memo. He was inspired by the Costco membership model. It was built as a test. It's now the key pillar of Amazon. The best ideas can come from anywhere.
    • Erica Klarreich: In a statistical analysis of nearly 1,000 networks drawn from biology, the social sciences, technology and other domains, researchers found that only about 4 percent of the networks (such as certain metabolic networks in cells) passed the paper’s strongest tests. And for 67 percent of the networks, including Facebook friendship networks, food webs and water distribution networks, the statistical tests rejected a power law as a plausible description of the network’s structure.
    • @kpkelleher: 531 ICOs that appeared in 2017 have already vanished. Together, they raised $233 million.
    • @darrenrovell: In November 2013, Jamie Siminoff came on Shark Tank valuing his WiFi enabled video doorbell at $7 million. Four sharks passed & @kevinolearytv offered his typical loan/royalty deal. Siminoff passed. That company became @ring & today sold for more than $1 billion to Amazon.
    • @tyleralove: As of today @bustle has fully adopted serverless. We’re down to 15 ec2 instances mostly comprised of self-managed HA Redis. We serve upwards of a billion requests to 80 million people using SSR preact and react a month. We are a thriving example of modern JavaScript at scale. We do all of this with a relatively tiny engineering team of 12 while simultaneously building compelling to use product that was never focused on social media audience gaming or egregious engagement metric hacking.
    • abetusk: I've heard, and agree with, that 95% of programming doesn't require any deep CS knowledge. The flip side of that is 5% of the time you will and for those 1/20 times you encounter a problem that requires theory, you're dead in the water unless you know how to identify it, how to solve it or where to look for solutions to it.
    • @geofft: At which point Trustico's CEO decided to EMAIL 23,000 CUSTOMER PRIVATE KEYS to Digicert, apparently in order to trigger that clause.
    • @swardley: The ONLY reason that Amazon is as big as it is today and continuing to grow rather than being constrained (as normally happens) is because competitor executives have utterly failed to adapt. This is not a market failure, it's a failure of executives ..
    • Andrei Barysevich: Contrary to a common belief that the security certificates circulating in the criminal underground are stolen from legitimate owners prior to being used in nefarious campaigns, we confirmed with a high degree of certainty that the certificates are created for a specific buyer per request only and are registered using stolen corporate identities, making traditional network security appliances less effective.
    • @KentonVarda: After I open sourced Protocol Buffers, the promo committee denied me for promotion (from Senior to Staff) because my packet contained no peer reviews from more-senior engineers who worked closely with me. (There were no such engineers.)
    • Jared Diamond: Why is there such widespread public opposition to science and scientific reasoning in the United States, the world leader in every major branch of science?
    • @sama: I wonder how much cryptocurrency is slowing the rate of AI progress by wildly driving up the price of GPUs...
    • kmagnum~ It's a little ridiculous the makers [IBM] of the sh*tlord application called Websphere would say deploying an app should be less complicated. Let me describe to you the hello world introduction to making a websphere website
    • @adrianco: if you are building the kind of apps that traditionally needed a DR plan and backup datacenter then you should be designing in Chaos Engineering from the start. That’s my current focus.
    • @swardley: X: What do you think the future of kubernetes & containers will be? Me: They might be very important invisible subsystems of the serverless future but only if Amazon and Alibaba heavily support it. Otherwise, they'll become mere blips in technology history, a forgotten path.
    • Daniel Lemire: What is the verdict? On a skylake processor, both reach an identical speed, at 0.35 cycles per input integer. The instruction count is nearly identical. Thus it is probably not worth bothering with shifts by immediate values for performance reasons.
    • @brettberson: AdWords is another example. It was broken and not usable until Jeff Dean saw a note on a company fridge and decided spend a weekend taking a crack at it. The story is detailed in @DanielCoyle's new book The Culture Code.
    • @ykanellopoulos: "You can use an eraser on the table or a sledgehammer in the construction site". So much for finding bugs in production and not during design or even development.  Quote from the architect Frank Lloyd Wright brought up by @r0ml in #OReillySACon.
    • @mikeloukides: Chaos experiments look a lot like integration tests. Disconnecting (or adding latency) between modules. @nora_js #OReillySACon
    • Jimmy Wales: I mean, so here's the thing. If you think about the DNA of any organization, it's very difficult to stop an organization from following the money. So Wikipedia is a nonprofit, and as a nonprofit, we could run ads - no legal prohibition on a nonprofit running ads as a means of support.
    • Vaclav Smil: Another order-of-magnitude increase is thus needed before PV will rival global hydroelectricity generation, which supplied more than 16 percent of world demand in 2016. Not even the most optimistic forecast—that of the International Renewable Energy Agency [PDF]—expects PV output to close that gap by 2030. But PV cells might be generating 10 percent of the world’s electricity by 2030.
    • @mipsytipsy: sigh.  sooner or later people are gonna realize that the only reasonable and practical tradeoffs are the ones we have made.  we didn't make this up for shits and giggles. sample, don't aggregate events, not metrics explore, don't dashboard event traces, not event strings
    • Rob Pike: These are like the ur-controls for the iPod and (later) iPhone, but anticipate the music player by almost three decades. In fact, the CERN knob is better than the click wheel: It is programmable to be smooth, indexed, or with variable turning resistance and spring return. It was inspirational to feel how it responded when turned in the various modes. Apple is very good at commercializing ideas, but big research institutions such as CERN, erstwhile Xerox PARC and Bell Labs excel at creating the ideas themselves.
    • @codinghorror: "The story I hear is that Cellebrite hires ex-Apple engineers and moves them to countries where Apple can't prosecute them under the DMCA or its equivalents"
    • @mokargas: Over a time-span of say, 4 years, any sizeable company will have projects using Grunt, Gulp, Webpack, NPM vanilla or some combination of those. Fire up any sort of legacy project, let's say, a year old, and it won't work. Require hours of troubleshooting.
    • @marcoarment: I wonder how many iOS apps are secretly burning their users’ CPUs and batteries to mine Bitcoin as a revenue stream. Whatever the number is, I bet it gets much higher this year as developers and ad networks get more desperate and it becomes “standard” practice to ad-tech people.
    • @hichaelmart: The ultimate step beyond FaaS will be invocations as a service. Sure, you can provide a function API, just like the Go Lambda runtime does, but really you're uploading an RPC server. Allow ppl to package that in a container and boot it fast, and voila.
    • @patio11: Minor heresy: "Because that's where the money is" is a perfectly valid reason to study programming or seek a job in software.
    • Jesse Allen: The researchers tested their devices for harvesting waste heat energy near room temperature. Their device produced an electrical energy of 2.3 meV per heat cycle between around 25 and 50 degrees Celsius. This result reflected an efficiency of around 1.0%, although the theoretical maximum for this device should be around 8.7%.
    • Rajeev Suri~ More and more traffic in the backbone of the Internet will be in the networks of web-scale companies, the Apple’s, Googles, Alibabas, Tencents, and others of the world. This is already under way and will only accelerate in the coming years. The growth in data center to data center traffic is going to outstrip the growth of mobile data. What is happening is that traffic is moving off service provider networks and nearer to the end user, and it is then moving onto the big backbone network of web-scale players. This is a fundamental change.
    • @RealSexyCyborg: Almost no Chinese homes have clothes dryers. So as standard of living here continues to improve the rational upgrade path from clotheslines is of course IOT app-enabled robot clotheslines. @internetofshit
    • freakboy2k: Not sure what larger lesson can be learned from my experience though. Maybe that small companies in areas with no local devs will take on remote contractors because they have no other choice?
    • Bill Gates: I benefited from having a great education - public schools through 6th grade and then a great private School (Lakeside). So there is a good chance I would never have gotten turned on to software and math the way I did and therefore not as successful.
    • @davewiner: Prior to Google's changes, my process for making sure I could find something in the future: 1. Write a blog post about it. I trusted Google would index it. I could search for it and find it. Over the years Google has eroded that trust. It's getting worse.
    • @CubanAnalyst: Generally this is how new ideas form in companies. It's not the CEOs or major shareholders who come up with new business dev/product/process ideas. It's usually the employees who come up with the ideas & of course implement them. Another reason why worker cooperatives make sense.
    • Swizec Teller: The correct response to “How do I scale my app?” is to wait and see and monitor. Your app will tell you where it hurts. Listen. Then fix.
    • Wm Leler: Dart took a different approach to this problem. Threads in Dart, called isolates, do not share memory, which avoids the need for most locks. Isolates communicate by passing messages over channels, which is similar to actors in Erlang or web workers in JavaScript.
    • chucker23n: I mean, that's just a long-winded argument on why they think cross-platform apps are great. We've had that debate since the 90s and we'll continue to have it, and Windows users will continue to hate iTunes as much as iOS users will continue to hate Material Design apps. But that wasn't really my point. The funny thing is that Flutter is pushed by Google, yet offers to be more iOS-like than Google's apps. That's an odd corporate inconsistency.
    • @QuinnyPig: Some days being an AWS customer feels like being in a giant polyamorous family where none of your parents are speaking to one another and you have to mediate.
    • @mikeloukides: We’re trying to build new systems that evolve very quickly. You can’t write down how it really works. Architecture docs never describe the system as built. Synoptic illegibility. @adrianco #OReillySACon
    • @ginablaber: Best thing to do as an architect: ask awkward problems.  Like: what problem are you really trying to solve? @adrianco #OReillySACon 
    • Bryana Knight: This biggest part of the GitHub codebase is an 8-year-old monolith. As a company, we’ve been fortunate enough to see a huge amount of user growth since the company started. User growth means data growth. The schema and setup that worked for GitHub early on, and very much allowed GitHub to get to where it is today with tons of features and an extremely robust API, is not necessarily the right schema and setup for the size GitHub is today. 
    • dzdt: AWS sells optionality. If you build your own data center, you are vulnerable to uncertain needs in a lot of ways. (1) your business scales at a different rate than you planned -- either faster or slower are problems! (2) you have traffic spikes, so you to over-provision. There is then a tradeoff doing it yourself: do you pay for infrastructure you barely ever use, or do you have reliability problems at peak traffic? (3) your business plans shift or pivot. A big chunk of the Amazon price should be considered as providing flexibility in the future. It isn't fair to compare prices backwards-looking: where you know what were your actual needs and can compare what it would have cost to meet those by AWS vs in house. The valid comparison is forward looking: what will it cost to meet needs over an uncertain variety of scenarios by AWS compared to in-house. The corallary of this is, for a well-established business with predictable needs, going in-house will probably be cheaper. But for a growing or changing or inherently unpredictable business, the flexibility AWS sells makes more sense!
    • @togelius: Standard evolution strategies (invented in the 1970s, can be implemented in 10 lines of code) are comparable in performance to fancy Natural Evolution Strategies, which in turn are comparable in performance to Deep Q-learning, on ALE Atari Games.
    • dgacmu: It's going to be a very exciting multi-company arms race [in ML] -- at minimum, Google, Intel, Nvidia. Microsoft has their FPGAs, Amazon has their rumors. And there are several startups trying to enter the space. I don't think we're looking at stagnating; very much the opposite. It's going to be fantastic for the field.
    • _pdp_: Well we removed all our servers from AWS and replaced them with lambda functions and dynamodb tables which resulted in 4.5 reduction in cost and increased performance by multiple factors. I suppose it all depends on what you are building and how you are building it. If you run servers I think it is no secret that AWS is not the cheapest option around.
    • smooc: Hiring manager here. While your reasons are valid you are missing an important one: Resource scarcity: the engineers that I need allocate to infrastructure I rather have working on user facing features and improvements. Talent is scarce, being able to out source infrastructure frees up valuable engineering time. This is one of the main reasons, for example, that Spotify (I’m not working for them) is moving to google cloud.
    • @DynamicWebPaige: How is @Microsoft differentiating itself as a cloud provider? By being a leader in integrated artificial intelligence; hybrid platforms (via Azure Stack); productivity apps; and security.
    • @joeerl: No - the ultimate reason was to achieve fault tolerance - to do this needs isolation. This implies isolated processes which (accidentally) maps nicely onto multi cores. So multi core performance is a happy side effect of wanting fault tolerance.
    • Ninja Squirrel: I got a 5 months used Aorus GTX 1080 for $550. God bless you if anyone needs to buy a new graphics card these days. I can't wait until Cryptocurrency collapses, so we will be able to buy a graphics card at normal price again.
    • Max Thompson: The most common misconception/myth about programming is that the way things are done is the best way.
    • throwaway2048: You need people to manage AWS instances too, ive never seen any evidence that it actually takes less people at scale.
    • ChuckMcM: The best win is to build out in a data center with a 10G pipe into Amazon's network so that you can spin up AWS only on peaks or while you are waiting to bring your own stuff up. That gives you the best of both worlds.
    • @bradfitz: I'm halfway tempted to learn Dart, just so I don't need to learn Kotlin & Swift & the hot web platform language du jour. I know, I know. Just cry with me.
    • zlynx: Serverless really needs to work on their latency I think. Things will be going great and then there's the oddly weird 2 second delay. I guess it is bringing up a new server or container to run the lambda in. Whereas with your own (or well, Amazon's) machines you can scale up before hitting the limits and not need long pauses. Maybe one day they'll fix that.
    • ucaetano: Industries go through cycles of innovation and concentration. During innovation cycles, many new non-standard products appear with innovative solutions, the entire pie grows really fast. As growth eventually stabilizes, standards become more relevant and consolidation happens, eventually leading to a stagnation that makes the industry ripe for disruption and change again. If you look at processors, you see that with the early custom processors, followed by some standardization and copy around the IBM S/360, followed by more proprietary innovation around the PC era, resulting finally in the x86, eventually disrupted by the mobile chips, which then consolidated around ARM and so on.
    • pathorn: Datacenters are super cheap compared to EC2. (I'm not talking building your own: start by leasing space from existing datacenters). There are a surprising number of places where you can go and lease a rack or ten or a whole room and be up and running in a couple of months. I make the case that colocating pays off at just about any scale, assuming you have $10k in the bank, have a use for at least 40 cores and are able to pay upfront to handle anticipated scale. Hurricane Electric has prices online of $300/mo for a rack. On AWS, a single full c4 machine (36 threads) costs $1.591 per Hour x 24 x 30 = 1145/mo -- this is more than the cost of running a whole rack with 40 machines. Decent internet can be gotten for hundreds per month. Ok, so how about buying your own machines? E5-2630 with 20 threads is $700 x 2 = $1400 + motherboard + disk + ssd brings it to several thousand, so it will pay off in at most 6 months, and we're not even talking bandwidth or storage costs. Depending on the application you could be looking at a payoff after 2-3 months. Worried about installing or remote management? IPMI, iDRAC, etc included with basically every server make this a piece of cake. The only good case for cloud are if you may suddenly scale 10x and can't predict it; don't have $10k in the bank; or don't have 1-2 months to order machines and sign a contract for rack space.
    • walrus01: Musk is talking about the OSI layer 1 and layer 2 of the satellite network, which by definition needs to be something unique and custom. Nobody has ever built a LEO dual-satellite-CPE make-before-break architecture before. The closest thing is the o3b architecture which is intended for very large, costly customer terminals. This will still speak ipv4 and ipv6 just fine. To gain a better understanding of why this needs to be totally unique / custom / proprietary, it is helpful to first have a thorough understanding of current high capacity dedicated geostationary orbit based systems (1:1 SCPC/MCPC with dedicated kHz) and various shared-bandwidth VSAT type systems (TDMA timesliced architecture between one large earth station, one piece of satellite transponder kHz, and a number of N fixed CPE terminals within the satellite's spot beam. After understanding current geostationary architectures, dig into the "How" and "why" o3b was created and has been such a success, and its general system architecture which is proprietary. Satellite engineer here: The OSI layer 1/2 needs to be totally custom because we're dealing with a unique architecture of, just off the top of my head: a) dual satellite LEO architecture b) possible satellite-to-earth station trunk links, and satellite-to-satellite c) CPE terminals that have no moving parts and use phased array antenna systems to talk to two satellites at the same time. From the stationary point of view of a rooftop CPE, the satellite that is currently "rising" from the horizon and will be soon overhead, and the satellite that is currently overhead and will soon pass out of sight. d) Densely packed high frequency spot beams on a moving LEO satellite. The closest thing that's ever been built to this before is again the o3b satellites, but there are a great deal fewer of them, they orbit much higher, and have much larger spot beams than these small LEO high-Ka/V-band satellites will have. e) Custom indoor modem RF tech to talk to the rooftop CPE and provide a standard 100/1000 copper ethernet handoff (and possibly integrated 802.11ac dual band wifi). TDMA timeslicing per CPE and bandwidth allocation - there is no way that an individual CPE will get 1:1 dedicated bandwidth 24x7x365, the amount of capacity in an individual spot beam sized area will be oversold based on standard network architecture principles that most people don't try to max out the capacity of their circuit 24x7.
    • NorseZymurgist: Having been on the ground floor a couple IBM software products, and witnessing others, I can comment on this. Usually the intentions are very good; the innovation and idea people get excited about what they're going to do. Then they start to over-engineer. "Maybe we should add this infrastructure to make it easy to add feature XYZ in the future". "We don't like those wheels, let's invent our own kinds of wheels" etc. Next time you know the product is overly complicated and bloated. Then the next step ... some manager seeking to earn their wings (and visibility) decides "This product is too big and complex, let's create a new one that's leaner and prettier" and the cycle repeats.
    • Melinda Gates: Does saving kids’ lives lead to overpopulation? We asked ourselves the same question at first. Hans Rosling, the brilliant and inspiring public health advocate who died last year, was great at answering it. I wrote about the issue at length in our 2014 letter. But it bears repeating, because it is so counterintuitive. When more children live past the age of 5, and when mothers can decide if and when to have children, population sizes don’t go up. They go down. Parents have fewer children when they’re confident those children will survive into adulthood. Big families are in some ways an insurance policy against the tragic likelihood of losing a son or a daughter. We see this pattern throughout history. All over the world, when death rates among children go down, so do birth rates. It happened in France in the late 1700s. It happened in Germany in the late 1800s. Argentina in the 1910s, Brazil in the 1960s, Bangladesh in the 1980s.
    • sh*ttyapartment: My manager treats me like a typist creating his novel. "I want a button here, and this will go here, and etc etc". That is all I do every day. I watch as his way of trying to create a system in his head fails and costs hundreds of thousands of dollars. At some point I gave up trying to even understand what he wanted and just blindly implement whatever he says now. Before I would guide his mind through to an approach that made sense, but for about 6 months, I have let his ideas twist and turn the system into a mess because I know I am getting out of here. I remember the first day on the job, I was at one of the stand up meetings and noticed all the programmers were completely lifeless and lacking in passion about anything related to their job or technology. I was just out of university at that point and didn't understand that all of the motivation had been sucked out of them until I became that person.

  • We used to tell stories about fending of bears and wolves. Now it's DDoS attacks. GitHub was attacked with Memcrashed - Major amplification attacks from UDP port 11211 to the tune of 1.35Tbps via 126.9 million packets per second. Memcache is really fast. February 28th DDoS Incident Report. GitHub followed the time honoured strategy of running to safer ground when attacked by a ferocious beast: Given the increase in inbound transit bandwidth to over 100Gbps in one of our facilities, the decision was made to move traffic to Akamai, who could help provide additional edge network capacity. At 17:26 UTC the command was initiated via our ChatOps tooling to withdraw BGP announcements over transit providers and announce AS36459 exclusively over our links to Akamai. Routes reconverged in the next few minutes and access control lists mitigated the attack at their border. Also, GitHub Survived the Biggest DDoS Attack Ever Recorded and BCP38

  • Peter Norvig has made his $60 book Paradigms of Artificial Intelligence Programming available for free on GitHub. Or did it put itself on GitHub? 

  • Cryptocurrencies have made it profitable to scour the world searching for every unused compute cycle. Lessons from the Cryptojacking Attack at Tesla. We'll only see more of this.
    • A few months ago, the RedLock Cloud Security Intelligence (CSI) team found hundreds of Kubernetes administration consoles accessible over the internet without any password protection. 
    • The RedLock CSI team revealed that the latest victim of cryptojacking is Tesla. While the attack was similar to the ones at Aviva and Gemalto, there were some notable differences. The hackers had infiltrated Tesla’s Kubernetes console which was not password protected. Within one Kubernetes pod, access credentials were exposed to Tesla’s AWS environment which contained an Amazon S3 (Amazon Simple Storage Service) bucket that had sensitive data such as telemetry.
    • We are beginning to witness the evolution of crytopjacking as hackers recognize the massive upside of these attacks and begin to explore new variations to evade detection.
    • A sneaky the testing labs of Nvidia, AMD, and Google are they're mining cryptocurrency, you know, as a test? How Overheated Is the GPU Market in 2018?

  • Catch22 is a deep pattern. Why I Quit Google to Work for Myself: Google kept telling me that it couldn’t judge my work until it saw me complete a project. Meanwhile, I couldn’t complete any projects because Google kept interrupting them midway through and assigning me new ones.

  • AWS, it's hard to quit you. Dropbox saved almost $75 million over two years by building its own tech infrastructure. Dropbox still uses AWS for less than 10 percent of its storage needs. @mathewlodge: You should read their S1. They still use AWS for compute. They just run their own storage. Oh, and their gross margin doubled when they migrated off S3.

  • In the age of aggregators algorithms are like the terminator, they don't care who they kill. Facebook's algorithm has wiped out a once flourishing digital publisher: But Speiser said the recent algorithm shift, which Facebook has said was designed to tamp down content that is consumed passively — and would instead emphasize posts from people's friends and family — took out roughly 75% of LittleThings' organic traffic while hammering its profit margins.

  • Good experience report from Two days of fun() at Lambda Days 2018.

  • Fascinating video of how music is produced these days. The Making Of Everything Is Recorded's "Close But Not Quite" With Richard Russell | Deconstructed. As much or more goes into the engineering as the singing and playing.

  • There are a lot of unhappy programmers/managers/people out there. Too many coders work in environments where they are treated as idiot savants. An epic thread, lots of pain, lots of blame, not a lot of solutions. It might help if programmers kept in mind managers are rarely taught how to manage. It might help if managers kept in mind programmers are rarely taught how to be managed. Being a person is hard.

  • Does terroire apply to software? Can you taste the place in software? I think you can. Instead of grasses, dirt, yeast and weather you can taste tool chains and culture. 

  • Bringing GPUs To Bear On Bog Standard Relational Databases.
    • The Brytlyt GPU database is written in C and C++ with the CUDA extensions, of course, and it uses Lua to stitch different elements of the software stack together. The database has been ported to run on X86, Power, and Arm architectures, but X86 and Power are the two main commercial efforts right now and Tesla GPU accelerators with NVLink interconnects between the GPUs is a big plus for performance, according to Heyns. Given that IBM has NVLink ports on its Power9 chips and coherency across the CPU and GPU memory – something that X86 server platforms cannot offer – the Power servers from IBM could have a leg up on X86 iron when running GPU databases.
    • In early benchmark tests done ahead of the M|18 conference, the demo was a simple query running the MyISAM engine (the default engine from way back in the day with MySQL until InnoDB became the preferred one) underneath MariaDB, and that query took 7 minutes on a server with just CPU oomph. With the Brytlyt engine underneath MariaDB, that same simple query took 100 milliseconds, which works out to a factor of 8,000X speedup.

  • Combine these two trends and it's not hard to understand why big cos rarely innovate. How Stock Buybacks Cause Economic Stagnation and  @PsychoSchmitt: 80% of CEOs surveyed said they’d pass up making an investment that would fuel a decade’s worth of innovation if it meant they’d miss a quarter of earnings results.

  • So cool. Computing With Random Pulses Promises to Simplify Circuitry and Save Power.
    • Stochastic computing begins with a counterintuitive premise—that you should first convert the numbers you need to process into long streams of random binary digits where the probability of finding a 1 in any given position equals the value you’re encoding. Although these long streams are clearly digital, they mimic a key aspect of analog numbers: A minor error somewhere in the bitstream does not significantly affect the outcome. And, best of all, performing basic arithmetic operations on these bitstreams, long though they may be, is simple and highly energy efficient.
    • While our examination of circuits for retinal implants and neural networks makes us very optimistic about the prospects for stochastic computing, we still haven’t discovered the real killer app for this approach.
    • There are limits to the precision you can achieve in practice, though. That’s because to represent an n-bit binary number, stochastic computing requires the length of the bitstream to be at least 2n. Take the case of 8-bit numbers, of which there are 256 possible values. Suppose you wanted to represent the probability 1/256 with a bitstream. You’d need a bitstream that is at the very least 256 bits long—otherwise there wouldn’t be a place for a lone 1 in a sea of 0s
    • Stochastic computing circuits, like many biological systems, are resilient in the face of many kinds of disturbances. If, for example, a source of environmental noise causes some of the binary digits in a bitstream to flip, the number represented by that bitstream won’t change significantly
    • Many of the signals that computers—and our brains—process are analog. And analog has some inherent advantages: If an analog signal contains small errors, it typically won’t really matter. Nobody cares, for example, if a musical note in a recorded symphony is a smidgen louder or softer than it should actually be. Nor is anyone bothered if a bright area in an image is ever so slightly lighter than reality. 

  • State machines are always the dream deferred. Programmers just don't seem to like the rigour. Forde's Tenth Rule, or, "How I Learned to Stop Worrying and ❤️ the State Machine"Welcome to the (unfinished) world of StatechartsWhy Developers Never Use State Machines. After some digital archeology on my hard drive I found a state machine generator I wrote in 1999. For illustration purposes only.

  • Should the trial period for SaaS require a credit card? In Freemium and Free Trial Conversion Benchmarks says conversion rates without a credit card are much higher. In The business of SaaS Stripe it depends: In general, requiring a credit card upfront will, on net, increase the number of new paying customers you get (it increases the trial-to-paying-customer conversion rate by more than it decreases the number of trials started). This factor reverses as a company gets increasingly sophisticated about activating free trial users (ensuring they make meaningful use of the software), typically via better in-product experiences, lifecycle email, and customer success teams.

  • How disappointing. Bill Gates is an advocate for the tabs agenda. I’m Bill Gates, co-chair of the Bill & Melinda Gates Foundation. Ask Me Anything.

  • YouTube is the most lucrative platform for creators — with Etsy and Instagram trailing behind. $3.2B YouTube. $1.4B Etsy. $538M Instagram. $230M Amazon Publishing. $208M Wordpress. $169M Tumblr. $86M Twitch. $33M eBay. At least in 2016. Curious to see with all the changes in YouTube if creators are still raking it in.

  • What business advice would you [Feargus Urquhart] give to indie developers? Get known for making a type of game. Some developers, including us, try to be a one sized fits all developer; Find, then protect, your fans; Play your game more than most of your players; Understand your costs; When talking to anyone about your game, be able to convey the concept in a few words – no more than a sentence; Video and GIFs are better than screenshots, which are better than text; Figure out which hats you can wear, and which ones you can’t; Don’t dwell or hold grudges; Love games, and love making them. 

  • The internet doesn't have security today because back in the day cryptography was classified as a munition. David Rosenthal sets the record straight in Nobody cared about security.
    • The design decisions taken in the ARPAnet days made the deployment of security easier. The main reason for today's security nightmares is quite different.
    • Making the original ARPAnet work at all was a huge achievement. It was, to a large extent, made possible because its design was based on the End-to-End Principle: it is far easier to obtain reliability beyond a certain margin by mechanisms in the end hosts of a network rather than in the intermediary nodes, especially when the latter are beyond the control of, and not accountable to, the former.
    • A principle of the Internet was that security was one of the functions assigned to network services (file transfer, e-mail, Web, etc.), not to the network on which they operated.
    • The more significant reason why the ARPAnet and early Internet lacked security was not that it wasn't needed, nor that it would have made development of the network harder, it was that implementing security either at the network or the application level would have required implementing cryptography. At the time, cryptography was classified as a munition. Software containing cryptography, or even just the hooks allowing cryptography to be added, could only be exported from the US with a specific license. Obtaining a license involved case-by-case negotiation with the State Department. In effect, had security been a feature of the ARPAnet or the early Internet, the network would have to have been US-only. 
    • Thus, for the whole of the period during which the Internet was evolving from an academic network into the world's information infrastructure it was impossible, at least for US developers, to deploy comprehensive security for legal reasons. It wasn't that people didn't care about security, it was because they cared about staying out of jail.

  • The Simple Algorithm That Ants Use to Build Bridges: To see how this unfolds, take the perspective of an ant on the march. When it comes to a gap in its path, it slows down. The rest of the colony, still barreling along at 12 centimeters per second, comes trampling over its back. At this point, two simple rules kick in. The first tells the ant that when it feels other ants walking on its back, it should freeze. “As long as someone walks over you, you stay put,” Garnier said. This same process repeats in the other ants: They step over the first ant, but — uh-oh — the gap is still there, so the next ant in line slows, gets trampled and freezes in place. In this way, the ants build a bridge long enough to span whatever gap is in front of them. The trailing ants in the colony then walk over it.

  • Not scary at all. China using big data to detain people before "crime" is committed: report
    • Chinese police theorists have identified specific "extremist behaviours, which include if you store a large amount of food in your home, if your child suddenly quits school and so on," she said. Train a computer to look for such conduct, and "then you have a big data program modelled upon pretty racist ideas about peaceful behaviours that are part of a Uyghur identity," she said.
    • The system is being used in Xinjiang, a region whose largely Muslim Uyghur population has been accused of committing acts of terror in China and abroad. 
    • The widespread use of political re-education is the latest attempt to root out what China calls extremism. Critics call it a racially motivated campaign directed at Uyghurs, who are being forced to pledge fealty to the Chinese state, study Mandarin Chinese and participate in cultural customs of the majority Han Chinese population.

  • Awesome deep dive into Animation in the League of Legends Client: Based on our experience with these methods of accomplishing animation with a web frontend, we think that native animations are great if you have simple cases like a single element sliding or fading away. But when multiple animations have to be composited together, a library like GSAP provides key features that make creating and editing it a whole lot easier. It’s also important to consider workflow when choosing a technique. Complex native animations often result in a jumble of brittle code. Adding a library like GSAP or using pre-rendered videos simplifies the creation and maintenance of this code greatly. The Lottie library also shows promise since it allows artists to work with browser animations without working with code. Finally, it’s important not to think of the browser’s renderer as a black box. Like any graphical engine, its abstractions hide key details that affect performance in profound ways which are unintuitive without an understanding of the internals. But behind those abstractions lies a powerful GPU-accelerated rendering pipeline capable of high graphical fidelity and performance if used properly.

  • Good trip report. A Summary of the Google Zürich Algorithms & Optimization Workshop

  • 36-Way Comparison Of Amazon EC2 / Google Compute Engine / Microsoft Azure Cloud Instances vs. Intel/AMD CPUs: Google's n1-highcpu-8 instance had come out on top followed by EC2 m5.large and then Google n1-standard-8 while Microsoft finally placed in fourth position with their Azure D4s v3 instance type. But based upon your particular workloads of interest, the positioning may be different.

  • Serverless Security: What's Left to Protect?
    • How Serverless reduces security risks: No need to manage OS patches; Short-lived servers don’t say compromised for long; Extreme elasticity means Denial of Service resistance
    • How Serverless changes security risks: Highly granular permissions offer risk and opportunity; Stateless servers require better data security; Application security rises in prominence; Vulnerable app dependencies hidden inside
    • How Serverless increases security risks: Greater dependency on third party services; Every function expands the attack surface; Ease of deployment leads to explosion of functions

  • PHP 7.2 is now faster than Facebook's HHVM on most test runs. The Definitive PHP 5.6, 7.0, 7.1, 7.2 & HHVM Benchmarks (2018): PHP 7.2 was the fastest engine in 14 out of the 20 configurations tested above; As far as WordPress is concerned, PHP 7.2 was the fastest in all tests. 

  • See Azure put through its paces. I Wanna Go Fast: Why Searching Through 500M Pwned Passwords Is So Quick: I'm really happy with the basic philosophy of this whole thing: serve as much as possible from edge node cache, go as fast as possible across the network when the request needs to go to the origin then process that request as quickly as possible. On the one hand, that's all web performance 101 but on the other hand, it's not always that readily achievable, especially at both scale and price points. There's certainly more opportunity for improvement though, for example by distributing the data and API endpoints in a more globally-accessible fashion. Putting another Azure Function in Europe with another copy of the data in Blob Storage wouldn't be hard or expensive. Cloudflare traffic manager could help geo-steer requests and folks in places like Europe could realise performance gains on requests that go all the way to the origin. Another idea I'm toying with is to use the Cloudflare Workers John mentioned earlier to plug directly into Blob Storage. Content there can be accessed easily enough over HTTP (that's where you download the full 500M Pwned Password list from) and it could take out that Azure Function layer altogether. That's something I'll investigate further a little later on as it has to potential to bring cost down further whilst pumping up performance.

  • What Does OO Afford? IMHO OO is simply about organizing code, any other magical qualities are brought by the mind of the perceiver.

  • React native? Developing separatete apps? Making the case for Flutter and Dart. Why Flutter Uses Dart
    • As direct evidence, a large project inside of Google wanted to port their mobile app to iOS. They were about to hire some iOS programmers but instead decided to try Flutter. They monitored how long it took to get developers up to speed on Flutter. Their results showed that a programmer could learn Dart and Flutter and become productive in three weeks. This compares to the five weeks they had previously observed to get programmers up to speed on Android alone (not to mention that they would have had to hire and train developers for iOS).
    • They were able to triple their productivity by using Dart and Flutter. This should be no surprise given what they were doing before. They, like many companies, were building separate apps for each platform (web, iOS and Android) utilizing separate languages, tools, and programmers. Switching to Dart meant that they no longer had to hire three different kinds of programmers. And it was easy for them to move their existing programmers to Dart.

  • Go 2017 Survey Results: For the first time, more survey respondents say they are paid to write Go than say they write it outside work. This indicates a significant shift in Go's user base and in its acceptance by companies for professional software development...Another important shift: the #1 use of Go is now writing API/RPC services (65%, up 5% over 2016), taking over the top spot from writing CLI tools in Go (63%)...When asked about the biggest challenges to their own personal use of Go, users clearly conveyed that lack of dependency management and lack of generics were their two biggest issues...Since last year there has been an increase of the percentage of people who identified "Go lacks critical features" as the reason they don't use Go more.

  • By moving from Python to Go Uber saved a lot of resources. Code Migration in Production: Rewriting the Sharding Layer of Uber’s Schemaless Datastore.
    • To investigate the optimization gains of rewriting the Schemaless sharding layer in Go, we created an experimental worker node that implemented a frequently used endpoint with high resource consumption. The results of this rewrite showed an 85 percent reduction in latency and an even greater reduction in resource consumption and the p99 request latency decreased by 70 percent.
    • Following our Go implementation, the Schemaless CPU utilization decreased by more than 85 percent. This efficiency gain let us cut down on the number of worker nodes used across all Schemaless instances, which, based on the same QPS as before, led to improved node utilization.
    • By re-implementing the service without changing any of Schemaless’ existing clients, we were able to implement, validate, and enable an endpoint within days instead of weeks or months with zero downtime.
    • To test our implementation at scale, we set up a Schemaless test instance where test traffic simulated production traffic. In this test instance, we moved write traffic from Schemaless’ Python implementation to Frontless and ran validation to check that the cells were written correctly. Finally, once the implementation was ready for production, we slowly migrated write traffic from Schemaless’ Python implementation onto the Frontless worker by having runtime configurations that could move a percentage of the write traffic to the new implementation in seconds.

  • Circuit keeps up to 5.5-meter accuracy after 3 kilometers. Tracking dead reckoning electronically is one of those things that's always seemed like it should work, but hasn't, until now. Accurate Navigation Without GPS: During each step, the heel is anchored to the ground for about 100 milliseconds. Guo figured out how to measure this instant of stillness, and use that to correct for the false motion in drifting data from the IMU. “You reset the position calculation with every step, so you do not accumulate error,” says Young. Guo designed a flexible MEMS pressure sensor to place under the insole of a boot with an IMU. He calculated that the system needed about 1000 sensors to get accurate readings (and provide redundancy in case some sensors broke underfoot), and built a custom circuit to combine data from the IMU and the pressure sensors, and designed the necessary algorithms.

  • Interested in building and deploying a sensor out into the world? Here's how it's done. MicroFlo and IoT: measuring air quality: MicroFlo is a flow-based programming runtime targeting microcontrollers. Just like NoFlo graphs run inside a browser or Node.js, the MicroFlo graphs run on an Arduino or other compatible device. The result of a MicroFlo build is a firmware that can be flashed on a microcontroller, and which can be live-programmed using tools like Flowhub.ESP8266 is an Arduino-compatible microcontroller with integrated WiFi chip. This means any sensors or actuators on the device can easily connect to other systems, like we do with lots of different sensors already at c-base.

  • Good explanation of The Paxos Algorithm

  • Benchmarking Google’s new TPUv2: On ResNet-50, a single Cloud TPU (containing 4 TPUv2 chips and 64GB of RAM) is ~7.3 faster than a single P100 and ~2.8 times faster than a V100. For InceptionV3, the speedup is almost the same (~7.6 and ~2.5, respectively). With higher precision (fp32), the V100 loses a lot of speed..On the models we tested, TPUs compare very well, both, performance-wise and economically, to the latest generations of GPUs. This stands in contrast to previous reports. Overall, the experience of using TPUs and adapting TensorFlow code is already pretty good for a beta.

  • XMPP is the chat of choice for games. League of Legends; Fortnite; now Eve Online Chat is Moving to ejabberd. Why? Mashimo: As far as i know a modern more modular design was the main reason. And thus offload some of the CPU utilisation to not-game server.

  • This non-academic approach might work better for many. How to think in graphs: An illustrative introduction to Graph Theory and its applications

  • Fun debugging adventure story. Following the Cookie Crumbs: Investigating a Performance Anomaly at Qumulo: We pulled up Qumulo’s socket code and quickly saw that when listening for connections, we always used a backlog of size 5. During cluster initialization we were creating a connected mesh network between all of the machines, so of course we had more than 5 connections created at once for any cluster of sufficient size. We were SYN flooding our own cluster from the inside! We quickly made a change to increase the backlog size that Qumulo used and all of the bad performance results disappeared: case closed!

  • You can run your own cell tower with open source software. Open BTS and An Interview with Derek Kozel

  • I was totally duped by this. Sci-Fi Short Film "Strange Beasts". Then the poignancy hits. 

  • Yep, this is pretty much how I expect the future will work. Your worst artificial intelligence nightmares were brought to life on 'The X-Files'. An AI will be given a simple command without imposed constraints and it ends up recruiting all of space and time to achieve its once modest goals.

  • 2016: The Honey Bee Algorithm: In the web hosting problem, the servers are like the nectar foragers, while the dynamic community of clients asking to use the servers – anyone using the internet – is like the changing landscape of flower patches.  The clients pay in money, while the flowers pay in nectar. Shared web hosting servers operate by switching from one application to the next based on demand for any given application. Each server can run only one application at a time (for security reasons), so switching applications – like a honey bee switching flower patches – incurs a time, and therefore revenue penalty as the server wipes itself clean and loads a new application. The best server allocation algorithm would need to respond to a highly dynamic environment and return the maximum total revenue – just like in a honey bee colony...In a test against a then state-of-the-art algorithm and two other potential methods, the researchers showed that their honey bee algorithm beat the competition. In fact, when they compared the honey bee algorithm to a totally unrealistic, omniscient algorithm that knew where future web traffic would be in advance, they found that under highly variable conditions – which are realistic for many situations on the internet – the honey bee algorithm came out ahead.

Reader Comments (4)

Your posts are becoming a bit more sprawling over time, which is making me consider whether I need to unsub. It's not that the items aren't interesting, it's that things aren't selective, so it's feeling overwhelming.

March 3, 2018 | Unregistered CommenterScott Hess

Can't say I disagree. Every week I think there won't be enough and there's always too much!

What would you like to see less of Scott? And more of?


March 4, 2018 | Registered CommenterTodd Hoff

I'd say a bit of consolidation - in several places there's links to the same Reddit or HN thread, but different posts.
Also, maybe fewer Twitter one-liners. Threads and links and things that are more in-depth

March 6, 2018 | Unregistered CommenterDefenestrator

I'm liking the Twitter one-liners, many interesting anecdotes!

I've noticed that lately you've maybe spent a bit less time in editing per story/bullet (though it's understandable because reading the whole internet + writing about it must take an enormous time), but I wouldn't change a thing if it meant you writing less.

I'm always happy to realize it's friday and that I'll get to read a new post in this series.. you're providing a great and valuable service! I've found out about many great projects from this series, such as Traefik, CockroachDB etc etc.

March 8, 2018 | Unregistered CommenterJoonas

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>