hot links

Stuff The Internet Says On Scalability For February 16th, 2018

High Scalability

16 Feb 2018 — 31 min read

Hey, it's HighScalability time:

Snow? Last march of the faeries? Nope. It's 1218 of Shooting Star drones forming the Olympic symbol. *chills*

If you like this sort of Stuff then please support me on Patreon. And I'd appreciate if you would recommend my new book—Explain the Cloud Like I'm 10—to anyone who needs to understand the cloud (who doesn't?). I think they'll learn a lot, even if they're already familiar with the basics.

63.2%: Americans with one and only one cable provider; $1.5 billion: spend on chip startups last year; $7.5 billion: Uber sales; $4.5 billion: Uber loss; 180 TFLOPS: computation accessible via the TensorFlow programming model from a Google Cloud VM; 10x: computational capabilities of the human brain than previously thought; 10 million: went live on Facebook sharing 47% more Live videos than the previous year; 1.7 million: HQ players during Superbowl; 1/400: power to perform public-key encryption; 8 bit: custom CPU build from scratch; $8,500: daily take from mining Monero with your botnet; 10,000: datasets shared on Kaggle; 41%: NVIDIA revenue growth; 14x: real world 4G LTE vs. 5G bandwidth; 2: two SpaceX demonstration satellites ready to launch;

Quotable Quotes:
- Packet Pushers~ The only really good protocols are in people’s minds.
- Georgia Dow~ I have an easier time getting people off of smoking and drinking that I do technology.
- Natalie Cheung: In order to create a real and lifelike version of the snowboarder with more than 1,200 drones, our animation team used a photo of a real snowboarder in action to get the perfect outline and shape in the sky.
- Certhas: tl;dr: "With the Ryzen 5 2400G, AMD has completely shut down the sub-$100 graphics card market. As a choice for gamers on a budget, those building systems in the region of $500, it becomes the processor to pick."
- AnalogOfDwarves: Better rule of thumb: Minimize the amount of covariant code. If you're repeating yourself, or thinking about doing it, ask yourself: "If I later change this in one place, will I want to change this in the other places?" And more importantly, "Are there changes I might make to this in one place that I'll want to avoid making in the other places?" If the answers to these questions are "yes" and "no", respectively, then you refactor to a common unit.
- @Carnage4Life: Azure usage continues to explode with 35% year-over-year growth compared to AWS with 15%. Microsoft as the scrappy fast follower is a look no expected.
- Scott Aaronson: Can we program a computer to find a 10,000-bit string that encodes more actionable wisdom than any human has ever expressed?
- @tommorris: Bitcoin advocates: "We need cryptocurrencies for the farmers in sub-Saharan Africa with only a crappy Android phone and no bank account." Also Bitcoin advocates: "Oh, you lost your money? You should have stored your private key on an airgapped burner laptop."
- J. M. Korhonen: Feel free to call me a luddite or whatever. It’s just that I’ve been studying the possibilities of blockchains for business for over a year now, and while it is certainly possible that I simply lack the imagination (or chutzpah) necessary for bold proclamations, I just don’t see the possibilities the marketers seem to see.’
- George Church: All these things come together in a time of exponential change. It’s not necessarily some panacea that’s full of abundance and you don’t have to think and it’s easy, but there are some win-wins to be had if we think about it deeply and we talk about it as if science was a real thing rather than something that’s inconvenient.
- russellbeattie: For what it's worth, Windows Phone was actually an amazing platform for both users and developers, and shows a fundamental rule of technology: There Is No Third Ecosystsm. The most dominant hardware maker (at the time) and software/os maker teamed up with a really great product, but couldn't break the established smartphone duopoly, even though it was only a few years old by that point. I wasn't a Microsoft fan by any stretch (the opposite actually), but even I agreed with the decision at the time, especially after using Windows Phone. First mover advantage is huge, and developers only have so much bandwidth.
- Picasso: To know what you’re going to draw, you have to begin drawing.
- David A. Paterson: The ending of Moore’s law and Denard scaling means that new innovations are needed in instruction set architectures … I think we are entering another renaissance in computer architecture.
- jules: A craftsman takes responsibility for the tools they use.
- Paul Dirac: I think there is a moral to this story, namely that it is more important to have beauty in one's equations than to have them fit experiment. If Schrodinger had been more confident of his work, he could have published it some months earlier, and he could have published a more accurate equation. That equation is now known as the Klein-Gordon equation, although it was really discovered by Schrodinger, and in fact was discovered by Schrodinger before he discovered his nonrelativistic treatment of the hydrogen atom. It seems that if one is working from the point of view of getting beauty in one's equations, and if one has really a sound insight, one is on a sure line of progress. If there is not complete agreement between the results of one's work and experiment, one should not allow oneself to be too discouraged, because the discrepancy may well be due to minor features that are not properly taken into account and that will get cleared up with further developments of the theory.
- Anil Nanduri: What you have is a complete three-dimensional viewing space, so you can create lots of interesting effects and transformations when you use that full capability. It's aways easy to fly more drones for an animation and increase the perspective.
- @DinaPomeranz: Most of the increase in the global population is currently happening due to growing life expectancy, rather than growing number of children
- tuptain: We're starting to figure out how we want to design around this [Lambda Cold Starts] in our shop since we're transitioning to serverless atm. We've been considering just using a CloudWatch event to warm the main API entry point Lambda occasionally. You don't get charged for it being warmed, just for compute time. Runtime performance (measured on standard JS benchmarks such as Speedometer, as well as a wide selection of popular websites) has remained unaffected by lazy deserialization.
- hinkley: I am coming to a very, very sad realization that teams that 'need a rewrite' probably don't deserve them. The desire for a do-over is a little childish to begin with, but the fact that you can't find a route from A to B means you lack a useful combination of imagination and discipline. From my personal experiences and those of my peers, I don't think you can trick people into discipline by rewriting the application and then letting them in after you've "fixed everything".
- hacksoncode: They forgot the best use of RAII: lock management. It's fantastically easy to create code paths that result in deadlocks without it.
- Geoff Tate: A huge [challenge facing the chip industry] is the growing cost and complexity of new designs. That’s what’s driving a lot of industry consolidation. Just as people had to band together because they couldn’t afford their own fabs, now they have to band together because designing a chip is getting so expensive. To design a chip you need a big team of people, so even a company that’s pretty good size has to centralize these kinds of things. The economics of scale are working toward concentration and consolidation, where everyone knows how to build switches and that the next switch chip has to be twice or four times as fast. On the flipside, we’re also seeing a bunch of chip and system companies springing up to address new stuff, such as AI and LiDAR.
- @bascule: BitGrail lost $170 million worth of Nano XRB tokens because... the checks for whether you had a sufficient balance to withdraw were only implemented as client-side JavaScript
- boulos: We fundamentally want Google Cloud to be the best place to do computing. That includes AI/ML and so you’ll see us both invest in our own hardware, as well as provide the latest CPUs, GPUs, and so on. Don’t take this announcement as “Google is going to start excluding GPUs”, but rather that we’re adding an option that we’ve found internally to be an excellent balance of time-to-trained-model and cost. We’re still happily buying GPUs to offer to our Cloud customers, and as I said elsewhere the V100 is a great chip. All of this competition in hardware is great for folks who want to see ML progress in the years to come.
- Smári McCarthy: We [Iceland] are spending tens or maybe hundreds of megawatts on producing something [bitcoin] that has no tangible existence and no real use for humans outside the realm of financial speculation. That can't be good.
- Chris Lee: LISA pathfinder three times better than required, 10 times better than expected.
- Ramez Naam: The world sucks in a lot of ways. But as a percentage of humanity, those are all at record lows. And those numbers have mostly dropped by a factor of two or three since the 1970s. Statistically this is the best time to be born human ever in the hundred-thousand year history of humanity.
- sjellis: Yeah, Fargate may be an inflection point: a lot of discussion of Docker ignores the fact that you need orchestration to make it work in prod, and orchestration is overhead.
- JohnCarter~ More crazy flying machines - CanberraUAV and ArduPilot Most fascinating talk of Linuxconf2018 that I have seen so far… "The highly redundant network system we used in CanberraUAV was very complex, but proved its worth on the day after our relay aircraft was destroyed. "
- jefe78: As a systems engineer, I struggle with this virtually every day. We're called 'DevOps' by most and anytime we encounter a new problem, everyone invariably screams for containers. Containers aren't a magic bullet. My favourite example is when our AWS TAMs offer a solution, knowing we have ZERO pipeline/infrastructure setup for supporting containers. They always push containers. We don't use containers, stop forcing them down our throat. We've tried, we've been burned, VMs work for us. Stop! When did containers become perceived as the end-all solution? I see their value and uses but they don't meet ours so why have we started ignoring the right solution for the job? I see this everywhere I go.
- @cloud_opinion: One person's opinion, but AWS customer support has become bad lately. May be all the aggressive hiring and the company culture did not get communicated. This is how companies become bad. Hope its just an isolated experience.
- walrus01: I'm a network engineer with a lot of two way satellite experience. You are not only wrong but you are so grossly wrong that you should be ashamed of yourself and delete your post. Please go read some basic books on cryptography (start with Schneier) before spouting absolute nonsense. Satellite is no more or less trustworthy than licensed point to point microwave, using wifi in an urban environment, or sharing TDMA timesliced bandwidth on a DOCSIS3.0/DOCSIS3.1 coax network segment. All of which are media that can potentially be intercepted and rely on properly implemented crypto.
- thisisit: In my opinion the question is not about the service but whether it is a viable and sustainable business. Sure Uber is a better experience than taxis which are/were a nightmare. But most of the growth seems to be coming from underpriced rides. When the price normalisation happens Uber might not be a good service anymore.
- Robotbeat: That's just the initial system. And there will be, say, 30 satellites in view at any one time, giving an aggregate of 600 Gbps. At a typical 100:1 over-subscription ratio, they should be capable of serving almost a million customers per region (~500km diameter) at that speed. At more reasonable 10 or 100Mbps, it's on the order of tens of millions of customers in a region. But that's just the initial constellation. SpaceX plans to put 12,000 total satellites up, with the VLEO ones having much higher throughput. Idea is to replace them every 4-6 years with faster throughput.
- Robotbeat: The [SpaceX] network is not intended for mobile use. The frequency is too high, thus it doesn't penetrate buildings very well. The idea is to put a lunch box or pizza box sized phased array antenna on your roof. As long as you have a fairly clear view of the sky, you should be able to get good reception with a moderately sized antenna. It probably would work for vehicles, however.
- David Mack: I’ve been pretty happy with our choices: Amazon Web Services, Elastic Beanstalk, Firebase, AngularJS, Coffeescript, Kafka, Simple Queue System, SocketStream, Docker, SemaphoreCI, MySQL. Of the list, AngularJS and MySQL have been the only ones to give us scaling problems. Our monolithic AngularJS code-bundle has got too big and the initial download takes quite a while and the application is a bit too slow. MySQL (in RDS) crashes and restarts due to growing BI query complexity and it’s been hard to fix this. I appreciate now that technologies have a surprisingly short lifespan.
- sreque: I would recommend every college student learn C, and learn it from the perspective of an "abstract machine" language, how your C code will interact with the OS and underlying hardware, and what assembly a compiler may generate. I would consider learning C for pedagogical purposes to be much more important than C++.
- pmlnr: [re: GMAIL AMP] I genuinely fear the future of email. Email is still the only piece of communication you can own, from top to bottom, from running the service to owning the domain[^1] it runs on. You were able to send anyone email, regardless of rank, location, social status. Google (and recently, Outlook) is taking all of it away. It's putting mail from people not on your contact list in spam[^3]; it's by default blaclisting IPs within certain range[^2]; now it's bringing it's own format as well. Is this embrace, extend, extinguish, Google style?
- topbitcoin: “There was a bug on Bitgrail where if you placed two orders you got double balance added to your account. You could then withdraw while the orders were up and steal the coins. You had negative balance in the end but you could just make a new account.”
- Jakob Gruber: TL;DR: Lazy deserialization was recently enabled by default in V8 version 6.4, reducing V8’s memory consumption by over 500 KB per browser tab on average. On average, V8’s heap size decreased by 540 KB, with 25% of the tested sites saving more than 620 KB, 50% saving more than 540 KB, and 75% saving more than 420 KB.
- shevegen: Well - times have changed. With Google taking away from what Microsoft used to have as a foothold, and MS using that to keep competitors away (embrace and extend), it did not make as much sense for Microsoft to keep on doing how they used to operate in the past. That has already been a change in strategy several years ago. I'd sorta call it the Steve Ballmer era being over.
- Alex Punnen: As of the time of writing this, relations between Apache Foundation and Datastax- which was one of the largest contributor to Cassandra have soured. There is an commercial version of Cassandra — Datastax Enterprise Edition and open source version is the Apache Cassandra. The Java driver of Cassandra has two version, the open source and DSE provided, and you cannot use commercial driver with open source Cassandra. For other languages like Go the driver is open source.
- Steve Jobs: Everything in this world... was created by people no smarter than you.
- Carlos E. Perez: Experimental evidence reveals a new reality, even at the smallest unit of our cognition, there is a kind of conversational cognition that is going on between individual neurons that modifies each other’s behavior. Thus, not only are neurons machines with state, but neurons are also machines with an instruction set and a way to send code to each other. I’m sorry, but this is just another level of complexity.
- Dijkstra: A recent CS graduate got her first job, started in earnest on a Monday morning and was given her first programming assignment. She took pencil and paper and started to analyse the problem, thereby horrifying her manager 1.5 hours later because “she was not programming yet”. She told him she had been taught to think first. Grudgingly the manager gave her thinking permission for two days, warning her that on Wednesday she would have to work at her keyboard “like the others”! I am not making this up. And also the programming manager has found the euphemism with which to lend an air of respectability to what he does: “software engineering”.
- Mark LaPedus: the industry is pinpointing and narrowing down the transistor options for the next major nodes after 3nm. Those two nodes, called 2.5nm and 1.5nm, are slated to appear in 2027 and 2030
- baybal2: My prognosis as bit of an insider popping in and out of Shenzhen. All guns are pointed at _memory_Memory is an uncompetitive industry, a cash cow unseen in history, comparable only to oil. The SEL empire is built not on top of galaxy notes, but on a pile of memory chips. The easiest way to get an order of magnitude improvement right away is to put more memory on die and closer to execution units and eliminate the I/O bottleneck, but no mem co. will sell you the memory secret sauce. Not only that memory is made on proprietary equipment, but decades of research were made entirely behind closed doors of Hynix/SEL/Micron triopoly hydra, unlike in the wider semi community where even Intel's process gets leaks out a bit in their research papers.
- Walid S. Saba: So what has happened? How could we have leading labs (both in industry and academia) graduating or nurturing so-called experts in language processing—‘experts’ that are indifferent to a couple of centuries of fundamental work by some of the most penetrating minds in logic, semantics, and formal languages? It seems now that one of the most difficult problems in computing science (i.e., NLU) is thought of as a ‘data’ problem, and thus it is a problem that can be easily tackled by pulling some machine learning library, downloading lots of data, training your ‘deep’ network on that ‘big’ data, and viola—you are another step closer to passing the Turing Test (or better yet, to passing the Winograd Schema Challenge!). This is harmful to the field.
- Saumil Mehta: Just like every starry-eyed 22 year old learns to go from black-and-white to technicolor by the age of 35, I’ve reconciled myself to the fact that the militancy of my youth was in fact naive. I’ve reconciled myself to the fact that I may in fact work at other large companies in the future and that that need not be a matter of severe identity crisis. I still miss the sheer velocity, the occasional gut-wrench, the gallows humor and the foxhole feel of early stage startups.
- Richard Garwin: Livermore, of course, had to do things differently. Teller was critical of the Los Alamos history. None of their tests ever failed. Everybody knows you’re not taking enough risk, you’re not taking big enough steps, if your tests never failed. Livermore’s tests failed frequently. In fact, at the working level, Los Alamos and Livermore got along very well, although there were lots of competition in budget and whatnot. I’ve read just recently that people at Livermore said, “Well, you know, we really didn’t know what we were doing. We had to set up some kind of pulse X-ray system for X-raying our mock bombs as we were testing how they would work.
- quadcore: The way lockstep work is that the clients gives themselves a rendez-vous in the future and agree to compute one turn of the game, given the players' input of that turn, at that moment. It's very clever. The players input are sent to every clients and will be computed in the future in a deterministic way. So every client is computing the exact same game. Now if the clients doesnt compute the same game given the same players' input, the game is OOS and that basically should never happen because the game is dead. To detect an OOS, a client needs to compute a hash of some relevant game data (ultimately "everything" ends up being represented by the position of the units (plus a dead state)), every turn and send them with its "end of turn" message to the server/clients. If clients disagree with the value of the hash, they are OOS.

Winter Olympics 2018: Inside the Opening Ceremonies Drone Show. I would have thought they would use flocking type logic to control the drones. Also curious is how big events today are assumed to be synthetic and asynchronous. You can only get the full viewing experience by watching a streaming service, not attending in person, the complete reversal of how reality used to work.
- Like the Super Bowl, the opening ceremony production you'll see on your TV—or streaming device—was prerecorded. That's less of a cheat than an insurance policy; tiny drones can only handle so much abuse, and Pyeongchang is a cold and windy city
- Bringing 1,218 of those drones into harmony doesn't present much more of a logistical challenge than 300, thanks to how the Shooting Star platform works. After animators draw up the show using 3-D design software, each individual drone gets assigned to act as a kind of aerial pixel, filling in the 3-D image against the night sky.
- With the animation in place, each drone operates independently, communicating with a central computer rather than any of the drones around it. Just before takeoff, that computer also decides which drone plays what role, based on the battery levels and GPS strength of each member of the fleet. The drones can typically fly for a little under 20 minutes, given the limitations of current lithium-ion battery

Here's a job title that does not suck: Natalie Cheung, Intel's general manager of drone light shows.

Eliminating competition so you can charge higher prices without the pesky need to upgrade your physical plant? Shocked, shocked I say. FCC report finds almost no broadband competition at 100Mbps speeds: Forty-one percent of developed Census blocks had one ISP offering such speeds, for a total of 85 percent with zero or one ISP...63.2 percent of developed Census blocks had one cable provider, but only 3.8 percent had two and 0.3 percent had three...DSL and satellite were in a higher percentage of developed Census blocks than cable or fiber. Also, SpaceX set to launch first prototype Starlink satellites for global internet.

Crime continues to pay. Cryptocurrency Mining Malware Infected Over Half-Million PCs Using NSA Exploit. Expect to see more of the same.
- Active since at least May 2017, Smominru botnet has already infected more than 526,000 Windows computers, most of which are believed to be servers running unpatched versions of Windows, according to the researchers.
- The botnet operators have already mined approximately 8,900 Monero, valued at up to $3.6 million, at the rate of roughly 24 Monero per day ($8,500) by stealing computing resources of millions of systems.

Reliability rarely moves the needle. Every software developer knows what the score is. How Apple Plans to Root Out Bugs, Revamp iPhone Software: Software chief Craig Federighi laid out the new strategy to his army of engineers last month, according to a person familiar with the discussion. His team will have more time to work on new features and focus on under-the-hood refinements without being tied to a list of new features annually simply so the company can tout a massive year-over-year leap, people familiar with the situation say. The renewed focus on quality is designed to make sure the company can fulfill promises made each summer at the annual developers conference and that new features work reliably and as advertised.
- arjoura: As someone who used to work on iOS at Apple, what that company honestly needs is a culture not beholden to the whims of their EPMs (project managers). They used to help organize and work with engineering to schedule things across the company’s waterfall style development. However, by the time I left, they essentially took power over engineering. Radar became the driver for the entire company and instead of thinking about a holistic product, everything became a priority number. P0 meant, emergency fix immediately, P4 meant nice to have. You get the idea. Nothing could be worked on if it wasn’t in Radar with a priority number attached and signed off by the teams’ EPM. No room for a side project or time away from your daily duties because there were always P1s to fix. If you didn’t personally have any left for the day, you’d take one from another engineer who was likely swamped with their own list of P1s.
- masklinn: It's not clear because there have been absolute sh*t-show releases going back to the early days of OSX. Even the fondest-remembered releases were only so after a ton of polish, and that was with releases slipping significantly (you might see 3 years pass between major updates, and the new version would still be completely unstable).
- exBarrelSpoiler: QA being divided between Craig and Kim didn't help, either. Nor the organizational wars.
- @stevesi: 28/ No one ever anywhere has delivered a general purpose piece of S/W+H/W at scale of 1B delivering such a broad, robust, consistent experience. We don’t have a measure for what it means to be “high quality”. I can say that in any absolute sense, Apple has exceeded everyone else.

How much to store 7,418TB? AWS S3 vs On-Premises. On Prem: $204.98TB (first year). $2.77TB annual after first year plus operational labor. S3: $176,110.98 (monthly) * 12 = $2,113,331.76 (annual) / 7,418TB = $284.89TB (annual) PLUS costs not reported. Conclusion: If you are a startup and only using a small amount of storage then you can easily cost justify using a public cloud provider. However, if you are an enterprise and own datacenters then it’s far better from a cost stand point to build your own S3 on-premises and maintain control

Fun history tour through how the networking sausage really was made. Show 376: How Did MPLS Get Its Start? Greg Ferro wants a better networking world. If we could start over and rewrite new protocols from scratch, this could be the best of all possible networking worlds, couldn't it? Nope, says the people who were there. Technical debt is baked in from the first byte. A protocol must satisfy the makers of the protocol, that is they must be able to make money from it, so it can't be too much like a competitors are be too innovative. A protocol must also satisfy customers. It can't be too new or it will scare users off. It can't cause current equipment and practices to be tossed. Customers want what they already understand. The more things change the more they stay the same.
- Tunnels were not invented 6 months ago. Tunnels were invented long ago. What's changed it the control and maintenance of tunnels.
- The fundamental discovery of MPLS was tunnels and efficient tunneling by adding just one more label. The other genius of MPLS advance was not using a fixed length header, but having label stack, so you could have as many different control protocols working independently of each other. Stacking labels together let the right thing happen throughout the network.
- Before we had the public internet we have today there was UUNET. MPLS fit in a world where routing was on IP addresses but forwarding on a tag. They wanted to get rid of L2 switches. Wanted solve a problem that when a link was overloaded the goal would be to send traffic from some of the links to go over different links. Shortest path routing sends everything over the same link regardless of bandwidth utilization.
- The other part is how you control tunnels. We'll be iterating on how to create and control tunnels forever. Tunnels and overlays are going to be with us for quite some time.
- There's a need for simplification and optimization. Some of the solutions coming out today are variation on a theme of what went before.
- SD-WAN is L3 VPN with a controller instead of inline embedded control plane. It's viewed as being easier because it's over the top and some of the control functions or orthogonal to the service provider.
- MPLS are now the old Bell-heads. What changes is the protocol you consider the sucky old thing.

Reflecting on Wayfair’s Conversion to React and Redux. Technology is not the only benefit of moving to a new stack: Adopting React quickly opened doors in our hiring pipeline, allowing us to recruit great engineers. It also brought cohesiveness and expertise to our codebase by shortening the time to test and deploy new features going forward.

California's outlawing of non-compete agreements is like horizontal gene transfer for companies. Inspired by Exponent 141.

Hardware is part of many projects these days. Here's some useful advice. Common Connected Hardware Blunders.
- There are primarily two different kinds of product prototypes. A stakeholder prototype, which focuses on delivering desired functionality by leveraging as many pre-existing solutions as possible. And a functional prototype, which focuses on exploring production options by honing in on core mechanics and functionality.
- Don't wait for perfection. Every day spent on product development is a day not on the market.
- Ultimately a restless BOM is indicative of teams not truly collaborating together. There are strategies to hurdle specific challenges, and even aid with hardware versioning, but if the BOM is constantly changing there is little opportunity to build valuable services on top a hardware foundation. And, based on our experience, those value-add services are what matter the most to a connected product businesses.
- If you can't get your BOM down you can: Go to market with the lower (desired) price point. Take it out of the margin, or eat it. Make it up with scale or an equally performant but lower cost v2 in the future or Account for the long-term upside of value-add cloud services (subscription, etc).

Packet loss on your network? Twilio finds HTTP/2 may not be the best choice. Discovering Issues with HTTP/2 via Chaos Testing: When there is packet loss on the network, congestion controls at the TCP layer will throttle the HTTP/2 streams that are multiplexed within fewer TCP connections. Additionally, because of TCP retry logic, packet loss affecting a single TCP connection will simultaneously impact several HTTP/2 streams while retries occur. In other words, head-of-line blocking has effectively moved from layer 7 of the network stack down to layer 4.

If you can get over the PSTD from the experience of systems that moved all their logic into stored procedures, this just might be the future. Serverless Databases: The Future of Event-Driven Architecture. It removes all kins of friction from the development process.
- AWS Aurora Serverless comes with an on-demand auto-scaling configuration. This means, the database will start-up, scales capacity as per your application’s demand and shuts down when not in use.
- Aurora Serverless saved $2.14 for just 24 hours when compared to Aurora RDS.

KPTI/KAISER Meltdown Initial Performance Regressions: The KPTI [Linux kernel page table isolation] patches to mitigate Meltdown can incur massive overhead, anything from 1% to over 800%. Where you are on that spectrum depends on your syscall and page fault rates, due to the extra CPU cycle overheads, and your memory working set size, due to TLB flushing on syscalls and context switches. I [Brendan Gregg] described these in this post, and analyzed them for a microbenchmark. Of course, nothing beats testing with real workloads, and the analysis I've included here may be more useful in explaining why performance regressed and showing opportunities to tune it, than for the estimated regressions. Practically, I'm expecting the cloud systems at my employer (Netflix) to experience between 0.1% and 6% overhead with KPTI due to our syscall rates, and I'm expecting we'll take that down to less than 2% with tuning: using 4.14 with pcid support, huge pages (which can also provide some gains), syscall reductions, and anything else we find.

Having a feeling of control in you life turns out to be essential for mental health. That's why you need to do projects just for fun. The Cranky Developer Manifesto: Rule 1: It's My Project; Rule 2: It'll Be Done When It Is; Rule 3: Go Overboard; Rule 4: Humbug To Stability; Rule 5: Set Your Own Practices; Rule 6: License At Will; Rule 7: You're Not A Diplomat.

Great explanation. How does Tor *really* work? Why did it take so long for Ross Ulbricht of Silk Road fame so long to get caught? Tor is secure as long as the first node and last node aren’t compromised. This is quite hard to do because normally Tor will create a circuit with the nearest fastest nodes. If you’re in India for example, it’s unlikely that the NSA / CIA / MI6 / GCHQ will have Tor nodes set up there. Even so if you are in some weird part of America it’s more likely that there exists a Tor node closer to you than the big agencies do. Such as: 1. He regualrly boasted on LinkedIn of a large project he is working on, ‘ ‘creating an economic simulation’ in his words. 2. He used a real photograph of himself for a fake ID to rent servers to run his international multimillion dollar drugs marketplace. 3. He asked for advice on coding the secret website for his international multimillion dollar drugs marketplace using his real name. 4. He sought contacts in courier firms, presumably to work out how to best ship things from his international multimillion dollar drugs marketplace, on Google+, where his real name, real face and real YouTube profile were visible. 5. He allegedly paid $80,000 to kill a former employee of his international multimillion dollar drugs marketplace to a man who turned out to be an undercover cop

Everyone wants a silver bullet. The fact is when you get a lot of people talking to each other it becomes a nightmare. So you jump to the next fix. With fewer people on the new system it works great. Then more people are added. The even more people are added. The nightmare returns. You jump again. The cycle repeats. There is no silver bullet because werewolves don't exist. Slack is the opposite of organizational memory.

Programming is bringing order to chaos. If you think programming can be reduced to a single simple theory you've never stared deep into the maw of complexity and heard it laugh at you. We've Already Thought the Unthinkable: I recently read Tomas Petricek’s Thinking the Unthinkable, where he argues that modern PLT makes several restrictive assumptions about the nature of programming. Our reliance on mathematics in CS is not fundamental and our obsession with formal logic and algorithms keeps us from seeing other possible paradigms. He proposes two other unthinkable paradigms that are unrelatable to modern mathematical programming. I disagree with his premises: I think there’s a very valid reason to ground aspects of programming in mathematics. However, I want to focus on his core conclusion: that other paradigms are ‘unthinkable’. I’d like to argue that both of his proposed paradigms are 1) implemented, 2) at least somewhat explored, and 3) mathematically formalizable.

Scary as it is exciting. Hands-On with Skydio R1 Autonomous Drone! The comments surface subconscious fears of a lot of tech these days. What if it had guns? Hunter Seeker. Can it follow cars and not just people?

MIT has put their deep learning program online. 6.S191: Introduction to Deep Learning.

What's your coding style? In Think before you code Murat brings up Dijkstra was a big believer in thinking everything through before coding. Extreme Programming says don't think, start coding. There's a parallel here with writing. There are discovery writers. They start writing and let the stories and characters evolve. They may have to do a lot of editing, but writing is a form of thinking and that's the way a lot great writers work. Then there are writers who plan out every little detail. J. K. Rowling is supposed to have wrote her Harry Potter series that way. Others find a middle ground. They plot lightly and discovery write the rest, knowing they may have to do a lot of rewrites in the process. This is all before the editing process occurs. A good editor can cause a book to change a lot. Programmers don't have editors to pour over their software stories looking for ways to make the story better. Since I started on punch cards I generally think a lot before coding, but no plan survives contact with the enemy, so changes always occur. Coding is part of learning. So is thinking. So is testing. So is using. There's no one way to bring order from chaos.

How production engineers support global events on Facebook.
- Before the New Year's Eve planning, the segments written to storage were 10 seconds long (with a bit rate of around 1 Mb/s and approximately 1.25 MB per segment). We reduced these operations by consolidating the media segments into 20-second units, which decreased the IOPS to the storage system without negatively impacting its reliability or performance.
- Shadow traffic. This method of testing allowed us to replicate incoming production live streams on the FBLS host and multiply them into shadow streams. They then act as an additional input into the system so that an arbitrary amount of additional traffic can be generated on demand. This form of production traffic amplification was our preferred method of load-testing the system for New Year's Eve planning, because it simultaneously tests all aspects of the Facebook Live system (except for playback) as well as the dependency storage and network services. It produces an accurate load that closely matches production traffic and gives the highest-confidence measurement of the maximum traffic the system can sustain.

Pricing is always a pain. Here's a good story showing how several different pricing strategies worked out before finally converging on the right price. SaaS Pricing: Lessons from 4 Pricing Changes: If we started this whole process again, we’d start with a single $50/mo plan. It’s straightforward, giving no reason for confusion...Identify your target customer by talking to people and listening to objections, From there, price with them in mind, focusing on clarity and simplicity. Once you have a good foundation, experiment! For us, scaling on a value metric is working well.

A free 325 page Azure Serverless Computing Cookbook. Topics like Building a backend Web API using HTTP triggers; Sending an email notification to the administrator of the website using the SendGrid service; Using Cognitive Services to locate faces from the images; Testing Azure Functions; Monitoring your Azure Functions; Creating a common code repository for better manageability within a function app; Adding multiple messages to a Queue using the IAsyncCollector function.

Batch is the old world, streaming is the new. Migrating Batch ETL to Stream Processing: A Netflix Case Study with Kafka and Flink. Netflix moved to this stack: Apache Flink; Apache Kafka acting as a message bus; Apache Hive providing data summarization, query, and analysis using an SQL-like interface (particularly for metadata in this case); Amazon S3 for storing data within HDFS; the Netflix OSS stack for integration into the wider Netflix ecosystem; Apache Mesos for job scheduling and execution; and Spinnaker for continuous delivery.

Possibly self-serving benchmark. NoSQL Performance Benchmark 2018 – MongoDB, PostgreSQL, OrientDB, Neo4j and ArangoDB. Yet it might help you in your decision process. Good comments on HackerNews.

Do you still need on-demand instances? Replacing EC2 On-Demand Instances With New Spot Instances: The EC2 Spot instance marketplace has had a number of enhancements in the last couple months that have made it more attractive for more use cases. Putting these all together makes it easy to take instances you formerly ran on-demand and add an option to turn them into new Spot instances. They are much less likely to be interrupted than with the old spot market, and you can save a little to a lot in hourly costs, depending on the instance type, region, and availability zone.

Lots of good Lessons learned while developing Age of Empires 1 Definitive Edition. #12 is often over looked.
- 1. Get networking and multiplayer working early; 2. Develop strong out of sync (OOS) detection tools early, and learn how to use them; 3. Do not underestimate the complexity and depth of UWP and Xbox Live development; 4. Develop clean and defensive coding practices early on; 5. Do not disable or break "old" logging code. Make sure it always compiles; 6. Add debug primitives if the engine doesn't have any; 7. Profile early and make major engine architectural decisions based off actual performance metrics; 8. Figure out early on how to split up a singled threaded engine to be multithreaded; 9. Many RTS systems rely on emergent behavior and are interdependent; 10. Automated regression testing; 11. Playtest constantly and with enough variety; 12. Assume the original developers knew what they were doing; 13. Don't waste time developing new templated containers and switching the engine to use them, but do reformat and clean up the old code.
- skittlebrau: We got way better at dealing with synced/deterministic simulations over time. Much better sync logging system, decoupling render rate from simulation rate, more institutional knowledge/intuition about all the surefire ways to break determinism (using unsynced random number generator, uninitialized variables, reading local machine state from sim, using local timers rather than synced ones, not resetting floating point flags after calling Direct3D, etc). The bad news was once we were generally avoiding all the common causes as a matter of course, the sync bugs that remained were non-trivial ones that took lots of time to diagnose.

Some not so ancient history well told. Unix Architecture Evolution from the 1970 PDP-7 to the 2018 FreeBSD Important Milestones and Lesson. Went from thousands to millions of lines of code. Modularity has increased with code size. Software complexity rose only to self correct and become simpler again over time. Cyclomatic complexity decreased, even without a central task master making it happen. In Unix the way of working is to establish a convention and politely ask people to follow it without rigid enforcement. Unix has continually evolved, but there has been a slow down in architectural evolution. It's hard to change a shipping system with 10s of millions of lines of code. Many features are added but not used. Architectural features change roles over time. Architectural decisions that appear reasonable at the time cause technical debt. For example, socket streams and datagrams seemed like a good idea, but didn't take. Portability as a shaping force. The ability to easily port Unix influenced the design of everything. There's a competition between architectural styles, ptrace, getrusage, and the file system interface. There's a preservation of architectural form. The modern form is stable. Growth of federated architectures, a number of architectures live in their own box, like OPEN SSL and git. It's important to have a gifted team to create and select powerful abstractions.

Want to hide data in images? Use AI of course. Hiding Images using AI — Deep Steganography: Using a ConvNet will solve all the problems mentioned above. Firstly, the convnet will have a good idea about the patterns of natural images, and will be able to make decisions on which areas are redundant, and more pixels can be hidden there. By saving space on redundant areas, the amount of hidden information can be increased. Because the architecture and the weights can be randomised, the exact way in which the network will hide the information cannot be known to anybody who doesnt have the weights.

7 Simple Steps To Reduce AWS RDS Spend
- data transfer out of RDS costs money and it depends on the amount of data transferred. If you multi-AZ deployment, cost doubles. There are additional costs for backup etc. As is the case with most of the AWS resources, AWS charges customers for the provisioned capacity and they do not care if you fully utilize the RDS instance type of the provisioned storage capacity or performance
- In order to optimize RDS costs, it is important to make sure that rightsized RDS instances are selected. It is also important that the provisioned storage capacity and performance match the application needs.
- RDS is appropriate for sales transaction data due to its ACID property. On the other hand, product descriptions and pictures could be stored in cheaper storage systems such as S3
- Do a small pilot project to try out different instance types and collect database usage patterns.
- For data that are not accessed for a while, have an automation tool to move them off RDS to a less costly storage service such as S3. Sometimes the raw is not needed after they are processed.
- For planned workload, smooth out RDS utilizations in the span of the workload.

Reading is fundamental. A Moth Brain Learns to Read MNIST.
- We gave MothNet the classic ML task of identifying the handwritten digits of the MNIST dataset [LeCun & Cortes (2010)]. MothNet routinely achieved 75% to 85% accuracy classifying test digits after training on 1 to 20 samples per class. In this few-samples regime it substantially out-performed standard ML methods such as Nearest-neighbors, SVM, and CNN (Fig 2). The results demonstrate that even very simple biological architectures hold novel and effective algorithmic tools applicable to ML tasks, in particular tasks constrained by few training samples or the need to add new classes without full retraining.
- cdelahunt: (paper author) You are correct that the 'natural' moth maxes out after about 20 samples/class. It is not yet clear whether this is an intrinsic limitation of the architecture (the competitive pressure on an insect is for fast and rough learning), or whether it is just an artifact of the parameters of the natural moth. For example, slowing the Hebbian growth parameters would allow the system to respond to more training samples, which should give better test-set accuracy. We're still running experiments.

Fun debugging story. A tale of performance debugging: from 1.3X slower to 48X faster than Apache Kafka: I do not think there exists a system today that can handle the load for the next generation of information flows in a cost efficient manner - Terabits/second. Low latency and high throughput will be a requirement to process drone logs, handle new security attacks, etc

Partisan: Enabling Cloud-Scale Erlang Applications: In this work, we present an alternative distribution layer for Erlang, named Partisan. Partisan is a topology-agnostic distributed programming model and distribution layer that supports several network topologies for different application scenarios: full mesh, peer-to-peer, client-server, and publish-subscribe. Partisan allows application developers to specify the network topology at runtime, rather than encoding topology-specific concerns into application code.

Distributed Systems 3rd edition (2017): For this third edition of "Distributed Systems," the material has been thoroughly revised and extended, integrating principles and paradigms into nine chapters: 1. Introduction 2. Architectures 3. Processes 4. Communication 5. Naming 6. Coordination 7. Replication 8. Fault tolerance 9. Security A separation has been made between basic material and more specific subjects.

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures: We achieve stable learning at high throughput by combining decoupled acting and learning with a novel off-policy correction method called V-trace. We demonstrate the effectiveness of IMPALA for multi-task reinforcement learning on DMLab-30 (a set of 30 tasks from the DeepMind Lab environment (Beattie et al., 2016)) and Atari-57 (all available Atari games in Arcade Learning Environment (Bellemare et al., 2013a)). Our results show that IMPALA is able to achieve better performance than previous agents with less data, and crucially exhibits positive transfer between tasks as a result of its multi-task approach.

Stuff The Internet Says On Scalability For February 16th, 2018

High Scalability

Read more

Kafka 101

Capturing A Billion Emo(j)i-ons

Brief History of Scaling Uber

Behind AWS S3’s Massive Scale