hot links

Stuff The Internet Says On Scalability For May 10th, 2019

High Scalability

10 May 2019 — 27 min read

Wake up! It's HighScalability time:

Deep-sky mosaic, created from nearly 7,500 individual exposures, provides a wide portrait of the distant universe, containing 265,000 galaxies that stretch back through 13.3 billion years of time to just 500 million years after the big bang. (hubblesite)

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. I wrote Explain the Cloud Like I'm 10 for people who need to understand the cloud. And who doesn't these days? On Amazon it has 45 mostly 5 star reviews (107 on Goodreads). They'll learn a lot and hold you in awe.

Number Stuff:

36%: of the world touches a Facebook app every month, 2 years over a life time
$84.4: average yearly Facebook ad revenue per user in North America
1%: performers raked in 60% of all concert-ticket revenue world-wide in 2017—more than double their share in 1982
175 zetabytes: size of datasphere in 2025, up 5x from 2018, 49% stored in public clouds
45.9: Amazon's percentage of U.S. online retail growth in 2018 and 20.8% of total U.S. retail sales growth
$4.5B: Apple's make it all go away payment to Qualcomm
64 nanowatts: per square meter energy harvested from sky battery
18%: YoY drop in smartphone sales
10x: size of software markets and businesses compare to 10-15 years ago laregly due to the liquidity provided by the global internet
33: age of average American gamer, who prefers to play on their smartphone and is spending 20 percent more than a year ago and 85 percent more than in 2015, the $43.4 billion spent in 2018 was mostly on content
336: average lifespan of a civilization in tears
2/3rds: drop in M&A spending in April
74%: SMBs said they “definitely would pay ransom at almost any price” to get their data back or prevent it from being stolen
2.5B: devices powered by Android
40%: use a hybrid cloud infrastructure
50%: drop in 2019 hard disk sales, due to a combination of general market weaknesses and the transition of notebooks to SSDs
2.5 million: total view count for the final Madden NFL 19 Bowl match
40%: Amazon merchants based in China

Quotable Stuff:

@mjpt777: APIs to IO need to be asynchronous and support batching otherwise the latency of calls dominate throughput and latency profile under burst conditions. Languages need to evolve to better support asynchronous interfaces and have state machine support, not try to paper over the obvious issues with synchronous APIs. Not everyone needs high performance but the blatant waste and energy consumption of our industry cannot continue.
Guido van Rosuum: I did not enjoy at all when the central developers were sending me hints on Twitter questioning my authority and the wisdom of my decisions, instead of telling me in my face and having an honest debate about things.
Isobel Cockerell: A kind of WeChat code had developed through emoji: A half-fallen rose meant someone had been arrested. A dark moon, they had gone to the camps. A sun emoji—“I am alive.” A flower—“I have been released.”
@scottsantens: Australian company shifts to 4-day week with every Weds off and no decrease in pay. Result? 46% more revenue, a tripling of profits, and happier employees taking fewer sick days. Also Thurs are now much more productive. We work too much.
Twitter: Across the six Twitter Rules policy categories included in this report, 16,388 accounts were reported by known government entities compared to 5,461 reported during the last reported period, an increase of 17%.
Michael Sheetz: The Blue Moon lander can bring 3.6 metric tons to the lunar surface, according to Bezos. Bezos also unveiled the company's BE-7 rocket engine at the event. The engine will be test fired for the first time this summer, Bezos said. It's largely made of "printed" parts, he added. "We need the new engine and that's what this is," Bezos said.
Umich: Called MORPHEUS, the chip blocks potential attacks by encrypting and randomly reshuffling key bits of its own code and data 20 times per second—infinitely faster than a human hacker can work and thousands of times faster than even the fastest electronic hacking techniques. With MORPHEUS, even if a hacker finds a bug, the information needed to exploit it vanishes 50 milliseconds later. It’s perhaps the closest thing to a future-proof secure system.
Sean Illing: In some ways our dependence on the phone also makes us less independent. Americans always celebrate self-reliance as a value, but it’s very clear we don’t — even for a moment — want to be by ourselves or on our own any longer. I have mixed feelings about the whole mythology of self-reliance. But certainly, while the myth that we’re self-reliant lives on, our ability to be alone seems to be going by the wayside.
DSHR: If University libraries/archives spent 1% of their acquisitions budget on Web archiving, they could expand their preserved historical Web records by a multiple of 20x.
Alexander Rose: Probably a third of the organizations or the companies over 500 or 1,000 years old are all in some way in wine, beer, or sake production.
@benedictevans: Idle observation: 2/3 to 3/4 of Google and Facebook’s ad business is from companies that never bought print advertising other than Yellow Pages. And a lot of what was in print went elsewhere.
@stevecheney: There is so much asymmetry in the Valley it cracks me up... Distributed teams are not a new trend — they are just downstream to VCs when fundraising series A/B. We built a 50 person distributed co after YC late 2013. And are 4x more capital efficient because of it.
digitalcommerce360: retailers ranked Nos. 401-500 this year grew their collective web revenue by 24.3% in 2018 over 2017, faster than the 20.0% growth of Amazon, and well above the 14.1% year-over-year ecommerce growth in North America.
Nikita: So why was AMP needed? Well, basically Google needed to lock content providers to be served through Google Search. But they needed a good cover story for that. And they chose to promote it as a performance solution.
c2h5oh: Name one high profile whistleblower in USA in the last 30 years has not had his entire life upturned or, more often, straight up ruined as a direct result of his high moral standards.
Nobody working at Boeing of FAA right now has witnessed one during their lifetime - all they saw were cautionary tales.
Logan Engstrom et al.: In summary, both robust and non-robust features are predictive on the training set, but only non-robust features will yield generalization to the original test set. Thus, the fact that models trained on this dataset actually generalize to the standard test set indicates that (a) non-robust features exist and are sufficient for good generalization, and (b) deep neural networks indeed rely on these non-robust features, even in the presence of predictive robust features.
Andy Greenberg: SaboTor also underscored an aggressive new approach to law enforcement's dark-web operations: The agents from the Joint Criminal Opioid Darknet Enforcement team that carried it out—from the FBI, Homeland Security Investigations, Drug Enforcement Administration, Postal Service, Customs and Border Protection, and Department of Defense—now all sit together in one room of the FBI's Washington headquarters. They've been dedicated full-time to following the trail of dark-web suspects, from tracing their physical package deliveries to following the trail of payments on Bitcoin's blockchain.
Dharmesh Thakker: The future of open source is in the cloud, and the future of cloud is heavily influenced by open source. Going forward, I believe the diamond standard in infrastructure software will be building a legendary open-source brand that is adopted by thousands of users, and then delivering a cloud-native, full-service experience to commercialize it. Along the way, non- open-source companies that use cloud “time-to-value” effectively, as well as hybrid open-source solutions delivered on multi-cloud and on-premise systems, will continue to thrive. This is the new OpenCloud paradigm, and I am excited about the hundreds of transformational companies that will be formed in the coming years to take advantage of it.
RcouF1uZ4gsC: There seems to be a trend of people making a lot of money designing/building stuff that erodes privacy and ethics and then leaving the company where they made that money and talking about privacy and ethics. Take for example Justin Rosenstein who invented the Like button.
A. Nonymous: On the fateful day, a switch crashed. The crash condition resulted in a repeated sequence of frames being sent at full wire speed. The repeated frames included broadcast traffic in the management VLAN, so every control-plane CPU had to process them. Network infrastructure CPUs at 100% all over the data center including core switches, routing adjacencies down, etc. The entire facility could not process for ~3.5 hours. No stretched L2, so damage was contained to a single site. This was a reasonably well-managed site, but had some dumb design choices. Highly bridged networks don’t tolerate dumb design choices.
Kevin Fogarty: Despite the moniker, 5G is more of a statement of direction than a single technology. The sub-6GHz version, which is what is being rolled out today, is more like 4.5G. Signal attenuation is modest, and these devices behave much like cell phones today. But when millimeter wave technology begins rolling out—current projections are 2021 or 2022—everything changes significantly. This slice of the spectrum is so sensitive that it can be blocked by clothing, skin, windows, and sometimes even fog.
DSHR: Why did "cloud service providers" have an "inventory build-up during calendar 2018"? Because the demand for storage from their customers was even further from insatiable than the drive vendors expected. Even the experts fall victim to the "insatiable demand" myth.
Eric Budish: In particular, the model suggests that Bitcoin would be majority attacked if it became sufficiently economically important — e.g., if it became a “store of value” akin to gold — which suggests that there are intrinsic economic limits to how economically important it can become in the first place.
Kalev Leetaru: In fact, much of the bias of deep learning comes from the reliance of the AI community on free data rather than paying to create minimally biased data. Putting this all together, as data science matures it must become far more like the hard sciences, especially a willingness to expend the resources to collect new data and ask the hard questions, rather than its current usage of merely lending a veneer of credibility to preordained conclusions.
Joel Hruska: AMD picked up one percentage point of unit share in the overall x86 market in Q1 2019 compared with the previous quarter and 4.7 percentage points of market share compared with Q1 2018. This means AMD increased its market share by 1.54x in just one year — a substantial improvement for any company.
@awsgeek: <- Meet the latest AWS Lambda Layers fanboy. I love how I can now move common dependencies into shared layers & reduce Lambda package sizes, which allows me to continue developing & debugging functions in the Lambda console. Yes, I love VIM, but I'm still a sucker for a GUI!
Chen: It’s not hard to believe that someone, maybe an employee, could be convinced to add a rogue element, a tiny little capacitor or something, to a board. There was a bug we heard about that looked like a generic Ethernet jack, and it worked like one, but it had some additional cables. The socket itself is the Trojan and the relevant piece is inside the case, so it’s hard to see.
@aallan: "The future is web apps, and has been since Steve Jobs told the WWDC audience in 2007 he had a 'sweet solution' to their desire to put apps on the iPhone – the web! – and was greeted by the stoniest of silences…" Yup, this! The future is never web apps.
@tmclaughbos: The biggest divide in the ops community isn't Old v. DevOps v. SRE, k8s, v. serverless, or whatever. It is "How do I run infrastructure?" v. "How do I not run infrastructure?".
@PaulDJohnston: The serverless shift * From Code to Configuration * From High LoC towards Low LoC (preferably zero) * From Building Services to Consuming Services * From Owning Workloads to Disowning Workloads
Vishal Gurbuxani: Facebook is not a social network anymore. It is a completely re-written internet, where a consumer spends their time, money, attention to buy products/services for their day-day life, as well as connect with people/groups, etc. I hope we can all take a stand and realize that our humanity is being lost by Facebook, when they choose to use algorithms to police 2.7 billion people.
@mattklein123: The thing I find most ironic about the C++ is dead narrative is that C++ IS one of the (several) reasons that Envoy has blown up. Google/Apple would not have touched Envoy if not C++, and their support was critical in early 2017, lending both expertise and resources. 1/ Winning in OSS is about product market fit, "hiring" contributors, and community building. The bigger the community, the more expertise and the more production usage, and this creates a compounding virtuous cycle. 2/
@clintsharp: It is nearly impossible to imbue an algorithm with *judgement*. And ultimately, we are paying operators of complex systems for their judgement. When to page out, when to escalate, when to bring in the developers. No algorithm is going to solve that for you.
Rudraksh Tuwani et al.: Our analysis reveals copy-mutation as a plausible mechanism of culinary evolution. As the world copes with the challenges of diet-linked disorders, knowledge of the key determinants of culinary evolution can drive the creation of novel recipe generation algorithms aimed at dietary interventions for better nutrition and health.
ellius: After fixing a recent bug, I asked my client company what if any postmortem process they had. I informally noted about 8 factors that had driven the resolution time to ~8 hours from what probably could have been 1 or 2. Some of them were things we had no control over, but a good 4-5 were things in the application team's immediate control or within its orbit. These are issues that will definitely recur in troubleshooting future bugs, and doing a proper postmortem could easily save 250+ man hours over the course of a year. What's more, fixing some of these issues would also aid in application development. So you're looking at immediate cost savings
MITTR: The limiting factor for new machines is no longer the hardware but the power available to keep them humming. The Summit machine already requires a 14-megawatt power supply. That’s enough to light up an entire a medium-sized town. “To scale such a system by 10x would require 140 MW of power, which would be prohibitively expensive,” say Villalonga and co. By contrast, quantum computers are frugal. Their main power requirement is the cooling for superconducting components. So a 72-qubit computer like Google’s Bristlecone, for example, requires about 14 kw. “Even as qubit systems scale up, this amount is unlikely to significantly grow,” say Villalonga and co.
Patient0: In the ~15 years I spent building software in C++ I don't recall a single time that I wished for garbage collection. By using RAII techniques, it was always possible (nay, easy!) to write code that cleaned up after itself automatically. I always found it easier to reason about and debug programs because I knew when something was supposed to be freed, and in which thread. In contrast, in the ~10 years I spent working in Java, I frequently ran into problems with programs which needed an excessive amount of memory to run. I spent countless frustrating hours debugging and "tuning" the JVM to try to reduce the excessive memory footprint (never entirely successfully). Garbage collection is an oversold hack - I concede there are probably some situations where it is useful - but it has never lived up to the claims people made about it, especially with respect to it supposedly increasing developer productivity.
dan.j.newhouse: I just went through migrating our production machines from m5 and r5 instances to z1d over the last couple months. I’m a big fan of the z1d instance family now. Where I work, our workloads are very heavy on CPU, in addition to wanting a good chunk of RAM (what database doesn’t, though?). The m5 and r5 instances don’t cut it in the CPU department, and the c5 family is just poor RAM per dollar. While this blog post is highlighting the CPU, the z1d also has the instance storage NVMe ssd (as does the r5d). Set the database service to automatic delayed start, and toss TempDB on that disk. That local NVMe ssd is great in multiple ways. First, it’s already included in the price of the EC2 instance. Secondly, I’ve seen throughput in the neighborhood of 750 MB/s of against it (YMMV). Considering the cost of a io1 volume with a good chunk of provisioned IOPS is NOT cheap, plus you need an instance large enough to support that level of throughput to EBS in the first place, this is a big deal. If you’ve got the gp2-blues, with TempDB performing poorly, or worse, even experiencing buffer latch wait timeouts for our good buddy database ID 2, making a change to a z1d (or r5d, if you don’t need that CPU) to leverage that local ssd is really something to consider.
_nothing: My opinion is different nowadays. Instagram is surely a place where exists a lot of true beauty and expression, but it's also a place full of people largely driven by societal and monetary reward, to an extent that I've come to consider unhealthy. We are being influenced, and we are influencing. And we like that-- my social brain wants to know what society considers beautiful, it enjoys training itself on what society considers beautiful, it wants to be affirmed in the beauty of its own body. Instagram gave me exactly what I wanted. But (and I don't mean to criticize anyone here at all, considering I was and still am subject to the same pressures and influences) I don't think what I thought I wanted was healthy. I want to be happy. A constant stream of corgi videos and bikini photos and travel porn gives me little ups, but it also shapes my brain in ways I think could be damaging.

Useful Stuff:

Another example of specialization being the key to scalability and efficiency. It's fascinating to see all the knobs Dropbox can tune because they do one thing well. How we optimized Magic Pocket for cold storage.
- kdkeyser: This article is about single-region storage vs. multi-region storage (and how to reduce the cost in this case). There is very little public info available about distributed storage systems in multi-region setup with significant latency between the sites.
- preslavle: In our approach the additional codebase for cold storage is extremely small relative to the entire Magic Pocket codebase and importantly does not mutate any data in the live write path: data is written to the warm storage system and then asynchronously migrated to the cold storage system. This provides us an opportunity to hold data in both systems simultaneously during the transition and run extensive validation tests before removing data from the warm system. We use the exact same storage zones and codebase for storing each cold storage fragment as we use for storing each block in the warm data store. It’s the same system storing the data, just for a fragment instead of a block. In this respect we still have multi-zone protections since each fragment is stored in multiple zones.
- Over 40% of all file retrievals in Dropbox are for data uploaded in the last day, over 70% for data uploaded in the last month, and over 90% for data uploaded in the last year. Dropbox has unpredictable delete patterns so we needed some process to reclaim space when one of the blocks gets deleted.
- This system is already designed for a fairly cold workload. It uses spinning disks, which have the advantage of being cheap, durable, and relatively high-bandwidth. We save the solid-state drives (SSDs) for our databases and caches. Magic Pocket also uses different data encodings as files age. When we first upload a file to Magic Pocket we use n-way replication across a relatively large number of storage nodes, but then later encode older data in a more efficient erasure coded format in the background
- Dropbox’s network stack is already heavily optimized for transferring large blocks of data over long distances. We have a highly tuned network stack and gRPC-based RPC framework, called Courier, that is multiplexing requests over HTTP/2 transport. This all results in warm TCP connections with a large window size that allows us to transfer a multi-megabyte block of data with a single round-trip.
- One beautiful property of the cold storage tier is that it’s always exercising the worst-case scenario. There is no plan A and plan B. Regardless of whether a region is down or not, retrieving data always requires a reconstruction from multiple fragments. Unlike our previous designs or even the warm tier, a region outage does not result in major shifts in traffic or increase of disk I/O in the surviving regions. This made us less worried about hitting unexpected capacity limits during emergency failover at peak hours.

If you come to silicon valley should you work for a consumer oriented company or an enterprise SaaS company? The answer for a long time has been to target the consumer space. Consumer has been sexy, where the innovation is, where a new business model can win, where opportunity can be found. Exponent Episode 170 — A Perfect Meal makes the case that's not the case any more. Consumer and enterprise SaaS have switched roles. Consumer is now controlled by monopolies. It's hard for a new entrant to gain a foot hold in the consumer space. Enterprise SaaS is where a new product can win on merit. The examples given are Zoom and Slack. Both Zoom and Slack have won because they are better than their competitors. Can you say the same about many recent consumer products? The change is driven by the same trends we've seen drive the consumer market. Bring your own device in the enterprise has made users more of driving force in deciding what software an enterprise adopts. The role of the gatekeeper has diminished. You only need to convince an individual employee at a company to give your product a try, which is perfect for software as a service. Anyone at a company can sign up for a SaaS product at no risk. A sales team doesn't have build relationships to drive sales. Employees drive adoption. Once in a company, especially if your product has a viral component, you can land and expand sales, something both Zoom and Slack have mastered. This drives down customer acquisition costs dramatically. Once you have an individual onboard that individual can infect others. And your sales team, after seeing a company has a number of users, can call that company to try and get the entire company on board. The pitch can be you're relieving pain by offering a managed service for the entire company instead of the pain of each team managing a service for themselves. If you want to build a product where the best product wins then enterprise is the new sexy. The competitive dynamics in the enterprise reward being the better company in a way that consumer no longer does.

A radical rethinking of the stack. Fast key-value stores: An idea whose time has come and gone: We argue that the time of the RInK [Remote, in-memory key-value] store has come and gone: their domain-independent APIs (e.g., PUT/GET) push complexity back to the application, leading to extra (un)marshalling overheads and network hops. Instead, data center services should be built using stateful application servers or custom in-memory stores with domain-specific APIs, which offer higher performance than RInKs at lower cost.
- SerDes is always a huge waste: in ProtoCache prior to its rearchitecture, 27% of latency was due to (un)marshalling. In our experiments (Section 3), (un)marshalling accounts for more than 85% of CPU usage. We also found (un)marshalling a 1KB protocol buffer to cost over 10us, with all data in the L1 cache. A third-party benchmark [5] shows that other popular serialization formats (e.g., Thrift [27]) are equally slow
- Extra network hops also have a cost: prior to its rearchitecture, ProtoCache incurred an 80 ms latency penalty simply to transfer large records from a remote store, despite a high speed network.
- What they want instead: Stateful application servers couple full application logic with a cache of in-memory state linked into the same process (Fig. 2b). This architecture effectively merges the RInK with the application server; it is feasible when a RInK is only accessed by a single application and all requests access a single key. Latency is 29% to 57% better (at the median), with relative improvement increasing with object size.
- This is really a back to the future model of services. Services were stateful at one time. Then we went stateless to scale and added in caches to mitigate the performance penalty for separating state from logic. It would be interesting to see something like Lambda distribute stateful actors instead of functions.
- Good discussion on HN.

We don't need no stinkin' OS. But of course the interfaces they talk about are really just another OS. I/O Is Faster Than the CPU – Let’s Partition Resources and Eliminate (Most) OS Abstractions: I/O is getting faster in servers that have fast programmable NICs and non-volatile main memory operating close to the speed of DRAM, but single-threaded CPU speeds have stagnated. Applications cannot take advantage of modern hardware capabilities when using interfaces built around abstractions that assume I/O to be slow. We therefore propose a structure for an OS called parakernel, which eliminates most OS abstractions and provides interfaces for applications to leverage the full potential of the underlying hardware. The parakernel facilitates application-level parallelism by securely partitioning the resources and multiplexing only those resources that are not partitioned. Great discussion on HN. We've seen all this before.

We should keep this lesson in mind when it comes to the use of biological weapons. Anything put out in the world can be captured, analysed, 3D printed en masse, and sent right back at the attacker. How Chinese Spies Got the N.S.A.’s Hacking Tools, and Used Them for Attacks: Chinese intelligence agents acquired National Security Agency hacking tools and repurposed them in 2016 to attack American allies and private companies in Europe and Asia, a leading cybersecurity firm has discovered. The episode is the latest evidence that the United States has lost control of key parts of its cybersecurity arsenal.

Real Time Lambda Cost Analysis Using Honeycomb: All of the services that support our web and mobile applications at Fender Digital are built using AWS Lambda. With Lambda’s cost-per-use billing model we have cut the cost of hosting our services by approximately 90%...While we have been tagging our resources diligently to calculate the cost of each service, the AWS Cost Explorer does not allow us to delve into the configuration of the function and the actual resources it has consumed versus the actual invocation times....Log aggregation to the rescue! We aggregate all of the Cloudwatch log groups for our Lambda functions into a single Kinesis stream, which then parses out the structured JSON logs and publishes them to honeycomb.io. I cannot recommend that tool highly enough for analyzing log data. It is a great product from a great group of people. AWS adds its own log lines as well, including when the function starts an invocation, when the invocation ends, and a report of the invocation that can be used to calculate the cost of the invocation...The function is currently configured for 512 MB, and since most of the time it spends is in network I/O on calling app store APIs, we can reduce the configured memory. If we were to reduce it to 384 MB we would see an ~25% reduction in cost. The average invocation time may be slightly higher, but as this function is invoked off of a DynamoDB stream it has no direct impact on user experience. What are the actual costs incurred? Assuming we’ve already consumed the free tier for Lambda, during the 3 day period we are looking at there were 1,405,640 requests x $0.0000002 per request = $0.28 and 140,057.8 gb-seconds of compute time x $0.00001667 = $2.33. For those three days, this function cost $2.61, leading to a potential savings of $0.65. That’s not a lot, but as our use of lambda steadily grows we can ensure we are not over-provisioned.

Measuring MySQL Performance in Kubernetes: You can see the numbers vary a lot, from 10550 tps to 20196 tps, with the most time being in the 10000tps range. That’s quite disappointing. Basically, we lost half of the throughput by moving [MySQL] to the Kubernetes node. To improve your experience you need to make sure you use Guaranteed QoS. Unfortunately, Kubernetes does not make it easy. You need to manually set the number of CPU threads, which is not always obvious if you use dynamic cloud instances. With Guaranteed QoS there is still a performance overhead of 10%, but I guess this is the cost we have to accept at the moment.

If your company runs multiple lines of business (think Gmail vs. Google Docs vs. Google Calendar, etc.), how can you tell how much of your hardware and infrastructure spend is attributed to each LOB? Embracing context propagation: When the request enters our system, we typically already know which LOB it represent, either from the API endpoint or even directly from the client apps. We can use context (baggage) to store the LOB tag and use it anywhere in the call graph to attribute measurements of resource usage to specific LOB, such as number of reads/writes in the storage, or number of messages processed by the messaging platform.

Attack of the Killer Microseconds: A new breed of low-latency I/O devices, ranging from faster datacenter networking to emerging non-volatile memories and accelerators, motivates greater interest in microsecond-scale latencies. Existing system optimizations targeting nanosecond- and millisecond-scale events are inadequate for events in the microsecond range. New techniques are needed to enable simple programs to achieve high performance when microsecond-scale latencies are involved, including new microarchitecture support.

The fundamental issue with humanity is one person's idea of utopia is another's dystopia. We spend our lives navigating the resulting maelstrom of the conflict. This vision is not Humanity Unchained, it's Humanity Limited by its own creation. Do you really want humanity limited by an AI enforcing all the "standard problems of political philosophy"? We would be stuck in the past rather than moving forward. Stephen Wolfram: But what will be possible with this? In a sense, human language was what launched civilization. What will computational language do? We can rethink almost everything: democracy that works by having everyone write a computational essay about what they want, that’s then fed to a big central AI—which inevitably has all the standard problems of political philosophy. New ways to think about what it means to do science, or to know things. Ways to organize and understand the civilization of the AIs. A big part of this is going to start with computational contracts and the idea of autonomous computation—a kind of strange merger of the world of natural law, human law, and computational law. Something anticipated three centuries ago by people like Leibniz—but finally becoming real today. Finally a world run with code.

A very well written Post-mortem and remediations for [Matrix] Apr 11 security incident. When you hear this in a movie you know exactly what happens next: What can we trust if not our own servers? And that's what happened: If there is one lesson everyone should learn from this whole mess, it is: SSH agent forwarding is incredibly unsafe, and in general you should never use it. Not only can malicious code running on the server as that user (or root) hijack your credentials, but your credentials can in turn be used to access hosts behind your network perimeter which might otherwise be inaccessible. All it takes is someone to have snuck malicious code on your server waiting for you to log in with a forwarded agent, and boom, even if it was just a one-off ssh -A. TL;DR: keep your services patched; lock down SSH; partition your network; and there's almost never a good reason to use SSH agent forwarding.

You have a great IoT idea, but you just can't get past the lack of a M2M networks. Good news. AT&T went live with their NarrowBand Internet of Things (NB-IoT) network. Bodyport, for example, uses the LTE-M network to connect a smart scale that transmits patients’ cardiovascular data to remote care teams in near real-time. They're working with suppliers to certify $5 modules that connect devices to NB-IoT and pricing plans are available for as low as $5/year/device. You need a revenue model, but at least it's possible.

Meet programmers from a more civilized age. Brian Kernighan interviews Ken Thompson at Vintage Computer Festival East 2019.

I/O Is Faster Than the CPU – Let’s Partition Resources and Eliminate (Most) OS Abstractions: NICs and non-volatile main memory operating close to the speed of DRAM, but single-threaded CPU speeds have stagnated. Applications cannot take advantage of modern hardware capabilities when using interfaces built around abstractions that assume I/O to be slow. We therefore propose a structure for an OS called parakernel, which eliminates most OS abstractions and provides interfaces for applications to leverage the full potential of the underlying hardware. The parakernel facilitates application-level parallelism by securely partitioning the resources and multiplexing only those resources that are not partitioned

6 new ways to reduce your AWS bill with little effort: AWS introduced AMD-powered EC2 instances that are 10% cheaper compared to the Intel-powered Instances. They provide the same resources (CPU, memory, network bandwidth) and run the same AMIs; Use VPC endpoints instead of NAT gateways; Convertible Reserved EC2 Instances - Saving potential: Additional 25% over On-Demand (assuming you can now go from 1-year terms to 3-year terms); EC2 Spot Instances - Saving potential: 70-90% over On-Demand; S3 Intelligent-Tiering.

Maybe this would work for how to communicate within a team? Mr. Rogers’ Nine Rules for Speaking to Children (1977): State the idea you wish to express as clearly as possible, and in terms preschoolers can understand; Rephrase in a positive manner; Rephrase your idea to eliminate all elements that could be considered prescriptive, directive, or instructive; Rephrase any element that suggests certainty; Rephrase your idea to eliminate any element that may not apply to all children; Add a simple motivational idea that gives preschoolers a reason to follow your advice; Rephrase your new statement, repeating the first step; Rephrase your idea a ﬁnal time, relating it to some phase of development a preschooler can understand.

How is a blockchain and end-to-end encryption totally owned by Facebook any more private? Top 3 Takeaways from Facebook: Blockchain will be at the center of Facebook’s Strategy for their entire platform and payments; Building out infrastructure on the data-level to have security from the ground up; Code privacy and data use principles as first class concepts into the infrastructure. This is one of the primary uses cases of using a Blockchain based network; I didn’t see anyone catch this sound bite from Mark, but he basically said that they are rewriting all of Facebook’s back-end code to be more user-centric, which is a distributed ledger and access control.

Do you want to go serverless and survive AWS region level outages? Here's a huge amount of practical detail as well tips and gotchas. Disaster Tolerance Patterns Using AWS Serverless Services: S3 resilience: Use versioning and cross region replication for S3 buckets; Use CloudFront origin failover for read access to replicated S3 buckets; DynamoDB resilience: Use global tables for DynamoDB tables; API Gateway and Lambda resilience: Use a regional API Gateway and associated Lambda functions in each region; Use Route 53 latency or failover routing with health checks in front API Gateways; Cognito User Pools resilience: Create custom sync solution for now.

The Story Behind an Instacart Order, Part 1: Building a Digital Catalog: Fun fact: While partners can send us inventory data at any point in the day, we receive most data dumps around 10 pm local time. Certain individual pieces of our system (like Postgres) weren’t configured to handle these 10 pm peak load times efficiently — we didn’t originally build with elastic scalability in mind. To solve this we began a “lift and shift” of our catalog infrastructure from our artisanal system to a SQL-based interface, running on top of a distributed system with inexpensive storage. We’ve decoupled compute from that storage, and in this new system, we rely on Airflow as our unified scheduler to orchestrate that work. Re-building our infrastructure now, not only helps us deal with load times efficiently, it saves cost in the long run and ensures that we can make more updates every night to the catalog as we receive more product attributes and our data prioritizations models evolve.

Some risks of coordinating only sometimes: Within AWS, we are starting to settle on some patterns that help constrain the behavior of systems in the worst case. One approach is to design systems that do a constant amount of coordination, independent of the offered workload or environmental factors. This is expensive, with the constant work frequently going to waste, but worth it for resilience. Another emerging approach is designing explicitly for blast radius, strongly limiting the ability of systems to coordinate or communicate beyond some limited radius. We also design for static stability, the ability for systems to continue to operate as best they can when they aren’t able to coordinate.

O'reilly has branded something they call the Next Architecture: The growth we’ve seen on our online learning platform in cloud topics, in orchestration and container-related terms such as Kubernetes and Docker, and in microservices is part of a larger trend in how organizations plan, code, test, and deploy applications that we call the Next Architecture. This architecture allows fast, flexible deployment, feature flexibility, efficient use of programmer resources, and rapid adapting, including scaling, to unpredictable resource requirements. These are all goals businesses feel increasingly pressured to achieve to keep up with nimble competitors. There are four aspects of the Next Architecture: Decomposition, Cloud, Containers, and Orchestration.

We’ve learned from running Azure Functions that only 30% of our executions are coming from HTTP events. KEDA: bringing event-driven containers and functions to Kubernetes: With the release of KEDA, any team can now deploy function apps created using those same [Microsoft] tools directly to Kubernetes. This allows you to run Azure Functions on-premises or alongside your other Kubernetes investments without compromising on the productive serverless development experience. The open source Azure Functions runtime is available to every team and organization, and brings a world-class developer experience and programming model to Kubernetes.The combination of flexible hosting options and an open source toolset gives teams more freedom and choice. If you choose to take advantage of the full benefits of a managed serverless service, you can shift responsibility and publish your apps to Azure Functions.

Why would you pick Fargate over Lambda? How Far Out is AWS Fargate: If I were to describe Fargate, I’d describe it as clusterless container orchestration. Much like “serverless” means an architecture where the server has been abstracted away...Lambda is an additional layer of abstraction where if your workload can be expressed as a function and complete it’s work in 15 minutes or less, then it’s a great choice, especially if your workload leans towards the sporadic. But if you need more control or the limits imposed by Lambda’s abstractions pose a problem for your workload, then Fargate is worth a close look. You don’t really need to choose one or the other as they very much compliment each other...Fargate is inherently simpler than Kubernetes because it only does one thing: container orchestration. And it does this very well. Everything else is provided by an external AWS service.

You've heard of autonomous self-driving cars? How about autonomous databases? It's a DBMS that can deploy, configure, and tune itself automatically without any human intervention. Advanced Database Systems 2019 #25: Self-Driving Database Engineering (slides): Personnel is ~50% of the TOC of a DBMS. Average DBA Salary (2017): $89,050. The scale and complexity of DBMS installations have surpassed humans. Replace DBMS components with ML models trained at runtime. True autonomous DBMSs are achievable in the next decade. You should think about how each new feature can be controlled by a machine.

Good explanation with a really cool white board. Latency Under Load: HBM2 vs. GDDR6. A traffic analogy is used. The more lanes you have to memory the higher the bandwidth. Use HBM2 for highest bandwidth and best power efficiency. Downside it's harder to design with and has higher cost. GDDR is a compromise giving great performance and wide pathway to memory.

microsoft/Quantum: These samples demonstrate the use of the Quantum Development Kit for a variety of different quantum computing tasks. Most samples are provided as a Visual Studio 2017 C# or F# project under the QsharpSamples.sln solution.

Vanilla JS: a fast, lightweight, cross-platform framework for building incredible, powerful JavaScript applications.

sirixdb/sirix: facilitates effective and efficient storing and querying of your temporal data through snapshotting (only ever appends changed database pages) and a novel versioning approach called sliding snapshot, which versions at the node level. Currently we support the storage and querying of XML- and JSON-documents in our binary encoding.

Book of Proceedings from Internet Identity Workshop 27. There's a huge number of topics and 47 pages of notes.

Gorilla: A Fast, Scalable, In-Memory Time Series Database: Gorilla optimizes for remaining highly available for writes and reads, even in the face of failures, at the expense of possibly dropping small amounts of data on the write path. To improve query efficiency, we aggressively leverage compression techniques such as delta-of-delta timestamps and XOR’d floating point values to reduce Gorilla’s storage footprint by 10x. This allows us to store Gorilla’s data in memory, reducing query latency by 73x and improving query throughput by 14x when compared to a traditional database (HBase)- backed time series data.