hot links

Stuff The Internet Says On Scalability For September 6th, 2019

High Scalability

06 Sep 2019 — 22 min read

Wake up! It's HighScalability time:

Coolest or most coolest thing ever?

Do you like this sort of Stuff? I'd love your support on Patreon. I wrote Explain the Cloud Like I'm 10 for people who need to understand the cloud. And who doesn't these days? On Amazon it has 54 mostly 5 star reviews (125 on Goodreads). They'll learn a lot and likely add you to their will.

Number Stuff:

lots: programmers who can't actually program.
2x: faster scheduling of jobs across a datacenter using reinforcement learning, a trial-and-error machine-learning technique, to tailor scheduling decisions to specific workloads in specific server clusters.
300 msecs: time it takes a proposed Whole Foods biometric payment system to scan your hand and process your transaction.
$8 million: Slack revenue loss from 2 hours of downtime. (catchpoint email)
8.4 million+: websites participating in Google's user tracking/data gathering network. It broadcasts personal data about visitors to these sites to 2,000+ companies, hundreds of billions of times a day
20x: BlazingSQL faster than Apache Spark on Google Cloud Platform using NVIDIA’s T4 GPUs by loading data directly into GPU memory using GPU DataFrame (GDF).
405: agencies with access to Ring data.
middle: age at which entrepreneurs are most successful. Youth is not a key trait of successful entrepreneurs.
5: years until we have carbon nanotube chips in our computers.
5 billion: DVDs shipped by Netflix over 21 years.
51%: chance the world as we know it will not end by 2050.
1,100: US business email compromise scams per month at a cost of $300 million.

Quotable Stuff:

@kennwhite: Merkle trees aren't gonna fix a low-bid state contractor unpatched 2012 IIS web server
Werner Vogels: To succeed in using application development to increase agility and innovation speed, organizations must adopt five elements, in any order: microservices; purpose-built databases; automated software release pipelines; a serverless operational model; and automated, continuous security. The common thing we have seen, though, is that customers who build modern applications see benefits across their entire businesses, especially in how they allocate time and resources. They spend more time on the logic that defines their business, scale up systems to meet peak customer demand easily, increase agility, and deliver new features to market faster and more often.
@Carnage4Life: This post points out that rents consume $1 out of every $8 of VC investment in the Bay Area.
@kentonwilliston: Too little, too late. RISC-V has already cornered the "open" core market IMO, and if I wanted a second option it's hard to see why 'd go with Power over others like MIPS open
echopom: > Why Does Developing on Kubernetes Suck ? IMHO because we are in a phase of transition. Having worked for years in software industry , I'm convinced we are halfway to a much bigger transformation for Software Engineers / SRE , Developers etc... I work in a Neobank ( N26 , Revolut, etc...) , we are currently in the process of re-writing our entire Core Banking System with MicroServices on top of Kubernetes with Kafka. Not a single day pass without having engineers needing to have an exchange about defining basically all of the the terms that exist within the K8/Docker/Kafa world. - What's a Pod ? How does a pod behave if Kafa goes down ? Do we really need ZooKeeper etc....Their workflows is insanely complex and requires hours if not a day to deploy a single change... obviously let's not even talk about the amount of work our SRE has in the pipe to "package" the entire stack of 150+ services in K8 through a single YAML file....
millerm: I have had this thought for many years. Where is all the perfectly designed, bug free, maintenance-bliss, fully documented, fully tested, future-proofed code located so we can all marvel at its glory?
@dvassallo: I agree with the advice. Still, I like these PaaS experiments. There’s a big opportunity for “conceptual compression” on AWS, and I bet one day we’ll see a good PaaS/framework that would be a good choice for the average Twitter for Pets app. And I doubt that would come from AWS.
JPL: Atomic clocks combine a quartz crystal oscillator with an ensemble of atoms to achieve greater stability. NASA's Deep Space Atomic Clock will be off by less than a nanosecond after four days and less than a microsecond (one millionth of a second) after 10 years. This is equivalent to being off by only one second every 10 million years.
Nathan Schneider: Pursuing decentralization at the expense of all else is probably futile, and of questionable usefulness as well. The measure of a technology should be its capacity to engender more accountable forms of trust.
@tef_ebooks: docker is just static linking for millenials
@Hacksterio: "But it’s not until we look at @TensorFlow Lite on the @Raspberry_Pi 4 that we see the real surprise. Here we see a 3X-4X increase in inferencing speed between our original TensorFlow benchmark, and the new results using TensorFlow Lite..."
@cmeik: I used to think, and have for many years, that partial failure was the fundamental thing and that's what needed to be surfaced. I'm not sure I believe that anymore, I'm starting to think it's more about uncertainity instead. But, I don't know.
@benedictevans: Fun with maths: The Moto MC68000 CPU in the original Mac had 68k transistors. Apple sold 372k units in 1984. 68k x 372k=25.3bn The A12X SoC in an iPad Pro has 10bn transistors. So, if you’re inclined to really unfair comparisons: 3 iPads => all Macs sold in the first year
atombender: This meme needs to die. Kubernetes is not overkill for non-Google workloads. In my current work, we run several Kubernetes clusters via GKE on Google Cloud Platform. We're a tiny company — less than 20 nodes running web apps, microservices and search engines — but we're benefiting hugely from the operational simplicity of Kubernetes. Much, much, much better than the old fleet of Puppet-managed VMs we used to run. Having surveyed the competition (Docker Swarm, Mesos/Marathon, Rancher, Nomad, LXD, etc.), I'm also confident that Kubernetes was the right choice. Kubernetes may be a large and complex project, but the problem it solves is also complex. Its higher-level cluster primitives are vastly better adapted to modern operations than the "simple" Unix model of daemons and SSH and what not. The attraction isn't just the encapsulation that comes with containers, but the platform that virtualizes physical nodes and allows containers to be treated as ephemeral workloads, along with supporting primitives like persistent volumes, services, ingresses and secrets, and declarative rules like horizontal autoscalers and disruption budgets. Given this platform, you have a "serverless"-like magically scaling machine full of tools at your fingertips. You don't need a huge workload to benefit from that.
cryptica: I'm starting to think that many of the most successful tech companies of the past decade are not real monopolies but succeeded purely because the centralization of capital made it difficult for alternative projects to compete for a limited period of time. Even projects with strong network effects are unlikely to last forever.
Code Lime: [Hitting the same database from several microservices] almost refutes the whole philosophy of microservice architecture. They should be independent and self-contained. They should own their data and have complete freedom on how it is persisted. They are abstractions that help de-couple processes. Obviously, they come with a fair amount of overhead for this flexibility. Yet, flexibility is what you should aim for.
gervase: When I was running hiring at a previous startup, we ran into this issue often. When I proposed adding FizzBuzz to our screening process, I got a fair amount of pushback from the team that it was a waste of the candidates' time. Once we'd actually started using it, though, we found it filtered between 20-30% of our applicant pool, even when we let them use literally any language they desired, presumably their strongest.
@jensenharris: There’s no such thing as a “startup inside of a big company.” This misnomer actively misleads both big company employees working in such teams as well as people toiling in actual startups. Despite all best efforts to create megacorp “startups”, they can never exist. Here's why: 1) The most fundamental, pervasive background thread of an early-stage startup is that when it fails, everyone has find a new job. The company is gone, kaput, relegated to the dustbin of Crunchbase. The company literally lives & dies on the work every employee does every day.
@math_rachel: "A company can easily lose sight of its strategy and instead focus strictly on the metrics that are meant to represent it... Wells Fargo never actually had a cross-selling strategy. It had a cross-selling metric."
@ben11kehoe: Don’t put your processed data in the same bucket as the raw ingested data—different lifecycle and backup requirements #sdmel19
Lauren Feiner: The proposed solutions focus on removing weaker players from the ecosystem and undermining the hate clusters from within. Johnson and his team suggest that, rather than attacking a highly vocal and powerful player, social media platforms remove smaller clusters and randomly remove individual members. Removing just 10% of members from a hate cluster would cause it to begin to collapse, the researchers say.
@mathiasverraes: Philosophy aside, the important questions are, does exposing persisted events have the same practical downsides (in the long term) as exposing state? If so, are there better mitigations? Are the downsides outweighing the upsides? I'm leaning to no, yes, no.
@jessitron: “building software isn't at all like assembling a car. In terms of managing growth, it's more like raising a child or tending a garden.” @KevinSimler
Kevin Simler: In a healthy piece of code, entropic decay is staved off by dozens of tiny interventions — bug fixes, test fixes, small refactors, migrating off a deprecated API
streetcat1: First, I must say that the inventors of UML saw it as the last layer. The grand vision was a complete code generation from UML diagrams. And this was the overall grand vision that drove OO in general. I think that this is was happening now with the "low code" startups. The whole idea is to separate the global decisions (which are hard to change) - e.g. architecture, what classes, what each class do, from the local one (e.g. which data structure to use). So you would use UML for the global decisions, and than make programming the classes almost mechanical.
Yegor Bugayenko: Our empirical evidence suggests even expert programmers really learn to program within a given domain. When expert programmers switch domains, they do no better than a novice. Expertise in programming is domain-specific. We can teach students to represent problems in a form the computer could solve in a single domain, but to teach them how to solve in multiple domains is a big-time investment. Our evidence suggests students graduating with a four-year undergraduate degree don't have that ability. Solving problems with a computer requires skills and knowledge different from solving them without a computer. That's computational thinking. We will never make the computer completely disappear. The interface between humans and computers will always have a mismatch, and the human will likely have to adapt to the computer to cover that mismatch. But the gap is getting smaller all the time. In the end, maybe there's not really that much to teach under this definition of computational thinking. Maybe we can just design away the need for computational thinking.
@aphyr: If you do this, it makes life so much easier. Strict serializability and linearizability become equivalent properties over histories of txns on maps. If you insist on making the individual r/w micro-ops linearizable, it *breaks* serializability, as we previously discussed.
Jennifer Riggins: Datadog itself conducts regular game days where it kills a certain service or dependency to learn what threatens resiliency. These game days are partnerships between the people building whatever’s being tested — as they know best and are initially on-call if it breaks — and a site reliability engineer. This allows the team to test monitoring and alerting, making sure that dashboards are in place and there are runbooks and docs to follow, making sure that the site reliability engineer is equipped to eventually take over.
dragonsh: Indeed Uber did try to enter Indonesia they failed, they were out of most of South East and East Asia, because they don't have enough engineering talent to build a system for those specific countries, local companies like Grab in Singapore, Gojek in Indonesia, Didi in China beat them. So why would you think those companies do not have a talent to build systems better suited to their own environment than Uber.
blackoil: We handle peaks of 800k tps in few systems. It is for an analytical platform. Partition in Kafka by some evenly distributed key, Create simple apps that read from a partition and process it, commit offset. Avoid communication between process/threads. Repartition using kafka only. For some cases we had to implement sampling wherein usecase required highly skewed partitions.
chihuahua: I was working at Amazon when the 2-pizza team idea was introduced. A week or two later, we though "we're a 2-pizza team now, let's order some pizza". That when we found out that there was no budget for pizza, it was merely a theoretical concept. At the time the annual "morale budget" (for food and other items) was about $5 per person. These days I think the morale budget is a bit higher; in 2013 there were birthday cakes once a month.
kator: Another thing that often gets overlooked is the concept of "Single Threaded Owner". I'm an STO on a topic, that means I write and communicate the known truth and our strategy and plans, I participate in discussions around that topic, I talk to customers about it, I read industry news and leverage my own experience in that topic. Others know me as that STO and reach out to me with related topics if something makes sense to me in my topic area then I try to address it, if not I connect the person with another STO I think would be interested in their idea or problem. Success at Amazon is deeply driven by networking, we have an internal tool called Phonetool which allows you to quickly navigate the company and find people who are close to the topic you have in mind. I keep thinking it's like the six degrees of separation concept, if somebody doesn't know the topic they know someone who is closer to the topic, within a couple of emails you are in a conversation with someone on the other side of the company who is passionate, fired up and knows more about the topic than you thought could be known. They're excited to talk to you about their topic and teach you or learn from your new idea related to their area of focus.
Const-me: You know what is a waste of my time? When I wrote a good answer to a question which I think is fine, which then goes to oblivion because some other people, who often have absolutely no clue what’s asked, decide the question is not good enough.
Matt Parsons: Names can’t transmit meaning. They can transmit a pointer, though, which might point to some meaning. If that meaning isn’t the right meaning, then the recipient will misunderstand. Misunderstandings like this can be difficult to track down, because our brains don’t give us a type error with a line and column number to look at. Instead, we just feel confused, and we have to dig through our concept graph to figure out what’s missing or wrong.
Avast: The findings from the analysis of the obtained snapshot of the C&C server were quite surprising. All of the executable files on the server were infected with the Neshta fileinfector. The authors of Retadup accidentally infected themselves with another malware strain. This only proves a point that we have been trying to make – in good humor – for a long time: malware authors should use robust antivirus protection.
@greenbirdIT: According to 1 study, computational resources required to train large AI models is doubling every three to four months.
@ACLU: Amazon wants to connect doorbell cameras to facial recognition databases, with the ability to call the police if any “suspicious” people are detected.
@esh: Kira just discovered the joy of increasing the AWS Lambda MemorySize from the default of 128 to 1792, resulting in the use of a full CPU and a much faster response time. Her Slack command now answers in 2.5 seconds instead of 35 seconds. And the cool thing is that it costs the same to run it faster.
@johncutlefish: “We know something is working when we spend longer on it, instead of shorter. Her team was delivering into production daily/weekly. They could have easily bragged about how “quickly” they “move things to done”. But she didn’t”
tilolebo: Isn't it possible to just set the haproxy maxconn to a slightly lower value than what the backend can deal with, and then let the reverse proxy retry once with another backend? Or even queue it for some hundreds of milliseconds before that? This way you avoid overloading backends. Also, haproxy provides tons of real-time metrics, included the highwatermark for concurrent and queued connections.
ignoramous: Scheduled tasks are a great way to brown-out your downstream dependencies. In one instance, MDAM RAID checks caused P99 latency spikes first Sunday of every month [0] (default setting). It caused a lot of pain to our customers until the check was IO throttled, which meant spikes weren't as high, but lasted for a longer time. Scheduled tasks are a great way to brown-out yourself.
@sfiscience: "The universe is not a menu. There's no reason to think it's full of planets just waiting for humans to turn up. For most of Earth's history, it hasn't been comfortable for humans." - Olivia Judson

Useful Stuff:

Always on the lookout for examples from different stacks. Here's a new power couple. Using Backblaze B2 and Cloudflare Workers for free image hosting. It looks pretty straightforward and even better "Everything I've mentioned in the post is 100% free, assuming you stay within reasonable limits." Backblaze has a 10GB free file limit, and then charges $0.005/GB/Month thereafter. Cloudflare Workers also offers a free tier which includes 100,00 requests every 24 hours, with a maximum of 1,000 requests every 10 minutes. Also, Migrating 23TB from S3 to B2 in just 7 hours.

Anything C can do Rust can do better. Well, not quite yet. Intel's Josh Triplett on what would it take for Rust to be comparable to C. Rust is 4 years old. Rust needs full parity with C to support the long tail of systems software. Rust has automatic memory management without GC. Calls to free are inserted by the compiler at compile time. Like C Rust does not have a runtime. Unlike C Rust has safe concurrent programming. The memory safety makes it easier to implment safe concurrency. Rust would have prevented 73% of security bugs in Mozilla. Rust needs better C interoperability. Rust need to improve code size by not linking in used instructions. Needs to support inline assembly. Needs safe SIMD intrinsics. Needs to support bloat16 to minimize storage space and bandwidth for floating point calcs.

Apparently we now need 2FA all the way down. Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case: Criminals used artificial intelligence-based software to impersonate a chief executive’s voice and demand a fraudulent transfer of €220,000 ($243,000) in March in what cybercrime experts described as an unusual case of artificial intelligence being used in hacking.

What's Mars Solar Conjunction, and Why Does It Matter? For 10 days we won't talk to devices on Mars. Why? "because Mars and Earth will be on opposite sides of the Sun, a period known as Mars solar conjunction. The Sun expels hot, ionized gas from its corona, which extends far into space. During solar conjunction, this gas can interfere with radio signals when engineers try to communicate with spacecraft at Mars, corrupting commands and resulting in unexpected behavior from our deep space explorers. To be safe, engineers hold off on sending commands when Mars disappears far enough behind the Sun's corona that there's increased risk of radio interference." This period of when commands are not sent called a "command moratorium." Talk about a maintenance window! This is the kind of thing Delay-tolerant networking has to take into account. Machines will need enough native intelligence to survive without human guiding hands.

Ready for a new aaS? iPaaS is integration platform as a service: iPaaS lets you connect anything to anything, in any way, and anywhere. iPaaS works very well in huge enterprise environment that need to integrate a lot of on-premises applications and cloud-based applications or data providers.

Living life on the edge. Macquarie Bank replaced 60 EC2 instances with code running at the lambda edge for lower latency and a 80% cost savings. At the edge before a response goes back to the client they inject a few headers: HSTS to require encryption; and X-Frame-Options to prevent pages from being loaded in an iframe and to protect against cross-site scripting attacks. They also validate the JWT token and redirect to a login page if it's invalid. WAF and Shield are also used for protection.

I had no idea you could and should prune lambda versions. The Dark Side of AWS Lambda: Lambda versions every function. When you couple CI/CD with rapid development and Lambda functions, you get many versions. Hundreds even. And Lambda code storage is limited to 75GB. We hit that limit, and we hit it hard. AWS does allow you to delete specified versions of functions that are no longer in use.

Was Etsy too good to be true? Platforms follow a life cycle:
- Most platform users don't earn a living: Though he once dreamed of Etsy sellers making their livings selling things they made themselves, he knows now that was never really what happened for the vast majority. Even when he was CEO and things were small and maybe idyllic, only a fraction of a percentage of sellers were making more than $30,000 a year.
- The original platform niche is abandoned as the platform searches for more revenue by broadening its audience: “It’s just a place to sell now,” Topolski says, delineating her personal relationship with the platform that built her business and helped her find the community that makes up much of her world.
- Platform costs shift to users: “I get it, Etsy as a whole needs to be competitive in a marketplace that’s completely shifted toward being convenient,” she tells me. “But it’s a financial issue for people like me whose products are extremely expensive to ship. All of a sudden my items are $10 to $15 more expensive, but I didn’t add any value to justify that pricing.”
- User margins become a source of platform profits: before Silverman took over, an Etsy executive told Forbes that more than 50 percent of Etsy’s revenue comes from seller services, like its proprietary payment processing system, which takes a fee of 3 percent, plus 25 cents per US transaction (the company made it the mandatory default option in May, removing the option for sellers to use individual PayPal accounts). New advertising options and customer support features in Etsy Plus — available to sellers willing to pay a $10 monthly fee — expand on that.
- An edifice complex often signals the end: One moment that sticks out in her mind: a tour of Etsy’s new nine-story, 200,000-square-foot offices in Brooklyn’s Dumbo neighborhood, which opened in the spring of 2016. “I remember immediately getting this sinking feeling that none of it was for us,” she says. It didn’t seem like the type of place she could show up for a casual lunch. It was nice that the building was environmentally-friendly, that it was big and beautiful. It was weird that there was so much more security and less crafting, replaced by the sleek lines of a grown-up startup.
- Users become just another metric/kpi: “We’re the heart of the company, creating literally all content and revenue,” she says, “and suddenly we weren’t particularly welcome anymore.”
- Pay to play. The platform starts charging for once organic features. For example, on Amazon you now have to pay for advertising to have your product surfaced during search. This reduces margins.
- Obey or pay. The platform exerts control by requiring users to behave in ways that benefit the platform. Failure to obey results in penalties.
- Private label brands. The platform introduces its own products to compete with yours. And since platform products don't need to follow the same rules or incur the same costs as platform users they cannibalize user sales.
- Allow knock-off products. Since increasing sales volume is what matters most the platform is disincentivized at removing knock-off products.
- Monetization over customer experience. Money becomes the most important design driver and the customer experience suffers.

koreth: I have been working on an ES/CQRS system for about 4 years and enjoy it...it’s a payment-processing service.
- What have the costs been? The message-based model is kind of viral. Ramping up new engineers takes longer, because many of them also have never seen this kind of system before. Debugging gets harder because you can no longer look at a simple stack trace. I’ve had to explain why I’m spending time hacking on the framework rather than working on the business logic.
- What have the benefits been? The fact that the inputs and outputs are constrained makes it phenomenally easier to write meaningful, non-brittle black-box unit tests. Having the ability to replay the event log makes it easy to construct new views for efficient querying. Debugging gets easier because you have an audit trail of events and you can often suck the relevant events into a development environment and replay them. Almost nothing had to change in the application code when we went from a single-node-with-hot-standby configuration to a multiple-active-nodes configuration. The audit trail is the source of truth, not a tacked-on extra thing that might be wrong or incomplete.

How long before SSDs replace HDDs? DSHR says a lot longer than you might think. The purchase cost of an HDD is much more than 20% of the power and cooling costs over its service life. So speed isn't as important as low $/TB. Speed in nearline is nice, but it isn't what the nearline tier is for. At 5x their cost won't justify wholesale replacement of the nearline tier. The recent drop in SSD price reflects the transition to 3D flash. The transition to 4D flash is far from imminent, so this is a one-time effect.

As soon as you have the concept of a transaction --- a group of read and write operations --- you need to have rules for what happens during the timeline between the first of the operations of the group and the last of the operations of the group. An explanation of the difference between Isolation levels vs. Consistency levels: Database isolation refers to the ability of a database to allow a transaction to execute as if there are no other concurrently running transactions (even though in reality there can be a large number of concurrently running transactions). The overarching goal is to prevent reads and writes of temporary, incomplete, aborted, or otherwise incorrect data written by concurrent transactions. Database consistency is defined in different ways depending on the context, but when a modern system offers multiple consistency levels, they define consistency in terms of the client view of the database. If two clients can see different states at the same point in time, we say that their view of the database is inconsistent (or, more euphemistically, operating at a “reduced consistency level”). Even if they see the same state, but that state does not reflect writes that are known to have committed previously, their view is inconsistent with the known state of the database.

How We Manage a Million Push Notifications an Hour. Key idea: Each time we found a point which needed to handle multiple implementations of the same core logic, we put it behind a dedicated service: Multiple devices for a user was put behind the token service. Multiple applications were given a common interface on notification server. Multiple providers were handled by individual job queues and notification workers. Also, Rust at OneSignal

Having attended more than a few Hadoop meetups this was like reading a young friend was moving into a retirement home. What happened to Hadoop.
- Something happened within the big data world to erode Hadoop’s foundation of a distributed file system (HDFS) coupled with a compute engine for running MapReduce (the original Hadoop programming model) jobs:
  - Mobile phones became smartphones and began generating streams of real-time data.
  - Companies were reminded that they had already invested untold billions in relational database and data warehouse technologies
  - Competitive or, at least, alternative projects such as Apache Spark began to spring up from companies, universities, and web companies trying to push Hadoop, and the whole idea of big data, beyond its early limitations.
  - Venture capital flowed into big data startups.
  - Open source, now very much in the mainstream of enterprise IT, was getting better
  - Cloud computing took over the world, making it easier not just to virtually provision servers, but also store data cheaply and to use managed services that tackle specific use case
  - Docker and Kubernetes were born. Together, they opened people’s eyes to a new way of packaging and managing applications and infrastructure
  - Microservices became the de facto architecture for modern applications
- What are the new trends?
  - Streaming data and event-driven architectures are rising in popularity.
  - Apache Kafka is becoming the nervous system for more data architectures.
  - Cloud computing dominates infrastructure, storage, and data-analysis and AI services.
  - Relational databases — including data warehouses — are not going anywhere,
  - Kubernetes is becoming the default orchestration layer for everything,

6 Lessons we learned when debugging a scaling problem on GitLab.com: But the biggest lesson is that when large numbers of people schedule jobs at round numbers on the clock, it leads to really interesting scaling problems for centralized service providers like GitLab. If you're one of them, you might like to consider putting in a random sleep of maybe 30 seconds at the start, or pick a random time during the hour and put in the random sleep, just to be polite and fight the tyranny of the clock.

Federated GraphQL Server at Scale: Zillow Rental Manager Real-time Chat Application: We share how we try to achieve developer productivity and synergy between different teams by having a federated GraphQL server...we decided to go with a full-fledged GraphQL, Node, React-Typescript application which would form the frontend part of the satellite...Both, Rental Manager and Renter Hub talk to Satellite GraphQl server (express-graphql server) which maps requests to the appropriate endpoint in Satellite API after passing through the authentication service for each module...We implemented a layered approach where each module houses multiple features and each feature has its own schema, resolvers, tests, and services. This strategy allows us to isolate each feature into its own folder and then stitch everything together at the root of our server. Each feature has its own schema and is written in a file with .graphql extension so that we can leverage all the developer tooling around GraphQl.

Soft Stuff:

cloudstateio/cloudstate: a standards effort defining a specification, protocol, and reference implementation, aiming to extend the promise of Serverless and its Developer Experience to general-purpose application development. CloudState builds on and extends the traditional stateless FaaS model, by adding support for long-lived addressable stateful services and a way of accessing mapped well-formed data via gRPC, while allowing for a range of different consistency model—from strong to eventual consistency—based on the nature of the data and how it should be processed, managed, and stored.
TULIPP (article): makes it possible to develop energy-efficient embedded image processing systems more quickly and less expensively, with a drastic reduction in time-to-market. The results are impressive: The processing, which originally took several seconds to analyze a single image on a high-end PC, can now run on the drone in real time, i.e. now approximately 30 images are analyzed per second. “The speed of pedestrian detection algorithm could be increased by a factor of 100: Now the system can analyze 14 images per second compared to one image every seven seconds. Enhancement of X-ray image quality by applying noise-removing image filters allowed reducing the intensity of radiation during surgical operations to one fourth of the previous level. At the same time energy consumption could be significantly reduced for all three applications.

Pub Stuff:

A link layer protocol for quantum networks: Here, we take the first step from a physics experiment to a quantum internet system. We propose a functional allocation of a quantum network stack, and construct the first physical and link layer protocols that turn ad-hoc physics experiments producing heralded entanglement between quantum processors into a well-defined and robust service. This lays the groundwork for designing and implementing scalable control and application protocols in platform-independent software.
How Chemistry Computes: Language Recognition by Non-Biochemical Chemical Automata. From Finite Automata to Turing Machines: Our Turing machine uses the Belousov-Zhabotinsky chemical reaction and checks the same symbol in an Avogadro′s number of processors. Our findings have implications for chemical and general computing, artificial intelligence, bioengineering, the study of the origin and presence of life on other planets, and for artificial biology.
Choosing a cloud DBMS: architectures and tradeoffs: My key takeaways as a TL;DR: Store your data in S3; Use portable data format that gives you future flexibility to process it with multiple different systems (e.g. ORC or Parquet); Use Athena for workloads it can support (Athena could not run 4 of the 22 TPC-H queries, and Spectrum could not run 2 of them), especially if you are doing less frequent ad-hoc queries.
The Art Of PostgreSQL: is the new edition of my previous release, Mastering PostgreSQL in Application Development. It contains mostly fixes to the old content, a new title, and a new book design (PDF and paperback). Content wise, The Art of PostgreSQL also comes with a new whole chapter about PostgreSQL Extensions.
TeaVaR: Striking the Right Utilization-Availability Balance in WAN Traffic Engineering: We advocate a novel approach to this challenge that draws inspiration from financial risk theory: leverage empirical data to generate a probabilistic model of network failures and maximize bandwidth allocation to network users subject to an operator-specified availability target. Our approach enables network operators to strike the utilizationavailability balance that best suits their goals and operational reality. We present TeaVaR (Traffic Engineering Applying Value at Risk), a system that realizes this risk management approach to traffic engineering (TE).