advertise
Monday
Apr162018

Google: A Collection of Best Practices for Production Services

This excerpt—Appendix B - A Collection of Best Practices for Production Services—is from Google's awesome book on Site Reliability Engineering. Worth reading if it hasn't been on your radar. And it's free!

 

Fail Sanely

Sanitize and validate configuration inputs, and respond to implausible inputs by both continuing to operate in the previous state and alerting to the receipt of bad input. Bad input often falls into one of these categories:

Incorrect data

Validate both syntax and, if possible, semantics. Watch for empty data and partial or truncated data (e.g., alert if the configuration is N% smaller than the previous version).

Delayed data

This may invalidate current data due to timeouts. Alert well before the data is expected to expire.

Fail in a way that preserves function, possibly at the expense of being overly permissive or overly simplistic. We’ve found that it’s generally safer for systems to continue functioning with their previous configuration and await a human’s approval before using the new, perhaps invalid, data.

Examples

In 2005, Google’s global DNS load- and latency-balancing system received an empty DNS entry file as a result of file permissions. It accepted this empty file and served NXDOMAIN for six minutes for all Google properties. In response, the system now performs a number of sanity checks on new configurations, including confirming the presence of virtual IPs for google.com, and will continue serving the previous DNS entries until it receives a new file that passes its input checks.

In 2009, incorrect (but valid) data led to Google marking the entire Web as containing malware [May09]. A configuration file containing the list of suspect URLs was replaced by a single forward slash character (/), which matched all URLs. Checks for dramatic changes in file size and checks to see whether the configuration is matching sites that are believed unlikely to contain malware would have prevented this from reaching production.

Progressive Rollouts

Click to read more ...

Friday
Apr132018

Stuff The Internet Says On Scalability For April 13th, 2018

Hey, it's HighScalability time:

 

Bathroom tile? Grandma's needlepoint? Nope. It's a diagram of the dark web. Looks surprisingly like a tumor.

If you like this sort of Stuff then please support me on Patreon. And I'd appreciate if you would recommend my new book—Explain the Cloud Like I'm 10—to anyone who needs to understand the cloud (who doesn't?). I think they'll learn a lot, even if they're already familiar with the basics. 

  • $23 billion: Amazon spend on R&D in 2017; $0.04: cost to unhash your email address; $35: build your own LIDAR; 66%: links to popular sites on Twitter come from bots; 60.73%: companies report JavaScript as primary language; 11,000+: object dataset provide real objects with associated depth information; 150 years: age of the idea of privacy; 30%~ AV1's better video compression; 100s of years: rare-earth materials found underneath Japanese waters; 67%: better image compression using Generative Adversarial Networks; 1000 bit/sec: data exfiltrated from air-gapped computers through power lines using conducted emissions; 

  • Quotable Quotes:
    • @Susan_Hennessey: Less than two months ago, Apple announced its decision to move mainland Chinese iCloud data to state-run servers.
    • @PaulTassi: Ninja's New 'Fortnite' Twitch Records: 5 Million Followers, 250,000 Subs, $875,000+ A Month via @forbes
    • @iamtrask: Anonymous Proof-of-Stake and Anonymous, Decentralized Betting markets are fundamentally rule by the rich. If you can write a big enough check, you can cause anything to happen. I fundamentally disagree that these mechanisms create fair and transparent markets.
    • David Rosenthal: The redundancy needed for protection is frequently less than the natural redundancy in the uncompressed file. The major threat to stored data is economic, so compressing files before erasure coding them for storage will typically reduce cost and thus enhance data survivability.
    • @mjpt777: The more I program with threads the more I come to realise they are a tool of last resort.
    • JPEG XS~ For the first time in the history of image coding, we are compressing less in order to better preserve quality, and we are making the process faster while using less energy. Expected to be useful for virtual reality, augmented reality, space imagery, self-driving cars, and professional movie editing.
    • Martin Thompson: 5+ years ago it was pretty common for folks to modify the Linux kernel or run cut down OS implementations when pushing the edge of HFT. These days the really fast stuff is all in FPGAs in the switches. However there is still work done on isolating threads to their own exclusive cores. This is often done by exchanges or those who want good predictable performance but not necessarily be the best. A simple way I have to look at it. You are either predator or prey. If predator then you are mostly likely on FPGAs and doing some pretty advanced stuff. If prey then you don't want to be at the back of the herd where you get picked off. For the avoidance of doubt if you are not sure if you are prey or predator then you are prey. ;-)
    • Brian Granatir: serverless now makes event-driven architecture and microservices not only a reality, but almost a necessity. Viewing your system as a series of events will allow for resilient design and efficient expansion. DevOps is dead. Serverless systems (with proper non-destructive, deterministic data management and testing) means that we’re just developers again! No calls at 2am because some server got stuck? 
    • @chrismunns: I think almost 90% of the best practices of #serverless are general development best practices. be good at DevOps in general and you'll be good at serverless with just a bit of effort
    • David Gerard: Bitcoin has failed every aspiration that Satoshi Nakamoto had for it. 
    • @joshelman: Fortnite is a giant hit. Will be bigger than most all movies this year. 
    • @swardley: To put it mildly, the reduction in obscurity of cost through serverless will change the way we develop, build, refactor, invest, monitor, operate, organise & commercialise almost everything. Micro services is a storm in a tea cup compared to this category 5.
    • James Clear: The 1 Percent Rule is not merely a reference to the fact that small differences accumulate into significant advantages, but also to the idea that those who are one percent better rule their respective fields and industries. Thus, the process of accumulative advantage is the hidden engine that drives the 80/20 Rule.
    • Ólafur Arnalds: MIDI is the greatest form of art.
    • Abraham Lincoln: Give me six hours to chop down a tree and I will spend the first four sharpening the axe.
    • @RichardWarburto: Pretty interesting that async/await is listed as essentially a sequential programming paradigm.
    • @PatrickMcFadin: "Most everyone doing something at scale is probably using #cassandra" Oh. Except for @EpicGames and @FortniteGame They went with MongoDB. 
    • Meetup: In the CloudWatch screenshot above, you can see what happened. DynamoDB (the graph on the top) happily handled 20 million writes per hour, but our error rate on Lambda (the red line in the graph on the bottom) was spiking as soon as we went above 1 million/hour invocations, and we were not being throttled. Looking at the logs, we quickly understood what was happening. We were overwhelming the S3 bucket with PUT requests
    • Sarah Zhang: By looking at the polarization pattern in water and the exact time and date a reading was taken, Gruev realized they could estimate their location in the world. Could marine animals be using these polarization patterns to navigate through the ocean? 
    • Vinod Khosla: I have gone through an exercise of trying to just see if I could find a large innovation coming out of big companies in the last twenty five years, a major innovation (there’s plenty of minor innovations, incremental innovations that come out of big companies), but I couldn’t find one in the last twenty five years.
    • Click through for lots more quotes.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Apr102018

Sponsored Post: InMemory.Net, Educative, Triplebyte, Exoscale, Loupe, Etleap, Aerospike, Scalyr, Domino Data Lab, MemSQL

Who's Hiring? 

  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • 5 Signs You’ve Outgrown DynamoDB. Companies often select a database that seems to be the best choice at first glance, as well as the path of least resistance, and then are subsequently surprised by cost overruns and technology limitations that quickly hinder productivity and put the business at risk. This seems to be the case with many enterprises that chose Amazon Web Service’s (AWS) DynamoDB. In this white paper we’ll cover elements of costing as well as the results of benchmark-based testing. Read 5 Signs You’ve Outgrown DynamoDB to determine if your organization has outgrown this technology.

  • Advertise your event here!

Cool Products and Services

  • Datadog combines metrics, distributed tracing, and logs to monitor cloud-scale infrastructure and applications all in one place. With out-of-the-box dashboards and seamless integrations with over 200 technologies, Datadog provides end-to-end visibility into system’s health and performance at scale. Build your own rich dashboards, set alerts to identify anomalies, and collaborate with your team to troubleshoot and fix issues fast. Start a free trial and try it yourself.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net
  • For heads of IT/Engineering responsible for building an analytics infrastructure, Etleap is an ETL solution for creating perfect data pipelines from day one. Unlike older enterprise solutions, Etleap doesn’t require extensive engineering work to set up, maintain, and scale. It automates most ETL setup and maintenance work, and simplifies the rest into 10-minute tasks that analysts can own. Read stories from customers like Okta and PagerDuty, or try Etleap yourself.

  • Educative provides interactive courses for software engineering interviews created by engineers from Facebook, Microsoft, eBay, and Lyft. Prepare in programming languages like Java, Python, JavaScript, C++, and Ruby. Design systems like Uber, Netflix, Instagram and more. More than 10K software engineers have used Coderust and Grokking the System Design Interview to get jobs at top tech companies like Facebook, Google, Amazon, Microsoft, etc. Ace your software engineering interviews today. Get started now

  • Gartner’s 2018 Magic Quadrant for Data Science and Machine Learning Platforms. Read Gartner’s most recent 2018 release of the Magic Quadrant for Data Science and Machine Learning Platforms. A complimentary copy of this important research report into the data science platforms market is offered by Domino. Download the report to learn: 
    • How Gartner defines the Data Science Platform category, and their perspective on the evolution of the data science platform market in 2018. 
    • Which data science platform is right for your organization. 
    • Why Domino was named a Visionary in 2018.

  • Exoscale GPU Cloud Servers. Powerful on-demand GPU. Perfect for your machine learning, artificial, and encoding workloads. GPU instances work exactly like other instances: they are billed by the minute and integrate seamlessly with your existing infrastructure. Tap the GPU's full power with direct passthrough access. Speed-up Tensorflow or any other Deep Learning, Big Data, AI, or Encoding workload. Start your GPU instances via our API or with your existing deployment management tools. Add parallel computational power to your stack with no effort. Get Started

  • .NET developers dealing with Errors in Production: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Managers want to know what’s wrong right away, users don’t want to provide log data, and you spend more time gathering information than you do fixing the problem. To fix all that, Loupe was built specifically as a .NET logging and monitoring solution. Loupe notifies you about any errors and tells you all the information you need to fix them. It tracks performance metrics, identifies which errors cause the greatest impact, and pinpoints the root causes. Learn more and try it free today.

  • Enterprise-Grade Database Architecture. The speed and enormous scale of today’s real-time, mission critical applications has exposed gaps in legacy database technologies. Read Building Enterprise-Grade Database Architecture for Mission-Critical, Real-Time Applications to learn: Challenges of supporting digital business applications or Systems of Engagement; Shortcomings of conventional databases; The emergence of enterprise-grade NoSQL databases; Use cases in financial services, AdTech, e-Commerce, online gaming & betting, payments & fraud, and telco; How Aerospike’s NoSQL database solution provides predictable performance, high availability and low total cost of ownership (TCO)

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • MemSQL envisions a world of adaptable databases and flexible data workloads - your data anywhere in real time. Today, global enterprises use MemSQL as a real-time data warehouse to cost-effectively ingest data and produce industry-leading time to insight. MemSQL works in any cloud, on-premises, or as a managed service. Start a free 30 day trial here: memsql.com/download/.

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.


Make Your Job Search O(1) — not O(n)

Triplebyte is unique because they're a team of engineers running their own centralized technical assessment. Companies like Apple, Dropbox, Mixpanel, and Instacart now let Triplebyte-recommended engineers skip their own screening steps.

We found that High Scalability readers are about 80% more likely to be in the top bracket of engineering skill.

Take Triplebyte's multiple-choice quiz (system design and coding questions) to see if they can help you scale your career faster.


The Solution to Your Operational Diagnostics Woes

Scalyr gives you instant visibility of your production systems, helping you turn chaotic logs and system metrics into actionable data at interactive speeds. Don't be limited by the slow and narrow capabilities of traditional log monitoring tools. View and analyze all your logs and system metrics from multiple sources in one place. Get enterprise-grade functionality with sane pricing and insane performance. Learn more today


If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Monday
Apr092018

Give Meaning to 100 billion Events a Day - The Analytics Pipeline at Teads

This is a guest post by Alban Perillat-Merceroz, Software Engineer at Teads.tv.

In this article, we describe how we orchestrate Kafka, Dataflow and BigQuery together to ingest and transform a large stream of events. When adding scale and latency constraints, reconciling and reordering them becomes a challenge, here is how we tackle it.


Teads for Publisher, one of the webapps powered by Analytics

 

In digital advertising, day-to-day operations generate a lot of events we need to track in order to transparently report campaign’s performances. These events come from:

  • Users’ interactions with the ads, sent by the browser. These events are called tracking events and can be standard (start, complete, pause, resume, etc.) or custom events coming from interactive creatives built with Teads Studio. We receive about 10 billion tracking events a day.
  • Events coming from our back-ends, regarding ad auctions’ details for the most part (real-time bidding processes). We generate more than 60 billion of these events daily, before sampling, and should double this number in 2018.

In the article we focus on tracking events as they are on the most critical path of our business.

Simplified overview of our technical context with the two main event sources

 

Tracking events are sent by the browser over HTTP to a dedicated component that, amongst other things, enqueues them in a Kafka topic. Analytics is one of the consumers of these events (more on that below).

We have an Analytics team whose mission is to take care of these events and is defined as follows:

We ingest the growing amount of logs,
We transform them into business-oriented data,
Which we serve efficiently and tailored for each audience.

To fulfill this mission, we build and maintain a set of processing tools and pipelines. Due to the organic growth of the company and new products requirements, we regularly challenge our architecture.

Why we moved to BigQuery

Click to read more ...

Friday
Apr062018

Stuff The Internet Says On Scalability For April 6th, 2018

Hey, it's HighScalability time:

 

Programmable biology - engineered cells execute programmable multicellular full-adder logics. (Programmable full-adder computations)

If you like this sort of Stuff then please support me on Patreon. And I'd appreciate if you would recommend my new book—Explain the Cloud Like I'm 10—to anyone who needs to understand the cloud (who doesn't?). I think they'll learn a lot, even if they're already familiar with the basics. 

  • $1: AI turning MacBook into a touchscreen; $2000/month: BMW goes subscription; 20MPH: 15′ Tall, 8000 Pound Mech Suit; 1,511,484 terawatt hours: energy use if bitcoin becomes world currency; $1 billion: Fin7 hacking group; 1.5 million: ethereum TPS, sort of; 235x: AWK faster than Hadoop cluster; 37%: websites use a vulnerable Javascript library; $0.01: S3, 1 Gig, 1 AZ; 

  • Quotable Quotes:
    • Huang’s Law~ GPU technology advances 5x per year because the whole stack can be optimized. 
    • caseysoftware: Metcalfe lives here in Austin and is involved in the local startup community in a variety of ways.  One time I asked him how he came up with the law and he said something close to: "It's simple! I was selling network cards! If I could convince them it was more valuable to buy more, they'd buy more!" As an EE who studied networks, etc in college, it was jarring but audacious and impressive.  He was either BSing all of us ~40 years ago or in that conversation a few years ago.. but either way, he helped make our industry happen.
    • Adaptive nodes: the consensus that the learning process is attributed solely to the synapses is questioned. A new type of experiments strongly indicates that a faster and enhanced learning process occurs in the neuronal dendrites, similarly to what is currently attributed to the synapses
    • @dwmal1: Spotted this paper via @fanf. The idea is amazing: "We offer a new metric for big data platforms, COST, or the Configuration that Outperforms a Single Thread", and find that several frameworks fail to beet a single core even when given 128 cores. 
    • David Rosenthal: But none of this addresses the main reason that flash will take a very long time to displace hard disk from the bulk storage layer. The huge investment in new fabs that would be needed to manufacture the exabytes currently shipped by hard disk factories, as shown in the graph from Aaron Rakers. This investment would be especially hard to justify because flash as a technology is close to the physical limits, so the time over which the investment would have to show a return is short.
    • @asymco: There are 400 million registered bike-sharing users and 23 million shared bikes in China. There were approximately zero of either in 2016. Fastest adoption curve I’ve ever seen (and I’ve seen 140).
    • @anildash: Google’s decision to kill Google Reader was a turning point in enabling media to be manipulated by misinformation campaigns. The difference between individuals choosing the feeds they read & companies doing it for you affects all other forms of media.
    • The Memory Guy: has recently been told that memory makers’ research teams have found a way to simplify 3D NAND layer count increases.
    • @JohnONolan: First: We seem to be approaching (some would argue, long surpassed) Slack-team-saturation. It’s just not new or shiny anymore, and where Slack was the “omg this is so much better than before” option just a few years ago — it has now become the “ugh… not another one” thing
    • Memory Guy: The industry has  moved a very long way over the last 40 years, but I need not mention this to anyone who’s involved in semiconductors.  In 1978 a Silicon Valley home cost about $100,000, or about the cost of a gigabyte of DRAM.  Today, 40 years later, the average Silicon Valley home costs about $1 million and a gigabyte of DRAM costs about $7.
    • CockroachDB: A three-node, fully-replicated, and multi-active CockroachDB 2.0 cluster achieves a maximum throughput of 16,150 tpmC on a TPC-C dataset. This is a 62% improvement over our 1.1 release. Additionally, the latencies on 2.0 dropped by up to 81% compared to 1.1, using the same workload parameters. That means that our response time improved by 544%.
    • @Carnage4Life: Interesting thread about moving an 11,000 user community from Slack to Discourse. It's now quite clear that Slack is slightly better than email for small groups but is actually worse than the alternatives for large groups
    • Paul Barham: You can have a second computer once you’ve shown you know how to use the first one.
    • Nate Kupp~ petabyte hadoop cluster Apple uses to understand battery life on iphone and ipad looking at logging data coming off those devices.
    • More quotes. More stuff. Go get it.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Thursday
Apr052018

Do you have too many microservices? - Five Design Attributes that can Help

This is a guest Post by Jake Lumetta, Founder and CEO, ButterCMS, an API-first CMS. For more content like this, follow @ButterCMS on Twitter and subscribe to our blog.

Are your microservices too small or too tightly coupled? Are you confident in your decision-making about service boundaries? In interviews with dozens of experienced CTOs, they offered design attributes that they consider when creating a set of microservices. This article distills that wisdom into five key principles to help you better design microservices.

The importance of microservice boundaries

The design attributes discussed below matter because reaping the benefits of microservices requires designing thoughtful microservice boundaries.

One of the major challenges when it comes to creating a new system with a microservice architecture. It came about when I mentioned that one of the core benefits of developing new systems with microservices is that the architecture allows developers to build and modify individual components independently — but problems can arise when it comes to minimizing the number of callbacks between each API. The solution according to McFadden, is to apply the appropriate service boundaries.

But in contrast to the sometimes difficult-to-grasp and abstract concept of domain driven design (DDD) —  a framework for microservices — I’ll be as practical as I can in this chapter as I discuss the need for well defined microservice boundaries with some of our industry’s tops CTOs.

First, avoid arbitrary rules

Click to read more ...

Monday
Apr022018

How ipdata serves 25M API calls from 10 infinitely scalable global endpoints for $150 a month

This is a guest post by Jonathan Kosgei, founder of ipdata, an IP Geolocation API. 

I woke up on Black Friday last year to a barrage of emails from users reporting 503 errors from the ipdata API.

Our users typically call our API on each page request on their websites to geolocate their users and localize their content. So this particular failure was directly impacting our users’ websites on the biggest sales day of the year. 

I only lost one user that day but I came close to losing many more.

This sequence of events and their inexplicable nature — cpu, mem and i/o were nowhere near capacity. As well as concerns on how well (if at all) we would scale, given our outage, were a big wake up call to rethink our existing infrastructure.

Our Tech stack at the time

Click to read more ...

Friday
Mar302018

Stuff The Internet Says On Scalability For March 30th, 2018

Hey, it's HighScalability time:

 

Objective painting is not good painting unless it is good in the abstract sense. A hill or tree cannot make a good painting just because it is a hill or tree. It is lines and colors put together so that they may say something.” – Georgia O’Keeffe

 

If you like this sort of Stuff then please support me on Patreon. And I'd appreciate if you would recommend my new book—Explain the Cloud Like I'm 10—to anyone who needs to understand the cloud (who doesn't?). I think they'll learn a lot, even if they're already familiar with the basics.

 

  • 6,000: new viri spotted by AI; 300,000: Uber requests per second; 10TB & 600 years: new next-gen optical disk; 32,000: sites running Coinhive’s JavaScript miner code; $1 billion: Uber loss per quarter; 3.5%: global NAND flash output lost to power outage; 100TB: new SSD; 48TB: RAM on one server; 200 million: Telegram monthly active users; 2,000: days Curiosity Rover on Mars; 225: emerging trends; 4,425: SpaceX satellites approved; 

  • Quotable Quotes:
    • @msuriar: Uber's worst outage ever: - Master log replication to S3 failed. - Logs backup up on primary. - Alerts fire but are ignored. - Disk full on primary. - Engineer deletes unarchived WAL files. - Config error prevents failover/promotion. #SREcon
    • @thecomp1ler: Most powerful Xeon is the 28 core Platinum 2180 at $10k RSP and >200W TDP. Due to the nature of Intel turbo boost it almost always operates at TDP under load. Our ARM is the Centriq 2452 at less than $1400 RSP, 46 cores and 120W TDP, that it never hits. Beats Xeon 9/10 workloads.
    • Arjun Narayan: This is also finally the year when people start to wake up and realize they care about serializability, because Jeff Dean said its important. Michael Stonebraker, meanwhile, has been shouting for about 20 years and is very frustrated that people apparently only care when The Very Important Googlers say its important.
    • @xleem: Simple Recipe: 1) Identify system boundaries 2) Define capabilities exposed 3) Plain english definitions of availabilty 4) Define technical SLO 5) measure baseline 6) set targets 7) iterate#SREcon
    • Jordan Ellenberg~ For seven years, a group of students from MIT exploited a loophole in the Massachusetts State Lottery’s Cash WinFall game to win drawing after drawing, eventually pocketing more that $3 million. How did they do it? How did they get away with it? And what does this all have to do with mathematical entities like finite geometries, variance of probability distributions, and error-correcting codes?
    • Jeff Dean: ML hardware is at its infancy. Even faster systems and wider deployment will lead to many more breakthroughs across a wide range of domains. Learning in the core of all of our computer systems will make them better/more adaptive. There are many opportunities for this.
    • Teller: Here's a compositional secret. It's so obvious and simple, you'll say to yourself, "This man is bullshitting me." I am not. This is one of the most fundamental things in all theatrical movie composition and yet magicians know nothing of it. Ready? Surprise me.
    • @kcoleman: OH (from an awesome Lyft driver): “Today has been great. I’ve been blessed by the algorithm.” Immediately had an eerie feeling that this could become an increasingly common way to describe a day.
    • Jim Whitehurst [Red Hat chief executive]: We added hundreds of customers in the last year, while Pivotal only added 44 new customers. Their average deal size is $1.5 million, quite large. So they are more the top-down, big company kind of focus. We have over 650 customers [for OpenShift], we added hundreds this past year, and we are growing faster than Pivotal. We thought we were are performing favorably compared to them, but this is the first time we had the data to really compare.
    • @danctheduck: My GDC takeaway: Everyone who is making games-as-a-service is getting most of their actual traction by building co-op MMOs. But very few of them realize this is what they are doing. So they keep sabotaging their communities with bizarro design philosophies. Short version (1/2) - People want to do fun activities with friends. The higher order bit. Super sticky. - Core gameplay collapses to this. Ex: Team vs Team plus match making is just another way of make 'fair' Team vs Environment (aka coop) - We focus a lot on PvP, competition or esports. But many of those are *aspirational* for players. Not actually desirable. - Or we fixate on single player genre tropes. Which may be a familiar reason to *start* playing, but aren't always key to why people *continue* playing.
    • @iamtrask: Lots of folks are optimistic about #blockchain. I recently came across a difficult question...If a zero-knowledge proof can prove to users that a centralized service performed honest computation, why decentralize it? We live in free markets... I see a correction coming...
    • Katherine Bourzac: Shanbhag thinks it’s time to switch to a design that’s better suited for today’s data-intensive tasks. In February, at the International Solid-State Circuits Conference (ISSCC), in San Francisco, he and others made their case for a new architecture that brings computing and memory closer together. The idea is not to replace the processor altogether but to add new functions to the memory that will make devices smarter without requiring more power. Industry must adopt such designs, these engineers believe, in order to bring artificial intelligence out of the cloud and into consumer electronics.
    • @dylanbeattie: When npm was first released in 2010, the release cycle for typical nodeJS package was 4 months, and npm restore took 15-30 seconds on an average project. By early 2018, the average release cycle for a JS package was 11 days, and the average npm restore step took 3-4 minutes. 1/11
    • @davemark: THREAD:  J.C.R. Licklider was one of the true pioneers of computer science. Back in about 1953, Licklider built something called a Watermelon Box. If it heard the word watermelon, it would light up an LED. That’s all it did. But it was the start of a huge wave. //@reneritchie
    • smudgymcscmudge: I have to admit that the switch from “free software” to “open source” worked on me. Early in my career I was intrigued by the idea, but couldn’t get past how “free” software was a sustainable model. I started to get it at around the same time the terminology changed.
    • @msuriar: CPU attack: spin up something that burns 100% CPU. (openssl or something). What do you expect to happen? What actually happens? #SREcon
    • Forrest Brazeal: The way I describe it is: functions as a service are cloud glue. So if I’m building a model airplane, well, the glue is a necessary part of that process, but it’s not the important part. Nobody looks at your model airplane and says: “Wow, that’s amazing glue you have there.” It’s all about how you craft something that works with all these parts together, and FaaS enables that.
    • You want more quotes? There are lots more. Can you handle the truth? Click through and test yourself.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Mar272018

Sponsored Post: Educative, Clover, Triplebyte, Exoscale, Symbiont, Loupe, Etleap, Aerospike, Scalyr, Domino Data Lab, MemSQL

Who's Hiring? 

  • Clover is looking for seasoned software engineers to help us solve the most complicated problem in the world: healthcare. We're using sophisticated data analytics, custom software, and machine learning to coordinate care and build a clearer model of our member's health and risk factors. We are on a mission to help seniors and low-income members live healthier while keeping costs down. This is an opportunity for those who want to be at the intersection of health and technology and thrive in a collaborative environment as well as the freedom of self-direction. If you're interested, please directly apply here!

  • Triplebyte now hires software engineers for top tech companies and hundreds of the most exciting startups like Apple, Dropbox, Mixpanel, and Instacart. They identify your strengths from an online coding quiz and let you skip resume and recruiter screens at multiple companies at once. It's free, confidential, and background-blind. Apply here.

  • Symbiont is a New York-based financial technology company building new kinds of computer networks to connect independent financial institutions together and allow them to share business logic and data in real time. This involves developing a distributed system which is also decentralized, and which allows for the creation of smart contracts, self-executing cryptographic agreements among counterparties. To do so, we're using a lot of techniques in blockchain technology, as well as those from traditional distributed systems, programming language design and cryptography. We are hiring for a number of roles, from entry-level to expert, including Haskell Backend Engineer, Database Engineer, Product Engineer, Site Reliability Engineer (SRE), Programming Language Engineer and SecOps Engineer. To find out more, just e-mail us your resume

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • 5 Signs You’ve Outgrown DynamoDB. Companies often select a database that seems to be the best choice at first glance, as well as the path of least resistance, and then are subsequently surprised by cost overruns and technology limitations that quickly hinder productivity and put the business at risk. This seems to be the case with many enterprises that chose Amazon Web Service’s (AWS) DynamoDB. In this white paper we’ll cover elements of costing as well as the results of benchmark-based testing. Read 5 Signs You’ve Outgrown DynamoDB to determine if your organization has outgrown this technology.

  • Advertise your event here!

Cool Products and Services

  • For heads of IT/Engineering responsible for building an analytics infrastructure, Etleap is an ETL solution for creating perfect data pipelines from day one. Unlike older enterprise solutions, Etleap doesn’t require extensive engineering work to set up, maintain, and scale. It automates most ETL setup and maintenance work, and simplifies the rest into 10-minute tasks that analysts can own. Read stories from customers like Okta and PagerDuty, or try Etleap yourself.

  • Educative provides interactive courses for software engineering interviews created by engineers from Facebook, Microsoft, eBay, and Lyft. Prepare in programming languages like Java, Python, JavaScript, C++, and Ruby. Design systems like Uber, Netflix, Instagram and more. More than 10K software engineers have used Coderust and Grokking the System Design Interview to get jobs at top tech companies like Facebook, Google, Amazon, Microsoft, etc. Ace your software engineering interviews today. Get started now

  • Gartner’s 2018 Magic Quadrant for Data Science and Machine Learning Platforms. Read Gartner’s most recent 2018 release of the Magic Quadrant for Data Science and Machine Learning Platforms. A complimentary copy of this important research report into the data science platforms market is offered by Domino. Download the report to learn: 
    • How Gartner defines the Data Science Platform category, and their perspective on the evolution of the data science platform market in 2018. 
    • Which data science platform is right for your organization. 
    • Why Domino was named a Visionary in 2018.

  • Exoscale GPU Cloud Servers. Powerful on-demand GPU. Perfect for your machine learning, artificial, and encoding workloads. GPU instances work exactly like other instances: they are billed by the minute and integrate seamlessly with your existing infrastructure. Tap the GPU's full power with direct passthrough access. Speed-up Tensorflow or any other Deep Learning, Big Data, AI, or Encoding workload. Start your GPU instances via our API or with your existing deployment management tools. Add parallel computational power to your stack with no effort. Get Started

  • .NET developers dealing with Errors in Production: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Managers want to know what’s wrong right away, users don’t want to provide log data, and you spend more time gathering information than you do fixing the problem. To fix all that, Loupe was built specifically as a .NET logging and monitoring solution. Loupe notifies you about any errors and tells you all the information you need to fix them. It tracks performance metrics, identifies which errors cause the greatest impact, and pinpoints the root causes. Learn more and try it free today.

  • Enterprise-Grade Database Architecture. The speed and enormous scale of today’s real-time, mission critical applications has exposed gaps in legacy database technologies. Read Building Enterprise-Grade Database Architecture for Mission-Critical, Real-Time Applications to learn: Challenges of supporting digital business applications or Systems of Engagement; Shortcomings of conventional databases; The emergence of enterprise-grade NoSQL databases; Use cases in financial services, AdTech, e-Commerce, online gaming & betting, payments & fraud, and telco; How Aerospike’s NoSQL database solution provides predictable performance, high availability and low total cost of ownership (TCO)

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • MemSQL envisions a world of adaptable databases and flexible data workloads - your data anywhere in real time. Today, global enterprises use MemSQL as a real-time data warehouse to cost-effectively ingest data and produce industry-leading time to insight. MemSQL works in any cloud, on-premises, or as a managed service. Start a free 30 day trial here: memsql.com/download/.

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.


Scale your Job Search with Triplebyte

Triplebyte is unique because they're a team of engineers running their own centralized technical interview. The evaluation quality is so good that companies like Apple, Dropbox, Mixpanel, and Instacart now let every engineer Triplebyte recommends skip steps in the application process.

They give personal assistance to discover which roles you're most excited about, schedule your final interviews back-to-back, and help you negotiate with multiple companies at once.

Triplebyte now works with top tech companies and hundreds of the most exciting pre-screened startups.

It's free, confidential, and background-blind for engineers. Take Triplebyte's online coding quiz to see if they can help you scale your career faster. (Engineers with architecture and system design experience tend to do especially well.)


The Solution to Your Operational Diagnostics Woes

Scalyr gives you instant visibility of your production systems, helping you turn chaotic logs and system metrics into actionable data at interactive speeds. Don't be limited by the slow and narrow capabilities of traditional log monitoring tools. View and analyze all your logs and system metrics from multiple sources in one place. Get enterprise-grade functionality with sane pricing and insane performance. Learn more today


If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Friday
Mar162018

Stuff The Internet Says On Scalability For March 16th, 2018

Hey, it's HighScalability time:

 

Hermetic symbolism was an early kind of programming. Symbols explode into layers of other symbols, like a programming language, only the instruction set is the mind.

 

If you like this sort of Stuff then please support me on Patreon. And I'd appreciate if you would recommend my new book—Explain the Cloud Like I'm 10—to anyone who needs to understand the cloud (who doesn't?). I think they'll learn a lot, even if they're already familiar with the basics.

 

  • ~30: AWS services used by iRobot; 450,000: Shopify S3 operations per second; $240: yearly value of your data; ~day: time to load a terabyte from Postgres into BigQuery; 5 million: viewers for top Amazon Prime shows; 130,000: Airbusians move from Microsoft Office to Google Suite; trillion: rows per second processed by MemSQL; 38 million: Apple Music paid members; 4 million: Microsoft git commits for a Windows release; 

  • Quotable Quotes:
    • Stephen Hawking: Although I cannot move and I have to speak through a computer, in my mind I am free.
    • Roger Penrose: Despite [Stephen Hawking] terrible physical circumstance, he almost always remained positive about life. He enjoyed his work, the company of other scientists, the arts, the fruits of his fame, his travels. He took great pleasure in children, sometimes entertaining them by swivelling around in his motorised wheelchair. Social issues concerned him. He promoted scientific understanding. He could be generous and was very often witty. On occasion he could display something of the arrogance that is not uncommon among physicists working at the cutting edge, and he had an autocratic streak. Yet he could also show a true humility that is the mark of greatness.
    • Raymond Wong: The most revealing part of the report exposes how Apple didn't even have plans to integrate Siri into HomePod until after the Amazon Echo launched
    • @ajaynairthinks: When I tell people I am the founding PM on Lambda, the question I often get is how the idea for #AWS Lambda/#Serverless  came about in the first place. The truth is, its way too hard to point to one person/event as the defining moment.
    • @ImerM1: Just did some budget analysis. Turns out we've managed to reduce our AWS costs for RNA-seq by 90%, by using Lambda, Batch and Step Functions 
    • Daniel Lemire: My numbers are clear: in my tests, it is three times faster to sum up the values in a LinkedHashSet.
    • @Tr0llyTr0llFace: The Bitcoin network is processing 200,000 transactions per day at a cost of $3B per year. Visa is processing 150,000,000 transactions per day at a cost of $8B per year. Visa also does insurance, credit, customer support...With Bitcoin your funds are lost if you forget your PIN.
    • Linus Torvalds: It looks like the IT security world has hit a new low. 
    • Steve Jobs~ When you're the janitor, reasons matter. Somewhere between the janitor and the CEO, reasons stop mattering. That Rubicon is crossed when you become a VP.
    • @ByRosenberg: The San Jose Mercury News, Oakland Tribune, Contra Costa Times and their sister papers had 1,000 editorial employees in 2000. Now they're down to 100. Devastating to see Bay Area news coverage decimated in one of the world's most important places to cover 
    • @jimwebber: Amazed at the medical/genomics paper I've just reviewed where the whole thing was built in Neo4j and processed via Cypher. Our user community is extraordinary.
    • @maria_fibonacci: Mansplainings aside, I build things with k8s as my day job, and I build things with Elixir because it's my fav lang. Even if they're different types of software, the reasons people end up using them are basically the same, yet the additional layer of complexity matters.
    • @gwenshap: Things I learned at #StrataData last week. I only attended one session, but talked to 100+ attendees. So, it is the "hallway" view. 1. Machine learning is sexier than ever. Lots of talk. I got the impression that organizations built around ML can do it, but existing business still didn't make the leap. My bet is that in 2-3 years we'll start seeing the "early majority".
    • Jim Handy: In a nutshell Mark [Thirsk] is telling us that there may be some stress on DRAM and NAND flash wafer supplies, but the companies that will feel the greatest impact will be tier 2 chip makers who purchase the lowest cost wafers. 
    • michaelt: So I'm a mid-level manager at a company with a few hundred developers. The more developers you hire, the higher the chance you'll start collecting people with obscure (but usually reasonably easy to satisfy) tool preferences. You know, the guy who changes his IDE to emacs key bindings, the guy who does all his e-mail using mutt when everyone else uses gmail, the girl who uses a Dvorak keyboard layout, the guy who insists he works best on a 1024x768 screen, and so on. Having tried a variety of industry tools and thought about about how you work best is usually a good sign[1]. Their favourite tools aren't my favourite tools, but they work for me so if they're not happy, I'm not happy.
    • Paul Kunert: Airbus will organise information around “teams, topics and programmes” and “let people go to the information that they need for their jobs… almost the opposite from an environment that is based on email where you receive whatever it is that others decide you can receive.”
    • Geoff Huston: However, I also suspect that the intelligence agencies are already focussing elsewhere. If the network is no longer the rich vein of data that it used to be, then the data collected by content servers is a more than ample replacement. If the large content factories have collected such a rich profile of my activities, then it seems entirely logical that they will be placed under considerable pressure to selectively share that profile with others. So, I’m not optimistic that I have any greater level of personal privacy than I had before. Probably less. Meet the new boss. Same as the old boss.
    • @PaulDJohnston: Counter prediction: very few software engineers will need to know about k8s et al because #serverless. It's code that matters and orchestration is becoming a commodity. (Sounding like @swardley). Also ,zip is a more robust deployment artifact 
    • Joel Hruska: In all cases, the pirated version of the [Final Fantasy XV] was faster, by 5 percent to a whopping 33 percent, depending on the scene...The implications of these findings are straightforward: The piracy protections baked into the game are hitting overall performance, causing a significant set of issues. Companies regularly deny it happens, but tests like this punch holes in such claims. 
    • Mathieu Ripert: we [Instacart] found out that with quantile regression we were able to plan deliveries closer to their due time without increasing late percentage. This effect allowed us to explore more trip combinations in our fulfillment engine and therefore increase efficiency (one of our most important metric) by 4%.
    • @vijaypande: We argue that the future of predicting the interactions between a drug and its prospective target demands more than simply applying deep learning algorithms from other domains, like vision and natural language, to molecules. 
    • John Allspaw: The increasing significance of our systems, the increasing potential for economic, political, and human damage when they don’t work properly, the proliferation of dependencies and associated uncertainty — all make me very worried. And, if you look at your own system and its problems, I think you will agree that we need to do more than just acknowledge this — we need to embrace it.
    • Kevlin Henney: Move Slow and Mend Things
    • Read on for more quotes.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...