advertise
Friday
Nov172017

Stuff The Internet Says On Scalability For November 17th, 2017

Hey, it's HighScalability time: 


The BOSS Great Wall. The largest structure yet found in the universe. Contains 830 galaxies. A billion light years across. 10,000 times the mass of the Milky Way.

 

If you like this sort of Stuff then please support me on Patreon. And there's my new book, Explain the Cloud Like I'm 10, for complete cloud newbies. 


  • $25 billion: Alibaba's Singles' Day sales; 6+ million: Slack daily active users; 4ms: boot time for a unikernel based VM; 1 billion: out of date Android devices; 10-20%: increase in RAM prices; 8 million: lines of code in F-35; $3 million: lost by Isaac Newton in the stock market; 30: it's RAID's birthday!; thousands: bugs fixed with Pentagon hackathon; 6+ terabytes: earth satellite data downloaded per day; 

  • Quotable Quotes:
    • Berners-Lee: When I invented the web, I didn’t have to ask Vint Cerf [the ‘father of the internet’] for permission to use the internet
    • Germaine de Stael: Ridicule dries up the imagination.
    • Alex Hudson: A lot of technical write-ups focus on scaling, performance and large-scale systems. It’s definitely interesting to see what problems Netflix have, and how they respond to them. It’s important to understand why Google take decisions in the way they do. However, most of their problems don’t apply to anyone else, and therefore many of the solutions may or may not be appropriate.
    • @jpetazzo: Step functions: they're great, but they don't support dynamic fan out (i.e. invoking an arbitrary number of "sub-lambdas" in parallel).
    • parasubvert: Perhaps one of the lessons of architecture that is missing is to teach people how to evaluate tradeoffs, or in other words, “taste”. I don’t think we’ve ever really had good taste as an industry. Buzzword bingo has always ruled, with some exceptions.
    • Calvin Biesecker: The cost to change one line of code on a piece of avionics equipment is $1 million, and it takes a year to implement. For Southwest Airlines, whose fleet is based on Boeing’s 737, it would “bankrupt” them if a cyber vulnerability was specific to systems on board 737s, he said, adding that other airlines that fly 737s would also see their earnings hurt.
    • @QConSF: @natekupp shares some of Thumbtack's learnings on their journey to scale: from a PHP/PostgreSQL monolith with a self-managed Hadoop cluster, to Dockerized #microservices paired with managed/serverless data infrastructure #qconsf
    • Bail Bloc: Mine Monero, waste electricity, generate CO2 and send less money to a charity than you could have just sent directly! What’s not to like?
    • @Xof: The notion that the only way to be a good programmer is to let it consume your life is toxic.
    • @swardley: AMZN is now worth an IBM + Oracle + CISCO and you'd still have enough  change left over to buy most of VMware. Not bad for a decade of growth.
    • @swardley: I've been a bit gobsmacked by who is using Lambda recently ... there was me thinking that big / traditional enterprise would be testing the waters slowly. How wrong.
    • @crichardson: If GoLang becomes #1 it will primarily due to fashion rather fitness for purpose. It's far too low level/lacking in expressiveness for many kinds of applications. Eg. Enterprise/business applications. It is not what Java's successor should be.
    • Dropbox: IPv6 does show slightly better performance over IPv4. However, without detailed client-side and network information, it is hard to say definitely where the IPv6 performance gain is from.
    • Stack Overflow: Two tags stand out in this analysis, both with tremendous growth, and they have something in common. Swift is Apple’s language for developing iOS apps that is a successor to Objective-C, and the angular tag
    • @Falkvinge: I've said it before and I'm saying it again and again: in order to beat old-world banking, crypto must be at least an order of magnitude better. Old-world banking offers free instant tx between private accts, and 15-cent txs to merchant accounts. Beat that or be obsoleted.
    • Alex Hudson: I want to hear more about projects that deferred decisions and put off architecting until much later in the process. I want to hear more about delivery at real speed. Small pieces of software that are not necessarily interesting but deliver business value are the real heroes in our industry, and the developers who create them the real stars. I especially want to hear more about developers working with systems that have constraints. I want to hear from people pushing standard stuff beyond its limits. I think we grossly underestimate what off-the-shelf systems can do, and grossly overestimate the capabilities of the things we develop ourselves. It’s time to talk much more about real-world, practical, medium-enterprise software architecture.
    • David Gerard: BTC is very clogged at the moment, with around 100,000 unconfirmed transactions as I write this, and peaks of 160,000 a few days ago. Transaction fees peaked at around $20 just to get your transaction through. This wasn’t helped by long delays between blocks, as mining capacity moved to BCH — the time between blocks peaking at 63 minutes a few days ago, on 11 November. fork.lol, which charts the relative profitability of the two, was overloaded and inaccessible. If shutdowns of mining progress in China, then whoever remains in mining will become the power. This is currently divided between Iceland, India, Japan, Georgia and the Czech Republic.
    • linkmotif: There’s a very common ethos that if people just focused on shipping they would somehow magically ship but that’s not how software works. You can’t just will shipping. You need to know what you’re doing.
    • sp527: I had a serious epiphany when I read that Braintree managed to vertically scale a two node (“HA”) Postgres setup to transaction volume in the millions and a massive valuation. Stack Overflow has had a similarly lean footprint for much of its history.
    • @ben11kehoe: This graphic from @googlecloud App Engine is nonsense. GAE literally makes you select instance sizes
    • @danielbryantuk: "Any change made to a complex adaptive system is a gamble. We mitigate risks, but we can't eliminate them" @relix42 #qconsf
    • ivanstepin: Flickr implemented Lanczos algorithm while Discord uses near-neighbor ( much less resource-consuming, but with slightly less quality ) algo. It may turn out that the gpu mem<->cpu mem data transfer can eat all the benefits for such simple algo as near-neighbor scaling.
    • There's more. Lots more.

Click to read more ...

Monday
Nov132017

Cassandra NoSQL Data Model Design 

We at Instaclustr recently published a blog post on the most common data modelling mistakes that we see with Cassandra. This post was very popular and led me to think about what advice we could provide on how to approach designing your Cassandra data model so as to come up with a quality design that avoids the traps.

There are a number of good articles around that with rules and patterns to fit your data model into: 6 Step Guide to Apache Cassandra Data Modelling and Data Modelling Recommended Practices.

However, we haven’t found a step by step guide to analysing your data to determine how to fit in these rules and patterns. This white paper is a quick attempt at filling that gap.

Phase 1: Understand the data

This phase has two distinct steps that are both designed to gain a good understanding of the data that you are modelling and the access patterns required.

Define the data domain

The first step is to get a good understanding of your data domain. As someone very familiar with relation data modelling, I tend to sketch (or at least think) ER diagrams to understand the entities, their keys and relationships. However, if you’re familiar with another notation then it would likely work just as well. The key things you need to understand at a logical level are:

• What are the entities (or objects) in your data model?
• What are the primary key attributes of the entities?
• What are the relationships between the entities (i.e. references from one to the other)?
• What is the relative cardinality of the relationships (i.e. if you have a one to many is it one to 10 or one to 10,000 on average)?

Basically, these are the same things you’d expect in from logical ER model (although we probably don’t need a complete picture of all the attributes) along with a complete understanding of the cardinality of relationships that you’d normally need for a relational model. An understanding of the demographics of key attributes (cardinality, distribution) will also be useful in finalising your Cassandra model. Also, understand which key attributes are fixed and which change over the life of a record.

Define the required access patterns

Click to read more ...

Friday
Nov102017

Stuff The Internet Says On Scalability For November 10th, 2017

Hey, it's HighScalability time: 


Ah, the good old days. This is how the FBI stored finger prints in 1944. (Alex Wellerstein). How much data? Estimates range from 30GB to 2TB.

 

If you like this sort of Stuff then please support me on Patreon. Also, there's my new book, Explain the Cloud Like I'm 10, for complete cloud newbies. 


  • 1 million: times we touch our phones per year; 13 million: lines of Javascript @ Facebook; 256K: RAM needed for TensorFlow on a microcontroller; 2,502%: increase in the sale of ransomware on the dark web; 800 million: monthly Instagram users; 40%: VMs in Azure run Linux; 40%: improved GCP network latency from new SDN stack; 50%: fat content of a woolly mammoth; 

  • Quotable Quotes:
    • Sean Parker: And that means that we [Facebook] need to sort of give you a little dopamine hit every once in a while, because someone liked or commented on a photo or a post or whatever. And that's going to get you to contribute more content, and that's going to get you ... more likes and comments
    • David Gerard: I spent yesterday afternoon on Twitter and /r/buttcoin, giggling. It was a popcorn overload moment for every acerbic cryptocurrency sceptic who ever thought that immutable, unfixable smart contracts were an obviously stupid idea that would continue to end in tears and massive losses, as they so often had previously.
    • @jessfraz: I remember now why I put everything into containers in the first place, it's because all software is 💩
    • Amin Vahdat: What we have found running our applications at Google is that latency is as important, or more important, for our applications than relative bandwidth. It is not just latency, but predictable latency at the tail of the distribution. If you have a hundred or a thousand applications talking to one another on some larger task, they are chatty with one another, exchanging small messages, and what they care about is making a request and getting a response back quickly, and doing so across what might a thousand parallel requests.
    • @SteveBellovin: Why anyone with any significant programming experience--and hence experience with bugs--every liked smart contracts is a mystery to me.
    • Neha Bagri: Startups worship the young. But research shows people are most innovative when they’re older
    • @manisha72617183: OH: I no longer tolerate complicated programming languages. My mental space is like Silicon Valley; rent is high and space is at a premium
    • @atoonk: On days like today, we're yet again reminded that the Internet is held together with duct tape.. #rockSolid #BGP #comcast #outage
    • @bradfitz: 0 days since last high impact bug in an experimental programming language on the Ethereum VM affecting millions of dollars.
    • TheScientist: The genetic, molecular, and morphological diversity of the brain leads to a functional diversification that is likely necessary for the higher-order cognitive processes that are unique to humans.
    • Woods' Theorem: As the complexity of a system increases, the accuracy of any single agent's own model of that system decreases rapidly.
    • Carlos E. Perez: The brain performs compensation when it encounters something it does not expect. It learns how to correct itself through perturbative methods. That’s what Deep Learning systems also do, and it’s got nothing to do with calculating probabilities. It’s just a whole bunch of “infinitesimal” incremental adjustments.
    • @erickschonfeld: “What can one expect of a few wretched wires?”—telegraph skeptic, 1841
    • @ErikVoorhees: The average Bitcoin transaction fee ($10.17) is now more than twice the cost of Bitcoin itself when I first learned of it ($5) in 2011 :(
    • LightShadow: StackOverflow should be one of the first internet companies to accept cryptocurrency micro payments. All they'd have to do is skim a small percentage from people tipping each other pennies for good answers
    • @lworonowicz: I feel like I killed a family dog - had to decommission an old #Solaris server with uptime of 6519 days.
    • Google: Andromeda 2.1 latency improvements come from a form of hypervisor bypass that builds on virtio, the Linux paravirtualization standard for device drivers. Andromeda 2.1 enhancements enable the Compute Engine guest VM and the Andromeda software switch to communicate directly via shared memory network queues, bypassing the hypervisor completely for performance-sensitive per-packet operations.
    • iAfrikan News: The first-ever fiber optic cable with a route between the U.S. And India via Brazil and South Africa will soon be a reality. This is according to a joint provisioning agreement entered into by Seaborn Networks ("Seaborn") and IOX Cable Ltd ("IOX").
    • @iamdevloper: 1969: -what're you doing with that 2KB of RAM? -sending people to the moon 2017: -what're you doing with that 1.5GB of RAM? -running Slack
    • Eric Schmidt: Bob Taylor invented almost everything in one form or another that we use today in the office and at home.
    • @ben11kehoe: I am so on board with CRDT-based data stores providing state to FaaS at the edge. 
    • VMG: “Code is Law” fails again.
    • Paul Frazee: In Bitcoin, acceptance of a change is signaled by the miners - once some percent of the miners agree, the change is accepted. This means that hashing power is used as a measure of voting power, and so the political system is essentially plutocratic. How is that significantly better than the board of a publicly traded company?
    • gtrubetskoy: Professor Tanenbaum is one of the most respected computer scientists alive, and for Intel to include Minix in their chip and not let him know is kind of unprofessional and not very nice to say the least. That is his only (and quite fair) point.
    • jsolson: Both approaches have tradeoffs, although I think even with ENA AWS hits ~70µs typical round-trip-times while GCE gets down to ~40µs. Amazon's largest VMs in some families do advertise higher bandwidth than GCE does currently.
    • @brendangregg: AWS put lots of work into optimizing Xen, including net & disk SR-IOV (direct metal access). But their new optimized KVM is even better.
    • @wheremattisat: “Facebook and Google are proto-AIs and we are their microbiome. The objective function of those AIs today is to make more money” @timoreilly
    • @ossia: "Weeks of programming can save you hours of planning." - Anonymous
    • @sallamar: For kicks, we run over 6.2 billion requests a month on lambda (450% yoy) at @ExpediaEng. Still cheaper than renting an apartment for a year.
    • @Joab_Jackson: At this point,IBM #openwhisk is the most viable #open source #serverless platform—@ryan_sb @thecloudcastnet #podcast
    • Polvi: I think PaaS is dead. That's why you see OpenShift and Cloud Foundry and everyone pivoting to Kubernetes. What's going to happen is PaaS will be reborn as serverless on the other side of the Kubernetes transition.
    • mmgutz: We're running our Debian farm on Azure thanks to startup perks. It's been up 100% for us the last 2.5 years. Azure service is no less or better than AWS.
    • zzzeek: I switch between multiple versions of MySQL and MariaDB all day long. If you aren't using specific things like MySQL's JSON type or NDB storage engine or expecting CHECK constraints to enforce on MySQL (oddly omitted from this feature comparison!), there is nothing different at all from a developer point of view, beyond the default values of flags which honestly change more between MySQL releases than anything else.
    • SEJeff: They [Azure] allow you to have native RDMA[1] for your VMs, something neither amazon or google will give you. As an oldhat Linux/Unix guy, it is somewhat amusing to think of Microsoft's cloud offering as the high perf one, but the facts don't lie. If you have true HPC style workloads such as bioinformatics, oil/natgas exploration, finance, etc, the extra node to node communication bits are necessary. The QDR fabric they have has a native speed of 40 Gbps. It is a shame they don't have FDR (56G) or EDR (100G), but still is quite impressive depending on your app. This also could be a game changer for large MPI jobs.
    • johnnycarcin: I've honestly yet to see a customer moving to Azure who has more than 50% Windows based systems. Almost everyone I've worked with only uses Windows Server for their SQL Server services, outside of that it's RHEL, CentOS or Ubuntu.
    • lurchedsawyer: So to answer your question as to what is needed for Azure to become a viable alternative to AWS: I would say about 10 years.
    • @mjpt777: If Google thinks latency trumps bandwidth then they should look to software before hardware for the main source of latency.
    • Ben Kehoe: Like so many things in life, serverless is not an all-or-nothing proposition. It’s a spectrum — and more than that, it has multiple dimensions along which the degree of serverlessness can vary
    • zzzeek: If I was doing brand new development somewhere I'm sure I'd use Postgresql, since from a developer point of view it's the most consistent and flexible. While for the last few years I've worked way more with MySQL / MariaDB and at the moment the MySQL side of things is a bit more familiar to me, I still appreciate PG's vastly superior query planner and index features.
    • There's more. Much more. Click through for more. More. More. More.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Nov072017

Sponsored Post: Loupe, Etleap, Aerospike, Stream, Scalyr, VividCortex, Domino Data Lab, MemSQL, Zohocorp

Who's Hiring? 

  • Symbiont is a New York-based financial technology company building new kinds of computer networks to connect independent financial institutions together and allow them to share business logic and data in real time. This involves developing a distributed system which is also decentralized, and which allows for the creation of smart contracts, self-executing cryptographic agreements among counterparties. To do so, we're using a lot of techniques in blockchain technology, as well as those from traditional distributed systems, programming language design and cryptography. We are hiring for a number of roles, from entry-level to expert, including Haskell Backend Engineer, Database Engineer, Product Engineer, Site Reliability Engineer (SRE), Programming Language Engineer and SecOps Engineer. To find out more, just e-mail us your resume

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • On-demand Webinar. Fast & Frictionless - The Decision Engine for Seamless Digital Business. In this session, guest speakers Michele Goetz, Principal Analyst at Forrester Research and Matthias Baumhof, VP Worldwide Engineering at ThreatMetrix, discuss: How risk-based authentication leveraging digital identities is key to empowering customer transactions; How real-time customer trust decisions can reduce fraud and improve customer satisfaction; How a high performance Hybrid Memory Architecture (HMA) database helps continuously evaluate across a multitude of factors to drive decisioning at the lowest operational cost. View now

  • Advertise your event here!

Cool Products and Services

  • .NET developers dealing with Errors in Production: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Managers want to know what’s wrong right away, users don’t want to provide log data, and you spend more time gathering information than you do fixing the problem. To fix all that, Loupe was built specifically as a .NET logging and monitoring solution. Loupe notifies you about any errors and tells you all the information you need to fix them. It tracks performance metrics, identifies which errors cause the greatest impact, and pinpoints the root causes. Learn more and try it free today.

  • Enterprise-Grade Database Architecture. The speed and enormous scale of today’s real-time, mission critical applications has exposed gaps in legacy database technologies. Read Building Enterprise-Grade Database Architecture for Mission-Critical, Real-Time Applications to learn: Challenges of supporting digital business applications or Systems of Engagement; Shortcomings of conventional databases; The emergence of enterprise-grade NoSQL databases; Use cases in financial services, AdTech, e-Commerce, online gaming & betting, payments & fraud, and telco; How Aerospike’s NoSQL database solution provides predictable performance, high availability and low total cost of ownership (TCO)

  • The Practical Guide to Managing Data Science at Scale. The ability to manage, scale, and accelerate an entire data science discipline increasingly separates successful organizations from those falling victim to hype and disillusionment. Download this practical guide for data science management, if you're currently, or aspiring to be, a data science manager. The paper demystifies and elevates the current state of data science management.

  • Etleap is a Redshift ETL tool that lets you bring all the data everyone wants into Redshift. It's easy enough for analysts to add and manage data connections on their own, without inundating IT/Engineering with requests for help. It takes just minutes to add new connections such as MySQL, Salesforce, S3, and many others, then you can "set it and forget it." Learn more about Redshift ETL with Etleap.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL envisions a world of adaptable databases and flexible data workloads - your data anywhere in real time. Today, global enterprises use MemSQL as a real-time data warehouse to cost-effectively ingest data and produce industry-leading time to insight. MemSQL works in any cloud, on-premises, or as a managed service. Start a free 30 day trial here: memsql.com/download/.

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.


Click to read more ...

Monday
Nov062017

Birth of the NearCloud: Serverless + CRDTs @ Edge is the New Next Thing


Kuhiro 10X Faster than Amazon Lambda

 

This is a guest post by Russell Sullivan, founder and CTO of Kuhirō.

Serverless is an emerging Infrastructure-as-a-Service solution poised to become an Internet-wide ubiquitous compute platform. In 2014 Amazon Lambda started the Serverless wave and a few years later Serverless has extended to the CDN-Edge and beyond the last mile to mobile,  IoT, & storage.

This post examines recent innovations in Serverless at the CDN Edge (SAE). SAE is a sea change, it’s a really big deal, it marks the beginning of moving business logic from a single Cloud-region out to the edges of the Internet, which may eventually penetrate as far as servers running inside cell phone towers. When 5G arrives SAE will be only a few milliseconds away from billions of devices, the Internet will be transformed into a global-scale real-time compute-platform.

The journey of being a founder and then selling a NOSQL company, along the way architecting three different NOSQL data-stores, led me to realize that computation is currently confined to either the data-center or the device: the vast space between the two is largely untapped. So I teamed up with some smart people and we created the startup Kuhirō: a company dedicated to incrementally pushing pieces of the Cloud out to the edge, gradually creating a decentralized cloud very close to end-users, a NearCloud.

We decided the foundations of this NearCloud would be compute & data so we are beginning with a stateful SAE system which will serve as a springboard for subsequent offerings (e.g. ML inference, real-time-analytics, etc…). At many CDN edges, we run customer business logic as functions which read and write real-time customer-data. We put in the effort to make a CRDT-based data-layer that (for the first time ever) delivers low-latency dynamic web-processing on shared-global-data from the CDN edge. Kuhirō enables customers to move the dynamic latency-sensitive parts of their app from the cloud to the edge, customer apps become global-scale real-time applications with Kuhirō handling the operations and scaling.

Serverless at the Edge Architecture

Click to read more ...

Friday
Nov032017

Stuff The Internet Says On Scalability For November 3rd, 2017

Hey, it's HighScalability time: 


Luscious visualization of a neural network as a large directed graph. It's a full layout of the ResNet-50 training graph, a neural network with ~3 million nodes, and ~10 million edges, using Gephi for the graph layout, to output a 25000x25000 pixel image. (mattfyles)

 

If you like this sort of Stuff then please support me on Patreon. And take a look at Explain the Cloud Like I'm 10, my new book for complete cloud newbies. Thanks for your support! It means a lot to me.

 

  • 96.4%: adversarial algorithm fools Google's image recognition; 70%+: GOOG and FB influence over internet traffic; 99%: bird reduction on farms using tuned laser guns; 52.45%: people playing my indie game have pirated it; 371,642: open-source projects depend on React; 6: words needed to ID you in email: 2x: node.js speed increase with Turbofan; 3: students who discovered 'Dieselgate'; 215KWh: energy consumed per bitcoin xaction, enough for a car to travel 1,000 miles; 33%: increase in Alphabet's quarterly profits; $1 billion: Amazon's ad business; 1 pixel: all it takes to fool our AI overlords; 61%: increase in Alibaba revenue; 1 minute: time it takes a 1964 acoustic coupler modem to load a Wikipedia page; $30,000: bitcoin lost for want of a PIN; 40%: increased learning speed using direct current stimulation; 1 million: IoT devices owned by Reaper; $52.6 billion: Apple quarterly revenue; 

  • Quotable Quotes: 
    • @pasiphae_goals: #33: go’s compiler is extremely fast, giving you ample time to debug data races and deadlocks.
    • Tim O'Reilly: we can choose instead to lift each other up, to build an economy where people matter, not just profit. We can dream big dreams and solve big problems. Instead of using technology to replace people, we can use it to augment them so they can do things that were previously impossible.
    • André Staltz: These are no longer the same companies as 4 years ago. GOOG is not anymore an internet company, it’s the knowledge internet company. FB is not an internet company, it’s the social internet company. They used to attempt to compete, and this competition kept the internet market diverse. Today, however, they seem mostly satisfied with their orthogonal dominance of parts of the Web, and we are losing diversity of choices. Which leads us to another part of the internet: e-commerce and AMZN...In the Trinet, we will have even more vivid exchange of information between people, but we will sacrifice freedom. Many of us will wake up to the tragedy of this tradeoff only once it is reality.
    • @math_rachel: Data is not neutral. Data will always bear the marks of its history. 
    • Jeff Bar: p3.16xlarge is 2.37 billion times faster, so the poor little PC would be barely halfway through a calculation that would run for 1 second today
    • @krishnan: I criticized Google calling their App Engine Serverless during their cloud conference Here @ben11kehoe says the same
    • F5: The obvious lesson is that the state of IoT security is still incredibly poor, and we need to do a better job of threat modeling6 the Internet7 of8 Things9.
    • Jakub Kasztalski: I just downloaded one of the pirated copies of my game. They asked me to remove my ad-blocker to keep their site running. Oh the sweet irony
    • Dart: The car would start drifting to side of the road and not know how to recover. The reason for the car’s instability was that no data was collected on the side of the road. During the data collection the supervisor always drove along the center of the road; however, if the robot began to drift from the demonstrations, it would not know how to recover because it saw no examples.
    • Tim O'Reilly: The company wanted to know how to get more developers for its platform. David asked a key question: “Do any of them play with it after work, on their own time?” The answer was no. David told them that until they fixed that problem, reaching out to external developers was wasted effort.
    • @swardley: X: Why do you think Amazon is so dangerous? Don't you think they will slow down?
      Me: No, they'll get faster.
    • @whispersystems: Signal is back after a brief service interruption. We appreciate your patience as we added more capacity to resolve connection errors.
    • lwansbrough: What the hell is taking up all that RAM? Slack is using over 800MB of RAM on my computer right now, and it's completely idle. I've got a C# process on my computer right now handling thousands of messages per second and consuming 1/20th the RAM. If you rebuild Slack in Xamarin you'd probably drop your RAM consumption to under 50MB, just a guess. JS fiends are gonna tell me C# ain't JS, and I agree. TypeScript ain't JS either, and it's a hell of a lot closer to C# than it is JS.
    • @vgill: Large-scale distributed systems are less about cool algorithms and more about the relentless hunting down of 6-sigma bugs.
    • Lots more genius quotes that can make you look even smarter. If that interests you, click through for enlightenment.

    Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

    Click to read more ...

Friday
Oct272017

Stuff The Internet Says On Scalability For October 27th, 2017

Hey, it's HighScalability time: 


Perfect! Now, imagine a little dog snuck under Big Dog's cone of shame and covered the food with its own cone of shame, and it won't leave. That's deadlock. Imagine a stream of little dogs sneaking under Big Dog's cone so Big Dog nevers gets a bite. That's livelock.

 

If you like this sort of Stuff then please support me on Patreon.

 

  • $100 billion: projected 2021 combined app store spend; 11 TB: SSD; 16: billion dollar disasters in the US this year; 8: meter long 3D printed bridge; 43%: employees who worry about losing their job due to their age; 125 TFLOPS: new AWS EC2 P3 instances; 7%: global Internet traffic flowing over QUIC; 50%: improvement in new in-package DRAM cache-management scheme; 43%: CockroachDB speed improvement executing parallel SQL statements; 325 billion: hours spent in Android apps in Q3; 4.5 million: C++ programmers; 3 trillion: ops per second in Pixel's Image Processing Unit; 80%: drop in Facebook referrals; 1,300 years: longest running business in the world; 40: age when tech workers start worrying about age discrimination; 400GE: first test by China Telecom Guangzhou and Huawei; 

  • Quotable Quotes:
    • @DynamicWebPaige: "One of our customers has 1 billion invocations daily; they told us their [Azure] bill was only $72 due to serverless computing." #RedShirtDevTour
    • Tim O'Reilly: “He realized then that history is a wave that moves through time slightly faster than we do.” If we are honest with ourselves, each of us has many such moments, when we realize that the world has moved on and we are stuck in the past.
    • @ehashdn: A primer: Site Reliability Engineers = sysadmins with Go / DevOps Engineers = sysadmins with Ruby / Systems Administrators = sysadmins with Perl
    • @ramez: Exponential gains in compute power produces only linear gains in AI accuracy. --> No runaway intelligence explosion 
    • @abnerg: $AMZN AWS reports YoY growth of 41.9% to $4.58B for the Q. $MSFT says Azure was up 90%. The ☁️ is on 🔥
    • @ryan_sb: Programming training doesn't need to start at birth. It's like plumbing: it's a skill, anyone can start anytime, it has hard and easy parts
    • Chuck Hollis: [Oracle] has re-implemented its entire business on a modern cloud platform – SaaS, PaaS and IaaS. Remember we’re talking a ~$200bn market cap company here – no easy trick.  The fun thing is that I’m part of a project to document the before and after around a whole raft of internal business metrics. The comparison is stunning, to say the least.
    • psyc: I can't be sure yet, but I'm starting to be concerned. I turned 40 recently. My current job search has lasted about 5x longer than any previous job search. I've been turned down for nonsensical reasons, such as not having enough experience in a specific language that I have a lot of experience in. (That was an assertion by the interviewer, not the result of technical questioning.) I've been interviewed several times by managers and directors 10 years younger than myself. I've noticed a distinct pattern where they'll ask very basic questions, I'll give a detailed answer than I know to be correct and insightful, and they'll say it's wrong - and I'm just dumbfounded, like what can I say? I'm not going to argue with them.
    • jquery: So even if there’s not explicit bias against older devs there is implicit bias by favoring quick whiteboard speed (memorization/practice) over the practiced thoughtfulness of older devs. And a one-off interview focused heavily on algos can sink anyone. It only takes one.
      This isn’t limited to FB. The $Elite companies I got offers from are the ones where I lucked through that one tricky interview by knowing it offhand. I worry as I get older, even as I become a stronger developer, I will become less and less able to marathon through these interviews. No wonder so many older devs switch to management.
    • Jonathan Solórzano-Hamilton: “You will never be able to understand any of what I’ve created. I am Albert F***ing Einstein and you are all monkeys scrabbling in the dirt.”
    • Franz Faerber: Hekaton achieves a roughly 15.7X performance improvement at 12 cores, while the scalability of the traditional engine is limited due to the overheads inherent in a disk-based architecture running a memory-bound workload.
    • @mhall119: I'm convinced that 90% of good software development in knowing what code not to write
    • Mohit Kumar: Coinhive has been hacked — a popular browser-based service that offers website owners to embed a JavaScript to utilise their site visitors' CPUs power to mine the Monero cryptocurrency for monetisation.
    • @timallenwagner: Couple updates: local execution is available via SAM Local. Lambda is also HIPAA eligible and PCI compliant.
    • @erictartanson: Amazon has 540K employees, nearly 7x that of Google and 4x that of Microsoft, 2nd largest US employer 
    • @NetflixUIE: Removing client-side React.js (but keeping it on the server) resulted in a 50% performance improvement on our landing page
    • Jonathan Lin Ern Sheong: Stack Overflow takes a hybrid approach, where links to questions are of the form https://stackoverflow.com/questions/42764046/ses-port-is-blocked-in-gcp with a perma ID and a vanity portion. It is friendly to both humans and computers. The URL remains valid even if the vanity portion changes when the Question is edited.
    • Vincent Lanaria: Alphabet says that deploying Project Loon in Puerto Rico is the first time it has used "machine learning powered algorithms" to ensure the balloons are over Puerto Rico. In other words, it hasn't found the optimal way of going at it just yet.
    • There are so many more quotes. Click through to read them all.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Oct242017

Sponsored Post: Loupe, Etleap, Aerospike, Stream, Scalyr, VividCortex, Domino Data Lab, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • On-demand Webinar. Fast & Frictionless - The Decision Engine for Seamless Digital Business. In this session, guest speakers Michele Goetz, Principal Analyst at Forrester Research and Matthias Baumhof, VP Worldwide Engineering at ThreatMetrix, discuss: How risk-based authentication leveraging digital identities is key to empowering customer transactions; How real-time customer trust decisions can reduce fraud and improve customer satisfaction; How a high performance Hybrid Memory Architecture (HMA) database helps continuously evaluate across a multitude of factors to drive decisioning at the lowest operational cost. View now

  • Advertise your event here!

Cool Products and Services

  • .NET developers dealing with Errors in Production: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Managers want to know what’s wrong right away, users don’t want to provide log data, and you spend more time gathering information than you do fixing the problem. To fix all that, Loupe was built specifically as a .NET logging and monitoring solution. Loupe notifies you about any errors and tells you all the information you need to fix them. It tracks performance metrics, identifies which errors cause the greatest impact, and pinpoints the root causes. Learn more and try it free today.

  • Enterprise-Grade Database Architecture. The speed and enormous scale of today’s real-time, mission critical applications has exposed gaps in legacy database technologies. Read Building Enterprise-Grade Database Architecture for Mission-Critical, Real-Time Applications to learn: Challenges of supporting digital business applications or Systems of Engagement; Shortcomings of conventional databases; The emergence of enterprise-grade NoSQL databases; Use cases in financial services, AdTech, e-Commerce, online gaming & betting, payments & fraud, and telco; How Aerospike’s NoSQL database solution provides predictable performance, high availability and low total cost of ownership (TCO)

  • What engineering and IT leaders need to know about data science. As data science becomes more mature within an organization, you may be pulled into leading, enabling, and collaborating with data science teams. While there are similarities between data science and software engineering, well intentioned engineering leaders may make assumptions about data science that lead to avoidable conflict and unproductive workflows. Read the full guide to data science for Engineering and IT leaders.

  • Etleap is a Redshift ETL tool that lets you bring all the data everyone wants into Redshift. It's easy enough for analysts to add and manage data connections on their own, without inundating IT/Engineering with requests for help. It takes just minutes to add new connections such as MySQL, Salesforce, S3, and many others, then you can "set it and forget it." Learn more about Redshift ETL with Etleap.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL envisions a world of adaptable databases and flexible data workloads - your data anywhere in real time. Today, global enterprises use MemSQL as a real-time data warehouse to cost-effectively ingest data and produce industry-leading time to insight. MemSQL works in any cloud, on-premises, or as a managed service. Start a free 30 day trial here: memsql.com/download/.

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Click to read more ...

Monday
Oct232017

One model at a time: Integrating and running Deep Learning models in production at EyeEm

This is a guest by Michele Palmia of @EyeEm.

We’ve now been running computer vision models in production at EyeEm for more than three years - on literally billions of images. As an engineer involved in building the infrastructure behind it from scratch, I both enjoyed and suffered the many technical challenges this task raised. This journey has also taught me a lot about managing processes and relationships with different teams, tasks of an especially challenging nature in a dynamic startup environment.

What follows is an attempt to consolidate the computer vision pipeline history at EyeEm, some of the challenges we had to face, some of the learning we’ve gained, and a glimpse into its future.

Index the world’s photos

Click to read more ...

Monday
Oct232017

New Book: Explain the Cloud Like I'm 10

What is the cloud? Why is it called a cloud? How does the cloud work? What does it mean when something is 'in the cloud'?

I wrote a new book: Explain the Cloud Like I'm 10, answering those questions for the complete beginner. It makes the perfect gift for Halloween. And Thanksgiving. And Christmas. Oh, and birthdays too!

The irony is, if you read HighScalability, you're not the target audience :-) Explain the Cloud Like I'm 10 is for people who hear about the cloud everyday and have wondered what it is.

Talking with people outside the tech bubble I've found the cloud is still a mystery. I think that's because almost every explanation of the cloud I could find was a rewording of the same unhelpful technobabble.

In Explain the Cloud Like I'm 10 I've used a lot of pictures and a lot of examples. I go slow and easy. I try really hard to build up an intuitive understanding of what the cloud is and how it works.

If you know of anyone who might benefit from a book like this, I'd appreciate it if you'd pass it on.

thanks!