advertise
Monday
Oct232017

One model at a time: Integrating and running Deep Learning models in production at EyeEm

This is a guest by Michele Palmia, now @EyeEm, good times @IBM, @UniPd and @UCC.

We’ve now been running computer vision models in production at EyeEm for more than three years - on literally billions of images. As an engineer involved in building the infrastructure behind it from scratch, I both enjoyed and suffered the many technical challenges this task raised. This journey has also taught me a lot about managing processes and relationships with different teams, tasks of an especially challenging nature in a dynamic startup environment.

What follows is an attempt to consolidate the computer vision pipeline history at EyeEm, some of the challenges we had to face, some of the learning we’ve gained, and a glimpse into its future.

Index the world’s photos

Click to read more ...

Monday
Oct232017

New Book: Explain the Cloud Like I'm 10

What is the cloud? Why is it called a cloud? How does the cloud work? What does it mean when something is 'in the cloud'?

I wrote a new book: Explain the Cloud Like I'm 10, answering those questions for the complete beginner. It makes the perfect gift for Halloween. And Thanksgiving. And Christmas. Oh, and birthdays too!

The irony is, if you read HighScalability, you're not the target audience :-) Explain the Cloud Like I'm 10 is for people who hear about the cloud everyday and have wondered what it is.

Talking with people outside the tech bubble I've found the cloud is still a mystery. I think that's because almost every explanation of the cloud I could find was a rewording of the same unhelpful technobabble.

In Explain the Cloud Like I'm 10 I've used a lot of pictures and a lot of examples. I go slow and easy. I try really hard to build up an intuitive understanding of what the cloud is and how it works.

If you know of anyone who might benefit from a book like this, I'd appreciate it if you'd pass it on.

thanks! 

 

Friday
Oct202017

Stuff The Internet Says On Scalability For October 20th, 2017

Hey, it's HighScalability time: 

 

Cassini's last image of Saturn, stitched together from 11 color composites, each a stack of three images taken in red, green, and blue channels. (Jason Major)

 

If you like this sort of Stuff then please support me on Patreon.

  • 21 million: max bitcoins ever; #2: Alibaba's cloud?; 1M MWh: Amazon Wind Farm Texas with 100 Turbines is live; $1000: cost to track someone with mobile ads; 20%: ebook sales of total; 17: qubit chip; 30%: Uber deep learning speedup using RDMA; 

  • Quoteable Quotes:
    • Tim O'Reilly: So what makes a real unicorn of this amazing kind? 1.  It seems unbelievable at first. 2.  It changes the way the world works. 3.  It results in an ecosystem of new services, jobs, business models, and industries.
    • @rajivpant: AlphaGo has already beaten two of the world's best players. But the new AlphaGo Zero began with a blank Go board and no data apart from the rules, and then played itself. Within 72 hours it was good enough to beat the original AlphaGo by 100 games to zero!
    • @swardley: Containers aka winning battle but losing the war. AMZN joining CNCF is misdirection. Lambda is given a free pass to entire software industry
    • vlucas: Event-Driven architecture sounds like somewhat of a panacea with no worries about hard-coded or circular dependencies, more de-coupling with no hard contracts, etc. but in practice it ends up being your worst debugging nightmare. I would much rather have a hard dependency fail fast and loudly than trigger an event that goes off into the ether silently.
    • @xaprb: Monitoring tells you whether the system works. Observability lets you ask why it's not working.
    • Brenon Daly: Startups are increasingly stuck. The well-worn path to riches – selling to an established tech giant – isn’t providing nearly as many exits as it once did. In fact, based on 451 Research calculations, 2017 will see roughly 100 fewer exits for VC-backed companies than any year over the past half-decade. This current crimp in startup deal flow, which is costing billions of dollars in VC distributions, could have implications well beyond Silicon Valley.
    • @krishnan: The battle lines are IaaS+ with Kubernetes vs Platforms like OpenShift or Pivotal CF Vs Serverless. No one is a winner yet
    • @fredwilson: "We now have 30,000 data scientists .... 100x more than any other hedge fund ... we are not yet two years old"
    • Elon Musk: Our goal is get you there and ensure the basic infrastructure for propellant production and survival is in place. A rough analogy is that we are trying to build the equivalent of the transcontinental railway. A vast amount of industry will need to be built on Mars by many other companies and millions of people.
    • Samburaj Das: ‘Dubai Blockchain Strategy’, which aims to record and process 100% of all documents and transactions on a blockchain by the year 2020. The sweeping blockchain mandate was announced by Hamdan bin Mohammed, the crown prince of Dubai, in October 2016.
    • @danielbryantuk: "Defence in depth can add 15% to dev cost. Some think this is a lot, but does it compare to data being exposed?" @eoinwoodz #OReillySACon
    • @cmeik: It's solving the issues around gossip floods, it's providing a more reliable infrastructure, it's alleviating head of line blocking.  These are the problems that when solved *enable* large clusters.
    • Benedict Evans: the four leading tech companies of the current cycle (outside China), Google, Apple, Facebook and Amazon, or ‘GAFA’, have together over three times the revenue of Microsoft and Intel combined (‘Wintel’, the dominant partnership of the previous cycle), and close to six times that of IBM. They have far more employees, and they invest far more.
    • Alan Andersen: At Canopy Tax we have been using dockerized micro-serviced Java containers using vertx with RxJava and have found it to be highly performant and memory efficient.
    • David Gerard: as of June 2017 the Bitcoin network was running 5,500,000,000,000,000,000 (5.5×1018, or 5.5 quintillion) hashes per second, or 3.3×1021 (3.3 sextillion) per ten minutes
    • @CodeWisdom: "One accurate measurement is worth a thousand expert opinions." - Grace Hopper
    • Hiten Shah: if you wan’t to build the next $1B+ SaaS app, you have to remember to constantly evolve on top of your product. Don’t build around any one single feature that the competition can rip out. Alternatively, make that one single feature so deeply integrated into everyone else’s product that it’s pointless for anyone else to copycat it.
    • @fmanjoo: Every number in Netflix’s earnings report is just extraordinary. $8 billion on content, 80 movies by next year!
    • Bloomberg: self-driving technology is a huge power drain. Some of today’s prototypes for fully autonomous systems consume two to four kilowatts of electricity -- the equivalent of having 50 to 100 laptops continuously running in the trunk
    • @swardley: I view that once again, Amazon has been given too long to freely invade a space .. and Lambda is gunning for the entire software industry.
    • Greg Ferro: First, Just about everything in this submission [from Oracle] sounds self serving and protecting their own revenue. Its hard to see this as a genuine attempt to add value to the process. Second, the author of the letter has adopted a tone that is approximately like a father berating a child for uppity behaviour.
    • Daniel Lemire: What I am really looking forward, as the next step, is not human-level intelligence but bee-level intelligence. We are not there yet, I think.
    • Richard Branson: People used to raid banks and trains for smaller amounts - it’s frightening to think how easy it is becoming to pull off these crimes for larger amounts.
    • tbarbugli: Disclaimer: CTO of Stream here. We experimented writing Cython code to remove bottlenecks, it worked for some (eg. make UUID generation and parsing faster) and think that’s indeed good advice to try that before moving to a different language. We still decided to drop Python and use Go for some parts of our infrastructure 
    • Paul Johnston: The Serverless conversation is shifting from Functions to Analytics and Monitoring.
    • Gui Cavalcanti: Eagle Prime has modular weapons systems. The chainsaw is absolutely my favorite. I had no idea it could do as much damage as it does
    • @sheeshee: I have a web app with a postgres database running in a kubernetes cluster. I am very hip now.
    • There are so many more good quotes. Click to read more.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Monday
Oct162017

ButterCMS Architecture: a Mission-Critical API Serving Millions of Requests per Month

This is a guest post by Jake Lumetta, co-founder and CEO of ButterCMS.

ButterCMS lets developers add a content management system to any website in minutes. Our business requires us to deliver near-100% uptime for our API, but after multiple outages that nearly crippled our business, we became obsessed with eliminating single points of failure. In this post, I’ll discuss how we use Fastly’s edge cloud platform and other strategies to make sure we keep our customers’ websites up and running.

At its core, ButterCMS offers:

ButterCMS Tech Stack

Click to read more ...

Friday
Oct132017

Stuff The Internet Says On Scalability For October 13th, 2017

Hey, it's HighScalability time: 

 

Tech is transforming how food is being grown. Lots of opportunity for local nerdy production. Greenhouses even look like dartacenters! (This Tiny Country Feeds the World)

 

If you like this sort of Stuff then please support me on Patreon.

 

  • 320 trillion: ops/second in Nvidia driverless-car computer; .25%: Lambda invocations impacted by cold starts; $30,000: monthly take hijacking computers to mine cryptocurrency; 400 gbps: Ethernet standard to be ratified this year; 2.1 million: MySQL 8.0 query/second; 100,000: Kiva robots owned by Amazon; 50,000: greenhouses in Egypt's new farm city; 100 petabytes: new hard drives ordered by Backblaze; 20 million: max Bitcoin users per month; 662 million: unused vacation days in US; 92 billion: Pornhub views per year; 1,000: new Facebook hires to review ads; 12 milion: Tinder matches per day; $1 billion: Google training grants; 

  • Quotable Quotes: 
    • @toddmotto: Space X sends a rocket up into space. Lands back on its feet back on earth 7minutes later. I can't even run an npm install in that time.
    • nappy-doo: Years ago, I started at Google, and was in Charlie's cafe, eating alone. I'm sitting there, and up walks Ken Thompson. He sits down, introduces himself as Ken, and asks me what I work on. We sat there for a good 40 minutes just chatting. One of my coolest memories of working at Google was that time. He was so down to earth, never bothered to talk up about who he was (even though I knew). I really appreciated that.
    • @asymco: The popularity of iPhone with US teens at all-time high. Android is at 13% and flat. Implies Apple taking share from non-consumption.
    • @brianleroux: “We have leapfrogged containers which are a disaster for security”—@marknca on serverless #ServerlessConf
    • batmansmk: In every way, [Tensorflow] reminds me of Angular.io project. A failed promise to be true multi-language, failing to use the expressiveness of python, with a super large API that tries to do things we didn't ask it to do and a lack of a general sounding architecture.
    • @bodil: Fun fact: the Erlang runtime is implemented in C, which is an untrue computer language.
    • @Werner: There is no compression algorithm for experience.
    • @jaksprats: Ramcloud: JS faster than C running untrusted code due to sand boxing overheads … ... Cloudflare workers nailed it :)
    • sig: It is amazing to think that only a few years ago you could take an old laptop, download a miner, stick it in a closet and it would spit out something now worth a quarter of a million dollars every few days. There are a lot of problems with Bitcoin. If you look at my comment history you will see I am pretty down on it for all sort of reasons I won't go into at the moment. However sometimes you just have to take a step back and admire how crazy impressive it is that Bitcoin has reached this point.
    • Linda Nichols~ How can you go serverless without vendor lock-in? Linda proposes two possibilities: containers multi-provider frameworks
    • @DivineOps: I don't think anyone can *afford* inventory. It's just that if everyone is moving slow, you can get away with moving slow
    • mrb: The bug behind BIP 50 caused a fork, however the bugfix wasn't a hard-fork. By definition a hard-fork is a fork that require all Bitcoin nodes to be updated. In the case of that bugfix only some nodes had to be updated (the ones run by miners making up a majority of the hash power) then the rest of the non-updated nodes automatically reorg'd to the right chain, the one with the most work.
    • @datawireio: 100+ Million members. 100s of #microservices. Hundreds of thousands of instances. <10 Core SREs.
    • @rbranson: ... but single-rack systems are an increasingly rare situation, won’t practically exist in 10 years.
    • @ryan_sb: Not that it's always right to follow Google/FB/Twitter, but note that *all* of them have kept monorepos through massive growth
    • Bob Frankston: I hate the word coding; it’s like calling writing, typing.
    • cletus: People also overestimate their needs. They rush to create Hadoop clusters and distributed NoSQL solutions because, you know, relational DBs can't keep up with their "Big Data" (which means, millions of rows) when in fact you can dump billions of rows into a single MySQL instance.
    • Bob Frankston: Algorithms are the new bureaucracy
    • Thomas Ryan: Coffeelake is a good chip and a clear improvement over both Skylake-X and Kabylake. It’s not a massive leap, but it’s a generation of products that appears to be solidly better than the last. It has extend Intel’s lead in the areas where they were beating AMD and largely closed the gaps in the areas that they weren’t.
    • @alexlovelltroy: Oooh. Describing serverless microservices as state machines simplifies defining microservice boundaries. #ServerlessConf
    • kevin42: I used a GCE to test some image processing software I wrote a while ago (it runs on a very large dataset). I configured a 64 core machine with 128gb of memory. It ran perfectly, although it cost about $200 to run the test for a day. Sure, it wasn't the highest performance per CPU, but I didn't have to buy the bare metal, I can scale up the number of cores if need be, and I can fire one up whenever I want one.
    • godzillabrennus: I’ve been a [Backblaze] customer for years and recently had a catastrophic failure of a computer and it’s direct attached backup drive. I have spent the last four days waiting for backblaze to create a restore a backup for a computer on and last I checked it was at 9%.
    • hashtagframework: I wonder if the author realizes that the Airbus A380 is itself an IT project with 120 million lines of code, and 330 miles worth of 100,000 individual wires that perform 1,150 different tasks. IT gets no credit... the wings are doing all the work.
    • @troyhunt: 23 hours and 42 minutes from initial private disclosure to @disqus to public notification and impacted accounts proactively protected
    • @bglick: 2 chained functions with 90% performance guarantees have an 81% performance guarantee. Chain 7 and you're < 50%. #Serverlessconf
    • @faunadb: We're moving from a product stack to more utility based architectures/practices (aka #serverless) whether you like it or not, so get on board. "It's not a question of if, but when." @swardley #serverlessconf 
    • @FrankPasquale: “Software problems accounted for nearly 15% of US car recalls in 2015, up from less than 5% in 2011"
    • @jessitron: Octopuses do distributed decisionmaking. a tentacle can see and decide what color to be, locally.
    • @faunadb: "When you're on the cutting edge of the cutting edge of a new technology, you have to realize there's a very long tail of adoption in a large organization" @marknca #serverlessconf
    • @Joab_Jackson: CQRS (Command/Query Responsibility Seperation): Fancy name for separating reads & writes into seperate channels @ben11kehoe #ServerlessConf
    • @EconCharlesRead: 40% of Europe’s domestic freight goes by sea, but just 2% does in America due to protectionist laws from 1920
    • There are many more quotes. Click through to read it all.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Oct102017

Sponsored Post: Loupe, Etleap, Aerospike, Stream, Scalyr, VividCortex, Domino Data Lab, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • On-demand Webinar. Fast & Frictionless - The Decision Engine for Seamless Digital Business. In this session, guest speakers Michele Goetz, Principal Analyst at Forrester Research and Matthias Baumhof, VP Worldwide Engineering at ThreatMetrix, discuss: How risk-based authentication leveraging digital identities is key to empowering customer transactions; How real-time customer trust decisions can reduce fraud and improve customer satisfaction; How a high performance Hybrid Memory Architecture (HMA) database helps continuously evaluate across a multitude of factors to drive decisioning at the lowest operational cost. View now

  • Advertise your event here!

Cool Products and Services

  • .NET developers dealing with Errors in Production: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Managers want to know what’s wrong right away, users don’t want to provide log data, and you spend more time gathering information than you do fixing the problem. To fix all that, Loupe was built specifically as a .NET logging and monitoring solution. Loupe notifies you about any errors and tells you all the information you need to fix them. It tracks performance metrics, identifies which errors cause the greatest impact, and pinpoints the root causes. Learn more and try it free today.

  • Enterprise-Grade Database Architecture. The speed and enormous scale of today’s real-time, mission critical applications has exposed gaps in legacy database technologies. Read Building Enterprise-Grade Database Architecture for Mission-Critical, Real-Time Applications to learn: Challenges of supporting digital business applications or Systems of Engagement; Shortcomings of conventional databases; The emergence of enterprise-grade NoSQL databases; Use cases in financial services, AdTech, e-Commerce, online gaming & betting, payments & fraud, and telco; How Aerospike’s NoSQL database solution provides predictable performance, high availability and low total cost of ownership (TCO)

  • What engineering and IT leaders need to know about data science. As data science becomes more mature within an organization, you may be pulled into leading, enabling, and collaborating with data science teams. While there are similarities between data science and software engineering, well intentioned engineering leaders may make assumptions about data science that lead to avoidable conflict and unproductive workflows. Read the full guide to data science for Engineering and IT leaders.

  • Etleap is a Redshift ETL tool that lets you bring all the data everyone wants into Redshift. It's easy enough for analysts to add and manage data connections on their own, without inundating IT/Engineering with requests for help. It takes just minutes to add new connections such as MySQL, Salesforce, S3, and many others, then you can "set it and forget it." Learn more about Redshift ETL with Etleap.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL envisions a world of adaptable databases and flexible data workloads - your data anywhere in real time. Today, global enterprises use MemSQL as a real-time data warehouse to cost-effectively ingest data and produce industry-leading time to insight. MemSQL works in any cloud, on-premises, or as a managed service. Start a free 30 day trial here: memsql.com/download/.

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Click to read more ...

Monday
Oct092017

What will programming look like in the future?

 

Maybe programming will look something like the above video. Humans and AIs working together to produce software better than either can separately.

The computer as a creative agent, working in tandem with a human partner, to produce software, in a beautiful act of co-creation.

The alternative vision—The Coming Software Apocalypse—is a dead end. Better requirements and better tools have already been tried and found wanting. Requirements are a trap. They don't work. Requirements are no less complex and undiscoverable than code. Tools are another trap. Tools are just code that encode an inflexible solution to a problem that's already been solved.

Admittedly, I'm cheating. I have no idea how any of this will work, but here are the seeds of how it has already started:

Here's what we do know: neither tools or requirements are a silver bullet, they are a method of incrementally improving software quality. Software production quantity is not increased at all.

What we need is a manufacturing process that puts software production on an exponential curve. The only conceivable tool we have at the moment to put software on an exponential production curve is AI. That's the only way software can truly eat the world.

Right now, limited as we are by human programmers using methods that haven't changed much in 30 years, software is just nibbling at the world. And that won't scale. We need more software. A lot more software. And humans are the bottleneck.

Are humans and AIs working together to co-create software the solution? I don't know, but what else is there?

 

Related Articles

Friday
Oct062017

Stuff The Internet Says On Scalability For October 6th, 2017

Hey, it's HighScalability time: 

 

LiDAR sees an enchanted world. (Luminar)

 

If you like this sort of Stuff then please support me on Patreon.

 

  • 14TB: Western Digital Hard Drive; 3B: Yahoo's perfidy; ~80%: companies traded on U.S. stock market 1950-2009 were gone by 2009; 21%: conversion increase with AI-enabled site personalisation; $1 billion: US Air Force jets off the cloud; 1 billion: iOS devices in use; 1000x: new DeepMind WaveNet model produces 20 seconds of higher quality audio in 1 second; 96: vCPUs on new GCE machine type, with 624GB of memory; 

  • Quotable Quotes:
    • fusiongyro: The amount of incipient complexity in programming has been growing, not going down. What's more complex, "hello, world" to the console in Python, or "hello world" in a browser with the best and newest web stack? Mobility and microservices create lots of new edge cases and complexity—do non-programmers seem particularly well-equipped to handle edge cases to you? The problem has never really been the syntax—if it were, non-programmers would have made great strides with Applescript and SQL, and we'd all be building PowerBuilder libraries for a living. The problem is that programming requires a mode of thinking which is difficult. Lots of people, even people who do it daily, who are trained to do it and exercise great care and use great tool tools, are not great at it. This is not a syntax problem or a lack of decent libraries problem. We have simple programming languages with huge bodies of libraries. What's hard is the actual programming.
    • @troyhunt: 1 person didn’t patch Struts, got Equifax breached, sold shares & created dodgy search site with bad results. Right?
    • @rob_pike: Once in a while I need to build some large system written in C or C++ and am reminded why we made Go. #golang
    • @adam_chal: Me before #strangeloop: I'm not a real programmer unless I know Haskell Me after #strangeloop: I'm not a real programmer unless I knit
    • Julian Squires: I make a petty point about premature optimization; don't go out and rewrite your switch statements as binary searches by hand; maybe do rewrite your jump tables as switch statements, though.
    • @GossiTheDog: Re this - vuln scanners only find the vuln if you point them at a Struts URL. If you just point them at hostname or IP, it won’t find vuln.
    • @stevesi: Yes very much. Not unlike Wells Fargo trying to find a mid-level manager who signed people up for credit cards independent of metrics/execs.
    • @patio11: We would laugh out of the room a CEO who said "The reason that we didn't file our taxes last year was an employee forgot to buy a stamp."
    • @swardley: In general, the reasons for hybrid cloud have nothing to do with economics & everything to do with executives justifying past purchases 
    • @asymco: Changes in Android propagate to users over six years. iOS propagates in about three months.
    • bb611: It isn't luck [re: Incident: France A388 over Greenland on Sep 30th 2017, fan and engine inlet separated]. It is the result of millions of engineering hours spent on the development of highly reliable and resilient passenger aircraft, an emphasis on public identification and dissemination of design weaknesses, errors, and failures, and an unwavering focus by industry regulators on safety.
    • @mipsytipsy: "I would rather have a system that's 75% 'down' but users are fine, than a system 99.99% 'up' but user experience is impacted." #strangeloop
    • psyc: A huge proportion of the ICOs I investigate turn out to be pure facade. It's amazing to me just how quickly this con was honed and formalized, but I guess people have always been good at aping when it comes to get-rich-quick bandwagons. The standard ICO consists solely of: 1) A slick website. 2) A well-produced video. 3) A whitepaper that discusses trivially standard blockchain features and goals. No differentiation necessary. 4) The appearance that prominent or well-credentialed people are working on the "technology". That's all. The "product" is vapor. The real product is another pump & dump vehicle to satisfy the insatiable demand for pump & dump vehicles. This product is sold to the "investors" during the ICO. Said "investors" are even explicitly awarded more coins for shilling the pump everywhere by creating amateurish articles and YouTube videos.
    • nameless912~ As a developer at a company that's trying to shove Lambda down our throats for EVERYTHING...AWS needs to get better at a few key things before Lambda/serverless become viable enough that I'll actually consider integrating them into my services: 1. Permissions are a nightmare. 2. Networking is equally nightmarish. 3. If the future of compute is serverless, then Lambda, Google Cloud Functions, and whatever half-baked monstrosity Azure has cooked up are going to have to get together and define a common runtime for these environments.
    • @erikstmartin: “OS’s are dinosaurs. Let them rest” - @nicksrockwell #velocityconf
    • @bridgetkromhout: Thought experiment: what if all your systems restart at once? How long does it take you to recover? *Can* you? @whereistanya #velocityconf
    • Eric Hammond: Some services, like API Gateway, are far more complicated, difficult to use, and expensive than I expected before trying. Other services, like Amazon Kinesis Streams, are simpler, cheaper, and far more useful than I expected.
    • nameless912: please, please chop off my hands and pull out my eyeballs if cloud computing becomes yet another workflow engine. I though we killed those off in the 90s.
    • MIT: The proof-of-principle experiment that Neill and Roushan and co have pulled off is to make a chip with nine neighboring loops and show that the superconducting qubits they support can represent 512 numbers simultaneously.
    • @swardley: Equifax: We're a security nightmare! Adobe: Hold my beer Deloitte: Hold my beer Yahoo: Amateurs. Learn from a pro -
    • @somic: seeing more & more indicators these days that devops as a unifying idea is now dead. devops appears now to be ops who can write simple code
    • @slightlylate: So true. I'll trade 10 devs who are high on abstractions and metaprogramming for one who gives a damn about the user.
    • @postwait: I am just a single data point, but I use about 10% of my CS education (CS/MSe/~PhD) daily; about 50% of it monthly. I value it immensely.
    • @mweagle: The Go compiler will likely slow down your first sprint. It will radically improve your marathon performance.
    • There are many more quotes. Click through to the full article to read them. Or not. Up to you.

  • The Coming Software Apocalypse. After all these years it's still strange to see people fall into the "if we only had complete requirements we could finally make reliable systems, what's wrong with these idiots?" tarpit. Requirements are a trap. We went through all of this with waterfall and big design up front. It doesn't work. Requirements are no less complex and undiscoverable than code. Tools are another trap. Tools are code. Tools encode one perspective on a solution space and if there's anything the real world is good at, it's destroying perspective. IMHO, our mostly likely future is to treat programming as an act of computational creativity. Human programmers will work with AIs to co-create software systems. We'll work together to produce better software than a human can on their own or an AI can produce on it's own. We're better together, which is why I'm not afraid AI will replace programmers. Here's an example in music, A.I. Experiments: A.I. Duet, where a computer accompanies a piano player. Here's a better example—Ripples - A piano duet for improvising musician and generative software—where the AI piano player riffs off a human in real-time. You can imagine this is how sofware will be built in the future. Here's a hint at the productivity gain, thought it isn't a complete example, because what I'm talking about doesn't exist yet: @DynamicWebPaige: Blue lines: @Google's old Translate program, 500k lines of stats-focused code. Green: now, 500 lines of @tensorflow. See also, Jeff Dean On Large-Scale Deep Learning At Google and Peter Norvig on Machine Learning Driven Programming: A New Programming For A New World

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Monday
Oct022017

Ripple: The Most (Demonstrably) Scalable Blockchain

 

This is a guest post by Mark Travis, Performance Engineer at Ripple.

Ripple’s XRP Ledger is a blockchain-based payment network that transfers funds between any type of currency within a few seconds with average transaction costs of a fraction of a penny. The core of this peer-to-peer network is an open source C++ application called rippled. Ripple’s goal is to supplant the world’s existing legacy payment networks. As such, scalability is a continuous goal. This document describes how the rippled team has integrated performance engineering into its development processes, and how this has contributed to throughput gains of over 1000%.

Performance engineering practices deliver benefits in addition to measurable performance gains. These include the ability to report on the capabilities of the software so that users can feel confident that their needs will be met by the system. Performance engineering informs capacity planning and optimal configuration of environments to support the application. Many performance problems are caught and addressed before customers notice them. As process automation improves, each change to the software can be quickly assessed for improvement or regression. This methodology also makes better use of developer time by helping choose the most effective tasks for improving performance. Any software project serious about supporting global scale should integrate performance engineering into its development cycle.

Performance Engineering Method

Click to read more ...

Friday
Sep292017

Stuff The Internet Says On Scalability For September 29th, 2017

Hey, it's HighScalability time: 

 

Latency Numbers Every Programmer Should Know plotted over time. Click on and move the slider to see changes. There were a lot more blocks in 1990.

 

If you like this sort of Stuff then please support me on Patreon.

 

  • 1040: undergrads enrolled in Stanford's machine learning class; 39: minutes to travel from New York to Shanghai on Elon's rocket ride; $625,000: in stolen electronic-grade polysilicon; 160: terabits of data per second for Microsoft's new Trans-Atlantic Subsea Cable; 8K: people in Microsoft's AI group; 110%: increase in ICS/SCADA attacks from 2016 to 2017; 2 million: advertisers on Instagram; ~70%: savings using new Spot instance checkpointing; 10,000: nuts a year stored by a fox squirrel; $22.1 billion: IaaS market in 2016; 

  • Quotable Quotes:
    • @patio11: Wife: "Hold hands when crossing the street." *2 year old grabs own hands* "OK Mommy." Me: "Oh you're going to be so good at programming."
    • Charlie Demerjian: Intel’s “new” 8th Gen CPUs are a stopgap OEM placation to cover for a failed process, but they do bring some advances. As SemiAccurate sees it, Intel took .023 steps forward with the hardware and their messaging took three steps back.
    • Richard Dawkins: If AI Ran the World, Maybe it Would Be a Better Place
    • @swardley: No-one should be in any doubt that AWS is gunning for entire software stack (all of it) over next decade. Lambda, one code to rule them all.
    • @mstine: "you are as reliable as the weakest component in your stack...and people are a really weak component." @adrianco #CloudNativeLondon
    • @swardley: The vertical depth play will be found wanting as Amazon uses ecosystems to chew up horizonal components and move up the value chain.
    • @swardley: I assume Goldmans is bricking itself that Amazon might come into its industry and with good reason. The fattened slug wouldn't last long.
    • reacweb: I have a baremetal server and 99% of my admin task is apt-get update, apt-get upgrade. I have a diary where I write all the other admin tasks (the most complex one was configuring apache). When I buy a new server, I reread my notes to do some copy/paste. The freedom of a bare server is priceless ;-)
    • tedu: if you’re going to retry automatically, be damn sure the operation either failed or is idempotent. Or next week you can be the lucky author of the blog post about what happens when your billing database reverts to readonly mode, preventing any transactions from being marked paid, sending the payment service into a loop where it charges customers their monthly bill every 10 minutes for ten hours.
    • Jeff Barr: You can now resume workloads on spot instances and fleets. As long as they checkpoint to disk, workloads that aren’t time sensitive just got a whole heck of a lot cheaper for you.
    • Jamie Condiffe: The experiment uses drones to shuttle parcels of up to 4 pounds from a distribution center to vans—at least, when they’re parked at one of four rendezvous points around the city, anyway. The vans have a special landing zone on their roof, which allows the drone to set down and and drop off its payload. The driver of the vehicle is then tasked with actually delivering the package to a customer.
    • @codinghorror: Absolutely monstrous http://browserbench.org/Speedometer/  numbers for iPhone 8. That is over 1.4x the iPhone 7
    • @EdSwArchitect: "Logstash is not going to go to 500,000 log entries / second". Kafka does. #StrataData - Streams & Containers talk
    • endymi0n: Managing financial matters on AWS is such a royal PITA, I'm so glad we switched 90% of our stack to Google.
    • shub: Good luck finding anything public about graph processing on a dataset too large to fit on a single machine. I can launch an AWS instance with 128 cores and 4 TB RAM--how many triples is too many for that monster? Tens of billions? Hundreds of billions?
    • @pzfreo: @adrianco #CloudNativeLondon By the time you've decided on your container orch'n system, you could have the whole thing done in #Serverless
    • @danielbryantuk: "The easy win to get started with chaos engineering is to run a game day with what you currently have" @adrianco #CloudNativeLondon
    • @Tulio_de_Souza: CloudNative principle: "pay for what you used last month and not what what you guess you will need next year" #cloudnativelondon @adrianco
    • Yasmin Anwar: fox squirrels apparently organize their stashes of nuts by variety, quality and possibly even preference
    • @kopertop: Unfortunately @randybias, the most important thing @awscloud did cant be replicated by *any* Software. Marketing, innovation, and support.
    • Sarty: I guess the argument [for using Filecoin] is that I should trust a single behemoth like Amazon less than I should trust an arbitrary number of nameless, faceless on-the-cheap suppliers on the premise that a nebulous algorithm that I (the average user) don't totally understand will stochastically cause those suppliers to lose their contract if they lose too much data, but that's okay because a different nebulous algorithm I don't totally understand can reconstruct the data as long as most of those nameless, faceless suppliers are on the up-and-up, all on the fly and completely decentralized? Yeah, sure, sign me up. What could possibly go wrong?
    • danudey: The biggest change for us was SSDs coming down in price. Whereas before I might need four read slaves to ensure that at peak load I'm handling all my transactions within X ms, now I can guarantee it on one server. More importantly, in our industry where we're vastly more write-constrained than read-constrained and we're faced with e.g. MySQL not being able to easily spread writes over multiple servers simply, the appeal of something like MongoDB or Cassandra with built-in sharding and rebalancing to spread out both reads and writes sounds very appealing. And again, I can move from a giant complicated, expensive, heat-producing multi-disk raid10 to a pair of SSDs in RAID1 (or better) and easily meet my iops requirements. Without being able to upgrade to SSDs I think we would have been looking into other systems like Cassandra a lot sooner, but right now we can pretty easily throw some money at the problem and it goes away.
    • LibertarianLlama: Perhaps millennia from now we will be building Dyson Spheres around stars to use the energy to mine bitcoin.
    • happymellon: I work with satellite imagery processing which is quite large in it raw data format, and after a decade we are not dealing with petabytes of active data, hundreds of gigs for a full earth coverage. Before that I have held positions in finance, dealing with realtime transaction processing. We did not work in petabytes. If you are working in petabytes you are storing crap in your production database, and 99% of that data is wasted.
    • @karpathy: Kaggle competitions need some kind of complexity/compute penalty. I imagine I must be at least the millionth person who has said this.
    • unclebucknasty: I think I'm one of those graybeards. I see it in so many things tech. It's a pattern, and once you've seen it repeat a half-dozen times and also gain a depth of experience over that time, you can actually recognize when something represents genuine progress vs yet another passing fad. Spoiler alert: those that are most rabidly promoted are often the latter. But, if you try to raise the point in the midst of the latest fad, you generally get shouted down. So, you wait until the less-jaded figure it out...again. It was plainly obvious for NoSQl, just as it now is for SPAs (or at least our current approach). Don't believe me? Wait 5 years.
    • CBobRobison: Outsourcing Attacks are prevented by implementing Proof-of-Spacetime (PoST). With PoST a node is required to put a deposit down based on the amount of storage it's providing. It then has to continuously hash the stored data against public nonces and occasionally upload it's solution to the network to prove the data was there the whole time. If it doesn't actually have the data, it doesn't hash correctly, and it fails to provide PoST. As a negative consequences, the node forfeits its deposit.
    • Antonio Garcia-Martinez: Zuckerberg’s proposes, shockingly, a solution that involves total transparency. Per his video, Facebook pages will now show each and every post, including dark ones (!), that they’ve published in whatever form, either organic or paid. It’s not entirely clear if Zuckerberg intends this for any type of ad or just those from political campaigns, but it’s mindboggling either way. Given how Facebook currently works, it would mean that a visitor to a candidate’s page—the Trump campaign, for instance, once ran 175,000 variations on its ads in a single day—would see an almost endless series of similar content.

    Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

    Click to read more ...