advertise
Friday
Apr282017

Stuff The Internet Says On Scalability For April 28th, 2017

Hey, it's HighScalability time:

 

Do you understand the power symbol? I always think of O as a circuit being open, or off, and the | as the circuit being closed, or on. Wrong! Really the symbols are binary, 0 for false, or off, 1 for true, or on. Mind blown.

If you like this sort of Stuff then please support me on Patreon.
  • 220,000-Core: largest Google Compute Engine job; 100 million: Netflix subscribers; 1.3M: Sling TV subscribers; 200: Downloadable Modern Art Books; 25%: Americans Won't Subscribe To Traditional Cable; 84%: image payload savings using smart CDN; 10^5: number of world-wide cloud data centers needed; 63%: more Facebook clicks using personality targeting; 2.5 million: red blood cells created per second; 

  • Quotable Quotes:
    • Silicon Valley~ The only reason Gilfoyle and I stayed up 48 f*cking straight hours was to decrease server load, not keep it the same. 
    • Robert Graham: In other words, if the entire Mirai botnet of 2.5 million IoT devices was furiously mining bitcoin, it's total earnings would be $0.25 (25 cents) per day.
    • @BoingBoing: John Deere just told US Copyright office that only corporations can own property, humans merely license it
    • mattbillenstein: Lin Clark's talk makes this sound like they implemented a scheduler in React -- basically JS is single-threaded, so they're implementing their own primitives and a scheduler for executing those on that main thread.
    • Robert M. Pirsig: When analytic thought, the knife, is applied to experience, something is always killed in the process.
    • @vornietom: I honestly feel bad for the people on the Placebo March who thought they were at the Science March but double blind testing is important
    • MIT: we can capture and monitor human breathing and heart rates by relying on wireless reflections off the human body.
    • Mohamed Zahran~ Surprisingly enough traditional homogenous multi-core are really heterogeneous. Why is that? Every core is running at its own frequency. Many processors are now a traditional core and a GPU. FPGAs are already with us. Automata Processor is a specialized processor that can execute non-deterministic finite automata (regular expressions) orders of magnitude faster than a GPU.  Neuromorphic brain inspired chips. Fancy GPUs. 
    • @craigbuj: amazing how fast China Internet companies can scale: ofo: 10+ million daily rides in China Uber: ~6 million daily rides globally
    • knz: CockroachDB's architecture is an emergent property of its source code. 
    • @Jason: Good news: over 70b spent on digital ads in 2016.  Terrifying news: 89% of growth was Facebook & Google. Via @iab
    • @swardley: I think we need to stop thinking about AMZN as a future $1T biz and more think about it as a future $10T biz, possibly much more.
    • @timoreilly: "Algorithms are opinions embedded in code." @mathbabedotorg #TED2017 
    • Google: I think we [Google Cloud] have a pretty good shot at being No. 1 in five years
    • limitless__: Folks who think programmer skill declines when you're 40+ are 100% wrong. What declines is your willingness to put up with stupidity and what increases is your willingness and ability to tell someone to fly a kite when they tell you to work stupid hours and do stupid things.
    • @nicusX: "Don't worry about X. X is transparently managed for you". Reads: "When things go wrong you'll never be able to fix it" #mechanicalSympathy
    • defined: What's up is the rampant ageism in the industry - the perception that you are washed up as a "dinosaur" developer after a certain age, maybe 40 or so, and belong in management. We "dinosaurs" - we happy few - are living evidence to the contrary.
    • user5994461: AWS Spot Instances are under bid. The highest bidder takes the instances, the price changes all the time. Google Spot Instances (preemptibles) are 80% off and that's it. It's simple.
    • James Hamilton: in 10 years, ML will be more than 1/2 the worlds server side footprint.
    • qnovo: if we examine the average capacity in smartphones over the past 5 years, we see that it has grown at about 8% annually. A battery in a 2017 smartphone contains about 40 – 50% more capacity (mAh) than it did in 2012.
    • StorageMojo: Bottom line: the NVRAM market is heating up. And that’s a very good thing for the IT industry.
    • Crazycontini: We need a lot more help to clean up the world’s crypto mess.
    • Pramati Muthalaxe: Irrespective of what Facebook says, all of them have one objective — to get more money out of potential advertisers. That requires a constant decay of your reach.
    • danluu: It looks like, for a particular cache size, the randomized algorithms do better when miss rates are relatively high and worse when miss rates are relatively low,
    • There's just too much. To see all Quotable Quotes please click through to the full article.

  • Is Kubernetes the next OpenStack? The Cloudcast #296. No. The core architecture team for Kubernetes ensures there's a consistency accross the project...

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Apr252017

Sponsored Post: Etleap, Pier 1, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 

  • Pier 1 Imports is looking for an amazing Sr. Website Engineer to join our growing team!  Our customer continues to evolve the way she prefers to shop, speak to, and engage with us at Pier 1 Imports.  Driving us to innovate more ways to surprise and delight her expectations as a Premier Home and Decor retailer.  We are looking for a candidate to be another key member of a driven agile team. This person will inform and apply modern technical expertise to website site performance, development and design techniques for Pier.com. To apply please email cmwelsh@pier1.com. More details are available here.

  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 

Fun and Informative Events

  • DBTA Roundtable OnDemand Webinar: Leveraging Big Data with Hadoop, NoSQL and RDBMS. Watch this recent roundtable discussion hosted by DBTA to learn about key differences between Hadoop, NoSQL and RDBMS. Topics include primary use cases, selection criteria, when a hybrid approach will best fit your needs and best practices for managing, securing and integrating data across platforms. Brian Bulkowski, CTO and Co-founder of Aerospike, presented along with speakers from Cask Data and Splice Machine. View now.

  • Advertise your event here!

Cool Products and Services

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Click to read more ...

Friday
Apr212017

Stuff The Internet Says On Scalability For April 21st, 2017

Hey, it's HighScalability time:

 

Which do you see: Machines freeing people? Lost jobs? Slavery? Hyperactive Skittles?

If you like this sort of Stuff then please support me on Patreon.
  • year 1899: “Nobody has to use the Internet”; 12MPH: Speed news of Lincoln's assassination traveled the US; $200 million: Lyft tips; 500: data structures and algorithms interview questions; %0.00244140625: Odds of 13 straight male Dr. Who regens; 100: gigafactories could power the world; 100K: bots on Messenger; 1 million: containers Netflix lanched in one week; 5.2 trillion: 2014 US revenue; 52,129: iterations to converge on NFL schedule; 36 Gbps: Facebook's network in the sky; 

  • Quotable Quotes:
    • @mipsytipsy: "That doesn't sound hard. I could build that in a weekend."
    • @Noahpinion: The Elon Musk Future is the good future. The Peter Thiel Future is the bad future. But honestly you'll probably get the Jeff Bezos Future.
    • @BenedictEvans: In 2007 Google, Apple, Facebook & Amazon had maybe 50k staff between them. Today it's more like 400k.
    • @AWSonAir: @Expedia inserting 70,000 rows per second of hotel data with Amazon Aurora.
    • @swardley: STOP! If you're thinking of moving to cloud today (as in IaaS), you are so late that you need to consider moving to serverless ->
    • David Rosenthal: Silicon Valley would not exist but for Ph.D.s leaving research to create products in industry.
    • @cmeik: Distributed applications today treat the database like shared memory, and that's why we love things like Spanner.  This is a flawed design.
    • @Jason: Apple's cash hoard swells to five Teslas / four Ubers / 25 Twitters 😂 (aka $246.09 billion)
    • Founder Collective: I don’t imagine these founders aspired to start businesses in video rental, orthodonture, or email deliverability as kids. The point is that passion can be found in a lot of places.
    • @randomfrequency: OH: "We're doing soviet style agile - every two weeks there's a new five year plan" // Will Whittaker
    • ragsoflight: It's worth pointing out that the combine is a notoriously bad predictor of how a prospect will perform in the NFL. It's not a stretch to posit the same thing about technical interviews that stress rote memorization and minutia.
    • fbonetti: I'm currently working at a place that has virtually no process, which has it's own challenges, but I'm happy that I'm not arguing about burn down charts anymore :)
    • ryg: The good news is absolute DRAM latencies have gone down since the 80s – by a factor of about 4-5 or so. The bad news is that clock rates have increased by about a factor of 3000
    • @peterbourgon: "Programming is made up of 2 activities: making decisions (95%), and typing (5%)"
    • Steven Levy: Cannabis is to indoor growing as porn was to the internet
    • @codinghorror: for GPU hash cracking, one 1080 Ti is worth more than three (!) AWS G2.8xlarge instances ($2.60/hour)
    • Phillip Manwaring: Because everything outside of [solving your business’ problems] — load balancing, capacity planning, deployments, five 9 availability engineering — is undifferentiated heavy lifting
    • @EnterprisingA: Anyone who takes a good look at AWS Greengrass without going "holy sh-t" doesn't get it
    • @philnash: I got a 200-300ms improvement on render time using rel="preload" for fonts on philna.sh after reading @addyosmani's 
    • uiri: The people whom cdixon talks to that say they want to be '"working at or founding a startup"' are lying. They are lying to themselves first and foremost and only secondarily to cdixon. Their actions speak much louder than those words. Those actions say that they want to safety and security of the job that they don't enjoy rather than the risk and uncertainty of a startup.
    • crusso: The phrase "this is why we can't have nice things" surfaces time and again in the face of those 1 in 20 sociopaths who interact with you as a business owner.
    • empressplay: A core fundamental here is that people inherently misjudge the competency of their peers / the state of other projects / components / departments. Thus when they have a "problem" they make an assumption that their single point of failure won't be catastrophic (because everything else is okay), and that there's no need to sound an alarm over it (and potentially jeopardise their career).
    • incapsula: there’s been a major shift over time in the motivation of the people behind the DDoS attacks. Instead of simply trafficking in spam, botnet operators have figured out a way to monetize their efforts through extortion or by launching a DDoS-for-hire platform like Mirai.
    • rdtsc: Don't handle errors locally. Build a supervision tree where some of part of the system does just the work it is are intended to (ex.: handling a client's request), and other (isolated part) does the monitoring and error handling. Have one process monitor others, one machine monitor another etc.

  • The private cloud takes a hit. Intel Pulls Out of OpenStack Effort It Founded with Rackspace
    • vishvananda, one of the original OpenStack developers, on the two mistakes of OpenStack: 1. We thought that private clouds were generally valuable. 2. We focused on community building by supporting all use cases. The end result of these mistakes OpenStack is good for certain use-cases. It does really well in large companies that need a public-cloud like environment to manage their infrastructure and can hire a team of people to manage it (e.g. comcast, verizon, e-bay, wal-mart). I don't know that the exodus is due to endemic problems so much as the market finally waking up and realizing that public cloud is the future.
    • jldugger: IMO, OpenStack was sort of a consortium effort to compete with VMWare, AWS and Salesforce, but it's operational model is closest to AWS. Nearly all the big enterprise IT companies tied their rafts together hoping to stave the flow of customers. It doesn't appear to have worked.

    • ...

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Friday
Apr142017

Stuff The Internet Says On Scalability For April 14th, 2017

Hey, it's HighScalability time:

 

After 20 years, Cassini will not go gently into that good night, it will burn and rave at close of day. (nasa)

If you like this sort of Stuff then please support me on Patreon.
  • 10^15: synapses activated per second in human brain (2/3rds fail); $4.5B: Amazon spend on video (Netflix $6 billion); 22,000: AWS database migrations served; ~15%: Dropbox reduced CPU usage using Brotli; $3.5 trillion: IT spending in 2017; 10%: reduction in QoQ hard drive shipments; 33.3%: Nginx share of webserver market; 37.2 trillion: human cells in a Cell Atlas; 6.2 miles: journey to the center of the earth; 200: lines of code for blockchain; 95%: Wikipedia pages end up at philosophy; 1.2 billion: Messenger monthly users; 

  • Quotable Quotes:
    • Jeff Bezos: Day 2 is stasis. Followed by irrelevance. Followed by excruciating, painful decline. Followed by death. And that is why it is always Day 1.
    • Bob Schmidt: If debugging is the process of removing errors from a design, then designing must be the process of putting errors into a design!
    • @swardley: the gap between where the cutting edge is and where the majority are just seems to increase year on year.
    • Riot Games: We need to provide resources when it's time to grow, we need to react when it gets sick, and we need to do it all as fast as possible at a global scale.
    • masklinn: High-performance native code already does these specialisation, generally on a per-project basis (some projects include multiple allocators for different bits of data), and possibly using a non-OS allocator in the first place
    • @erikbryn: MT: @DKThomp : there are 950k warehouse workers —6X the number of steel workers and miners combined
    • Joeri: The challenge of a rewrite is not in mapping the core architecture and core use case, it's mapping all the edge cases and covering all the end user needs. You need people intimately familiar with the old system to make sure the new system does all the weird stuff which nobody understood but had good reasons that was in the corners of the old system's code. 
    • @redblobgames: 2016 GDC Diablo talk: let's switch from turn-based to real-time 2017 GDC Civilization talk: let's switch from real-time to turn-based
    • @random_walker: Encrypted traffic has a fingerprint—enough to distinguish among 200 Netflix vids with 99.5% accuracy in < 2.5 mins.
    • Sophie Wilson: You’re going to buy a 10-way, 18-way multi-core processor that’s the latest, all because we told you you could buy it and made it available, and we’re going to turn some of those processors off most of the time. So you’re going to pay for logic and we’re going to turn it off so you can’t use it.
    • qq66: But is there anything more personal than a computer programmer writing a bot to send messages for him?
    • Anu Hariharan: Unlike other social products, WeChat does not only measure growth by number of users or messages sent. Instead they also focus on measuring how deeply is the product engaged in every aspect of daily life (e.g., the number of tasks WeChat can help with in a day).
    • @fredwilson: "The real issue here is Facebook’s market power. And we face similar market power issues in search (Google) and commerce (Amazon)"
    • There are so many quotable quotes I couldn't include them all here. Click through to read the full article.

  • Luna Duclos on Game Development and Rebuilding Microservices. Switching from PHP/Python to Go. Go is much faster and uses less CPU. As big as the switch to Go is the switch from Google App Engine to VMs. GAE servers are small and CPU constrained despite the relatively high cost. Their Go cluster runs in the Google Cloud on Google Container Engine.

  • Werner Against the Machine. Wait, aren't you the machine now?

  • Kwabena Boahe on Stanford Seminar: Neuromorphic Chips: Addressing the Nanostransistor Challenge. A dollar bought more and more transistors until 2014, when for the first time the price for transistors went up. Fundamental constraints at the physical level is the cause. The challenge is to continually shrink the footprint of the transistor so it occupies less space. A traffic metaphor is used to explain the difficulty of continually shrinking transistors. Shrinking gives you fewer lanes and electrons can block a lane by being trapped in a pothole. When you get down to one lane and electron is trapped the current flows slowly. Our brains work with ultimately scaled devices...

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Apr112017

Sponsored Post: Pier 1, Aerospike, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 

  • Pier 1 Imports is looking for an amazing Sr. Website Engineer to join our growing team!  Our customer continues to evolve the way she prefers to shop, speak to, and engage with us at Pier 1 Imports.  Driving us to innovate more ways to surprise and delight her expectations as a Premier Home and Decor retailer.  We are looking for a candidate to be another key member of a driven agile team. This person will inform and apply modern technical expertise to website site performance, development and design techniques for Pier.com. To apply please email cmwelsh@pier1.com. More details are available here.

  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 

Fun and Informative Events

  • DBTA Roundtable OnDemand Webinar: Leveraging Big Data with Hadoop, NoSQL and RDBMS. Watch this recent roundtable discussion hosted by DBTA to learn about key differences between Hadoop, NoSQL and RDBMS. Topics include primary use cases, selection criteria, when a hybrid approach will best fit your needs and best practices for managing, securing and integrating data across platforms. Brian Bulkowski, CTO and Co-founder of Aerospike, presented along with speakers from Cask Data and Splice Machine. View now.

  • Advertise your event here!

Cool Products and Services

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Click to read more ...

Monday
Apr102017

Five things we’ve learned about monitoring containers and their orchestrators

This is a guest post by Apurva Davé, who is part of the product team at Sysdig.

Having worked with hundreds of customers on building a monitoring stack for their containerized environments, we’ve learned a thing or two about what works and what doesn’t. The outcomes might surprise you - including the observation that instrumentation is just as important as the application when it comes to monitoring.

In this post, I wanted to cover some details around what it takes to build a scale-out, highly reliable monitoring system to work across tens of thousands of containers. I’ll share a bit about what our infrastructure looks like, the design choices we made, and tradeoffs. The five areas I’ll cover:

  • Instrumenting the system

  • Relating your data to your applications, hosts, and containers.

  • Leveraging orchestrators

  • Deciding what to data to store

  • How to enable troubleshooting in containerized environments

For context, Sysdig is the container monitoring company. We’re based on the open source Linux troubleshooting project by the same name. The open source project allows you to see every single system call down to process, arguments, payload, and connection on a single host. The commercial offering turns all this data into thousands of metrics for every container and host, aggregates it all, and gives you dashboarding, alerting, and an htop-like exploration environment.

Ok, let’s get into the details, starting with the impact containers have had on monitoring systems.

Why do containers change the rules of the monitoring game?

Click to read more ...

Friday
Apr072017

Stuff The Internet Says On Scalability For April 7th, 2017

Hey, it's HighScalability time:

 

Visualization of the magic system behind software infrastructure. (eyezmaze@ThePracticalDev

If you like this sort of Stuff then please support me on Patreon.
  • 10-20: aminoacids can be made per second; 64800x: faster DDL Aurora vs MySQL; 25 TFLOPS: cap for F1 simulations; 15x to 30x: Tensor Processing Unit faster than GPUs and CPUs; 100 Million: Intel transistors per square millimeter; 25%: Internet traffic generated by Google; $1 million: Tim Berners-Lee wins Turing Award; 43%: phones FBI couldn't open because of crypto;

  • Quotable Quotes:
    • @adulau: To summarize the discussions of yesterday. All tor exit nodes are evil except the ones I operate.
    • @sinavaziri: Let's say a data center costs $1-2B. Then the TPU saved Google $15-30B of capex?
    • Vinton G. Cerf: While it would be a vast overstatement to ascribe all this innovation to genetic disposition, it seems to me inarguable that much of our profession was born in the fecund minds of emigrants coming to America and to the West over the past century.
    • Alan Bundy: AI systems are not just narrowly focused by design, because we have yet to accomplish artificial general intelligence, a goal that still looks distant. 
    • JamesBarney: Soo much this, just worked on a project that sacrificed reliability, maintainability, and scalability to use a real time database to deal with loads that were on the order of 70 values or 7 writes a second.
    • bobdole1234: 3.5x faster than CPU doesn't sound special, but when you're building inference capacity by the megawatt, you get a lot more of that 3.5x faster TPU inside that hard power constraint.
    • Eugenio Culurciello: As we have been predicting for 10 years, in SoC you can achieve > 10x more performance that current GPUs and > 100x more performance per watt.
    • Google: The TPU’s deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, ...) that help average throughput more than guaranteed latency. 
    • visarga: TPU excited me too at first, but when I realized that it is not related to training new networks (research) and is useful only for large scale deployment, I toned down my enthusiasm a little. 
    • Julian Friedman: Kube is being designed by system administrators who like distributed systems, not for programmers who want to focus on their apps.
    • shadowmint: Given what I've seen, I'd argue that clojure has an inherent complexity that results in poor code quality outcomes during the software maintenance cycle.
    • weberc2: I like Go, but it's not dramatically faster than Java. Any contest between the two of them will probably just be a back and forth of optimizations. They share pretty much the same upper bound.
    • adrianratnapala: All this means is that we should stop thinking of this stuff as RAM. Only the L1 cache is really RAM. Everything else is just a kind of fast, volatile, solid state disk that just happens to share an address space with the RAM.
    • pbreit: Getting a million users is infinitely harder than scaling a system to handle a million users. Most systems could run comfortably on a Raspberry Pi.
    • @sustrik: If you want your protocol to be fully reliable in the face of either peer shutting down, the terminal handshake has to be asymmetric. As we've seen above, TCP protocol has symmetric termination algorithm and thus can't, by itself, guarantee full reliability.
    • @damonedwards: Unit tests are critical for good dev, but aren't really ops concern. Integration tests are critical for good ops. Ops wants more int tests.
    • mannigfaltig: the brain appears to spend about 4.7 bits per synapse (26 discernible states, given the noisy computation environment of the brain); so it seems to be plenty enough for general intelligence. This could, of course, merely be a biological limit and on silicon more fine-grained weights might be the optimum.
    • marwanad: The main power of GraphQL is for client developers and lies in the decoupling it provides between the client and server and the ability to fulfill the client needs in a single round trip. This is great for mobile devices with slower networks.
    • kyleschiller: As a pretty good rule of thumb, a system that fails 1/nth of the time and has n opportunities to fail has ~.63 probability of failure, where n is more than ~10.
    • jjirsa: databases aren't where you want to have hipster tech. You want boring things that work. For me, Cassandra is the boring thing that works. 
    • @etherealmind: "rule #1 of Enterprise IT: easier to spend 10 million on equipment than 100k for a person. A third person would increase capacity by 30%"
    • @SwiftOnSecurity: “Just pick a good VPN” is like telling thirsty people to “go to a store and drink clear liquid.” They drank bleach, but at least you helped.
    • falsedan: There's 2 secrets to scaling to millions of users: 1. You aren't going to have millions of users so any work you do to support it is stopping you from delivering features that will make your existing 10 clients happier. 2. Write code that can be replaced (i.e. design for change). 
    • X86BSD: Have you tested running it on a FreeBSD box with ZFS? It has lz4 compression by default and makes such a great storage solution for PG. You get compression, snapshots, replication (not quite realtime but close), self healing, etc etc in a battled hardened and easy to manage filesystem and storage manager. I've found you can't beat ZFS and PG for most applications. Edge cases exist of course everywhere.

  • Worried about too much infrastructure? Only 2% of DNA codes for proteins, the other 98% codes for RNA. Harry Noller Lecture. Maybe lots of infrastructure is not a bad thing. One of they key differences in programming and biology is how in biology form completely determines function. Just amazing to watch in action: mRNA Translation (Advanced). Programming is the complete opposite.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Friday
Mar312017

Stuff The Internet Says On Scalability For March 31st, 2017

Hey, it's HighScalability time:

 

What lies beneath? Networks...of blood vessels. (Wellcome Image Awards)

If you like this sort of Stuff then please support me on Patreon.
  • 5000: node (150,000 pod) clusters in Kubernetes 1.6; 15 years: time to @spacex launch with a recycled rocket booster; 174 mbps: Internet speed in Dublin; 10 nm: Intel’s new Moore approved process; 30 minutes: to create Samsung's S8; 50 billion: of your cells replaced each day; 2 million: new red blood cells per second; 3dbm: attenuation of human body, same as a wall; 12: hours of tardis sounds; 350: pages to stop a bullet; 2: meters of DNA pack in a space .000006m wide; 

  • Quotable Quotes:
    • @swardley: Having met many "leaders" in technology & business, I wouldn't bet on the future survival of humanity. If anything AI might help the odds
    • Francis Pouliot: Any contentious hard fork of the Bitcoin blockchain shall be considered an alternative cryptocurrency (altcoin), regardless of the relative hashing power on the forked chain.
    • @coda: WhatsApp: 900M users, built w/ < 35 devs, using #erlang Krispy Kreme: 1004 locations, 3700 employees, original glazed is 190 #calories
    • @BenedictEvans: Still think it's interesting Instagram shifted emphasis from interests to friends. Is that a law of nature for social if you want scale?
    • @johnrobb: "each robot per thousand workers decreased employment by 6.2 workers and wages by 0.7 percent"
    • Alex Woodie: The Hadoop dream of unifying data and compute in a distributed manner has all but failed in a smoking heap of cost and complexity, according to technology experts and executives who spoke to Datanami.
    • @RichRogersIoT: "First you learn the value of abstraction, then you learn the cost of abstraction, then you are ready to engineer." - @KentBeck
    • @codemanship: Don't explain code quality to execs. Explain high cost of change. Explain slowing down of innovation. Explain longer cycle times.
    • @malwareunicorn: Bad malware pickup lines: Hey girl, I heard you like sandboxes. I would never try to escape yours ;)
    • dkhenry: The selling of data isn't the policy you need to fight. The monopoly power of ISP's is the problem you must push back on. 
    • @MaxWendkos: An SEO expert walks into a bar, bars, pub, tavern, public house, Irish pub, drinks, beer, alcohol
    • Barry Lampert: the point of Amazon isn't to offer a consumer the absolute lowest price possible; it's to offer the lowest price possible given the convenience that Amazon offers
    • Daniel Lemire: Let us make the statement precise: Most performance or memory optimizations are useless.
    • @sarahmei: People run into trouble with DRY because it doesn't tell you *what* not to repeat. People assume syntax, but it's actually concepts.
    • Dan Rayburn: China suffers from 9.2% transfer failure rate (similar to Malaysia, India and Brazil), and a high packet loss.  These two parameters have severe impact on content download time and overall performance.
    • Daniel Lemire: I submit to you that it is no accident if the StackOverflow list of top-paying programming languages is made of obscure languages. They are comparing the average of a niche against the average of a large population
    • For even more Quotable Quotes please click through to the main article.

  • For good WiFi you don't necessarily need one big powerful router bristling with antenna like a radiation mutated ant. 802.eleventy what? A deep dive into why Wi-Fi kind of suck and New Screen Savers (@20 min). You want a true mesh network (Plume). WiFi should whisper, use 5G to create pools of WiFi in each room so signals don't penetrate between rooms. Lots of little access points can automatically find a path through your house. Use a wired backhaul for best performance. Raw throughput isn't the best measure. How does it perform with many people using many devices? Roaming isn't always well supported. Consider how well the system hands-off devices as you walk through the house. 

  • BloomCON 2017 Videos are now available. You might like Honey, I Stole Your C2 [Command-and-control] Server: A dive into attacker infrastructure.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Mar292017

How to speed up your MySQL with replication to in-memory database

Original article available at https://habrahabr.ru/company/mailru/blog/323870/

I’d like to share with you an article based on my talk at Tarantool Meetup(the video is in Russian, though). It’s a short story of why Mamba, one of the biggest dating websites in the world and the largest one in Russia, started using Tarantool. Why did we decide to busy ourselves with MySQL-to-Tarantool replication?

First, we had to migrate to MySQL 5.7 at some point, but this version didn’t have HandlerSocket that was being actively used on our MySQL 5.6 servers. We even contacted the Percona team — and they confirmed MySQL 5.6 is the last version to have HandlerSocket.

Second, we gave Tarantool a try and were pleased with its performance. We compared it against Memcached as a key-value store and saw the speed double from 0.6 ms to 0.3 ms on the same hardware. In relative terms, Tarantool’s twice as fast as Memcached. In absolute terms, it’s not that cool, but still impressive.

Third, we wanted to keep the whole existing architecture. There’s a MySQL master server and its slaves — we didn’t want to change anything in this structure. Can MySQL 5.6 slaves with HandlerSocket be replaced with something else without having to make significant architectural changes?

We learned that the Mail.Ru Group team has a replicator they created for their own purposes. The idea of replicating data from MySQL to Tarantool belongs to them. We asked the team to share the source code, which they did. We had to rewrite the code, though, since it worked with MySQL 5.1 and Tarantool 1.5, not 1.7. The replicator uses libslave, an open-source solution for reading events from a MySQL master server, and is built statically without any of MySQL’s system libraries. It’s been open-sourcedunder the BSD license, so anyone can use it for free.

Replication constraints

Click to read more ...

Wednesday
Mar292017

Sponsored Post: ButterCMS, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 

  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 

Fun and Informative Events

  • Analyst Webinar: Forrester Study on Hybrid Memory NoSQL Architecture for Mission-Critical, Real-Time Systems of Engagement. Thursday, March 30, 2017 | 11 AM PT / 2 PM ET. In today’s digital economy, enterprises struggle to cost-effectively deploy customer-facing, edge-based applications with predictable performance, high uptime and reliability. A new, hybrid memory architecture (HMA) has emerged to address this challenge, providing real-time transactional analytics for applications that require speed, scale and a low total cost of ownership (TCO). Forrester recently surveyed IT decision makers to learn about the challenges they face in managing Systems of Engagement (SoE) with traditional database architectures and their adoption of an HMA. Join us as our guest speaker, Forrester Principal Analyst Noel Yuhanna, and Aerospike’s VP Marketing, Cuneyt Buyukbezci, discuss the survey results and implications for your business. Learn and register

  • Advertise your event here!

Cool Products and Services

  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • ButterCMS is an API-based CMS that seamlessly drops into your app or website. Great for blogs, dynamic pages, knowledge bases, and more. Butter works with any language/framework including Ruby, Rails, Node.js, .NET, Python, Django, Flask, React, Angular, Go, PHP, Laravel, Elixir, Phoenix, and Meteor.

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Click to read more ...