advertise
Tuesday
Jan032017

Sponsored Post: Loupe, New York Times, ScaleArc, Aerospike, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring?

  • The New York Times is looking for a Software Engineer for its Delivery/Site Reliability Engineering team. You will also be a part of a team responsible for building the tools that ensure that the various systems at The New York Times continue to operate in a reliable and efficient manner. Some of the tech we use: Go, Ruby, Bash, AWS, GCP, Terraform, Packer, Docker, Kubernetes, Vault, Consul, Jenkins, Drone. Please send resumes to: technicaljobs@nytimes.com

Fun and Informative Events

  • Your event here!

Cool Products and Services

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • ScaleArc's database load balancing software empowers you to “upgrade your apps” to consumer grade – the never down, always fast experience you get on Google or Amazon. Plus you need the ability to scale easily and anywhere. Find out how ScaleArc has helped companies like yours save thousands, even millions of dollars and valuable resources by eliminating downtime and avoiding app changes to scale. 

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Click to read more ...

Monday
Jan022017

Efficient storage: how we went down from 50 PB to 32 PB

As the Russian rouble exchange rate slumped two years ago, it drove us to think of cutting hardware and hosting costs for the Mail.Ru email service. To find ways of saving money, let’s first take a look at what emails consist of.

Indexes and bodies account for only 15% of the storage size, whereas 85% is taken up by files. So, files (that is attachments) are worth exploring in more detail in terms of optimization. At that time, we didn’t have file deduplication in place, but we estimated that it could shrink the total storage size by 36% since many users receive the same messages, such as price lists from online stores or newsletters from social networks containing images and so on. In this article, I’m going to describe how we implemented a deduplication system under the supervision of Albert Galimov.

Click to read more ...

Friday
Dec232016

Stuff The Internet Says On Scalability For December 23rd, 2016

Hey, it's HighScalability time:

 

A wondrous ethereal mix of technology and art. Experience of "VOID"

If you like this sort of Stuff then please support me on Patreon.

  • 2+ billion: Google lines of code distributed over 9+ million source files; $3.6 bn: lower Google taxes using Dutch Sandwich; $14.6 billion: aggregate value of all cryptocurrencies; 2x: graphene-fed silkworms produce silk that conducts electricity; < 100: scientists looking for extraterrestrial life; 48: core Qualcomm server SoC; 455: original TV series in 2016;

  • Quotable Quotes:
    • Ben Thompson~ It's so easy to think of tech with an 80s mindset with all the upstarts. We still glorify people in garages. The garage is gone...Our position in the world is not the scrappy upstart. It is the establishment.
    • The Attention Merchants: True brand advertising is therefore an effort not so much to persuade as to convert. At its most successful, it creates a product cult, whose loyalists cannot be influenced by mere information
    • @seldo: Speed of development always wins. Performance problems will (eventually) get engineered away. This is nearly always how technology changes.
    • @evgenymorozov: How Silicon Valley can support basic income: give everyone a bot farm so that we can make advertising $ from fake traffic to their platforms
    • @avdi: Apple has 33 Github repos and 56 contributors. Microsoft now has ~1,200 repos and 2,893 contributors.
    • Peter Norvig: Understanding the brain is a fascinating problem but I think it’s important to keep it separate from the goal of AI which is solving problems ... If you conflate the two it’s like aiming at two mountain peaks at the same time—you usually end up in the valley between them .... We don’t need to duplicate humans ... We want humans and machines to partner and do something that they cannot do on their own.
    • Brave New Greek: Unbounded anything—whether its queues, message sizes, queries, or traffic—is a resilience engineering anti-pattern. Without explicit limits, things fail in unexpected and unpredictable ways. Remember, the limits exist, they’re just hidden. By making them explicit, we restrict the failure domain giving us more predictability, longer mean time between failures, and shorter mean time to recovery at the cost of more upfront work or slightly more complexity.
    • Naren Shankar (Expanse): Everybody feels like they can look at the show and find parts of themselves in it. When you can give people collective ownership of the creative product you get the best from people. At the end of the day it shows. People work their asses off and accomplish the impossible.
    • Richard Jones: a corollary of Moore’s law (sometimes called Rock’s Law). This states that the capital cost of new generations of semiconductor fabs is also growing exponentially
    • Waterloo: His [Napoleon] strategy was simple. It was to divide his enemies, then pin one down while the other was attacked hard and, like a boxing match, the harder he punched the quicker the result. Then, once one enemy was destroyed, he would turn on the next. The best defense for Napoleon in 1815 was attack, and the obvious enemy to attack was the closest.
    • Daniel Lemire: beyond a certain point, reducing the possibility of a fault becomes tremendously complicated and expensive… and it becomes far more economical to minimize the harm due to expected faults
    • @greglinden: “For some products at Baidu, the main purpose is to acquire data from users, not revenue.” — @stuhlmueller
    • strebler:  Deep Learning has made some very fundamental advances, but that doesn't mean it's going to make money just as magically!
    • sulam: Twitter clearly doesn't have growth magic (or they'd be growing faster) -- but is that an engineer's fault? At the end of the day, any user facing engineering is beholden to the product team. Engineers at Twitter can run experiments, but they can't get those experiments shipped unless a PM is behind it.
    • Gil Tene: The right way to read "99%'ile latency of a" is "1 or a 100 of occurrences of 'a' took longer than this. And we have no idea how long". That is the only information captured by that metric. It can be used to roughly deduce "what is the likelihood that 'a' will take longer than that?". But deducing other stuff from it usually simply doesn't work.
    • @esh: Unheralded tiny features like AWS Lambda inside Kinesis Firehose streams replace infrastructure monstrosities with a few lines of code
    • @postwait: Listening to this twitter caching talk... *so* glad my OS doesn't even contemplate OOMs. How is that shit still in Linux? A literal WTF.
    • SomeStupidPoint: Mostly, it was just a choice to save $1-2k on a laptop (every 1-2 years) and spend the money on cellphone data and lattes.
    • @timbray: Oracle trying to monetize Java... Golang/Rust/Elixir all looking better. Assume all JVM langs are potential targets.
    • Kathryn S. McKinley: In programming languages research, the most revolutionary change on the horizon is probabilistic programming, in which developers produce models that estimate the real world and explicitly reason about uncertainty in data and computations. 
    • cindy sridharan: Four Golden Signals 1) Latency 2) Traffic 3) Errors 4) Saturation
    • @FioraAeterna: as a tech company grows in size, the probability of it developing its own in-house bug tracking system approaches 1
    • The Attention Merchants: In 1928, Paley made a bold offer to the nation’s many independent radio stations. The CBS network would provide any of them all of its sustaining content for free—on the sole condition that they agree to carry the sponsored content as well

  • philips: Essentially I see the world broken down into four potential application types: 1) Stateless applications: trivial to scale at a click of a button with no coordination. These can take advantage of Kubernetes deployments directly and work great behind Kubernetes Services or Ingress Services. 2) Stateful applications: postgres, mysql, etc which generally exist as single processes and persist to disks. These systems generally should be pinned to a single machine and use a single Kubernetes persistent disk. These systems can be served by static configuration of pods, persistent disks, etc or utilize StatefulSets. 3) Static distributed applications: zookeeper, cassandra, etc which are hard to reconfigure at runtime but do replicate data around for data safety. These systems have configuration files that are hard to update consistently and are well-served by StatefulSets. 4) Clustered applications: etcd, redis, prometheus, vitess, rethinkdb, etc are built for dynamic reconfiguration and modern infrastructure where things are often changing. They have APIs to reconfigure members in the cluster and just need glue to be operated natively seemlessly on Kubernetes, and thus the Kubernetes Operator concept

  • Top 5 uses for Redis: content caching; user session store; job & queue management; high speed transactions; notifications.

  • Is machine learning being used in the wild? The answer appears to be yes. Ask HN: Where is AI/ML actually adding value at your company? Many uses you might expect and some unexpected: predicting if a part scanned with an acoustic microscope has internal defects; find duplicate entries in a large, unclean data set; product recommendations; course recommendations; topic detection; pattern clustering; understand the 3D spaces scanned by customers; dynamic selection of throttle threshold; EEG interpretation; predict which end users are likely to churn for our customers; automatic data extraction from web pages; model complex interactions in electrical grids in order to make decisions that improve grid efficiency;sentiment classification; detecting fraud; credit risk modeling; Spend prediction; Loss prediction; Fraud and AML detection; Intrusion detection; Email routing; Bandit testing; Optimizing planning/ task scheduling; Customer segmentation; Face- and document detection; Search/analytics; Chat bots; Topic analysis; Churn detection; phenotype adjudication in electronic health records; asset replacement modeling; lead scoring;  semantic segmentation to identify objects in the users environment to build better recommendation systems and to identify planes (floor, wall, ceiling) to give us better localization of the camera pose for height estimates; classify bittorrent filenames into media classify bittorrent filenames into media categories; predict how effective a given CRISPR target site will be; check volume, average ticket $, credit score and things of that nature to determine the quality and lifetime of a new merchant account; anomaly detection; identify available space in kit from images; optimize email marketing campaigns; investigate & correlate events, initially for security logs; moderate comments; building models of human behavior to provide interactive intelligent agents with a conversational interface; automatically grading kids' essays; Predict probability of car accidents based on the sensors of your smartphone; predict how long JIRA tickets are going to take to resolve; voice keyword recognition; produce digital documents in legal proceedings; PCB autorouting.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Dec212016

Sponsored Post: Loupe, New York Times, ScaleArc, Aerospike, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring?

  • The New York Times is looking for a Software Engineer for its Delivery/Site Reliability Engineering team. You will also be a part of a team responsible for building the tools that ensure that the various systems at The New York Times continue to operate in a reliable and efficient manner. Some of the tech we use: Go, Ruby, Bash, AWS, GCP, Terraform, Packer, Docker, Kubernetes, Vault, Consul, Jenkins, Drone. Please send resumes to: technicaljobs@nytimes.com

Fun and Informative Events

  • Your event here!

Cool Products and Services

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • ScaleArc's database load balancing software empowers you to “upgrade your apps” to consumer grade – the never down, always fast experience you get on Google or Amazon. Plus you need the ability to scale easily and anywhere. Find out how ScaleArc has helped companies like yours save thousands, even millions of dollars and valuable resources by eliminating downtime and avoiding app changes to scale. 

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Click to read more ...

Monday
Dec192016

Has Amazon Overthrown Apple as the 'I Hate Buttons' Leader?

 

Steve Jobs is notorious for hating buttons. Here's Jobs explaining the foulness of buttons during his famous iPhone introduction:

What's wrong with their [other phones] user interface? The problem with them is really sort of in the bottom 40. They all have these keyboard that are there whether you need them or not to be there. And they all have these control buttons that are fixed in plastic and are the same for every application. Well every application wants a slightly different user interface, a slightly optimized set of buttons just for it. And what happens if you think of a great idea six months from now? You can't run around and add a button to these things. They're already shipped. So what do you do? It doesn't work because the buttons and the controls can't change.

The iPhone solved the button problem with a new multi-touch screen and by using your finger as the pointing device (not a nasty nasty stylus). We all know how this works now, but it was novel back in the olden days.

The iPhone was one of three new products based on revolutionary user interface development: the mouse and the Macintosh; the click-wheel and the iPod; multi-touch and the iPhone.

 

UI innovation is not enough on its own. Creating a new product category requires a combination of advanced hardware and new supporting software. The Mac was a completely new everything. The iPod paired with iTunes. And the iPhone leveraged OS X, iTunes, and a lot of very smart code for dealing with touch.

That's the history lesson.

Something curious has happened. Amazon. Amazon has escaped the land of misfit phones and has developed three brilliant new products based on revolutionary UIs and sophisticated software systems:

Click to read more ...

Friday
Dec162016

Stuff The Internet Says On Scalability For December 16th, 2016

Hey, it's HighScalability time:

 

This is the entire internet. In 1973! David Newbury found the map going through his dad's old papers.

If you like this sort of Stuff then please support me on Patreon.

  • 2.5 billion+: smartphones on earth; $36,000: loss making a VR game; $1 million: spent playing Game of War; 2000 terabytes: saved downloading Font Awesome's fonts per day; 14TB: new hard drives; 19: Systems We Love talks; 4,600Mbps: new 802.11ad Wi-Fi standard; 

  • Quotable Quotes:
    • Thomas Friedman: [John] Doerr immediately volunteered to start a fund that would support creation of applications for this device by third-party developers, but Jobs wasn’t interested at the time. He didn’t want outsiders messing with his elegant phone.
    • Fastly: For every problem in computer networking there is a closed-box solution that offers the correct abstraction at the wrong cost. 
    • ben stopford: The Data Dichotomy. Data systems are about exposing data. Services are about hiding it.
    • Ernie: just as Amazon invaded the CDN ecosystem with CloudFront and S3, CDNs are going to invade the cloud compute space of AWS.
    • The Attention Merchants: When not chronicling death in its many forms, Bennett loved to gain attention for his paper by hurling insults and starting fights. Once he managed in a single issue to insult seven rival papers and their editors. He was perhaps the media’s first bona fide “troll.” As with contemporary trolls, Bennett’s insults were not clever.
    • @swardley: "Serving 2.1 million API requests for $11" not bad at all. My company site used to cost £19 pcm
    • hibikir: I don't know about Uber, but I've worked at a lot of places that had sensitive data. A common patterns is to fail to treat employees like attackers, and protect data in ways that are very beatable by a motivated employee. 
    • @davecheney: OH: lambdas are stored procedures for millenials.
    • @jamesurquhart: This. Containers will play a huge role in low-level service deployments, but not user facing (e.g. “consumer”) app deployments (5-7 years).
    • theptip: Geo-redundancy seems like a luxury, until your entire site comes down due to a datacenter-level outage. (E.g. the power goes down, or someone cuts the internet lines when doing construction work on the street outside).
    • Resilience Thinking: The ruling paradigm-that we can optimize components of a system in isolation of the rest of the system-is proving inadequate to deal with the dynamic complexity of the real world.
    • Eliezer Steinbock: Disconnect users when they’ve just left their tab open. It’s so simple to do and saves precious resources
    • @ieatkillerbees: In 20 years of engineering I've never said, "thank goodness we hired someone who can reverse a b tree on a whiteboard while strangers watch"
    • Rushkoff: I think as people realize they can’t get jobs in this highly centralized digital economy, as companies realize that it might be better to beat them than join them, I think we will see the retrieval of some of these earlier networking values.
    • Darren Cibis: I think BigQuery is the better product at this stage, however, it’s had a big head start over Athena which has a lot of catching up to do.
    • Fastly: Over the span of a day, IoT devices were probed for vulnerabilities 800 times per hour by attackers from across the globe.
    • Quantum Gravity Research Could Unearth the True Nature of Time: somehow, you can emerge time from timeless degrees of freedom using entanglement.
    • @SystemsWeLove: "You can think of the OS as the bouncer at Club CPU: if a VIP comes in and buys up the place, you're out." -- @arunthomas #systemswelove
    • Erik Darling: When starting to index temp tables, I usually start with a clustered index, and potentially add nonclustered indexes later if performance isn’t where I want it to be.
    • Customers Don’t Give a Shit About Your Data Centers: My youngest daughter co-developed an Alexa skill called PotterHead. By taking advantage of the templates and how-to instructions, the skill was designed, developed, tested, and deployed within 24 hours — without a data center or any knowledge of ansible, git, jenkins, chef, or kubernetes.

  • In summary: mobile is [still] eating the world, everything is changing, nobody knows where it will all end up. And people are scared. Interesting observation on the new scale: Facebook, Amazon, Apple, and Google are 10x bigger than Microsoft & Intel when they were changing the world. 

  • Bigger is not always better when it comes to datacenters. AWS re:Invent 2016: Tuesday Night Live with James Hamilton. Amazon could easily build 200 megawatt (MW) facilities, yet they choose to build mostly 32MW facilities. Why? The data tells them to. What does the data say? The law of diminishing returns. The cost savings don't justify having a larger failure domain. When you start small and scale up a datacenter you get really big gains in cost advantage. As you get bigger and bigger it's a logarithm. The gains of going bigger are relatively small. The negative gain of a big datacenter is linear. If you have a 32MW datacenter tha's about 80k servers it's bad if it goes down, but it can be handled so that it's unnoticeable. If a datacenter with 500K server goes down the amount of network traffic needed to heal all the problems is difficult to handle.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Dec142016

Ask High Scalability: How to build anonymous blockchain communication?

This question came in over the Internets. If you have any ideas please consider sharing them if you have the time...

I am building a 2 way subscription model I am working on a blockchain project where in I have to built a information/data portal where in I will have 2 types of users data providers and data recievers such that there should be anonimity between both of these.

Please guide me how can I leverage blockchain (I think Etherium would be useful in this context but not sure) so that data providers of my system can send messages to data receivers anonymously and vice versa data receivers can request for data through my system to data providers.

I believe, it work if we can create a system where in if a user has data, it will send description to the server, The system will host this description about data without giving the data provider details.

Simultaneously server will store info which user has the data. When data receiver user logs in to system and wants and sees the description of data and wants to analyze that data, it will send request to server for that data. This request is stored in the server and it will allow access to data without receiver knowing who wants to access that data, but it will trigger a message to receiver that an anonymous user wants to access data and would data.

Can you please guide me how to build architecture of this system and how to proceed to do a POC?

Tuesday
Dec132016

A Scalable Alternative to RESTful Communication: Mimicking Google’s Search Autocomplete with a Single MigratoryData Server

This is a guest post by Mihai Rotaru, CTO of MigratoryData.

Using the RESTful HTTP request-response approach can become very inefficient for websites requiring real-time communication. We propose a new approach and exemplify it with a well-known feature that requires real-time communication, and which is included by most websites: search box autocomplete.

Google, which is one of the most demanding web search environments, seems to handle about 40,000 searches per second according to an estimation made by Internet Live Stats. Supposing that for each search, a number of 6 autocomplete requests are made, we show that MigratoryData can handle this load using a single 1U server.

More precisely, we show that a single MigratoryData server running on a 1U machine can handle 240,000 autocomplete requests per second from 1 million concurrent users with a mean round-trip latency of 11.82 milliseconds.

The Current Approach and Its Limitations

Click to read more ...

Friday
Dec092016

Stuff The Internet Says On Scalability For December 9th, 2016

Hey, it's HighScalability time:

 

Here's a 1 TB hard drive in 1937. Twenty workers operated the largest vertical letter file in the world. 4000 SqFt. 3000 drawers, 10 feet long. (from @BrianRoemmele)

If you like this sort of Stuff then please support me on Patreon.

  • 98%~ savings in green house gases using Gmail versus local servers; 2x: time spent on-line compared to 5 years ago; 125 million: most hours of video streamed by Netflix in one day; 707.5 trillion: value of trade in one region of Eve Online; $1 billion: YouTube's advertisement pay-out to the music industry; 1 billion: Step Functions predecessor state machines run per week in AWS retail; 15.6 million: jobs added over last 81 months;

  • Quotable Quotes:
    • Gerry Sussman~ in the 80s and 90s, engineers built complex systems by combining simple and well-understood parts. The goal of SICP was to provide the abstraction language for reasoning about such systems...programming today is more like science. You grab this piece of library and you poke at it. You write programs that poke it and see what it does. And you say, ‘Can I tweak it to do the thing I want?
    • @themoah: Last year Black Friday weekend: 800 Windows servers with .NET. This year: 12 Linux servers with Scala/Akka. #HighScalability #Linux #Scala
    • @swardley: If you're panicking over can't find AWS skills / need to go public cloud - STOP! You missed the boat. Focus now on going serverless in 5yrs.
    • @jbeda: Nordstrom is running multitenant Kubernetes cluster with namespace per team. Using RBAC for security.
    • Tim Harford: What Brailsford says is, he is not interested in team harmony. What he wants is goal harmony. He wants everyone to be focused on the same goal. He doesn’t care if they like each other and indeed there are some pretty famous examples of people absolutely hating each other. 
    • @brianhatfield: SUB. MILLISECOND. PAUSE. TIME. ON. AN. 18. GIG. HEAP. (Trying out Go 1.8 beta 1!)
    • haberman: If you can make your system lock-free, it will have a bunch of nice properties: - deadlock-free - obstruction-free (one thread getting scheduled out in the middle of a critical section doesn't block the whole system) - OS-independent, so the same code can run in kernel space or user space, regardless of OS or lack thereof 
    • Neil Gunther: The world of performance is curved, just like the real world, even though we may not always be aware of it. What you see depends on where your window is positioned relative to the rest of the world. Often, the performance world looks flat to people who always tend to work with clocked (i.e., deterministic) systems, e.g., packet networks or deep-space networks.
    • @yoz: I liked Westworld, but if I wanted hours of watching tech debt and no automated QA destroy a virtual world, I’d go back to Linden Lab
    • @adrianco: I think we are seeing the usual evolution to utility services, and new higher order (open source) functionality emerges /cc @swardley
    • Neil Gunther: a buffer is just a queue and queues grow nonlinearly with increasing load. It's queueing that causes the throughput (X) and latency (R) profiles to be nonlinear.
    • Juho Snellman: I think [QUIC] encrypting the L4 headers is a step too far. If these protocols get deployed widely enough (a distinct possibility with standardization), the operational pain will be significant.
    • @Tobarja: "anyone who is doing microservices is spending about 25% of their engineering effort on their platform" @jedberg 
    • @cdixon: 2016 League of Legends finals: 43M viewers 2016 NBA finals: 30.8M viewers 
    • @mikeolson: 7 billion people on earth; 3 billion images shared on social media every day. @setlinger at #StrataHadoop
    • @swardley: When you think about AWS Lambda, AWS Step Functions et al then you need to view this through the lens of automating basic doctrine i.e. not just saying it and codifying in maps and related systems but embedding it everywhere. At scale and at the speed of competition that I expect us to reach then this is going to be essential.
    • Jakob Engblom: hardware accelerators for particular  common expensive tasks seems to be the right way to add performance at the smallest cost in silicon area and power consumption.
    • Joe Duffy: The future for our industry is a massively distributed one, however, where you want simple individual components composed into a larger fabric. In this world, individual nodes are less “precious”, and arguably the correctness of the overall orchestration will become far more important. I do think this points to a more Go-like approach, with a focus on the RPC mechanisms connecting disparate pieces
    • @cmeik: AWS Lambda is cool if you never had to worry about consistency, availability and basically all of the tradeoffs of distributed systems.
    • prions: As a Civil Engineer myself, I feel like people don't realize the amount of underlying stuff that goes into even basic infrastructure projects. There's layers of planning, design, permitting, regulations and bidding involved. It usually takes years to finally get to construction and even then there's a whole host of issues that arise that can delay even a simple project. 
    • Netflix: If you can cache everything in a very efficient way, you can often change the game. 
    • The Attention Merchants: One [school] board in Florida cut a deal to put the McDonald’s logo on its report cards (good grades qualified you for a free Happy Meal). In recent years, many have installed large screens in their hallways that pair school announcements with commercials. “Take your school to the digital age” is the motto of one screen provider: “everyone benefits.” What is perhaps most shocking about the introduction of advertising into public schools is just how uncontroversial and indeed logical it has seemed to those involved.

  • Just how big is Netflix? The story of the tape is told in Another Day in the Life of a Netflix Engineer. Netflix runs way more than 100K EC2 instances and more than 80,000 CPU cores. They use both predictive and reactive autoscaling, aiming for not too much or too little, just the right amount. Of those 100K+ instances they will autoscale up and down 20% of that capacity everyday. More than 50Gbps ELB traffic per region. More than 25Ggps is telemetry data from devices sending back customer experience data. At peak Netflix is responsible over 37% of Internet traffic. The monthly billing file for Netflix is hundreds of megabytes with over 800 million lines of information. There's a hadoop cluster at Amazon whose only purpose is to load Netflix's bill. Netflix considers speed of innovation to be a strategic advantage. About 4K code changes are put into production per day. At peak over 125 million hours of video were streamed in a day. Support for 130 countries was added in one day. That last one is the kicker. Reading about Netflix over all these years you may have got the idea Netflix was over engineered, but going global in one day was what it was all about. Try that if you are racking and stacking. 

  • Oh how I miss stories that began Once upon a time. The start of so many stories these days is The attack sequence begins with a simple phishing scheme. This particular cautionary tale is from Technical Analysis of Pegasus Spyware, a very, almost lovingly, detailed account of the total ownage of the "secure" iPhone. The exploit made use of three zero-day vulnerabilities: CVE-2016-4657: Memory Corruption in WebKit, CVE-2016-4655: Kernel Information Leak, CVE-2016-4656: Kernel Memory corruption leads to Jailbreak. Do not read if you would like to keep your Security Illusion cherry intact. 

  • Composing RPC calls gets harder has the graph of calls and dependencies explodes. Here's how Twitter handles it. Simplify Service Dependencies with Nodes. Here's their library on GitHub. It's basically just a way to setup a dependency graph in code and have all the RPCs executed to the plan. It's interesting how parallel this is to setting up distributed services in the first place. They like it: We have saved thousands of lines of code, improved our test coverage and ended up with code that’s more readable and friendly for newcomers. Also, AWS Step Functions.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Dec062016

Sponsored Post: Loupe, New York Times, ScaleArc, Aerospike, Scalyr, Gusto, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring?

  • The New York Times is looking for a Software Engineer for its Delivery/Site Reliability Engineering team. You will also be a part of a team responsible for building the tools that ensure that the various systems at The New York Times continue to operate in a reliable and efficient manner. Some of the tech we use: Go, Ruby, Bash, AWS, GCP, Terraform, Packer, Docker, Kubernetes, Vault, Consul, Jenkins, Drone. Please send resumes to: technicaljobs@nytimes.com

  • IT Security Engineering. At Gusto we are on a mission to create a world where work empowers a better life. As Gusto's IT Security Engineer you'll shape the future of IT security and compliance. We're looking for a strong IT technical lead to manage security audits and write and implement controls. You'll also focus on our employee, network, and endpoint posture. As Gusto's first IT Security Engineer, you will be able to build the security organization with direct impact to protecting PII and ePHI. Read more and apply here.

Fun and Informative Events

  • Your event here!

Cool Products and Services

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • ScaleArc's database load balancing software empowers you to “upgrade your apps” to consumer grade – the never down, always fast experience you get on Google or Amazon. Plus you need the ability to scale easily and anywhere. Find out how ScaleArc has helped companies like yours save thousands, even millions of dollars and valuable resources by eliminating downtime and avoiding app changes to scale. 

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex measures your database servers’ work (queries), not just global counters. If you’re not monitoring query performance at a deep level, you’re missing opportunities to boost availability, turbocharge performance, ship better code faster, and ultimately delight more customers. VividCortex is a next-generation SaaS platform that helps you find and eliminate database performance problems at scale.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Click to read more ...

Page 1 ... 4 5 6 7 8 ... 213 Next 10 Entries »