advertise
Wednesday
Dec172014

The Big Problem is Medium Data

This is a guest post by Matt Hunt, who leads open source projects for Bloomberg LP R&D. 

“Big Data” systems continue to attract substantial funding, attention, and excitement. As with many new technologies, they are neither a panacea, nor even a good fit for many common uses. Yet they also hold great promise. The question is, can systems originally designed to serve hundreds of millions of requests for something like web pages also work for requests that are computationally expensive and have tight tolerances?

Modern era big data technologies are a solution to an economics problem faced by Google and other Internet giants a decade ago. Storing, indexing, and responding to searches against all web pages required tremendous amounts of disk space and computer power. Very powerful machines, fast SAN storage, and data center space were prohibitively expensive. The solution was to pack cheap commodity machines as tightly together as possible with local disks.

This addressed the space and hardware cost problem, but introduced a software challenge. Writing distributed code is hard, and with many machines comes many failures. So a framework was also required to take care of such problems automatically for the system to be viable.

Hadoop

Right now, we’re in a transition phase in the industry in computing built from the entrance of Hadoop and its community starting in 2004. Understanding why and how these systems were created also offers insight into some of their weaknesses.  

At Bloomberg that we don’t have a big data problem. What we have is a “medium data” problem -- and so does everyone else.   Systems such as Hadoop and Spark are less efficient and mature for these typical low latency enterprise uses in general. High core counts, SSDs, and large RAM footprints are common today - but many of the commodity platforms have yet to take full advantage of them, and challenges remain.  A number of distributed components are further hampered by Java, which creates its own complications for low latency performance.

A practical use case

Click to read more ...

Tuesday
Dec162014

Multithreaded Programming has Really Gone to the Dogs

Taken from Multithreaded programming - theory and practice on reddit, which also has some very funny comments. If anything this is way too organized. 

 What's not shown? All the little messes that have to be cleaned up after...

Tuesday
Dec162014

The Machine: HP's New Memristor Based Datacenter Scale Computer - Still Changing Everything

The end of Moore’s law is the best thing that’s happened to computing in the last 50 years. Moore’s law has been a tyranny of comfort. You were assured your chips would see a constant improvement. Everyone knew what was coming and when it was coming. The entire semiconductor industry was held captive to delivering on Moore’s law. There was no new invention allowed in the entire process. Just plod along on the treadmill and do what was expected. We are finally breaking free of these shackles and entering what is the most exciting age of computing that we’ve seen since the late 1940s. Finally we are in a stage where people can invent and those new things will be tried out and worked on and find their way into the market. We’re finally going to do things differently and smarter.

-- Stanley Williams (paraphrased)

HP has been working on a radically new type of computer, enigmatically called The Machine (not this machine). The Machine is perhaps the largest R&D project in the history of HP. It’s a complete rebuild of both hardware and software from the ground up. A massive effort. HP hopes to have a small version of their datacenter scale product up and running in two years.

The story began when we first met HP’s Stanley Williams about four years ago in How Will Memristors Change Everything? In the latest chapter of the memristor story, Mr. Williams gives another incredible talk: The Machine: The HP Memristor Solution for Computing Big Data, revealing more about how The Machine works.

The goal of The Machine is to collapse the memory/storage hierarchy. Computation today is energy inefficient. Eighty percent of the energy and vast amounts of time are spent moving bits between hard disks, memory, processors, and multiple layers of cache. Customers end up spending more money on power bills than on the machines themselves. So the machine has no hard disks, DRAM, or flash. Data is held in power efficient memristors, an ion based nonvolatile memory, and data is moved over a photonic network, another very power efficient technology. When a bit of information leaves a core it leaves as a pulse of light.

On graph processing benchmarks The Machine reportedly performs 2-3 orders of magnitude better based on energy efficiency and one order of magnitude better based on time. There are no details on these benchmarks, but that’s the gist of it.

The Machine puts data first. The concept is to build a system around nonvolatile memory with processors sprinkled liberally throughout the memory. When you want to run a program you send the program to a processor near the memory, do the computation locally, and send the results back. Computation uses a wide range of heterogeneous multicore processors. By only transmitting the bits required for the program and the results the savings is enormous when compared to moving terabytes or petabytes of data around.

The Machine is not targeted at standard HPC workloads. It’s not a LINPACK buster. The problem HP is trying to solve for their customers is where a customer wants to perform a query and figure out the answer by searching through a gigantic pile of data. Problems that need to store lots of data and analyze in realtime as new data comes in

Why is a very different architecture needed for building a computer? Computer systems can’t not keep up with the flood of data that’s coming in. HP is hearing from their customers that they need the ability to handle ever greater amounts of data. The amount of bits that are being collected is growing exponentially faster than the rate at which transistors are being manufactured. It’s also the case that information collection is growing faster than the rate at which hard disks are being manufactured. HP estimates there are 250 trillion DVDs worth of data that people really want to do something with. Vast amount of data are being collected in the world are never even being looked at.

So something new is needed. That’s at least the bet HP is making. While it’s easy to get excited about the technology HP is developing, it won’t be for you and me, at least until the end of the decade. These will not be commercial products for quite a while. HP intends to use them for their own enterprise products, internally consuming everything that’s made. The idea is we are still very early in the tech cycle, so high cost systems are built first, then as volumes grow and processes improve, the technology will be ready for commercial deployment. Eventually costs will come down enough that smaller form factors can be sold.

What is interesting is HP is essentially building its own cloud infrastructure, but instead of leveraging commodity hardware and software, they are building their own best of breed custom hardware and software. A cloud typically makes available vast pools of memory, disk, and CPU, organized around instance types which are connected by fast networks. Recently there’s a move to treat these resource pools as independent of the underlying instances. So we are seeing high level scheduling software like Kubernetes and Mesos becoming bigger forces in the industry. HP has to build all this software themselves, solving many of the same problems, along with the opportunities provided by specialized chips. You can imagine programmers programming very specialized applications to eke out every ounce of performance from The Machine, but what is more likely is HP will have to create a very sophisticated scheduling system to optimize how programs run on top of The Machine. What's next in software is the evolution of a kind of Holographic Application Architecture, where function is fluid in both time and space, and identity arises at run-time from a two-dimensional structure. Schedule optimization is the next frontier being explored on the cloud.

The talk is organized in two broad sections: hardware and software. Two-thirds of the project is software, but Mr. Williams is a hardware guy, so hardware makes up the majority of the talk.  The hardware section is based around the idea of optimizing the various functions around the physics that is available: electrons compute; ions store; photons communicate.

Here’s is my gloss on Mr. Williams talk. As usual with such a complex subject much can be missed. Also, Mr. Williams tosses huge interesting ideas around like pancakes, so viewing the talk is highly recommended. But until then, let’s see The Machine HP thinks will be the future of computing….

Click to read more ...

Monday
Dec152014

The Machine: HP's New Memristor Based Datacenter Scale Computer - Still Changing Everything

The end of Moore’s law is the best thing that’s happened to computing in the last 50 years. Moore’s law has been a tyranny of comfort. You were assured your chips would see a constant improvement. Everyone knew what was coming and when it was coming. The entire semiconductor industry was held captive to delivering on Moore’s law. There was no new invention allowed in the entire process. Just plod along on the treadmill and do what was expected. We are finally breaking free of these shackles and entering what is the most exciting age of computing that we’ve seen since the late 1940s. Finally we are in a stage where people can invent and those new things will be tried out and worked on and find their way into the market. We’re finally going to do things differently and smarter.

-- Stanley Williams (paraphrased)

HP has been working on a radically new type of computer, enigmatically called The Machine (not this machine). The Machine is perhaps the largest R&D project in the history of HP. It’s a complete rebuild of both hardware and software from the ground up. A massive effort. HP hopes to have a small version of their datacenter scale product up and running in two years.

The story began when we first met HP’s Stanley Williams about four years ago in How Will Memristors Change Everything? In the latest chapter of the memristor story, Mr. Williams gives another incredible talk: The Machine: The HP Memristor Solution for Computing Big Data, revealing more about how The Machine works.

The goal of The Machine is to collapse the memory/storage hierarchy. Computation today is energy inefficient. Eighty percent of the energy and vast amounts of time are spent moving bits between hard disks, memory, processors, and multiple layers of cache. Customers end up spending more money on power bills than on the machines themselves. So the machine has no hard disks, DRAM, or flash. Data is held in power efficient memristors, an ion based nonvolatile memory, and data is moved over a photonic network, another very power efficient technology. When a bit of information leaves a core it leaves as a pulse of light.

On graph processing benchmarks The Machine reportedly performs 2-3 orders of magnitude better based on energy efficiency and one order of magnitude better based on time. There are no details on these benchmarks, but that’s the gist of it.

The Machine puts data first. The concept is to build a system around nonvolatile memory with processors sprinkled liberally throughout the memory. When you want to run a program you send the program to a processor near the memory, do the computation locally, and send the results back. Computation uses a wide range of heterogeneous multicore processors. By only transmitting the bits required for the program and the results the savings is enormous when compared to moving terabytes or petabytes of data around.

The Machine is not targeted at standard HPC workloads. It’s not a LINPACK buster. The problem HP is trying to solve for their customers is where a customer wants to perform a query and figure out the answer by searching through a gigantic pile of data. Problems that need to store lots of data and analyze in realtime as new data comes in

Why is a very different architecture needed for building a computer? Computer systems can’t not keep up with the flood of data that’s coming in. HP is hearing from their customers that they need the ability to handle ever greater amounts of data. The amount of bits that are being collected is growing exponentially faster than the rate at which transistors are being manufactured. It’s also the case that information collection is growing faster than the rate at which hard disks are being manufactured. HP estimates there are 250 trillion DVDs worth of data that people really want to do something with. Vast amount of data are being collected in the world are never even being looked at.

So something new is needed. That’s at least the bet HP is making. While it’s easy to get excited about the technology HP is developing, it won’t be for you and me, at least until the end of the decade. These will not be commercial products for quite a while. HP intends to use them for their own enterprise products, internally consuming everything that’s made. The idea is we are still very early in the tech cycle, so high cost systems are built first, then as volumes grow and processes improve, the technology will be ready for commercial deployment. Eventually costs will come down enough that smaller form factors can be sold.

What is interesting is HP is essentially building its own cloud infrastructure, but instead of leveraging commodity hardware and software, they are building their own best of breed custom hardware and software. A cloud typically makes available vast pools of memory, disk, and CPU, organized around instance types which are connected by fast networks. Recently there’s a move to treat these resource pools as independent of the underlying instances. So we are seeing high level scheduling software like Kubernetes and Mesos becoming bigger forces in the industry. HP has to build all this software themselves, solving many of the same problems, along with the opportunities provided by specialized chips. You can imagine programmers programming very specialized applications to eke out every ounce of performance from The Machine, but what is more likely is HP will have to create a very sophisticated scheduling system to optimize how programs run on top of The Machine. What's next in software is the evolution of a kind of Holographic Application Architecture, where function is fluid in both time and space, and identity arises at run-time from a two-dimensional structure. Schedule optimization is the next frontier being explored on the cloud.

The talk is organized in two broad sections: hardware and software. Two-thirds of the project is software, but Mr. Williams is a hardware guy, so hardware makes up the majority of the talk.  The hardware section is based around the idea of optimizing the various functions around the physics that is available: electrons compute; ions store; photons communicate.

Here’s is my gloss on Mr. Williams talk. As usual with such a complex subject much can be missed. Also, Mr. Williams tosses huge interesting ideas around like pancakes, so viewing the talk is highly recommended. But until then, let’s see The Machine HP thinks will be the future of computing….

Click to read more ...

Friday
Dec122014

Stuff The Internet Says On Scalability For December 12th, 2014

Hey, it's HighScalability time:


We've had a wee bit of a storm in the bay area.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Dec102014

Reactive prefetch speeds Google's mobile search by 100-150 milliseconds.

Increasing responsiveness by parallelizing and prefetching content using hints and dependency graphs, is an old concept, but seldom do we see such a nice tight example of the benefit as is given by the great Ilya Grigorik in this G+ post

The insight here is that we're initiating the fetch for the HTML and its critical resources in parallel... which requires that the page initiating the navigation knows which critical resources are being used on the target page.

This is a powerful pattern and one that you can use to accelerate your site as well. The key insight is that we are not speculatively prefetching resources and do not incur unnecessary downloads. Instead, we wait for the user to click the link and tell us exactly where they are headed, and once we know that, we tell the browser which other resources it should fetch in parallel - aka, reactive prefetch!

As you can infer, implementing the above strategy requires a lot of smarts both in the browser and within the search engine... First, we need to know the list of critical resources that may delay rendering of the destination page for every page on the web! No small feat, but the Search team has us covered - they're good like that. Next, we need a browser API that allows us to invoke the prefetch logic when the click occurs: the search page listens for the click event, and once invoked, dynamically inserts prefetch hints into the search results page. Finally, this is where Chrome comes in: as the search results page is unloaded, the browser begins fetching the hinted resources in parallel with the request for the destination page. The net result is that the critical resources are fetched much sooner, allowing the browser to render the destination page 100-150 milliseconds earlier.

Tuesday
Dec092014

In Memory: Grace Hopper to Programmers: Mind Your Nanoseconds!

This is an article published a few years ago, but as today is Grace Hopper's birthday I thought it would be a good time to share again an amazing talk from this amazing woman.

Computing pioneer Grace Hopper, inventor of the compiler, searched for a concrete way to create an intuitive understanding of just how fast is a nanosecond, a billionth of a second, which was the speed of their new computer circuits. As an illustration she settled on the length of wire that is as long as light can travel in one nanosecond. The length is a very portable 11.8 inches. A microseconds worth of wire is a still portable, but a much bulkier 984 feet. In one millisecond light travels 186 miles, which only Hercules could carry. In today's terms, at a 3.06 GHz clock speed, there's .33 nanoseconds between ticks, or 3.73 inches of light travel.

Understanding the profligate ways of programmers, she suggests that every programmer wear a necklace of a microseconds worth of wire so they know what they are wasting when they throw away microseconds. And if a General is busting your chops about satellite messages taking too long to send, you can bust out your piece of wire and explain there's a lot of nanoseconds between here and there.

Here's a short, witty, and wise video of her famous nanosecond demonstration. An amazing lady, great innovator, an engaging speaker, and an inspiring teacher.

Related Articles

Tuesday
Dec092014

Sponsored Post: Campanja, Hypertable, Sprout Social, Scalyr, FoundationDB, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Campanja is an Internet advertising optimization company born in the cloud and today we are one of the nordics bigger AWS consumers, the time has come for us to the embrace the next generation of cloud infrastructure. We believe in immutable infrastructure, container technology and micro services, we hope to use PaaS when we can get away with it but consume at the IaaS layer when we have to. Please apply here.

  • Performance and Scale EngineerSprout Social, will be like a physical trainer for the Sprout social media management platform: you will evaluate and make improvements to keep our large, diverse tech stack happy, healthy, and, most importantly, fast. You'll work up and down our back-end stack - from our RESTful API through to our myriad data systems and into the Java services and Hadoop clusters that feed them - searching for SPOFs, performance issues, and places where we can shore things up. Apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events

  • Sign Up for New Aerospike Training Courses.  Aerospike now offers two certified training courses; Aerospike for Developers and Aerospike for Administrators & Operators, to help you get the most out of your deployment.  Find a training course near you. http://www.aerospike.com/aerospike-training/

Cool Products and Services

  • Aerospike Hits 1M writes per second with 6x Fewer Servers than Cassandra. A new Google Compute Engine benchmark demonstrates how the Aerospike database hit 1 million writes per second with just 50 nodes - compared to Cassandra's 300 nodes. Read the benchmark: http://www.aerospike.com/blog/1m-wps-6x-fewer-servers-than-cassandra/

  • Hypertable Inc. Announces New UpTime Support Subscription Packages. The developer of Hypertable, an open-source, high-performance, massively scalable database, announces three new UpTime support subscription packages – Premium 24/7, Enterprise 24/7 and Basic. 24/7/365 support packages start at just $1995 per month for a ten node cluster -- $49.95 per machine, per month thereafter. For more information visit us on the Web at http://www.hypertable.com/. Connect with Hypertable: @hypertable--Blog.

  • FoundationDB launches SQL Layer. SQL Layer is an ANSI SQL engine that stores its data in the FoundationDB Key-Value Store, inheriting its exceptional properties like automatic fault tolerance and scalability. It is best suited for operational (OLTP) applications with high concurrency. Users of the Key Value store will have free access to SQL Layer. SQL Layer is also open source, you can get started with it on GitHub as well.

  • Diagnose server issues from a single tab. The Scalyr log management tool replaces all your monitoring and analysis services with one, so you can pinpoint and resolve issues without juggling multiple tools and tabs. It's a universal tool for visibility into your production systems. Log aggregation, server metrics, monitoring, alerting, dashboards, and more. Not just “hosted grep” or “hosted graphs,” but enterprise-grade functionality with sane pricing and insane performance. Trusted by in-the-know companies like Codecademy – try it free!

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Friday
Dec052014

Stuff The Internet Says On Scalability For December 5th, 2014

Hey, it's HighScalability time:


InfoSec Taylor Swift is wise...haters gonna hate.

 

  • 6 billion+: Foursquare checkins; 25000: allocs for every keystroke in Chrome's Omnibox
  • Quotable Quotes:
    • @wattersjames: Pretty convinced that more value is created by networks of products in today's world than 'stacks'--'stack' model seems outdated.
    • @ChrisLove: WOW 70% of http://WalMart.com  traffic over the holidays was from a 'mobile' device! #webperf #webdevelopment #html5
    • @Nick_Craver: No compelling reason - we can run all of #stackexchange on one R610 server (4yr old) @ 5% CPU. #redis is incredibly efficient.
    • @jehiah: The ticker on http://bitly.com  rolled past 20 BILLION Bitlinks today. Made possible by reliable mysql clusters + NSQ.
    • @tonydenyer: micro services how to turn your ball of mud into a distributed ball of mud" #xpdaylon
    • @moonpolysoft: containers are the new nosql b/c all are dimly aware of a payoff somewhere, and are more than willing to slice each other up to get there.
    • @shipilev: Shipilev's First Law of Performance Issues: "It is almost always something very simple and embarrassing to admit once you found it"
    • Gérard Berry: We discovered that this whole hypothesis of being infinitely fast both simplified programming and helped us to be faster than others, which was fun.
    • @randybias: OH: “The religion of technology is featurism.”  [ brilliant observation ]
    • @rolandkuhn: ACID 2.0: associative commutative idempotent distributed #reactconf @PatHelland
    • @techmilind: @adrianco @wattersjames @cloudpundit Intra-region latency of <2ms is the killer feature. That's what makes Aurora possible.
    • @timreid: async involves a higher level of initial essential complexity, but a greatly reduced level of continual accidental complexity #reactconf
    • @capotribu: Docker Machine + Swarm + (proposed) Compose = multi-containers apps on clusters in 1 command #DockerCon
    • @dthume: "Some people say playing with thread affinity is for wimps. Those people don't make money" - @mjpt777 at #reactconf
    • @jamesurquhart: Reading a bunch of apparently smart people remain blind to the lessons of complexity. #rootcauseismerelyaclue
    • Facebook: the rate of our machine-to-machine traffic growth remains exponential, and the volume has been doubling at an interval of less than a year.

  • In the US we tend to be practical mobile users instead of personal and social fun users. Innovation is happening elsewhere as is clearly shown in Dan Grover's epic exploration of Chinese Mobile App UI Trends: using voice instead of text for messaging; QR codes for everything; indeterminate badges to indicate there's something interesting to look at; a discover menu item that contains "changing menagerie of fun"; lots of app stores; using phone numbers for login, even on websites; QR code logins; chat as the universal UI; more damn stickers; each app has a wallet; use of location in ways those in the US might find creepy; tight integration with offline consumption; common usage of the assistive touch feature on the iPhone; cutesy mascots in loading and error screens; pollution widgets; full ad splash screen when an app starts; theming of apps. 

  • Awesome analysis. A really deep dive with great graphics on Facebook's new network architecture. Facebook Fabric Networking Deconstructed: the new Facebook Network Fabric is in fact a Fat-Tree with 3-levels.

  • Just a tiny thing. AMD, Numascale, and Supermicro Deploy Large Shared Memory System: The Numascale system, installed over the past two weeks, consists of 5184 CPU cores and 20.7 TBytes of shared memory which is housed in 108 Supermicro 1U servers connected in a 3D torus with NumaConnect, using three cabinets with 36 servers apiece in a 6x6x3 topology. Each server has 48 cores three AMD Opteron 6386 CPUs and 192GB memory, providing a single system image and 20.7 TB to all 5184 cores.

  • Quit asking why something happened. The question that must be answered is how. The Infinite Hows (or, the Dangers Of The Five Whys)

  • What do customers want? Answers. Greg Ferro talks about how a company that hired engineers to answer sales inquiries doubled their sales. All people wanted were their questions answered. Once answered they would place an order. No complex time wasting sales cycle required. Technology has replaced the information gather part of the sales cycle. Customers already know everything about a product before making contact. Now what they need are answers.

  • Docker Networking is as simple as a reverse 4 and a half somersault piked from a 3 metre board into a black hole.

  • How is that Google Cloud Platform thingy working out? Pretty well says Aerospike. 1M Aerospike reads/sec for just $11.44/hour, on 50 nodes, with  linear scalability for 100% reads and 100% writes.

  • Your bright human mind can solve a maze. So what? Fatty acid chemistry can too: This study demonstrates that the Marangoni flow in a channel network can solve maze problems such as exploring and visualizing the shortest path and finding all possible solutions in a parallel fashion. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Dec032014

All employees should be limited only by their ability rather than an absence of resources.

James Hamilton hid a pearl of wisdom inside Why Renewable Energy (Alone) Won't Full Solve the Problem that I think is well worth prying out:

I’ve long advocated the use of economic incentives to drive innovative uses of computing resources inside the company while preventing costs from spiraling out of control.  Most IT departments control costs by having computing resources in short supply and only buying more resources slowly and with considerable care. Effectively computing is a scarce resource so it needs to get used carefully. This effectively limits IT cost growth and controls wastage but it also limits overall corporate innovation and the gains driven by the experiments that need these additional resources.

I’m a big believer in making effectively infinite computing resources available internally and billing them back precisely to the team that used them. Of course, each internal group needs to show the customer value of their resource consumption. Asking every group to effectively be a standalone profit center is, in some ways, complex in that the “product” from some groups is hard to quantitatively measure. Giving teams the resources they need to experiment and then allowing successful experiments to progress rapidly into production encourages innovation, makes for a more exciting place to work, and the improvements brought by successful experiments help the company be more competitive and better serve its customers.

I argue that all employees should be limited only by their ability rather than an absence of resources or an inability to argue convincingly for more. This is one of the most important yet least discussed advantages of cloud computing: taking away artificial resource limitations in support light-weight experimentation and rapid innovation. Making individual engineers and teams responsible to deliver more value for more resources consumed makes it possible encourage experimentation without fear that costs will rise without sufficient value being produced.