advertise
Friday
May232014

Gone Fishin' 2014

Well, not exactly Fishin', but I'll be on a month long vacation starting today.

I won't be posting new content, so we'll all have a break. Disappointing, I know.

If you've ever wanted to write an article for HighScalability this would be a great time :-) I'd be very interested in your experiences with containers vs VMs if you have some thoughts on the subject.

So if the spirit moves you, please write something.

See you on down the road...

Wednesday
May212014

9 Principles of High Performance Programs

Arvid Norberg on the libtorrent blog has put together an excellent list of principles of high performance programs, obviously derived from hard won experience programming on bittorrent:

Two fundamental causes of performance problems:

  1. Memory Latency. A big performance problem on modern computers is the latency of SDRAM. The CPU waits idle for a read from memory to come back.
  2. Context Switching. When a CPU switches context "the memory it will access is most likely unrelated to the memory the previous context was accessing. This often results in significant eviction of the previous cache, and requires the switched-to context to load much of its data from RAM, which is slow."

Rules to help balance the forces of evil:

Click to read more ...

Tuesday
May202014

It's Networking. In Space! Or How E.T. Will Phone Home.

What will the version of the Internet that follows us to the stars look like? Yes, people are really thinking seriously about this sort of thing. Specifically the InterPlanetary Networking Special Interest Group (IPNSIG).

Ansible-like faster-than-light communication it isn't. There's no magical warp drive. Nor is a network of telepaths acting as a 'verse spanning telegraph system.

It's more mundane than that. And in many ways more interesting as it's sort of like the old Internet on steroids, the one that was based on on UUCP and dial-up connections, but over vastly longer distances and with much longer delays:

Click to read more ...

Monday
May192014

A Short On How the Wayback Machine Stores More Pages than Stars in the Milky Way

How does the Wayback Machine work? Now with over 400 billion webpages indexed, allowing the Internet to be browsed all the way back to 1996, it's an even more compelling question. I've looked several times but I've never found a really good answer.

Here's some information from a thread on Hacker News. It starts with mmagin, a former Archive employee:

Click to read more ...

Friday
May162014

Stuff The Internet Says On Scalability For May 16th, 2014

Hey, it's HighScalability time:


Cross Section of an Undersea Cable. It's industrial art. The parts. The story.
  • 400,000,000,000: Wayback Machine pages indexed; 100 billion: Google searches per month; 10 million: Snapchat monthly user growth.
  • Quotable Quotes:
    • @duncanjw: The Great Rewrite - many apps will be rewritten not just replatformed over next 10 years says @cote #openstacksummit
    • @RFFlores: The Openstack conundrum. If you don't adopt it, you will regret it in the future. If you do adopt it, you will regret it now
    • elementai: I love Redis so much, it became like a superglue where "just enough" performance is needed to resolve a bottleneck problem, but you don't have resources to rewrite a whole thing in something fast.
    • @antirez: "when software engineering is reduced to plumbing together generic systems, software engineers lose their sense of ownership"
    • Tom Akehurst: Microservices vs. monolith is a false dichotomy.
    • @joestump: “Keep in mind that any piece of butt-based infrastructure can fail at any time. Plan your infrastructure accordingly.” Ain’t that the truth?
    • @SalesforceEng: Check out the scale of Kafka @LinkedInEng. @bonkoif says these numbers are about a month old. 3.25 million msgs/sec. 
    • Don Neufeld: The first is to look deeply into the stack of implicit assumptions I’m working with. It’s often the unspoken assumptions that are the most important ones. The second flows from the first and it’s to focus less on building the right thing and more how we’re going to meet our immediate needs.
    • Dan Gillmor: We’re in danger of losing what’s made the Internet the most important medium in history – a decentralized platform where the people at the edges of the networks – that would be you and me – don’t need permission to communicate, create and innovate.

  • If you think of a Hotel as an app, hotels have been doing in-app purchases for a long time. They lead with a teaser rate and then charge for anything that might cross a desire-money threshold. Wifi, that's extra. Gym, that's extra. The bar, a cover charge. Drinks, so so expensive. The pool, extra. A lounge by the pool is double extra extra. To go all the way hotels just need to let you stay for free and then fully monetize all the gamification points.

  • Apple: We handle hundreds of millions of active users using some of the most desirable devices on the planet and several Billion iMesssages/day, 40 billion push notifications/day, 16+ trillion push notifications sent to date.

  • It's a data prison for everyone! Comcast plans data caps for all customers in 5 years, could be 500GB. Or just a few 4K movies.

  • From the future of everything to the verge of extinction. The Slow Decline of Peer-to-Peer File Sharing: People have shifted their activities to streaming over file sharing. Subscribers get quality content at a reasonable price and it's dead simple to use, whereas torrenting or file sharing is a little more complicated.

  • I don't think people understand how hard this is to do in practice. European Court Lets Users Erase Records on Web. Once data is stored on tape deleting takes rewriting all the non-deleted data to another tape. So it's far more efficient to forget indexes to data than delete the data. Which goes against the point I'd imagine.

  • How is a strategy tax hands off? @parislemon: Instagram's decision to use Facebook's much worse place database over Foursquare's has made the product worse. Stupid.

  • Excellent detailed example of the SEDA architecture in action. Guide to Cassandra Thread Pools. Follow the regal message as it flows from thread pool to thread pool, transforming as it makes its way to its final resting place.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Click to read more ...

Thursday
May152014

Paper: SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine

So how do you knit multiple datacenters and many thousands of phones and other clients into a single cooperating system?

Usually you don't. It's too hard. We see nascent attempts in services like Firebase and Parse. 

SwiftCloud, as described in SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine, goes two steps further, by leveraging Conflict free Replicated Data Types (CRDTs), which means "data can be replicated at multiple sites and be updated independently with the guarantee that all replicas converge to the same value. In a cloud environment, this allows a user to access the data center closer to the user, thus optimizing the latency for all users."

While we don't see these kind of systems just yet, they are a strong candidate for how things will work in the future, efficiently using resources at every level while supporting huge numbers of cooperating users.

Abstract:

Click to read more ...

Wednesday
May142014

Google Says Cloud Prices Will Follow Moore’s Law: Are We All Renters Now?

After Google cut prices on their Google Cloud Platform Amazon quickly followed with their own price cuts. Even more interesting is what the future holds for pricing. The near future looks great. After that? We'll see.

Adrian Cockcroft highlights that Google thinks prices should follow Moore’s law, which means we should expect prices to halve every 18-24 months.

That's good news. Greater cost certainty means you can make much more aggressive build out plans. With the savings you can hire more people, handle more customers, and add those media rich features you thought you couldn't afford. Design is directly related to costs.

Without Google competing with Amazon there's little doubt the price reduction curve would be much less favorable.

As a late cloud entrant Google is now in a customer acquisition phase, so they are willing to pay for customers, which means lower prices are an acceptable cost of doing business. Profit and high margins are not the objective. Getting market share is what is important.

Amazon on the other hand has been reaping the higher margins earned from recurring customers. So Google's entrance into the early product life cycle phase is making Amazon eat into their margins and is forcing down prices over all.

But there's a floor to how low prices can go. Alen Peacock, co-founder of Space Monkey has an interesting graphic telling the story. This is Amazon's historical pricing for 1TB of storage in S3, graphed as a multiple of the historical pricing for 1TB of local hard disk:

Alen explains it this way:

Cloud prices do decrease over time, and have dropped significantly over the timespan shown in the graph, but this graph shows cloud storage prices as a multiple of hard disk prices. In other words, hard disk prices are dropping much faster than datacenter prices. This is because, right, datacenters have costs other than hard disks (power, cooling, bandwidth, building infrastructure, diesel backup generators, battery backup systems, fire suppression, staffing, etc). Most of those costs do not follow Moore's Law -- in fact energy costs are on a long trend upwards. So over time, the gap shown by the graph should continue to widen.

 

The economic advantages of building your own (but housed in datacenters) is there, but it isn't huge. There is also some long term strategic advantage to building your own, e.g., GDrive dropped their price dramatically at will because Google owns their datacenters, but Dropbox couldn't do that without convincing Amazon to drop the price they pay for S3.

Costs other than hardware began dominating in datacenters several years ago, Moore's Law-like effects are dampened. Energy/cooling and cooling costs do not follow Moore's Law, and those costs make up a huge component of the overall picture in datacenters. This is only going to get worse, barring some radical new energy production technology arriving on the scene.

What we're [Space Monkey] interested in, long term, is dropping the floor out from underneath all of these, and I think that only happens if you get out of the datacenter entirely.

As the size of cloud market is still growing there will still be a fight for market share. When growth slows and the market is divided between major players a collusionary pricing phase will take over. Cloud customers are sticky customers. It's hard to move off a cloud. The need for higher margins to justify the cash flow drain during the customer acquisition phase will reverse the favorable trends we are seeing now.

Until then it seems the economics indicate we are in a rent, not a buy world.

Related Articles 

  • IaaS Series: Cloud Storage Pricing – How Low Can They Go? - "For now it seems we can assume we’ve not seen the last of the big price reductions."
  • The Cloud Is Not Green
  • Brad Stone: “Bill Miller, the chief investment officer at Legg Mason Capital Management and a major Amazon shareholder, asked Bezos at the time about the profitability prospects for AWS. Bezos predicted they would be good over the long term but said that he didn’t want to repeat “Steve Jobs’s mistake” of pricing the iPhone in a way that was so fantastically profitable that the smartphone market became a magnet for competition.” 
Tuesday
May132014

Sponsored Post: Apple, Cloudant, CopperEgg, Logentries, Wargaming.net, PagerDuty, HelloSign, CrowdStrike, Gengo, ScaleOut Software, Couchbase, MongoDB, BlueStripe, AiScaler, Aerospike, LogicMonitor, AppDynamics, ManageEngine, Site24x7  

Who's Hiring?


  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here.
    • Enterprise Software Engineer. Apple's Emerging Technology Services group provides a Java based SOA platform for various applications to interact with each other. The platform is designed to handle millions of messages a day with very low latency. We have an immediate opening for a talented Software Engineer in a highly visible team who is passionate about exploring emerging technologies to create elegant scalable solutions. Please apply here
    • Mobile Services Software Engineer. The Emerging Technologies/Mobile Services team is looking for a proactive and hardworking software engineer to join our team. The team is responsible for a variety of high quality and high performing mobile services and applications for internal use. Please apply here
    • Sr. Software Engineer-iOS Systems. Do you love building highly scalable, distributed web applications? Does the idea of performance tuning Java applications make your heart leap? If so, iOS Systems is looking for a highly motivated, detail-oriented, energetic individual with excellent written and oral skills who is not afraid to think outside the box and question assumptions. Please apply here
    • Senior Software Engineering Manager. As a Senior Software Engineering Manager on our team, you will be managing teams of very dedicated and talented engineering team. You will be responsible for managing the development of mobile point of sale system on iPod touch hardware. Please apply here.
    • Sr Software Engineer - Messaging Services. An exciting opportunity for a Software Engineer to join Apple's Messaging Services team. We build the cloud systems that power some of the busiest applications in the world, including iMessage, FaceTime and Apple Push Notifications. We handle hundreds of millions of active users using some of the most desirable devices on the planet and several Billion iMesssages/day, 40 billion push notifications/day, 16+ trillion push notifications sent to date. Please apply here.

  • Engine Programmer - C/C++. Wargaming|BigWorld is seeking Engine Programmers to join our team in Sydney, Australia. We offer a relocation package, Australian working visa & great salary + bonus. Your primary responsibility will be to work on our PC engine. Please apply here

  • Senior Engineer wanted for large scale, security oriented distributed systems application that offers career growth and independent work environment. Use your talents for good instead of getting people to click ads at CrowdStrike. Please apply here.

  • Ops Engineer - Are you passionate about scaling and automating cloud-based websites? Love Puppet and deployment scripts? Want to take advantage of both your sys-admin and DevOps skills? Join HelloSign as our second Ops Engineer and help us scale as we grow! Apply at http://www.hellosign.com/info/jobs

  • Human Translation Platform Gengo Seeks Sr. DevOps Engineer. Build an infrastructure capable of handling billions of translation jobs, worked on by tens of thousands of qualified translators. If you love playing with Amazon’s AWS, understand the challenges behind release-engineering, and get a kick out of analyzing log data for performance bottlenecks, please apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events


  • The Biggest MongoDB Event Ever Is On. Will You Be There? Join us in New York City June 23-25 for MongoDB World! The conference lineup includes Amazon CTO Werner Vogels and Cloudera Co-Founder Mike Olson for keynote addresses.  You’ll walk away with everything you need to know to build and manage modern applications. Register before April 4 to take advantage of super early bird pricing.

  • Upcoming Webinar: Practical Guide to SQL - NoSQL Migration. Avoid common pitfalls of NoSQL deployment with the best practices in this May 8 webinar with Anton Yazovskiy of Thumbtack Technology. He will review key questions to ask before migration, and differences in data modeling and architectural approaches. Finally, he will walk you through a typical application based on RDBMS and will migrate it to NoSQL step by step. Register for the webinar.

Cool Products and Services

  • The NoSQL "Family Tree" from Cloudant explains the NoSQL product landscape using an infographic. The highlights: NoSQL arose from "Big Data" (before it was called "Big Data"); NoSQL is not "One Size Fits All"; Vendor-driven versus Community-driven NoSQL.  Create a free Cloudant account and start the NoSQL goodness

  • Finally, log management and analytics can be easy, accessible across your team, and provide deep insights into data that matters across the business - from development, to operations, to business analytics. Create your free Logentries account here.

  • CopperEgg. Simple, Affordable Cloud Monitoring. CopperEgg gives you instant visibility into all of your cloud-hosted servers and applications. Cloud monitoring has never been so easy: lightweight, elastic monitoring; root cause analysis; data visualization; smart alerts. Get Started Now.

  • PagerDuty helps operations and DevOps engineers resolve problems as quickly as possible. By aggregating errors from all your IT monitoring tools, and allowing easy on-call scheduling that ensures the right alerts reach the right people, PagerDuty increases uptime and reduces on-call burnout—so that you only wake up when you have to. Thousands of companies rely on PagerDuty, including Netflix, Etsy, Heroku, and Github.

  • Aerospike Releases Client SDK for Node.js 0.10.x. This client makes it easy to build applications in Node.js that need to store and retrieve data from a high-performance Aerospike cluster. This version exposes Key-Value Store functionality - which is the core of Aerospike's In-Memory NoSQL Database. Platforms supported: CentOS 6, RHEL 6, Debian 6, Debian7, Mac OS X, Ubuntu 12.04. Write your first app: https://github.com/aerospike/aerospike-client-nodejs.

  • consistent: to be, or not to be. That’s the question. Is data in MongoDB consistent? It depends. It’s a trade-off between consistency and performance. However, does performance have to be sacrificed to maintain consistency? more.

  • Do Continuous MapReduce on Live Data? ScaleOut Software's hServer was built to let you hold your daily business data in-memory, update it as it changes, and concurrently run continuous MapReduce tasks on it to analyze it in real-time. We call this "stateful" analysis. To learn more check out hServer.

  • LogicMonitor is the cloud-based IT performance monitoring solution that enables companies to easily and cost-effectively monitor their entire IT infrastructure stack – storage, servers, networks, applications, virtualization, and websites – from the cloud. No firewall changes needed - start monitoring in only 15 minutes utilizing customized dashboards, trending graphs & alerting.

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
May122014

4 Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO

This is a guest repost by Venkatesh CM at Architecture Issues Scaling Web Applications.

I will cover architecture issues that show up while scaling and performance tuning large scale web application in this blog.

Lets start by defining few terms to create common understanding and vocabulary. Later on I will go through different issues that pop-up while scaling web application like

  • Architecture bottlenecks
  • Scaling Database
  • CPU Bound Application
  • IO Bound Application

Determining optimal thread pool size of an web application will be covered in next blog.

Performance

Click to read more ...

Friday
May092014

Stuff The Internet Says On Scalability For May 9th, 2014

Hey, it's HighScalability time:


NASA captures Guatemala volcano erupting from space 
  • 40,000 exabytes: from now until 2020, the digital universe will about double every two years; $650,000: amount raised by the MaydayPAC in one week.
  • Quotable Quotes:
    • @BenedictEvans: Masayoshi Son: $20m initial investment in Alibaba, current stake worth $58bn.
    • @iamdevloper: I sneezed earlier and Siri compiled it to valid Perl.
    • @cdixon: "There is not enough competition in the last mile market to allow a true market to function" 
    • @PatrickMcFadin: Get ready for some serious server density. AMD is working on K12, brand-new x86 and ARM cores. This plus 8T SSD? 

  • With age comes changing priorities. Facebook is now 10 and has grown up. They are no longer moving fast and breaking things. They are now into the stability thing. Letting developers know they are a stable platform. The play is to get all that beautiful data from developers by being the platform for the Internet. On which an ad platform is built like a castle protecting a river valley. Interesting that Twitter said No! to becoming a platform, turning away developers. What has happened to Twitter's growth? The thought processes that lead to such different conclusions about the future would be interesting to understand.

  • Better than a Tauntaun roasted over an open light saber. An ode to 17 database in 33 minutes - RailsConf2014 by tobyhede. My favorite is "MySQL - The same as PostgreSQL but controlled by an evil overlord." 

  • Well, when you explain it that way...why GNU grep is fast: GNU grep is fast because it AVOIDS LOOKING AT EVERY INPUT BYTE; GNU grep is fast because it EXECUTES VERY FEW INSTRUCTIONS FOR EACH BYTE that it *does* look at.

  • How Gilt's Insane Traffic Spikes Pushed It Off Rails To Scala. It's unusual to have your expected traffic pattern to be a 100x spike once a day for 15 minutes, but that's the life of flash sales. Started as a Rails app. That didn't scale. They switched from Java to Scala because the Java system became too monolithic. They also bought into Akka and the whole Reactive platform idea. Architecture is terms of hundreds of microservices. Microservices keep a wall between unrelated services, reduces complexity, and keeps development friction-free.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so keep on going)...

Click to read more ...