advertise
Wednesday
Oct222014

Paper: Actor Model of Computation: Scalable Robust Information Systems

With Reactive Systems becoming the new old hotness, it will help to have a thorough grounding in the Actor Model. Here's a good start. Carl Hewitt in Actor Model of Computation: Scalable Robust Information Systems gives a very thorough and relatively concise explanation of the Actor model.

Here's the abstract.

The Actor model is a mathematical theory that treats "Actors" as the universal primitives of concurrent digital computation. The model has been used both as a framework for a theoretical understanding of concurrency, and as the theoretical basis for several practical implementations of concurrent systems. Unlike previous models of computation, the Actor model was inspired by physical laws. It was also influenced by the programming languages Lisp, Simula 67 and Smalltalk-72, as well as ideas for Petri Nets, capability-based systems and packet switching. The advent of massive concurrency through client-cloud computing and many-core computer architectures has galvanized interest in the Actor model.


Actor technology will see significant application for integrating all kinds of digital information for individuals, groups, and organizations so their information usefully links together. Information integration needs to make use of the following information system principles:
    * Persistence. Information is collected and indexed.
    * Concurrency: Work proceeds interactively and concurrently, overlapping in time.
    * Quasi-commutativity: Information can be used regardless of whether it initiates new work or become relevant to ongoing work.
    * Sponsorship: Sponsors provide resources for computation, i.e., processing, storage, and communications.
    * Pluralism: Information is heterogeneous, overlapping and often inconsistent.
    * Provenance: The provenance of information is carefully tracked and recorded

The Actor Model is intended to provide a foundation for inconsistency robust information integration.

Related Articles

Monday
Oct202014

Facebook Mobile Drops Pull For Push-based Snapshot + Delta Model

We've learned mobile is different. In If You're Programming A Cell Phone Like A Server You're Doing It Wrong we learned programming for a mobile platform is its own specialty. In How Facebook Makes Mobile Work At Scale For All Phones, On All Screens, On All Networks we learned bandwidth on mobile networks is a precious resource. 

Given all that, how do you design a protocol to sync state (think messages, comments, etc.) between mobile nodes and the global state holding servers located in a datacenter?

Facebook recently wrote about their new solution to this problem in Building Mobile-First Infrastructure for Messenger. They were able to reduce bandwidth usage by 40% and reduced by 20% the terror of hitting send on a phone.

That's a big win...that came from a protocol change.

Facebook Messanger went from a traditional notification triggered full state pull:

Click to read more ...

Friday
Oct172014

Stuff The Internet Says On Scalability For October 17th, 2014

Hey, it's HighScalability time:


What could this be? Swarms of drones painting 3D light sculptures against the night sky!
  • Quotable Quotes:
    • Visnja Zeljeznjak: Steve Jobs' product pricing formula: cost of materials x 3 + 33%
    • Benedict Evans: We now have over 2bn iOS and Android devices on earth, and this will grow in the next few years to well over 3bn.
    • @ClearStoryData: It's true! Avg beer drinker attracts 4.4% more Mosquitos than water drinker #Strataconf
    • Leslie Lamport: The core idea of the problem of that notion of causality came about because of my familiarity with special relativity...where one event could causally effect another depended on weather or not information from one could physically reach the other.
    • @laurelatoreilly: Fascinating session about cargo ships going dark to shift market prices #IoT #strataconf "your decisions are only as good as your data"
    • @muratdemirbas: Distributed/decentralized coordination is expensive & hard to scale. Centralized coordination is cheap & scales easily using hierarchies.
    • @froidianslip: ”Kafka is awesome. We heard it cures cancer." -- @gwenshap #Strataconf
    • @timoreilly: RT @grapealope: The self-driving car has 6000 sensors, and takes readings at 4Hz. That's a lot of data. @MCSrivas #strataconf #MapR
    • @froidianslip: Love the paraphrase borrowed from Ray Bradbury, "Any sufficiently complex configuration is indistinguishable from code." #Strataconf
    • @matei_zaharia: Spark shatters MapReduce's 100 TB and 1 PB sort records... with 10x fewer nodes
    • @msallstr: “Synchronous calls in this environment are the crystal meth of programming”  @mjpt777 on the new   reactive manifesto 
    • @postwait: “If you put them under enough stress, perfectly rational people will panic and start believing in science” #priceless
    • Ilya Grigorik: It's great to see access from mobile is around 30% faster compared to last year.
    • @ryandotsmith: Recently migrated an async system to SQS. Much simple. Tiny latency. Here is the code (maybe a gem?)

  • People just don't appreciate the power of messy. The problematic culture of "Worse is Better". There's an implied notion here that people can't recognize better when they see it. Better is not a platonic ideal. It can't be proved by argument. Better, like evolution, is something that works itself out in practice. Like evolution, Worse is Better is an algorithm for stepping through a possibility space by jumping from one working phenotype to the next more adapted working phenotype. And for many, that's better. Not Ideal, but Better.

  • The Times They Are a-Changin'. Docker and Microsoft partner to drive adoption of distributed applications. What's the goal? nickstinemates: Package your Windows app in a docker container, use same tooling you would otherwise use to deploy to a docker engine running on a Windows host. Package your Linux app in a docker container, use same tooling you would otherwise use to deploy to a docker engine running on a Linux host.

  • Leandro Pereira writes a fine autobiography in Life of a HTTP request, as seen by my toy web server. All the stages of life are there. Socket creation. Acceptance. Scheduling. Coroutines. Reading requests. Parsing requests. All the way to the reply and the death of the connection. A lot to learn if you want to look at the simplified internals of a service.

  • Wonderful talk: Call Me Maybe: Carly Rae Jepsen and the Perils of Network Partitions. Kyle Kingsbury takes a detailed look at different partition problems in different databases. There are split brains. Masters dying. Lost data. General network mayhem. It's great. The lesson: what's written down in the marketing documentation is not always what you get. Test your application and see what really happens. The world is not simple. A dumb solution where you understand the failure modes can be a good choice.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Oct152014

Using a SSD Cache in Front of EBS Boosted Throughput by 50%, for Free

Using EBS has lots of advantages--reliability, snapshotting, resizing--but overcoming the performance problems by using Provisioned IOPS is expensive. 

Swrve, an integrated marketing and A/B testing and optimization platform for mobile apps, did something clever. They are using the c3.xlarge EC2 instances, that have two 40GB SSD devices per instance, as a cache.

They found through testing RAID-0 striping using a 4-way stripe along with enhanceio, effectively increased throughput by over 50%, for free. With no filesystem corruption problems.

How is it free? "We were planning on upgrading to the C3 class of instance anyway, and sticking with EBS as the backing store. Once you’re using an instance which has SSD ephemeral storage, there are no additional fees to use that hardware."

For great analysis, lots of juicy details, graphs, and configuration commands, please take a look at How we increased our EC2 event throughput by 50%, for free

Tuesday
Oct142014

Sponsored Post: Apple, Hypertable, VSCO, Gannett, Sprout Social, Scalyr, FoundationDB, AiScaler, Aerospike, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here. 
    • Senior Engineer: Mobile Services. The Emerging Technologies/Mobile Services team is looking for a proactive and hardworking software engineer to join our team. The team is responsible for a variety of high quality and high performing mobile services and applications for internal use. We seek an accomplished server-side engineer capable of delivering an extraordinary portfolio of features and services based on emerging technologies to our internal customers. Please apply here.
    • Apple Pay Automation Engineer. The Apple Pay group within iOS Systems is looking for a outstanding automation engineer with strong experience in building client and server test automation. We work in an agile software development environment and are building infrastructure to move towards continuous delivery where every code change is thoroughly tested by push of a button and is considered ready to be deployed if we choose so. Please apply here
    • Site Reliability Engineer. As a member of the Apple Pay SRE team, you’re expected to not just find the issues, but to write code and fix them. You’ll be involved in all phases and layers of the application, and you’ll have a direct impact on the experience of millions of customers. Please apply here.
    • Software Engineering Manager. In this role, you will be communicating extensively with business teams across different organizations, development teams, support teams, infrastructure teams and management. You will also be responsible for working with cross-functional teams to delivery large initiatives. Please apply here

  • VSCO. Do you want to: ship the best digital tools and services for modern creatives at VSCO? Build next-generation operations with Ansible, Consul, Docker, and Vagrant? Autoscale AWS infrastructure to multiple Regions? Unify metrics, monitoring, and scaling? Build self-service tools for engineering teams? Contact me (Zo, zo@vs.co) and let’s talk about working together. vs.co/careers.

  • Gannett Digital is looking for talented Front-end developers with strong Python/Django experience to join their Development & Integrations team. The team focuses on video, user generated content, API integrations and cross-site features for Gannett Digital’s platform that powers sites such as http://www.usatoday.com, http://www.wbir.com or http://www.democratandchronicle.com. Please apply here.

  • Platform Software Engineer, Sprout Social, builds world-class social media management software designed and built for performance, scale, reliability and product agility. We pick the right tool for the job while being pragmatic and scrappy. Services are built in Python and Java using technologies like Cassandra and Hadoop, HBase and Redis, Storm and Finagle. At the moment we’re staring down a rapidly growing 20TB Hadoop cluster and about the same amount stored in MySQL and Cassandra. We have a lot of data and we want people hungry to work at scale. Apply here.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events

  • Sign Up for New Aerospike Training Courses.  Aerospike now offers two certified training courses; Aerospike for Developers and Aerospike for Administrators & Operators, to help you get the most out of your deployment.  Find a training course near you. http://www.aerospike.com/aerospike-training/

  • November TokuMX Meetups for Those Interested in MongoDB. Join us in one of the following cities in November to learn more about TokuMX and hear TokuMX use cases. 11/5 - London;11/11 - San Jose; 11/12 - San Francisco. Not able to get to these cities? Check out our website for other upcoming Tokutek events in your area - www.tokutek.com/events.

Cool Products and Services

  • Hypertable Inc. Announces New UpTime Support Subscription Packages. The developer of Hypertable, an open-source, high-performance, massively scalable database, announces three new UpTime support subscription packages – Premium 24/7, Enterprise 24/7 and Basic. 24/7/365 support packages start at just $1995 per month for a ten node cluster -- $49.95 per machine, per month thereafter. For more information visit us on the Web at http://www.hypertable.com/. Connect with Hypertable: @hypertable--Blog.

  • FoundationDB launches SQL Layer. SQL Layer is an ANSI SQL engine that stores its data in the FoundationDB Key-Value Store, inheriting its exceptional properties like automatic fault tolerance and scalability. It is best suited for operational (OLTP) applications with high concurrency. Users of the Key Value store will have free access to SQL Layer. SQL Layer is also open source, you can get started with it on GitHub as well.

  • Diagnose server issues from a single tab. Scalyr replaces all your monitoring and log management services with one, so you can pinpoint and resolve issues without juggling multiple tools and tabs. Engineers say it's powerful and easy to use. Customer support teams use it to troubleshoot user issues. CTO's consider it a smart alternative to Splunk, with enterprise-grade functionality, sane pricing, and human support. Trusted by in-the-know companies like Codecademy – learn more!

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Cloud deployable. Free instant trial, no sign-up required.  http://aiscaler.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Oct132014

How League of Legends Scaled Chat to 70 million Players - It takes Lots of minions.

How would you build a chat service that needed to handle 7.5 million concurrent players, 27 million daily players, 11K messages per second, and 1 billion events per server, per day?

What could generate so much traffic? A game of course. League of Legends. League of Legends is a team based game, a multiplayer online battle arena (MOBA), where two teams of five battle against each other to control a map and achieve objectives.

For teams to succeed communication is crucial. I learned that from Michal Ptaszek, in an interesting talk on Scaling League of Legends Chat to 70 million Players (slides) at the Strange Loop 2014 conference. Michal gave a good example of why multiplayer team games require good communication between players. Imagine a basketball game without the ability to call plays. It wouldn’t work. So that means chat is crucial. Chat is not a Wouldn’t It Be Nice feature.

Michal structures the talk in an interesting way, using as a template the expression: Make it work. Make it right. Make it fast.

Making it work meant starting with XMPP as a base for chat. WhatsApp followed the same strategy. Out of the box you get something that works and scales well...until the user count really jumps. To make it right and fast, like WhatsApp, League of Legends found themselves customizing the Erlang VM. Adding lots of monitoring capabilities and performance optimizations to remove the bottlenecks that kill performance at scale.

Perhaps the most interesting part of their chat architecture is the use of Riak’s CRDTs (commutative replicated data types) to achieve their goal of a shared nothing fueled massively linear horizontal scalability. CRDTs are still esoteric, so you may not have heard of them yet, but they are the next cool thing if you can make them work for you. It’s a different way of thinking about handling writes.

Let’s learn how League of Legends built their chat system to handle 70 millions players...

Stats

Click to read more ...

Friday
Oct102014

Stuff The Internet Says On Scalability For October 10th, 2014

Hey, it's HighScalability time:


Social climber: Instagram explorer scales to new heights in New York.

 

  • 11 billion: world population in 2100; 10 petabytes: Size of Netflix data warehouse on S3; $600 Billion: the loss when a trader can't type; 3.2: 0-60 mph time of probably not my next car.
  • Quotable Quotes:
    • @kahrens_atl: Last week #NewRelic Insights wrote 618 billion events and ran 237 trillion queries with 9 millisecond response time #FS14
    • @sustrik: Imagine debugging on a quantum computer: Looking at the value of a variable changes its value. I hope I'll be out of business by then.
    • Arrival of the Fittest: Solving Evolution's Greatest Puzzle: Every cell contains thousands of such nanomachines, each of them dedicated to a different chemical reaction. And all their complex activities take place in a tiny space where the molecular building blocks of life are packed more tightly than a Tokyo subway at rush hour. Amazing.
    • Eric Schmidt: The simplest outcome is we're going to end up breaking the Internet," said Google's Schmidt. Foreign governments, he said, are "eventually going to say, we want our own Internet in our country because we want it to work our way, and we don't want the NSA and these other people in it.
    • Antirez: Basically it is neither a CP nor an AP system. In other words, Redis Cluster does not achieve the theoretical limits of what is possible with distributed systems, in order to gain certain real world properties.
    • @aliimam: Just so we can fathom the scale of 1B vs 1M: 1,000,000 seconds is 11.5 days. 1,000,000,000 seconds is 31.6 YEARS
    • @kayousterhout: 92% of catastrophic failures in distributed data-intensive systems caused by incorrect error handling https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf … #osdi14
    • @DrQz: 'The purpose of computing is insight, not numbers.' (Hamming) Sometimes numbers ARE the insight so, make them accesible too. (Me)

  • Robert Scoble on the Gillmor Gang said that because of the crush of signups, ello had to throttle invites. Their single PostgreSQL server couldn't handle it captain.

  • Containers are getting much larger with new composite materials. Not that kind of container. Shipping containers. High oil costs have driven ships carrying 5000 containers to evolve. Now they can carry 18,000 and soon 19,000 containers!

  • If you've wanted to make a network game then this is a great start. Making Fast-Paced Multiplayer Networked Games is Hard: Fast-paced multiplayer games over the Internet are hard, but possible. First understanding your constraints then building within them is essential. I hope I have shed some light on what those constraints are and some of the techniques you can use to build within them. No doubt there are other ways out there and ways yet to be used. Each game is different and has its own set of priorities. Learning from what has been done before could help a great deal.

  • Arrival of the Fittest: Solving Evolution's Greatest Puzzle: Environmental change requires complexity, which begets robustness, which begets genotype networks, which enable innovations, the very kind that allow life to cope with environmental change, increase its complexity, and so on, in an ascending spiral of ever-increasing innovability...is the hidden architecture of life.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Oct082014

That's Not My Problem - I'm Renting Them

Scott Hanselman gives a hilarious and insightful talk in Virtual Machines, JavaScript and Assembler, a keynote at Velocity Santa Clara 2014. The topic of his talk is an intuitive understanding of the cloud and why it's the best thing ever. 

At about 6:30 into the video Scott is at his standup comic best when he recounts a story of a talk Adrian Cockroft gave on Netflix’s move to SSDs. An audience member energetically questioned the move to SSDs saying they had high failure rates and how moving to SSDs was a stupid idea.

To which Mr. Cockroft replies:

That's not my problem, I'm renting them.

Scott selected the ideal illustration of the high level of abstraction the cloud provides. If you are new to the cloud that's a very hard idea to grasp. "That's not my problem, I'm renting them" is the perfect mantra when you find yourself worried about things you don't need to be worried about anymore.

Monday
Oct062014

How Clay.io Built their 10x Architecture Using AWS, Docker, HAProxy, and Lots More

This is a guest repost by Zoli Kahan from Clay.io. 

This is the first post in my new series 10x, where I share my experiences and how we do things at Clay.io to develop at scale with a small team. If you find these things interesting, we're hiring - zoli@clay.io.

The Cloud

CloudFlare

CloudFlare

CloudFlare handles all of our DNS, and acts as a distributed caching proxy with some additional DDOS protection features. It also handles SSL.

Amazon EC2 + VPC + NAT server

Click to read more ...

Friday
Oct032014

Stuff The Internet Says On Scalability For October 3rd, 2014

Hey, it's HighScalability time:


Is the database landscape evolving or devolving?

 

  • 76 million: once more through the data breach; 2016: when a Zettabyte is transfered over the Internet in one year
  • Quotable Quotes:
    • @wattersjames: Words missing from the Oracle PaaS keynote: agile, continuous delivery, microservices, scalability, polyglot, open source, community #oow14
    • @samcharrington: At last count, there were over 1,000,000 million containers running in the wild. http://stats.openvz.org  @jejb_ #ccevent
    • @mappingbabel: Oracle's cloud has 30,000 computers. Google has about two million computers. Amazon over a million. Rackspace over 100,000.
    • Andrew Auernheimer: The world should have given the GNU project some money to hire developers and security auditors. Hell, it should have given Stallman a place to sleep that isn't a couch at a university. There is no f*cking justice in this world.
    • John Nagle: The right answer is to track wins and losses on delayed and non-delayed ACKs. Don't turn on ACK delay unless you're sending a lot of non-delayed ACKs closely followed by packets on which the ACK could have been piggybacked. Turn it off when a delayed ACK has to be sent. I should have pushed for this in the 1980s.
    • @neil_conway: The number of < 15 node Hadoop clusters is >> the number of > 15 node Hadoop clusters. Unfortunately not reflected in SW architecture.

  • In the meat world Google wants devices to talk to you. The Physical Web. This will be better than Apple's beacons because Apple is severely limiting the functionality of beacons by requiring IDs be baked into applications. It's a very static and controlled world. In other words, it's very Apple. By using URLs Google is supporting both the web and apps; and adding flexibility because a single app can dynamically and generically handle the interaction from any kind of device. In other words, it's very Google. Apple has the numbers though, with hundreds of millions of beacon enabled phones in customer hands. Since it's just another protocol over BLE it should work on Apple devices as well.

  • Did Netflix survive the great AWS rebootathon? The Chaos Monkey says yes, yes they did: Out of our 2700+ production Cassandra nodes, 218 were rebooted. 22 Cassandra nodes were on hardware that did not reboot successfully. This led to those Cassandra nodes not coming back online. Our automation detected the failed nodes and replaced them all, with minimal human intervention. Netflix experienced 0 downtime that weekend. 

  • Google Compute Engine is following Moore's Law by announcing a 10% discount. Bandwidth is still expensive because networks don't care about silly laws. And margin has to come from somewhere.

  • Software is eating...well you've heard it before. Mesosphere cofounder envisions future data center as ‘one big computer’: The data center of the future will be fully virtualized, with everything from power supplies to storage devices consolidated into a single pool and managed by software, according to an executive whose company intends to lead the way.

  • Companies, startups, hacker spaces, teams are all intentional communities. People choose to work together towards some end. A consistent group killer is that people can be really sh*tty to each other. There's a lot of work that has been done around how to make intentional communities work. Holacracy is just one option. Here's a really interesting interview with Diana Leafe Christian on what makes communities work. It requires creating Community Glue, Good Process and Communication Skill, Effective Project Management, and good Governance and Decision making. Which is why most communities fail. Did you know there's even something called Non-Defensive Communication? If followed the Internet would collapse.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...