Stuff The Internet Says On Scalability For July 1, 2011

Submitted for your scaling pleasure:

  • Twitterers tweet 200 million tweets a day. Popular topics are eclectic, ranging from Swine Flu to Rebecca Black. Twitter has a really cool video on the global flow of tweets in the world. Worth watching. It looks like a rainbow arcing across the northern hemisphere.
  • Amazon Cloud Now Stores 339 Billion Objects, more than doubling last years volume. 
  • Quotable quotes for independence Alex:
    • n8foo: My fav part about the new #AWS pricing announcement - 500TB is the level where they say 'contact us'.
    • RoeyYaniv: Scalability guidelines - Technology can and will fail. The business should not. 
    • stevedekorte: Are the folks advocating FP for scalability unaware of the Von Neumann bottleneck?
    • nivertech: #Hadoop is a guy with a machete in front of a jungle - it made a trail, but there are new better #BigData middleware offerings in the jungle
    • lhazlewood: I don't think I've ever had a Love/Hate relationship like I've had with NoSQL. It's all awesome. And it all sucks.
    • dmccreary: #nosql is like teenage sex. Everyone is talking about it but few have actually done it.
  • Free is good says Amazon, No Inbound Data Transfer Fees and slightly lower outbound fees. Will Amazon make up for this with voume? Maybe S3 sales (which are not cheaper)? it's true, most sites upload little data compared to their outbound traffic, but for some use cases it's a real win, email, scraper bots, chat, video, and backup.
  • TDL4 is the new cage match Bot Net champion, it teaches us many lessons on how to build for high availability. Maybe we've reached the same stage for computer viri that we have for the biological variety, they will always be with us...
  • If data centers are the new computers, then why are they still using commodity computers that mimic the internet in form when they are nothing like the internet? Matt Welsh talks about why we still don't have low latency networking. And Packet Pushers talks about collapsing networking layers in the data center, using a single tier architecture with any-to-any communication.
  • DjangoCon Europe 2011 videos are now available.
  • Heroku is building a bulwark against Software Erosion. Impressively, their oldest app still works on their most recent infrastructure. They could never changing anything, but that's not what they do, constant improvements are being made over their entire PaaS stack. They make explicit contracts that provide a strong separation between the app and platform.
  • UsenetDHT: A low-overhead design for Usenet. This paper presents the design and implementation of UsenetDHT, a Usenet system that allows a set of cooperating sites to keep a shared, distributed copy of Usenet articles. 
  • The world you see depends on how fast you sample it. Possibly entire cultures could arise and die in a bacterial second, but we would never see them. Kyle Brandt in Per Second Measurements Don’t Cut It finds something similar when looking for dropped packets on their network interfaces. A whole new world opened up under a microscope.
  • PayPal Hits 100 Million Active Users
  • Controllability of complex networks. Here we develop analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system’s entire dynamics.
  • Tim O'Reilly with an inspiring talk on innovation and change. Look for people who want to have fun, not make money, money makers don't initiate innovation.
  • Weaving a New ‘Net: A Mesh-Based Solution for Democratizing Networked Communications. It is essential that we develop and maintain a communications infrastructure that will enable individuals and 
    communities (especially those in danger of political repression) to participate and contribute 
    fully and actively to the public sphere, and to communicate confidently in private. 
  • Humans are not more complicated because they have more code, they may have less code in the right places. We are the children of the lost DNA. Less can be a lot more.
  • Announcing Ozma: extending Scala with Oz concurrency. If the current language stacks bore you, here's something completely different. Ozma is an attempt at making the concurrency concepts of Oz available to a larger public. Ozma implements the full Scala specification and runs on the Mozart VM. It can therefore be seen as a new implementation of Scala. Ozma extends Scala with dataflow variables (allowing tail-recursive list functions), declarative (deterministic) concurrency, lazy declarative concurrency, and message-passing concurrency based on ports. For a great talk on Scala take a look at this Interview with Venkat Subramaniam.
  • Learn more about Spinning with Dmitry Vyukov. Not the bicycle kind, but the processor kind.
  • FlumeJava: easy, efficient data-parallel pipelines. A Java library that makes it easy to develop, test, and run efficient data-parallel pipelines. FlumeJava first optimizes the execution plan, and then executes the optimized operations on appropriate underlying primitives (e.g., MapReduces). 
  • C-MR. A continuous MapReduce framework for stream-oriented applications. C-MR supports low-latency stream processing using the familiar MapReduce interface with the addition of integrated window semantics.
  • If you dreamed of making a viable product charging next to nothing, think again: You Can’t Make Money Charging $1 Per Month.
  • Adding Scale to ASP.NET Applications in the Cloud. Nice set of strategies for scaling on Azure: Increase the number of Azure instances; Add Table or Blob Storage; Add AppFabric Caching; Use an asynchronous work process; Offload static or semi-static content to the Azure Content Delivery Network; Take advantage of Traffic Manager.