advertise
Monday
Nov112013

Ask HS: What is a good OLAP database choice with node.js?

This question was asked over email and I thought a larger audience might want to take a whack at it.

With a business associate, I am trying to develop a financial software that handles financial reports of listed companies. We managed to create this database with all the data necessary to do financial analysis. My associate is a Business Intelligence specialist so he is keen to use OLAPs databases like Microsoft Analysis Services or Jedox Palo, which enables in-memory calculations and very fast aggregation, slicing and dicing of data or write-backs.

At the same time I did an online course (MOOC) from Stanford CS184 called Startup Engineering which promoted/talked a lot about javascript and especially node.js as the language of the future for servers.

As I am keen to use open-source technologies (would be keen to avoid MS SSAS) for the development of a website to access this financial data , and there are so many choices for databases out there (Postgre, MongoDB, MySQL etc..but don't think they have OLAP features), do you know of resources, blogs, people knowledgeable on the matter, which talk about combination of node.js with OLAP databases? best use of a particular system with node.js?

Thanks for your input.

Friday
Nov082013

Stuff The Internet Says On Scalability For November 8th, 2013

Hey, it's HighScalability time:


Robot elephant from 1950, which consisted of 9000 parts and could walk 27 mp/h
  • Quotable Quotes:
    • Brandon Downey: F*ck these guys. 
    • @IEEEorg: Every second 21.6 people get their first mobile device. Mobile is growing 5 times faster than the human population. 
    • @littleidea: a distributed system to deploy a distributed systems to deploy a distributed system, bring your own turtles
    • @BenedictEvans: Photos shared/day: Facebook - 350m Snapchat - 350m Whatsapp - 400m Instagram: 55m.
    • @SciencePorn: Price of 1gb of storage over time: 1981 $300000, 1987 $50000, 1990 $10000, 1994 $1000, 1997 $100, 2000 $10, 2004 $1, 2012 $0.10
    • @kellabyte: If I hear “network partitions are rare on even hundred(s) of node clusters” again I’m going to lose my shit. This fallacy needs to die.
    • @danielbilling: PT had some great lines. "Logic merely enables one to be wrong with authority" is a particular favourite.
    • @aphyr: "Making things implicit in distributed systems is a good way to f*ck yourself"
    • @solarce: "Backpressure should be required" #riconwest
    • @mrb_bk: "Unbounded queues are AWFUL! They will F*CK YOU UP!!" - @jmhodges

  • Growth Hacking sounds a bit like cancer, but it's subtly different. In this case it helps explain the bewildering idea that Snapchat has a valuation of $3.5 Billion. The insight behind an ephemeral service is cool. When Facebook is building entire dead cities for pictures nobody will ever see again, the pure honesty of saying those pictures aren't really worth saving is refreshing. It rooted in the high schools, grew by word of mouth, allowed the free expression and creativity, easy to use, mobile first, group oriented, thrill of random rewards, etc. But I think it goes deeper. Kids are by their nature ahistoric and Snapshat is a perfect match for that nature. Adults are all about memory and Facebook perfectly reflects that ethos.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Thursday
Nov072013

Paper: Tempest: Scalable Time-Critical Web Services Platform

An interesting and different implementation approach: Tempest: Scalable Time-Critical Web Services Platform

Tempest is a new framework for developing time-critical web services. Tempest enables developers to build scalable, fault-tolerant services that can then be automatically replicated and deployed across clusters of computing nodes. The platform automatically adapts to load fluctuations, reacts when components fail, and ensures consistency between replicas by repairing when inconsistencies do occur. Tempest relies on a family of epidemic protocols and on Ricochet, a reliable time critical multicast protocol with probabilistic guarantees.

Tempest is built around a novel storage abstraction called the TempestCollection in which application developers store the state of a service. Our platform handles the replication of this state across clones of the service, persistence, and failure handling. To minimize the need for specialized knowledge on the part of the application developer, the TempestCollection employs interfaces almost identical to those used by the Java Collections standard. Elements can be accessed on an individual basis, but it is also possible to access the full set by iterating over it, just as in a standard Collection. The hope is that we can free developers from the complexities of scalability and fault-tolerance, leaving them to focus on application functionality.

Traditionally, services relying on a transactional database backend offer a strong data consistency model in which every read operation returns the result of the latest update that occurred on a data item. With Tempest we take a different approach by relaxing the model such that services offer sequential consistency [10]: Every replica of the service sees the operations on the same data item in the same order, but the order may be different from the order in which the operations were issued. Later, we will see that this is a non-trivial design decision; Tempest services can sometimes return results that would be erroneous were we using a more standard transactional execution model. For applications where these semantics are adequate, sequential consistency buys us scheduling flexibility that enables much better real-time responsiveness.

Tuesday
Nov052013

10 Things You Should Know About AWS

Authored by Chris Fregly:  Former Netflix Streaming Platform Engineer, AWS Certified Solution Architect and Purveyor of fluxcapacitor.com.

Ahead of the upcoming 2nd annual re:Invent conference, inspired by Simone Brunozzi’s recent presentation at an AWS Meetup in San Francisco, and collected from a few of my recent Fluxcapacitor.com consulting engagements, I’ve compiled a list of 10 useful time and clock-tick saving tips about AWS.

1) Query AWS resource metadata

 

Can’t remember the EBS-Optimized IO throughput of your c1.xlarge cluster?  How about the size limit of an S3 object on a single PUT?  awsnow.info is the answer to all of your AWS-resource metadata questions.  Interested in integrating awsnow.info with your application?  You’re in luck.  There’s now a REST API, as well!

Note:  These are default soft limits and will vary by account.

2) Tame your S3 buckets

 

Delete an entire S3 bucket with a single CLI command:  

aws s3 rb s3://<bucket-name> --force

Recursively copy a local directory to S3:

aws s3 cp <local-dir-name> s3://<bucket-name> --region <region-name> --recursive

3) Understand AWS cross-region dependencies

Click to read more ...

Monday
Nov042013

ESPN's Architecture at Scale - Operating at 100,000 Duh Nuh Nuhs Per Second

ESPN went on air in 1978. In those 30+ years think of the wonders we’ve seen! When I think of ESPN I think of a world wide brand that is the very definition of prime time. And it shows in their stats. ESPN.com peaks at 100,000 requests per second. Their peak event is, not surprisingly, the World Cup. But would you be surprised to learn ESPN is powered by only a few hundred servers and a couple of dozen engineers? I was.

And would you be surprised to learn ESPN is undergoing a fundamental transition from an Enterprise architecture to one capable of handling web scale loads driven by increasing mobile usage, personalization, and a service orientation? Again, thinking ESPN was just about watching sports on TV, I was surprised. ESPN is becoming much more than that. ESPN is becoming a sports platform. 

How does ESPN handle all of this complexity, responsibility, change, and load? Unlike most every other profile on HighScalability. The fascinating story of ESPN’s architecture is told by Manny Pelarinos, Senior Director, Engineering at ESPN in the InfoQ presentation Architecture at Scale at ESPN. Information from Max Protect: Scalability and Caching at ESPN.com has also been folded in. 

Starting in a pre-personal computer era ESPN developed an innovative cable and satellite TV sports empire. From an initial 30 minute program reviewing the day’s sports, they went on to make deals with the NBA, USFL, NHL, and what would become the big fish of all sports in the US, the National Football League.

Sport by sport deals were made to bring sports data in from all possible sources so ESPN could report scores, play film clips, and generally become one stop shopping for all things sports on TV and later the web.

It’s a complex system to understand. They have a lot going on with Television & Broadcasting, live scoring, editing and publishing, Digital Media, giving sports scores, web and mobile, personalization, fantasy games, and they also want to expand API access to 3rd party developers. Unlike most every profile on HighScalability ESPN has an enterprise heritage. It’s a Java Enterprise stack, so you’ll see Oracle databases, JMS brokers, Java Beans, and Hibernate.

Some of the most important lessons we’ll learn about: 

  • Platform changes everything. ESPN sees themselves as a content provider. These days content is accessed through multiple paths. It can be on TV, or on ESPN.com, or on mobile, but content is also being consumed by more and more internal applications, like Fantasy Games. And they also want to provide an external API so developers can build on ESPN resources. ESPN wants to become a walled garden built on a sports content platform that centralizes access to their prime advantage over everyone else, which is unprecedented access to sports related content and data. The walled garden approach that Facebook has made work for social, Apple has made work for apps, and Google has made work for AI, is what ESPN wants to do for sports. The problem is transitioning from an enterprise architecture to a platform based on APIs and services is a tough change to make. They can do it. They are doing it. But it will be hard.

  • Web scale changes everything. Many web properties today use Java as their standard backend development environment, but ESPN.com, which grew up in the Java Enterprise era, went all in for the canonical Enterprise architecture. And it has worked quite well. Until there was a sort of phase transition from enterprise class loads experienced by a relatively predictable ESPN.com to a world dominated by high mobile traffic, mass customization, and platform concerns. Many of the architecture choices we see in native web properties must now be used by ESPN.com.

  • Personalization changes everything. The cache that once saved your database is now much less useful when all content becomes dynamically constructed for each user and must follow you on every mode of access (.com, mobile, TV). 

  • Mobile changes everything. It puts pressure everywhere on your architecture. When there was just the web architecture didn’t matter as much because there were fewer users and fewer servers. In the mobile age with so many more users and servers these kind of architecture decisions make a huge difference. 

  • Partnerships are power. ESPN can create a walled garden because over the years they have developed partnerships that gives them special access to data that nobody else has. It’s good to be firstest with the mostest. That individual sports like the NFL and MLB seeking to capture this value with their own network lessens this advantage somewhat, but the forces are such that everyone needs to get along, which puts ESPN in the middle of a powerful platform play, if they can execute.

Lights. Camera. Action. Let’s learn how ESPN scales...

Click to read more ...

Thursday
Oct312013

Paper: Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask

Awesome paper on how particular synchronization mechanisms scale on multi-core architectures: Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask.

The goal is to pick a locking approach that doesn't degrade as the number of cores increase. Like everything else in life, that doesn't appear to be generically possible:

None of the nine locking schemes we consider consistently outperforms any other one, on all target architectures or workloads. Strictly speaking, to seek optimality, a lock algorithm should thus be selected based on the hardware platform and the expected workload

Abstract:

Click to read more ...

Wednesday
Oct302013

Strategy: Use Your Quantum Computer Lab to Tell Intentional Blinks from Involuntary Blinks

Oh, you don't have a Quantum Computer Lab staffed with researchers? Well, Google does. Here they are on G+. To learn what they are up to the Verge has A first look inside Google's futuristic quantum lab. The lab is partnership between NASA, Google, and a 512-qubit D-Wave Two quantum computer.  

One result from the lab is:

The first practical application has been on Google Glass, as engineers put the quantum chips to work on Glass's blink detector, helping it to better distinguish between intentional winks and involuntary blinks. For engineering reasons, the quantum processor can never be installed in Glass, but together with Google's conventional server centers, it can point the way to a better blink-detecting algorithm. That would allow the Glass processor to detect blinks with better accuracy and using significantly less power. If successful, it could be an important breakthrough for wink-triggered apps, which have struggled with the task so far.

Google thinks quantum computing has a major role in machine learning:

Click to read more ...

Tuesday
Oct292013

Sponsored Post: Apple, NuoDB, ScaleOut, FreeAgent, CloudStats.me, Intechnica, MongoDB, Stackdriver, BlueStripe, Booking, Rackspace, AiCache, Aerospike, New Relic, LogicMonitor, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Apple is hiring for multiple positions. Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly.
    • Sr. Software Engineer. You will primarily work with the domain team including project managers and engineers, as well as a large team of consultants in California and India. You will also work with many cross-functional and infrastructural teams during software delivery life cycle. Assignments can include design, delivery and oversight of incremental functionality, as well as a multi-year re-architecture of a complex in-flight application. Please apply here.
    • Enterprise Software Engineer.  Ability to develop detailed design and deliver a scalable implementation. Hands-on development of new code and/or managing existing code as part of a group and/or alone. Evaluating products including open-source modules and if need be incorporating them into projects. Should be able to lead a small group of developers to develop and maintain systems. Please apply here
    • Senior Cocoa Engineer. Apple is seeking a senior Cocoa engineer to join the IS&T Client Frameworks team. Are you someone looking to solve technically challenging problems involving a wide range of technologies (client, server, web)? Please apply here.
    • Web Application Engineer. We are looking for a team player with focus on designing and developing WWDR’s web-based applications. The successful candidate must have the ability to take minimal business requirements and work pro-actively with cross functional teams to obtain clear objectives. Please apply here.
    • Web Application Engineer. We are looking for a team player with focus on designing and developing WWDR’s web-based applications. The successful candidate must have the ability to take minimal business requirements and work pro-actively with cross functional teams to obtain clear objectives. Please apply here.
    • Senior Web Developer: Worldwide Developer Relations. Responsible for the architecture, design and development of the user interface of WWDR’s web applications. Work with UI designer and marketing to analyze business requirements and contribute to functional requirements. Collaborate with server-side software engineers on design, document technical specifications, and implement proper solutions. Please apply here
    • Sr Software Engineer. The iOS Systems Team is looking for a Software Engineer to work on operations, tools development and support of worldwide iOS Device sales and activations. Please apply here
    • Sr. Software Engineer. The Identity Management Services team at Apple is in search of a motivated Senior Software Engineer who is self-driven and has a proven track record in design and development of complex, highly available and scalable systems. Please apply here
    • SQE and Operations Manager, iOS Systems. The iOS Systems team is looking for an experienced hands-on manager to lead the Quality Engineering, Build and Release Engineering team. Please apply here
    • Senior Engineer: Emerging Technology. Apple’s Emerging Technology group is looking for a senior engineer passionate about exploring emerging technologies to create paradigm shifting cloud based solutions. Please apply here. 
    • Senior Storage Engineer. Software Engineering Operations (SEO) is seeking an experienced storage engineer to join our team. This role will focus on designing, deploying and maintaining critical SAN and NAS storage solutions. Please apply here

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

  • FreeAgent are looking for a talented Operations Engineer to come and work on the FreeAgent app, internal services and supporting infrastructure. You'll be working alongside our Ops team squashing single points of failure, fixing bottlenecks, profiling load and solving interesting scaling and automation problems. Please apply here

  • Intechnica is looking for Performance Architects, Performance Engineers, a Lead Automation Engineer, and a Solution Assurance Analyst. If making super-fast systems is your forte, send your CV with covering letter to careers@intechnica.co.uk.

  • Stackdriver is looking for systems + cloud + dev + ops guru to serve as our liaison within the DevOps community. If you are passionate about monitoring and automation, enjoy working on open source, and are excited by the prospect of sharing your expertise with your peers, get in touch with us today! http://bit.ly/143ARmy

  • We need awesome people @ Booking.com - We want YOU! Come design next generation interfaces, solve critical scalability problems, and hack on one of the largest Perl codebases. Please apply online.

  • LogicMonitor is looking for a Front End developer to have a huge impact, be valued, realize their dreams, and help us realize ours. We are looking for someone to own the code that delivers the design and usability of LogicMonitor's enterprise SaaS application(s). Please apply online

  • New Relic is looking for a Java Instrumentation Engineer, Java Scalability Engineer,  Distributed Systems Engineer and Android app engineer in Portland, OR. Ready to scale a web service with more incoming bits/second than Twitter? 

Fun and Informative Events

  • Your event here.

Cool Products and Services

  • NuoDB Blackbirds Release 2.0 Birthday. They grow up so fast these days! What people love about NuoDB is that it’s stable, always there for you and its flexible. Which is why it’s winning all kinds of popularity competitions, from “Most Likely to Succeed” through “Least Likely To Fall Over Sharding” to “Most Likely to Be ACID Compliant”. 

  • Rapidly Develop Hadoop MapReduce Code. With ScaleOut hServer™ you can use a subset of your Hadoop data and run your MapReduce code in seconds for fast code development and you don’t need to load and manage the Hadoop software  stack, it's a self-contained Hadoop MapReduce execution environment. To learn more check out www.scaleoutsoftware.com/prototypehadoop/

  • CloudStats.me - Monitor all your VPS, Dedicated and Cloud servers from one place. Whether you have only one server or hundreds of them, you will be able to check their status in seconds from the dashboard. Try server monitoring now for free.

  • MongoDB Backup Free Usage Tier Announced. We're pleased to introduce the free usage tier to MongoDB Management Service (MMS). MMS Backup provides point-in-time recovery for replica sets and consistent snapshots for sharded systems with minimal performance impact. Start backing up today at mms.mongodb.com.

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • NEW! Aerospike 3 - Download FREE. Introducing the new Aerospike 3 database that builds off of Aerospike's legacy of speed, scale, and reliability, adding an extensible data model that supports complex data types, large data types, queries using secondary indexes, user defined functions (UDFs) and distributed aggregations using Stream UDFs for real-time data.

  • The Rackspace Cloud Application Programming Interface (API)  has changed the game allowing customers to easily modify their cloud configuration with just a few lines of code. The API is a powerful tool and something everyone should know about, regardless of your level of technical ability.

  • aiScaler, aiProtect, aiMobile integrated solutions for Dynamic Site Acceleration, Denial of Service Protection and Simplifying Mobile Content. Free instant trial, no sign-up required. http://aiscaler.com/

  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Oct282013

Design Decisions for Scaling Your High Traffic Feeds

Guest post by Thierry Schellenbach, Founder/CTO of Fashiolista.com, follow @tschellenbach on Twitter and Github

Fashiolista started out as a hobby project which we built on the side. We had absolutely no idea it would grow into one of the largest online fashion communities. The entire first version took about two weeks to develop and our feed implementation was dead simple. We’ve come a long way since then and I’d like to share our experience with scaling feed systems.

Feeds are a core component of many large startups such as Pinterest, Instagram, Wanelo and Fashiolista. At Fashiolista the feed system powers the flat feed, aggregated feed and the notification system. This article will explain the troubles we ran into when scaling our feeds and the design decisions involved with building your own solution. Understanding the basics of how these feed systems work is essential as more and more applications rely on them.

Furthermore we’ve open sourced Feedly, the Python module powering our feeds. Where applicable I’ll reference how to use it to quickly build your own feed solution.

Introduction to Feeds

The problem of scaling feed systems has been widely discussed, but let me start by clarifying the basics:

Click to read more ...

Friday
Oct252013

Stuff The Internet Says On Scalability For October 25th, 2013

Hey, it's HighScalability time:


Test your sense of scale. Is this image of something microscopic or macroscopic? Find out.
  • $465m: Amount lost in 45 minutes due to a software bug. Where? Where else...the finance industry.
  • Quotable Quotes:
    • FCC: Fiber-to-the-home, on average, has the best performance in terms of latency, with 18 ms average during the peak period, with cable having 26 ms latency and DSL 44 ms latency.
    • @CompSciFact: "About 1,000 instructions is a reasonable upper limit for the complexity of problems now envisioned." -- John von Neumann, 1946
    • @anildash: healthcare.gov got 20M unique visitors in 20 days, faster than Google+ launch. Took Pinterest 2 years & BuzzFeed 4 years to hit 20M.
    • Thomas A. Edison: I start where the last man left off.
    • @brycebaril: I've never had a tech conference toy with my emotions like this year's #realtimeconf

  • Great explanation of the Netflix people don't know, their CDN. Chaos Kong is Coming: A Look At The Global Cloud and CDN Powering Netflix:  Netflix sees about 2 billion requests per day to its API, which serves as the “front door” for devices requesting videos, and routes the requests to the back-end services that power Netflix. That activity generates about 70 to 80 billion data points each day that are logged by the system.

  • Healthcare.gov Didn’t Work in Tests, Launched Anyway. This is just getting silly. When have projects built and released like this ever worked? Especially under huge huge initial loads. Never (or close to). This stuff is complicated for many reasons on every level. To compare such a product with a website is the height of technical ignorance. I recall Captain Renault: I'm shocked, shocked to find that gambling is going on in here!

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...