advertise
Wednesday
Oct152008

Sun Customer Ready HPC Cluster: Reference Configurations with Sun Fire X2200 M2 and X2100 M2 Servers

The reference configurations described in this blueprint are starting points for building Sun Customer Ready HPC Clusters configured with Sun Fire X2100 M2 and X2200 M2 servers. The configurations define how Sun Systems Group products can be configured in a typical grid rack deployment. This document describes configurations in detail using Sun Fire X2100 M2 and X2200 M2 servers with a Gigabit Ethernet data fabric, as well as configurations using Sun Fire X2200 M2 servers with a high-speed InfiniBand fabric. These configurations focus on single rack solutions, with external connections through uplink ports of the switches. These reference configurations have been architected using Sun's expertise gained in actual, real-world installations. Within certain constraints, as described in the later sections, the system can be tailored to the customer needs. Certain system components described in this document are only available through Sun's factory integration. Although the information contained here could be used during an integration on-site, the optimal benefit is achieved through Sun Customer Ready System integration.

Click to read more ...

Wednesday
Oct152008

Hadoop - A Primer

Hadoop is a distributed computing platform written in Java. It incorporates features similar to those of the Google File System and of MapReduce to process vast amounts of data "Hadoop is a Free Java software framework that supports data intensive distributed applications running on large clusters of commodity computers. It enables applications to easily scale out to thousands of nodes and petabytes of data" (Wikipedia) * What platform does Hadoop run on? * Java 1.5.x or higher, preferably from Sun * Linux * Windows for development * Solaris

Click to read more ...

Tuesday
Oct142008

Sun N1 Grid Engine Software and the Tokyo Institute of Technology Super Computer Grid

One of the world's leading technical institutes, the Tokyo Institute of Technology (Tokyo Tech) created the fastest supercomputer in Asia, and one of the largest outside of the United States. Using Sun x64 servers and data servers deployed in a grid architecture, Tokyo Tech built a cost-effective, flexible supercomputer that meets the demands of compute- and data-intensive applications. Built in just 35 days, the TSUBAME grid includes hundreds of systems incorporating thousands of processor cores and terabytes of memory, and delivers 47.38 trillion1 floating-point operations per second (TeraFLOPS) of sustained LINPACK benchmark performance and 1.1 petabyte of storage to users running common off-the-shelf applications. Based on the deployment architecture, the grid is expected to reach 100 TeraFLOPS in the future. This Sun BluePrints article provides an overview of the Tokyo Tech grid, named TSUBAME. The third in a series of Sun BluePrints articles on the TSUBAME grid, this document provides an overview of the overall system architecture of the grid, as well as a detailed look at the configuration of the Sun N1 Grid Engine software that makes the grid accessible to users.

Click to read more ...

Tuesday
Oct142008

Implementing the Lustre File System with Sun Storage: High Performance Storage for High Performance Computing

Much of the focus of high performance computing (HPC) has centered on CPU performance. However, as computing requirements grow, HPC clusters are demanding higher rates of aggregate data throughput. Today's clusters feature larger numbers of nodes with increased compute speeds. The higher clock rates and operations per clock cycle create increased demand for local data on each node. In addition, InfiniBand and other high-speed, low-latency interconnects increase the data throughput available to each node. Traditional shared file systems such as NFS have not been able to scale to meet this growing demand for data throughput on HPC clusters. Scalable cluster file systems that can provide parallel data access to hundreds of nodes and petabytes of storage are needed to provide the high data throughput required by large HPC applications, including manufacturing, electronic design, and research. This paper describes an implementation of the Sun Lustre file system as a scalable storage cluster using Sun Fire servers, high-speed/low-latency InfiniBand interconnects, and additional networking and storage devices. Furthermore, this paper explores the use of the Sun Lustre file system at a shared government and education research site, including configuration information and details on testing that was performed on-site to evaluate the performance of Sun's scalable storage solution.

Click to read more ...

Tuesday
Oct142008

Sun Storage and Archive Solution for HPC

When designing data storage solutions for High Performance Computing (HPC) environments, IT architects strive to balance complex and often conflicting requirements. The need to manage a skyrocketing amount of data, along with the goals of controlling cost and immediate data availability, can make it difficult to meet HPC application demands within the constraints of today's IT budgets. To help customers address an almost bewildering set of architectural challenges, Sun has developed the Sun Storage and Archive Solution for HPC, a reference architecture that can be easily customized to meet specific application goals and business requirements. This article is intended for IT managers and storage architects familiar with HPC applications and data requirements in the organization. It assumes that the audience has a technical background and some familiarity with issues surrounding the task of configuring systems and storage.

Click to read more ...

Monday
Oct132008

Challenges from large scale computing at Google

From Greg Linden on a talk Google Fellow Jeff Dean gave last week at University of Washington Computer Science titled "Research Challenges Inspired by Large-Scale Computing at Google" : Coming away from the talk, the biggest points for me were the considerable interest in reducing costs (especially reducing power costs), the suggestion that the Google cluster may eventually contain 10M machines at 1k locations, and the call to action for researchers on distributed systems and databases to think orders of magnitude bigger than they often are, not about running on hundreds of machines in one location, but hundreds of thousands of machines across many locations.

Click to read more ...

Monday
Oct132008

SQL Server 2008 Database Performance and Scalability

Microsoft SQL Server 2008 incorporates the tools and technologies that are necessary to implement relational databases, reporting systems, and data warehouses of enterprise scale, and provides optimal performance and responsiveness.
With SQL Server 2008, you can take advantage of the latest hardware technologies while scaling up your servers to support server consolidation. SQL Server 2008 also enables you to scale out your largest data solutions.

This white paper describes the performance and scalability capabilities of Microsoft® SQL Server® 2008 and explains how you can use these capabilities to:
* Optimize performance for any size of database with the tools and features that are available for the database engine, analysis services, reporting services, and integration services.
* Scale up your servers to take full advantage of new hardware capabilities.
* Scale out your database environment to optimize responsiveness and to move your data closer to your users.


Read the entire article about SQL Server 2008 Database Performance and Scalability at MyTestBox.com - web software reviews, news, tips & tricks.

Click to read more ...

Friday
Oct102008

Useful Corporate Blogs that Talk About Scalability

Some intrepid company blogs are posting their technical challenges and how they solve them. I wish more would open up and talk about what they are doing as it helps everyone move forward. Here are a few blogs documenting their encounters with the bleeding edge:

  • Flickr
  • Digg
  • LinkedIn
  • Facebook
  • Amazon Web Services blog
  • Twitter blog
  • Reddit blog
  • Photobucket blog
  • Second Life blog
  • PlentyofFish blog
  • Joyent's Blog Any others that should be added?

    Click to read more ...

  • Friday
    Oct102008

    The Art of Capacity Planning: Scaling Web Resources

    Update 3: The book was released! Find it on Amazon at The Art of Capacity Planning. Update 2: Maybe the iPhone can use a little capacity planning? What's Behind the iPhone 3G Glitches: One source says Apple programmed the Infineon chip to demand a more powerful 3G signal than the iPhone really requires. So if too many people try to make a call or go on the Internet in a given area, some of the devices will decide there's insufficient power and switch to the slower network—even if there is enough 3G bandwidth available. Update: To get a taste of what will be served, mySQL DBA has a nice post titled Capacity Planning, Architecture, Scaling, Response time, Throughput. You learn how to figure out when your application will break by building a 3rd order polynomial. Cool stuff! John Allspaw who is the Operations Engineering Manager at Flickr is about to publish a book with O'Reilly. There are not much details so far but it seems interesting and relevant to High Scalability. Allspaw combines personal anecdotes from many phases of Flickr's growth with insights from his colleagues in many other industries to give you solid guidelines for measuring your growth, predicting trends, and making cost-effective preparations. Topics include:

    • Evaluating tools for measurement and deployment
    • Capacity analysis and prediction for storage, database, and application servers
    • Designing architectures to easily add and measure capacity
    • Handling sudden spikes
    • Predicting exponential and explosive growth
    • How cloud services such as EC2 can fit into a capacity strategy
    The Art of Capacity Planning: Scaling Web Resources is available for pre-order on amazon.com

    Click to read more ...

    Wednesday
    Oct082008

    Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest 

    This strategy is stated perfectly by Flickr's Myles Grant: The Flickr engineering team is obsessed with making pages load as quickly as possible. To that end, we’re refactoring large amounts of our code to do only the essential work up front, and rely on our queuing system to do the rest. Flickr uses a queuing system to process 11 million tasks a day. Leslie Michael Orchard also does a great job explaining the queuing meme in his excellent post Queue everything and delight everyone. Asynchronous work queues are how you scalably solve problems that are too big to handle in real-time. The process:

  • Identify the minimum feedback the client (UI, API) needs to know an operation succeeded. It's enough, for example, to update a client's view when a posting a message to a microblogging service. The client probably isn't aware of all the other steps that happen when a message is added and doesn't really care when they happen as long as the obvious cases happen in an appropariate period of time.
  • Queue all work not on the critical path to a job queueing system so the critical path remains unblocked. Work is then load balanced across a cluster and completed as resources permit. The more sharded your architecture is the more work can be done in parallel which minimizes total throughput time. This approach makes it much easier to bound response latencies as features scale.

    Queues Give You Lots of New Knobs to Play With

    As features are added data consumers multiply, so throwing a new task into a sequential process has a good chance of blowing latencies. Queueing gives much more control and flexibility over the performance of a system. With queues some advanced strategies you have at your disposal are:
  • Horizontal scaling. Add more processing resources to do more work in parallel.
  • Priority order processing. Paying customers, can be processed first, for example. Take measures to avoid starvation.
  • Aggregation. Work sitting on the same queue for the same user can be aggregated together so it can be processed as a batch.
  • Work canceling. A request later in the queue can cancel work earlier in the queue. These can just be dropped.
  • CPU limitting. When jobs have unbounded CPU time it destroys the latency for other jobs sitting in the queue. Bounding CPU limits on jobs evens out latency for everyone.
  • Low priority work dropping. Under load low priority jobs can be dropped. Just make you have background sweep processes that catch work that should have been done and redoes it.
  • Admission control. Under load clients can be told about when to retry. This is the best form of flow control, end-to-end flow with the client. We want to push back on work as high up the stack as we can. Stop the client from pushing work to you and you've accomplished something. Just having blind retries and timeouts puts immense pressure on the whole system. These ideas have been employed in embedded real-time systems forever and now it seems they'll move into web services as well.

    What Can You do with Your Queue?

    The options are endless, but here are some uses I found out in the wild:
  • Backfill jobs. Backfill is what Flickr calls asynchronous job that: alter database tables in preparation for a new feature; fix existing features; or other operation that touch a lot of accounts, photos, or groups. For example, a sharding approach means related data is spread through many different shards. To delete a user account would require visiting each shard to delete that users data. Each of those deletes would be queued to they could be done in parallel. Now lets say a bug prevented some of the user data from deleting. After the bug was fixed the user data for all the impacted user accounts would have to be scheduled to be deleted again.
  • Low latency funciton call router.
  • Scatter/gather calls in paralellel.
  • Defer expensive library calls.
  • Parellize database queries.
  • Job queue system for a cluster. Efficiently use all your pool of CPU power.
  • Sending scheduled mail merged emails.
  • Creating guest hosts
  • Put heavy code on backend instead of the web server.
  • Call a cron script to update topic hits and popular article hits.
  • Clean useless data from database because it's outdated.
  • Resize photos.
  • Run daily reports.
  • Update search indexes.
  • Speed up batch jobs by running them in parallel.
  • SpamAssassin spamtraps.

    Queuing Implies an Event Driven State Machine Based Client Architecture

    Moving to queuing has architecture implications. The client and server are nolonger connected in a direct request-response sort of way. Instead, the server continually sends events to clients. The client is event driven instead of request-response driven. Internally clients often simulates the reqest-response model even though Ajax is asynchronous. It might be better to drop the request-response illusion and just make the client an event driven state machine. An event can come from a request, or from asynchronous jobs, or events can be generated by others performing activities that a client should see. Each client has an event channel that the system puts events on for a client to consume. The client is responspible for making sense of the event in its current context and is capable of handling any event regardless of its original source.

    Queuing Systems

    If you are in the market for a queuing system take a look at:
  • Gearman - Open Source Message Queuing System
  • Amazon's SQS. The latencies for this service tend to be high and variable so it may not be appropriate for all tasks.
  • beanstalkd.
  • Apache ActiveMQ.
  • Spread Queue
  • Rabbit MQ
  • Open AMQ
  • The Schwartz
  • Starling
  • Simple MQ
  • Roll your own.

    Related Articles

  • Flick Engineers Do it Offline by Myles Grant
  • Queue everything and delight everyone by Leslie Michael Orchard.
  • Gearman - Open Source Message Queuing System
  • GridGain: One Compute Grid, Many Data Grids

    Click to read more ...