advertise
Wednesday
Jan162008

Strategy: Asynchronous Queued Virus Scanning

Atif Ghaffar has a nice strategy to deal with virus checking uploads:

  • Upload item into a safe area. If necessary, the uploader blocks waiting for a result.
  • Queue a work order into a job system so all the work can be distributed throughout your cluster.
  • A service in your cluster performs the virus scan and informs the uploader of the result.
  • Move the vetted item into your system. This removes the CPU bottleneck from your web servers and distributes it through your cluster. Keep your web servers providing prompt service to users. Let your cluster do the heavy lifting. This minimizes response time and maximizes throughput. A similar system can be used for creating thumbnails, transcoding, copyright checks, updating indexes, event notification or any other kind of intensive work.

    Click to read more ...

  • Tuesday
    Jan152008

    Does Sun Buying MySQL Change Your Scaling Strategy?

    Sun is buying MySQL for $1 billion. The MySQL team has worked long and hard so I don't begrudge them their pay day. Strike while the iron is offering a lot of cash I say. And I have nothing against Sun. Yet I can't help but think this changes the mental calculation of what database to use. When Oracle acquired Innobase a new independent storage engine was needed for MySQL. How is this different? Does this change your thinking any? Would Martha say it's a good thing? Like Luke I've searched my feelings, but the force is not with me and I don't really know how I feel about it.

    Click to read more ...

    Tuesday
    Jan152008

    Sun to Acquire MySQL

    So what are we announcing today? That in addition to acquiring MySQL, Sun will be unveiling new global support offerings into the MySQL marketplace. We'll be investing in both the community, and the marketplace - to accelerate the industry's phase change away from proprietary technology to the new world of open web platforms. Read more on Jonathan Schwartz's Blog What do you think about this?

    Click to read more ...

    Monday
    Jan142008

    OpenSpaces.org community site launched - framework for building scale-out applications

    GigaSpaces launched OpenSpaces.org, a community web site for developers who wish to utilize and contribute to the open source OpenSpaces development framework. OpenSpaces extends the Spring Framework for enterprise Java development, and leverages the GigaSpaces eXtreme Application Platform (XAP) for data caching, messaging and as the container for application business logic. It is designed for building highly-available, scale-out applications in distributed environments, such as SOA, cloud computing, grids and commodity servers. OpenSpaces is widely used in a variety of industries, including financial services, telecommunications, manufacturing and retail -- and across the web in e-commerce, Web 2.0 applications such as social networking sites, search and more. OpenSpaces.org already lists more than two dozen projects submitted by the developer community, including GigaSpaces customers, partners and employees. Innovative projects include an instant messaging platform, integration with PHP, configuration via JRuby, an implementation of Spring Batch and a scalable dynamic RSS feed delivery system. GigaSpaces recently announced the OpenSpaces Developer Challenge, a developer competition with $25,000 in total prizes and a $10,000 grand prize. The prizes will be awarded to the most innovative applications built using the OpenSpaces framework or plug-ins that extend it. The Challenge deadline is April 2, 2008 and ‘early bird’ prizes are available for those who submit their concepts by February 13, 2008. Additionally, in November of 2007 GigaSpaces launched its Start-Up Program, which provides free software licenses for qualifying individuals and companies.

    Click to read more ...

    Sunday
    Jan132008

    Google Reveals New MapReduce Stats

    The Google Operating System blog has an interesting post on Google's scale based on an updated version of Google's paper about MapReduce. The input data for some of the MapReduce jobs run in September 2007 was 403,152 TB (terabytes), the average number of machines allocated for a MapReduce job was 394, while the average completion time was 6 minutes and a half. The paper mentions that Google's indexing system processes more than 20 TB of raw data. Niall Kennedy calculates that the average MapReduce job runs across a $1 million hardware infrastructure, assuming that Google still uses the same cluster configurations from 2004: two 2 GHz Intel Xeon processors with Hyper-Threading enabled, 4 GB of memory, two 160 GB IDE hard drives and a gigabit Ethernet link. Greg Linden notices that Google's infrastructure is an important competitive advantage. "Anyone at Google can process terabytes of data. And they can get their results back in about 10 minutes, so they can iterate on it and try something else if they didn't get what they wanted the first time." It is interesting to compare this to Amazon EC2:

    • $0.40 Large Instance price per hour x 400 instances x 10 minutes = $26.7
    • 1 TB data transfer in at $0.10 per GB = $100
    For a hundred bucks you could also process a TB of data!

    Click to read more ...

    Sunday
    Jan132008

    A Note on How to Create Teasers When Posting 

    I fully and enthusiastically encourage anyone who wants to share a relevant topic to register and post. People have added a lot of good and useful content. Don't be shy. It's been asked how a teaser is created when posting so the full article doesn't display on the front page. A teaser is a paragraph interesting enough to convince readers to click on the "read more" link to get the full article. Creating a teaser in Drupal is accomplished by inserting < ! -- break -- > on a separate line directly after the text you want to be the teaser. Only DO NOT include the spaces. So your post looks like: Teaser Content < ! -- break -- > (no spaces in real life) Rest of Content It's a bit kludgey, but it works.

    Click to read more ...

    Saturday
    Jan122008

    Gandi.net, french registrar launches in granular server resources.

    Gandi.net, a French domain registrar has launched a very flexible dynamic resource allocated VPS service.

    Click to read more ...

    Friday
    Jan112008

    FTP Sanity: Redundancy, archiving, consolidation.

    Easy FTP redundancy and consolidation with the Open Source project Generic-FTP. Works with probably any Linux FTP Server (ProFTPD only one tested). Get rid of some single points of failure. A very easy to set up solution using scripts written in PHP. Tested thoroughly in a production environment.

    Click to read more ...

    Thursday
    Jan102008

    Sharding with Cookie-Based Session Storage

    In a recent project, I utilized RoR's cookie-based session storage to shard geographically distinct user groups. My technique for doing so was unique and, although it was a premature optimization, it is none-the-less an idea worth exploring.

    Click to read more ...

    Thursday
    Jan102008

    MONO ASP.NET. Will it make the web???

    I was wondering if it is already possible to scale a MONO's .NET website. I cannot see any real websites (with the term real I mean "a highly visited website") running mono. What do you think? Will MONO ASP.NET scale??? Is it worth planning a site to run with Mono asp.net? Or should we leave it to the future? What do you think?

    Click to read more ...