Entries in Distributed Computing (4)


22 Recommendations for Building Effective High Traffic Web Software

This is a guest post by Ashwanth Fernando, Software Engineer from the trenches at large scale internet companies.

Inspired by the book "Effective Java" by Joshua Bloch, I wanted to share my holistic recommendations on building high traffic web software (i.e. web applications/services that serve high traffic loads). Some of these items may not be just about software design but also around surrounding areas such as the engineering organization, culture etc.

Two disclaimers up front:

1) This is my opinion.
2) There will be real world situations where the below principles will be wrong as in all things "software". Please use common sense all the time.

Consider using more than one datacenter

There have been numerous horror stories about businesses, ahem going out of business because they just had a single datacenter. Its really important to have more than one data center if you want to protect yourself from natural disasters or electrical supply failures. Run all your datacenters in active-active configuration. It may cost extra money, but its well worth it rather than having an active passive configuration and then finding out at the end that for some pieces of data, your passive hardware was not consistent with the active one.

Consider a sparse datacenter deployment

Click to read more ...


10 Things You Should Know About AWS

Authored by Chris Fregly:  Former Netflix Streaming Platform Engineer, AWS Certified Solution Architect and Purveyor of

Ahead of the upcoming 2nd annual re:Invent conference, inspired by Simone Brunozzi’s recent presentation at an AWS Meetup in San Francisco, and collected from a few of my recent consulting engagements, I’ve compiled a list of 10 useful time and clock-tick saving tips about AWS.

1) Query AWS resource metadata


Can’t remember the EBS-Optimized IO throughput of your c1.xlarge cluster?  How about the size limit of an S3 object on a single PUT? is the answer to all of your AWS-resource metadata questions.  Interested in integrating with your application?  You’re in luck.  There’s now a REST API, as well!

Note:  These are default soft limits and will vary by account.

2) Tame your S3 buckets


Delete an entire S3 bucket with a single CLI command:  

aws s3 rb s3://<bucket-name> --force

Recursively copy a local directory to S3:

aws s3 cp <local-dir-name> s3://<bucket-name> --region <region-name> --recursive

3) Understand AWS cross-region dependencies

Click to read more ...


100 Node Hazelcast cluster on Amazon EC2

Deploying, running and monitoring application on a big cluster is a challenging task. Recently Hazelcast team deployed a demo application on Amazon EC2 platform to show how Hazelcast p2p cluster scales and screen recorded the entire process from deployment to monitoring.

Hazelcast is open source (Apache License), transactional, distributed caching solution for Java. It is a little more than a cache though as it provides distributed implementation of map, multimap, queue, topic, lock and executor service. 

Details of running 100 node Hazelcast cluster on Amazon EC2 can be found here. Make sure to watch the screencast!


Big Data on Grids or on Clouds? 

 Contributed by Wolfgang Gentzsch:

Now that we have a new computing paradigm, Cloud Computing, how can Clouds help our data? Replace our internal data vaults as we hoped Grids would? Are Grids dead now that we have Clouds? Despite all the promising developments in the Grid and Cloud computing space, and the avalanche of publications and talks on this subject, many people still seem to be confused about internal data and compute resources, versus Grids versus Clouds, and they are hesitant to take the next step. I think there are a number of issues driving this uncertainty.

read more at: