Before you proceed make sure you have physical volume(something like /dev/sda1, /dev/sda4, etc) with no data. This is going to be the gfs volume which you will export to other nodes. It should be on the node which is going to be your gnbd server. If you dont have such volume create one using fdisk. I used mounted gfs volume as a DOCUMENT ROOT for my Apache server nodes(Load Balanced). I tried it on FC4 64-bit. If you plan to try it on any other distribution or 32-bit arch.. still the procedure remains same. Since I built it from source but not RPMs, you may have to simply supply config options with a different CFLAGS. Full details at http://linuxsutra.chakravaka.com/redhat-cluster/2006/11/01/howto-gfs-gnbd
All, I'm looking for a faster/reliable solution for DR/BC as well as for sclability for my web/db servers. I came across VMWare Infrastructure and other products. The I/O performance concerns me to go with virtual servers. I'm also looking into imaging software such as Acrnois. Could anyone share their thoughts on how it's being done with bigger names such as google/youtube etc..? Thank you, Regards, Janakan Rajendran.
From FRESH Ports and their website:
ISPman is an ISP management software written in perl, using an LDAP
backend to manage virtual hosts for an ISP. It can be used to manage,
DNS, virtual hosts for apache config, postfix configuration, cyrus
mail boxes, proftpd etc.
ISPMan was written as a management tool for the network at 4unet where
between 30 to 50 domains are hosted and the number is crazily growing.
Managing these domains and their users was a little time consuming,
and needed an Administrator who knows linux and these daemons
fluently. Now the help-desk can easily manage the domains and users.
LDAP data can be easily replicated site wide, and mail box server can
be scaled from 1 to n as required. An LDAP entry called maildrop
tells the SMTP server (postfix) where to deliver the mail. The SMTP
servers can be loadbalanced with one of many load balancing
techniques. The program is written with scalability and High
availability in mind.
This may not be the right software for you if you want to run a small
ISP on a single box or if you want to use this software as an LDAP
editor or a DNS management software by itself.
ISPMan is written mostly in Perl and is based on four major components. All these components are based on open standards and are easily customizable.
Windows and SQL Server : Receive so much negativity in terms of the Highly Available, Scalable Platform..
I remain neutral, but time and again, when people talk Windows or SQL Server, they seem to consider them unreliable with limits around scalability, performance and availability. And then you start looking at some of the big boys you have listed here in the architectural section and most of them are on Linux, MySQL,Oracle platforms that we dont see Windows and SQL Server in there.. What are your thoughts ?
Where do you draw the line between scalability vs Performance vs High Availability vs Reliability? I guess at the end of the day, we all want to be highly available, great performance and always reliable. So is it safe to say that scalability is the answer ? Also when do you start to think scale out vs scale up ?
Update: Google added videos on Cluster Computing and MapReduce. There are five lectures: Introduction, MapReduce, Distributed File Systems, Clustering Algorithms, and Graph Algorithms. Advanced website design depends on deep distributed system design knowledge. Where do you get this knowledge? Try Google. They have a a whole Code for Educators program with tutorials and lectures on AJAX programming, distributed systems, and web security. Looks pretty nice.
Hi gurus, I'm totally new to this high scalability thing. I'm trying to create a website with scalability in mind (personal project). In my application I'll have forums for different groups of people (each group will have their own forums, members of groups can still post in other groups' forums but each group will mainly be using their forums most of the time). Now, I'm going to start with about 2000 groups with the potential of reaching up to 10000 groups (this is the maximum due to the nature of my application). I was thinking that having all posts in one table will be way too much for one table (esp. that some groups are expected to post hundreds or even thousands times per day, let's say about 500 of the groups, the rest of the groups won't be that active though) as I'll have to index the PostID, ParentPostID, GroupID and PostDate which can produce large indexes (consequentially causing slow inserts) if having everything in one table. So, I'm thinking of a way to divide the posts in many tables, here are some of the things I thought of: 1. Creating a separate table for every group e.g. ForumsPosts_x, where x is the GroupID (which has its own pros and cons, some of the pros that I can have small indexes and also use identity columns, I also assume it should be easy to move the tables to other databases should the application grow. Well, I posted this idea on some other forums and most people told me it's a sign of bad design if I have thousands of tables in my database. I was also concerned how to design my DAL if I do this. Should I use sprocs with dynamic SQL or use SQL text directly in my DAL code and what about the query plan caching if having a large number of tables .. so many problems here!) 2. Put everything in one table and if the site grows move some of the groups to another database (I'm concerned though about having many databases on the same machine, will it affect performance? of course I won't have hundreds of databases on the same machine but may be about 5 or even 10 databases on the same machine) I also have some other questions: I'm going to use ASP.NET for this project, I was planning initially to use SQL Server as a database but I'm worried about the SQL Server part and the cost of growth, should I consider an alternative like MySQL? But how will it perform with ASP.NET though in a high scalability scenario? Any suggestions are highly appreciated...
Update: A fun exploration of applied searching in How to search for the word "pen1s" in 185 emails every second. When indexOf doesn't cut it you just trie harder. Has a drunken friend ever inspired you to create a first of its kind internet service that is loved by millions, deemed subversive by thousands, all while handling over 1.2 billion emails a year on one rickity old server? That's how Paul Tyma came to build Mailinator. Mailinator is a free no-setup web service for thwarting evil spammers by creating throw-away registration email addresses. If you don't give web sites you real email address they can't spam you. They spam Mailinator instead :-) I love design with a point-of-view and Mailinator has a big giant harry one: performance first, second, and last. Why? Because Mailinator is free and that allows Paul to showcase his different perspective on design. While competitors buy big Iron to handle load, Paul uses a big idea instead: pick the right problem and create a design to fit the problem. No more. No less. The result is a perfect system architecture sonnet, beauty within the constraints of form. How does Mailinator carry out its work as a spam busting super hero? Site: http://mailinator.com/
Hi, First of all; thanks for a creating a GREAT resource on high scalability architecture. For us building high scalability solutions from the west coast of (tiny) Norway good input on the subject isn't always abundant. Which leads me to my next question; Are there any events or conferences on high scalability / SaaS in the US or internationally that any of you would recommend architects or data center managers to attend?
From Wikipedia: Hyperic HQ is a popular open source IT Operations computer system and network monitoring application software. It auto-discovers all system resources and their metrics, including hardware, operating systems, virtualization, databases, middleware, applications, and services. It watches hosts and services that you specify, alerting you when things go bad and again when they get better. It also provides historical charting and event correlation for faster problem identification. The Hyperic HQ server is a distributed J2EE application that runs on top of the open source JBoss Application Server. It is written in Java and portable C code and runs on Linux, Windows, Solaris, HP-UX and Mac OS X. Hyperic HQ Portal is a Java and AJAX User Interface that includes: * Inventory/Application Model & host hierarchy * Monitoring of network services (SMTP, POP3, HTTP, NNTP, ICMP, SNMP) * Monitoring of host resources (processor load, disk usage, system logs) * Remote monitoring supported through SSH or SSL encrypted tunnels. * Continuous Auto-Discovery of system resources including hardware, software and services * Track log & configuration data * Remote resource control for corrective actions such as starting and stopping services, vacuum database table, or snapshotting a VM * Ability to define event handlers to be run during service or host events for proactive problem resolution * Problem Resource Identification & Root Cause Analysis * Event Correlation * Alerting when service or host problems occur or get resolved via email, pager, Text messaging, RSS * Security/Access Control * Simple plug-in design that allows users to easily develop their own service checks depending on needs, by using the tools of choice (XML, J2EE, Bash, C++, Perl, Ruby, Python, PHP, C#, etc.) I met Javier Soltero, the CEO of Hyperic at the Velocity Web Performance and Operations dinner. I hit him with my best stuff and he didn't flinch a bit. Javier showed a deep understanding of the issues, a real passion for his product and the space, and the knowing good humor of someone who has been through a few wars and learned a little something along the way. I don' know if that translates to an excellent product, but it would at least make me take a look.