Paper: Designing Disaster Tolerant High Availability Clusters
A very detailed (339 pages) paper on how to use HP products to create a highly available cluster. It's somewhat dated and obviously concentrates on HP products, but it is still good information.
Table of contents:
1. Disaster Tolerance and Recovery in a Serviceguard Cluster
2. Building an Extended Distance Cluster Using ServiceGuard
3. Designing a Metropolitan Cluster
4. Designing a Continental Cluster
5. Building Disaster-Tolerant Serviceguard Solutions Using Metrocluster with Continuous Access XP
6. Building Disaster Tolerant Serviceguard Solutions Using Metrocluster with EMC SRDF
7. Cascading Failover in a Continental Cluster
 Evaluating the Need for Disaster Tolerance
 What is a Disaster Tolerant Architecture?
 Types of Disaster Tolerant Clusters
 Extended Distance Clusters
 Metropolitan Cluster
 Continental Cluster
 Continental Cluster With Cascading Failover
 Disaster Tolerant Architecture Guidelines
 Protecting Nodes through Geographic Dispersion
 Protecting Data through Replication
 Using Alternative Power Sources
 Creating Highly Available Networking
 Disaster Tolerant Cluster Limitations
 Managing a Disaster Tolerant Environment
 Using this Guide with Your Disaster Tolerant Cluster Products
2. Building an Extended Distance Cluster Using ServiceGuard
 Types of Data Link for Storage and Networking
 Two Data Center Architecture
 Two Data Center FibreChannel Implementations
 Advantages and Disadvantages of a Two-Data-Center Architecture
 Three Data Center Architectures
 Rules for Separate Network and Data Links
 Guidelines on DWDM Links for Network and Data
3. Designing a Metropolitan Cluster
 Designing a Disaster Tolerant Architecture for use with Metrocluster Products
 Single Data Center
 Two Data Centers and Third Location with Arbitrator(s)
 Additional EMC SRDF Configurations
 Setting up Hardware for 1 by 1 Configurations
 Setting up Hardware for M by N Configurations
 Worksheets
 Disaster Tolerant Checklist
 Cluster Configuration Worksheet
 Package Configuration Worksheet
 Next Steps
4. Designing a Continental Cluster
 Understanding Continental Cluster Concepts
 Mutual Recovery Configuration
 Application Recovery in a Continental Cluster
 Monitoring over a Wide Area Network
 Cluster Events
 Interpreting the Significance of Cluster Events
 How Notifications Work
 Alerts
 Alarms
 Creating Notifications for Failure Events
 Creating Notifications for Events that Indicate a Return of Service
 Performing Cluster Recovery
 Notes on Packages in a Continental Cluster
 How Serviceguard commands work in a Continentalcluster
 Designing a Disaster Tolerant Architecture for use with Continentalclusters
 Mutual Recovery
 Serviceguard Clusters
 Data Replication
 Highly Available Wide Area Networking
 Data Center Processes
 Continentalclusters Worksheets
 Preparing the Clusters
 Setting up and Testing Data Replication
 Configuring a Cluster without Recovery Packages
 Configuring a Cluster with Recovery Packages
 Building the Continentalclusters Configuration
 Preparing Security Files
 Creating the Monitor Package
 Editing the Continentalclusters Configuration File
 Checking and Applying the Continentalclusters Configuration
 Starting the Continentalclusters Monitor Package
 Validating the Configuration
 Documenting the Recovery Procedure
 Reviewing the Recovery Procedure
 Testing the Continental Cluster
 Testing Individual Packages
 Testing Continentalclusters Operations
 Switching to the Recovery Packages in Case of Disaster
 Receiving Notification
 Verifying that Recovery is Needed
 Using the Recovery Command to Switch All Packages
 How the cmrecovercl Command Works
 Forcing a Package to Start
 Restoring Disaster Tolerance
 Restore Clusters to their Original Roles
 Primary Packages Remain on the Surviving Cluster
 Primary Packages Remain on the Surviving Cluster using cmswitchconcl
 Newly Created Cluster Will Run Primary Packages
 Newly Created Cluster Will Function as Recovery Cluster for All Recovery Groups
 Maintaining a Continental Cluster
 Adding a Node to a Cluster or Removing a Node from a Cluster
 Adding a Package to the Continental Cluster
 Removing a Package from the Continental Cluster
 Changing Monitoring Definitions
 Checking the Status of Clusters, Nodes, and Packages
 Reviewing Messages and Log Files
 Deleting a Continental Cluster Configuration
 Renaming a Continental Cluster
 Checking Java File Versions
 Next Steps
 Support for Oracle RAC Instances in a Continentalclusters Environment
 Configuring the Environment for Continentalclusters to Support Oracle RAC
 Initial Startup of Oracle RAC Instance in a Continentalclusters Environment
 Failover of Oracle RAC Instances to the Recovery Site
 Failback of Oracle RAC Instances After a Failover
5. Building Disaster-Tolerant Serviceguard Solutions Using Metrocluster with Continuous Access XP
 Files for Integrating XP Disk Arrays with Serviceguard Clusters
 Overview of Continuous Access XP Concepts
 PVOLs and SVOLs
 Device Groups and Fence Levels
 Creating the Cluster
 Preparing the Cluster for Data Replication
 Creating the RAID Manager Configuration
 Defining Storage Units
 Configuring Packages for Disaster Recovery
 Completing and Running a Metrocluster Solution with Continuous Access XP
 Maintaining a Cluster that uses Metrocluster/CA
 XP/CA Device Group Monitor
 Completing and Running a Continental Cluster Solution with Continuous Access XP
 Setting up a Primary Package on the Primary Cluster
 Setting up a Recovery Package on the Recovery Cluster
 Setting up the Continental Cluster Configuration
 Switching to the Recovery Cluster in Case of Disaster
 Failback Scenarios
 Maintaining the Continuous Access XP Data Replication Environment
6. Building Disaster Tolerant Serviceguard Solutions Using Metrocluster with EMC SRDF
 Files for Integrating ServiceGuard with EMC SRDF
 Overview of EMC and SRDF Concepts
 Preparing the Cluster for Data Replication
 Installing the Necessary Software
 Building the Symmetrix CLI Database
 Determining Symmetrix Device Names on Each Node
 Building a Metrocluster Solution with EMC SRDF
 Setting up 1 by 1 Configurations
 Grouping the Symmetrix Devices at Each Data Center
 Setting up M by N Configurations
 Configuring Serviceguard Packages for Automatic Disaster Recovery
 Maintaining a Cluster that Uses Metrocluster/SRDF
 Managing Business Continuity Volumes
 R1/R2 Swapping
 Building a Continental Cluster Solution with EMC SRDF
 Setting up a Primary Package on the Primary Cluster
 Setting up a Recovery Package on the Recovery Cluster
 Setting up the Continental Cluster Configuration
 Switching to the Recovery Cluster in Case of Disaster
 Failback Scenarios
 Maintaining the EMC SRDF Data Replication Environment
 R1/R2 Swapping
7. Cascading Failover in a Continental Cluster
 Overview
 Symmetrix Configuration
 Using Template Files
 Data Storage Setup
 Setting Up Symmetrix Device Groups
 Setting up Volume Groups
 Testing the Volume Groups
 Primary Cluster Package Setup
 Recovery Cluster Package Setup
 Continental Cluster Configuration
 Data Replication Procedures
 Data Initialization Procedures
 Data Refresh Procedures in the Steady State
 Data Replication in Failover and Failback Scenarios
 
             
             
            