SPARCcluster HA Server Software Administration Guide
この本のみを検索
PDF 文書ファイルをダウンロードする

Introduction

1

This chapter offers a high-level overview of the functionality included with Solstice HA. The interactions between the various parts of Solstice HA are also discussed.
Use the following table to locate specific information in this chapter:
Solstice HA Overviewpage 1-1
Elements of Solstice HApage 1-4
Solstice HA Commandspage 1-10
System Files Associated With Solstice HApage 1-11

1.1 Solstice HA Overview

Solstice HA is an unbundled software product that supports specific dual-server hardware configurations. When properly configured, the hardware and software together provide highly available data services.
Solstice HA provides an environment in which data services remain available after any single hardware or software point of failure has occurred in the configuration.
The hardware configuration is called SPARCcluster High Availability, while the software is referred to as a Solstice High Availability. The configuration includes two servers, multi-host disks, Solstice DiskSuite software, and the Solstice HA software. The data services available include Solstice HA-NFS and Solstice HA-ORACLE.

1.1.1 Hardware Overview

Each server in a Solstice HA configuration has two or more disks which are accessible only from that server. These are called local disks. They contain the Solaris software environment, the Solstice HA packages, and optionally other local data.
Disks in the configuration that are accessible from either of the servers are called multi-host disks. Multi-host disks are organized into one or two disksets during configuration. These disksets contain the data (information) for highly available services.
The servers in the configuration communicate via two private network connections. Solstice HA configuration and status information is communicated across these links.
The servers also have one or more public network connections that provide communication to clients of the highly available services.
The servers are referred to as being siblings of each other.

1.1.2 Software Overview

The Solstice HA and Solstice DiskSuite packages and the Solaris 2.4 distribution are installed on both servers in the configuration.
The Solstice HA software has the following components:
  • Membership monitor
  • Fault monitor
  • Programs used by the membership monitor and fault monitor
  • Various administration commands
  • Solstice DiskSuite software package
The membership monitor, fault monitor, and associated programs allow one server to take over when hardware or software fails.
When a takeover occurs, the server assuming control becomes the I/O master for the failed server's disksets and redirects the clients of the failed server to itself. The takeover also includes actions that are specific to the HA-NFS and HA-ORACLE data services.
Administrators can use the same mechanism to manually direct one server to take over the data services for the sibling server. This is referred to as switchover and is performed using the haswitch(1M) command.
A switchover allows administrators to take a server offline for maintenance and to bring a previously offline server back online.
Solstice DiskSuite is required for Solstice HA operations. Solstice DiskSuite provides the following functionality:
  • Diskset management
  • Disk mirroring
  • Disk concatenation
  • Disk striping
  • Hot spare pool device management
  • UNIX file system logging
  • Management of metadevice status and configuration of database replicas
After the Solaris software environment is installed, Solstice DiskSuite and Solstice HA software are installed on each server's local disks. The hasetup(1M) command provides an interface to set up the configuration. Either Solstice DiskSuite commands or the Solstice DiskSuite Tool (metatool(1M)) graphical user interface can then be used to create concatenations, stripes, mirrors, hot spares, and UFS logs. Solstice DiskSuite Tool provides an interface that makes large disk configurations easier to manage.
Administrators will use the hastat(1M), haload(1M), and metastat(1M) commands to monitor the status of the Solstice High Availability configuration and the Solstice DiskSuite metadevices.

1.2 Elements of Solstice HA

The Solstice HA product consists of a set of programs which provide the ability to:
  • Monitor the Solstice HA configuration
  • Configure the software in the Solstice HA configuration
  • Monitor the services running on Solstice HA servers
  • Monitor the status of the Solstice HA configurations
  • Manipulate disksets

グラフィック

Figure 1-1

Figure 1-1 illustrates how Solstice HA fits on top of Solstice DiskSuite and Solaris 2.4. Discussions of each of these elements are given in the following subsections.

1.2.1 Solstice High Availability

Solstice High Availability is a software and hardware package that enables two servers to act as a highly available data facility. Solstice HA is built on Solstice DiskSuite, which provides the mirroring, concatenation, stripes, hot spares, UFS logging.
Each Solstice HA server acts as an I/O master for its respective diskset and runs data services that export data on that diskset.
In a symmetric configuration, each server is also a backup for the sibling server's data services. Solstice HA provides programs used by each server to monitor the status of data services running on itself as well as the data services running on the sibling machine in the configuration.
Solstice HA automates the decision to take over when the sibling server has a software or hardware failure. Takeover processing includes common actions, such as assuming I/O mastery of the failed server's diskset and redirecting the failed server's clients to itself. Takeover also includes actions that are specific to the data service.
Solstice HA additionally provides for administratively initiated switchover, which is the graceful switch of a diskset from one functional server to the sibling to reconfigure or bring a server back online.
Solstice HA is broken into three major layers: service, command, and management, as explained in the following three subsections.

1.2.1.1 Service Layer

Solstice HA supports two data services, HA-NFS and HA-ORACLE. The HA-ORACLE is version 7.0 relational database management system (DBMS).

1.2.1.2 Command Layer

Solstice HA provides utilities for configuring and administering the highly available data facility. These utilities allow:
  • Configuring of the two Solstice HA servers (hasetup(1M))
  • Checking the configuration to ensure high availability (hacheck(1M))
  • Editing the vfstab.logicalhost and dfstab.logicalhost files (hafstab(1M))
  • Monitoring the status of the configuration (hastat(1M))
  • Transferring the data services from one server in the configuration to the sibling (haswitch(1M))
  • Monitoring the load on the servers (haload(1M))
  • Verify the HA-ORACLE installation (haoracle(1M))

1.2.1.3 Management Layer

The Solstice HA management layer includes the membership monitor and the fault monitor.
The membership monitor detects which of the two servers in the Solstice HA configuration is running and which of the two servers has failed.
The principal functions of the membership monitor are to make sure the servers are in sync and to coordinate the configuration of the applications and services when the state of the configuration changes.
The membership monitor provides the following features:
  • Reliability. No single point of failure in the servers can cause membership monitor failure.
  • Fault detection. Detection of a server crash within the Solstice HA configuration.
  • Server removal. Removal of the failed server from the Solstice HA configuration using a reliable fail-fast mechanism.
While the membership monitor detects total failure of a system in the Solstice HA configuration, the fault monitor detects failures of individual services.
The fault monitor consists of the fault daemon and the programs used to probe various parts of the data service. These probes are executed periodically by the fault daemon to ensure the services are working. The types of probes include:
  • Probes of both the public and private networks
  • Probes of both the local and remote exported NFS service
  • Probes of both the local and remote ORACLE service
If the probe detects a service failure, the fault monitor may try to restart the service. If the service does not restart, the fault monitor probe initiates a takeover.
For HA-NFS service, the fault monitor checks the availability of each of the exported highly available NFS file systems.
Under certain circumstances the fault monitor will not initiate a takeover even though there has been an interruption of a service. These interruptions can include:
  • The exported NFS file system is being checked with fsck(1M).
  • The NFS file system is locked via lockfs(1M).
  • The name service is not working. Because client HA-NFS depends on the name service database (NIS or NIS+), the HA-NFS services are only as reliable as the name service. The name service exists outside the Solstice HA configuration so you must take necessary measures to ensure its reliability. These measures may include use of uninterruptable power supply (UPS) on the name service servers. Refer to the SPARCcluster High Availability Server Service Manual for additional information. Because of the changes to /etc/nsswitch.conf, the server side of HA-NFS has a network name service dependency only on netgroup.

Note - Do not change any of the programs or files associated with the fault monitor daemon or probe. You can, however, change some of the parameters using Solstice HA commands.

1.2.2 Solstice DiskSuite

Solstice DiskSuite 4.0 is a software package that offers a metadisk driver and several UNIX file system enhancements that provide better performance, greater capacity, and improved availability.
The metadisk driver is the basic element of the Solstice DiskSuite product. This driver is implemented as a set of loadable, pseudo device drivers. The metadisk driver uses other physical device drivers to pass I/O requests to and from the underlying devices.
An overview of the metadisk driver elements is presented in the following subsections. For a complete discussion, refer to the Solstice DiskSuite 4.0 Administration Guide, which is included with the Solstice High Availability product.

1.2.2.1 Metadevices

Metadevices are the basic functional unit of the metadisk driver. After you create metadevices, you can use them like physical disk slices. These metadevices devices can be made up of one or more slices. The metadevices may be configured to use a single device, a concatenation of stripes, or stripe of devices.

1.2.2.2 Metadevice State Database Replicas

Metadevice state database replicas provide the nonvolatile memory necessary to keep track of configuration and status information for mirrors, submirrors, concatenations, stripes, UFS logs, and hot spares. The replicas also keep track of error conditions that have occurred. A majority of metadevice state database replicas must be preserved in the event a disk or SPARCstorage Array chassis fails.
The replicas are automatically placed on disks in the disksets by the metaset command. metaset places replicas on disks in a diskset. You must have at least three controllers attached to a server or you are at risk of not having enough replicas to have a majority of replicas in the event of a controller failure.

1.2.2.3 Disksets

A diskset is a pair of hosts and disk drives in which all the drives must be accessible by both hosts. There are one or two disksets in a Solstice HA configuration.
Only one server can master a diskset at any point in time. There is one metadevice state database per diskset and one per local diskset. You are instructed to create three metadevice state database replicas on the local disks during installation and configuration of Solstice HA. Numerous replicas are automatically placed on the disks in each diskset. The number and placement of the replicas on disks in the disksets is automatically determined by the metaset(1M) command.

1.2.2.4 Concatenations and Stripes

Each metadevice is either a concatenation or a stripe of slices. Concatenations and stripes work much the way the cat(1) command is used to concatenate two or more files together to create one larger file. When slices are concatenated, the addressing of the component blocks is done on the components sequentially. The file system can use the entire concatenation.
Striping is similar to concatenation except the addressing of the metadevice blocks is interlaced on the components, rather than addressed sequentially. When stripes are defined, an interlace size may be specified. The interlace size is a number followed by k for kilobytes, m for megabytes, or b for 512-byte blocks (for example, 8m, 16k, or 512b).

1.2.2.5 Mirrors

All multi-host data must be placed on mirrored metadevices. This is necessary for the server to tolerate single-component failures.
To set up mirroring, you first create a metamirror. A metamirror is a special type of metadevice made up of one or more other metadevices. Each metadevice within a metamirror is called a submirror.

1.2.2.6 Hot Spares

The hot spare facility enables automatic replacement of failed submirror components, provided that a suitable spare component is available and reserved. Hot spares are temporary fixes, used until failed components are either repaired or replaced. Hot spares provide further security from downtime due to hardware failures.

1.2.2.7 UNIX File System Logging

UFS logging records UFS updates in a log before the updates are applied to the file system. UFS logging speeds up reboots, provides faster local directory operations, and decreases the time necessary for synchronous disk writes.
UFS logging eliminates file system checking at boot time because changes from unfinished system calls are discarded. A pseudo device, called the trans device, manages the contents of the log. Deciding on the placement of the log on multi-host disks in Solstice HA configurations is very important, because selecting the wrong location can decrease performance.
When using UFS logs in Solstice HA configurations, follow these guidelines:
  • Set up one log per file system. Logs should not be shared between file systems.
  • If you have heavy writing activity on a file system, use separate disks for the log and master.
  • The recommended size for a UFS logs is one Mbyte per 100 Mbytes of file system size (one percent). The maximum useful log size is 64 Mbytes, for file systems larger than 6.4 Gbytes.

1.3 Solstice HA Commands

This subsection describes the commands associated with Solstice HA. A printed copy of each man page is provided in Appendix B, "Man Pages."
  • hacheck(1M) - Validates Solstice HA configurations. This program ensures the configuration has been set up correctly and that the software and hardware provide highly available data services.
  • hafstab(1M) - Provides a method of editing and distributing dfstab(4) and vfstab(4) files to the two servers in a Solstice HA configuration.
  • haload(1M) - Monitors the load on the pair of Solstice HA servers. Monitoring is necessary because there must be some excess capacity between the two servers. If there is no excess capacity and a takeover occurs, the remaining server will be unable to take care of the combined workload.
  • haoracle(1M) - Verifies the HA-ORACLE installation. The haoracle command also used to maintain the list of monitored databases in the HA-ORACLE configuration file, haoracle_databases(4).
  • hasetup(1M) - Provides an interface that allows initial configuration of the Solstice HA servers. The information entered on one of the Solstice HA servers is automatically updated on the other server. hasetup first attempts to discover most information about the configuration without user input.
You are asked about additional public network names, the type of configuration (symmetric or asymmetric), the data services being used, space for UFS logging, and placement of disks in disksets. The program then updates the Solstice HA configuration files with the information.
  • hastat(1M) - Displays the current state of the Solstice HA configuration. In the default mode, the status of all the components of the configuration is displayed only once; the program then exits.
  • haswitch(1M) - Transfers the specified diskset along with its associated data services and IP addresses to the specified sibling server.
You also use Solstice DiskSuite commands when performing administration procedures on Solstice HA configurations. These man pages are included with the Solstice DiskSuite distribution. Printed copies of the man pages can be found in the Solstice DiskSuite 4.0 Administration Guide.

1.4 System Files Associated With Solstice HA

There are several system files associated with Solstice HA. You can edit the md.tab, vfstab.logicalhost, and dfstab.logicalhost files. Do not edit the other files.
  • /etc/opt/SUNWmd/md.tab - This file is used by the metainit and metadb commands as an optional input file. Each metadevice must have a unique entry in the file. Tabs, spaces, comments (using the pound sign (#) character), and line continuations (using the backslash (\) character) can be used in the file.

Note - The md.tab file is not automatically updated when the configuration is changed, so you must manually change the md.tab file. This manual does provide instructions and advice on using the md.tab file for initial configuration of Solstice DiskSuite metadevices.

  • /etc/opt/SUNWhadf/hadf/vfstab.logicalhost - In a symmetric configuration, there are two instances of this file, one for each logical host. An asymmetric configuration has only one instance of this file. The vfstab.logicalhost files list the file systems mounted for the logical host.
  • /etc/opt/SUNWhadf/nfs/dfstab.logicalhost - In a symmetric configuration there are two instances of this file, one for each logical host. An asymmetric configuration has only one instance of this file. This file is present only if you are running HA-NFS.
  • /etc/opt/SUNWhadf/hadf/cmm_confcdb - This file contains configuration information for the membership monitor. Among other things, it identifies the two hosts of a Solstice HA configuration, private network connections, and membership monitor states and transitions.
  • /usr/opt/SUNWhadf/hadf/hadfconfig - This file is read by the reconfiguration programs as part of the initial step of membership reconfiguration.
  • /logicalhost/statmon - The statmon directory contains files that record lock manager and status monitor states when HA-NFS is running on a logical host.