Chapter 1 Introduction to Administering Sun Cluster
This chapter provides information on preparing to administer the cluster and the procedures for using Sun Cluster administration tools.
Administering Sun Cluster Overview
Sun Cluster's highly-available environment ensures that critical applications are available to end users. The system administrator's job is to make sure that Sun Cluster is stable and operational.
Familiarize yourself with the planning information in the Sun Cluster Software Installation Guide for Solaris OS and the Sun Cluster Concepts Guide for Solaris
OS before beginning administration tasks. Sun Cluster administration is organized into tasks among the following manuals.
For the most part, you can perform Sun Cluster administration tasks while the cluster is operational, with the impact on cluster operation limited to a single node. For those procedures that require that the entire cluster be shut down, schedule downtime for off hours, to impose minimal impact on the
system. If you plan to take down the cluster or a cluster node, notify users ahead of time.
Administration Tools
You can perform administrative tasks on Sun Cluster by using a Graphical User Interface (GUI) or by using the command-line. The following section provides an overview of the GUI and command-line tools.
Graphical User Interface
Sun Cluster supports Graphical User Interface (GUI) tools that you can use to perform various administrative tasks on your cluster.
These GUI tools are SunPlexTM Manager and, if you are using Sun Cluster on a SPARC based system, Sun Management Center. See Chapter 10, Administering Sun Cluster With the Graphical User Interfaces for more information and for procedures about configuring SunPlex Manager and Sun Management Center. For specific information
about how to use these tools, see the online help for each GUI.
Command-line Interface
You can perform most Sun Cluster administration tasks interactively through the scsetup(1M) utility. Whenever possible, administration procedures
in this guide are described using scsetup.
You can administer the following Main Menu items through the scsetup utility.
You can administer the following Resource Group Menu items through the scsetup utility.
-
Create a resource group
-
Add a network resource to a resource group
-
Add a data service resource to a resource group
-
Online/Offline or Switchover a resource group
-
Enable/Disable a resource
-
Change properties of a resource group
-
Change properties of a resource
-
Remove a resource from a resource group
-
Remove a resource group
-
Clear the stop_failed error flag from a resource
Table 1-1 lists other commands that you use to administer Sun Cluster. See the man pages for more detailed information.
Table 1–1 Sun Cluster Command-Line Interface Commands
|
Command
|
Description
|
|
ccp(1M)
|
Starts remote console
access to the cluster.
|
|
if_mpadm(1M)
|
Use to switch
IP addresses from one adapter to another in an IP Network Multipathing group.
|
|
sccheck(1M)
|
Checks and validates
the Sun Cluster configuration to ensure the very basic configuration for a cluster to be functional.
|
|
scconf(1M)
|
Updates a Sun Cluster
configuration. The -p option lists cluster configuration information.
|
|
scdidadm(1M)
|
Provides administrative
access to the device ID configuration.
|
|
scgdevs(1M)
|
Runs the global
device namespace administration script.
|
|
scinstall(1M)
|
Installs and
configures Sun Cluster software. The command can be run interactively or non-interactively. The -p option displays release and package version information for the Sun Cluster software.
|
|
scrgadm(1M)
|
Manages the registration
of resource types, the creation of resource groups, and the activation of resources within a resource group. The -p option displays information on installed resources, resource groups, and resource types.
Note –
Resource type, resource group, and resource property names
are case insensitive when executing scrgadm.
|
|
scsetup(1M)
|
Runs the interactive
cluster configuration utility, which generates the scconf command and its various options.
|
|
scshutdown(1M)
|
Shuts down
the entire cluster.
|
|
scstat(1M)
|
Provides a snapshot
of the cluster status.
|
|
scswitch(1M)
|
Performs changes
that affect node mastery and states for resource groups and disk device groups.
|
In addition, use commands to administer the volume manager portion of Sun Cluster. These commands depend on the specific volume manager used in your cluster, either Solstice DiskSuiteTM, VERITAS Volume Manager, or Solaris Volume ManagerTM.
Preparing to Administer the Cluster
This section describes what to do to prepare for administering your cluster.
Documenting a Sun Cluster Hardware Configuration
Document the hardware aspects that are unique to your site as your Sun Cluster configuration is scaled. Refer to your hardware documentation when you change or upgrade the cluster to save administration labor. Labeling cables and connections between the various cluster components can also make administration
easier.
Reduce the time required by a third-party service provider when servicing your cluster by keeping records of your original cluster configuration, and subsequent changes.
Using an Administrative Console
You can use a dedicated SPARC workstation, known as the administrative console, to administer the active cluster. Typically, you install and run the Cluster Control Panel
(CCP) and graphical user interface (GUI) tools on the administrative console. For more information on the CCP, see How to Log In to Sun Cluster Remotely. For instructions on installing the Cluster Control Panel module for Sun Management Center and SunPlex Manager GUI tools, see the Sun Cluster Software Installation Guide for Solaris OS.
The administrative console is not a cluster node. The administrative console is used for remote access to the cluster nodes, either over the public network or through a network-based terminal concentrator.
If your SPARC cluster consists of a Sun EnterpriseTM 10000 server, you must log in from the administrative console to the System Service Processor (SSP). Connect using the netcon(1M) command. The default method for netcon to connect with a Sun Enterprise 10000 domain is through the network interface. If the network is inaccessible, you can use netcon in “exclusive”
mode by setting the -f option. You can also send ~* during a normal netcon session. Either of the previous solutions give you the option of toggling to the serial interface if the network becomes unreachable.
Sun Cluster does not require a dedicated administrative console, but using a console provides these benefits:
Backing Up the Cluster
Back up your cluster on a regular basis. Even though Sun Cluster provides an HA environment, with mirrored copies of data on the storage devices, Sun Cluster is not a replacement for regular backups. Sun Cluster can survive multiple failures, but does not protect against user
or program error, or catastrophic failure. Therefore, you must have a backup procedure in place to protect against data loss.
The following information should be included as part of your backup.
-
All file system partitions
-
All database data if you are running DBMS data services
-
Disk partition information for all cluster disks
-
The md.tab file if you are using Solstice DiskSuite/Solaris Volume Manager as your volume manager
Beginning to Administer the Cluster
Table 1–2 provides a starting point for administering your cluster.
Table 1–2 Sun Cluster 3.1 4/04 Administration Tools
How to Log In to Sun Cluster Remotely
The Cluster Control Panel (CCP) provides a launch pad for cconsole(1M), crlogin(1M), and ctelnet(1M) tools. All three tools start a multiple window connection to a set of specified nodes. The multiple-window connection consists of a host window for each of the specified nodes and a common
window. Input to the common window is sent to each of the host windows, allowing you to run commands simultaneously on all nodes of the cluster. See the ccp(1M)
and cconsole(1M) man pages for more information.
-
Verify that the following prerequisites are met before starting the CCP.
-
Make sure the PATH variable on the administrative console includes the Sun Cluster tools directory, /opt/SUNWcluster/bin, and /usr/cluster/bin. You can specify an alternate location for the tools directory by setting the $CLUSTER_HOME environment variable.
-
Configure the clusters file, the serialports file, and the nsswitch.conf file if using a terminal concentrator. The files can be either /etc files or NIS/NIS+ databases. See clusters(4) and serialports(4)
for more information.
-
Determine if you have a Sun Enterprise 10000 server platform.
-
Start the CCP launch pad.
From the administrative console, type the following command.
The CCP launch pad is displayed.
-
To start a remote session with the cluster, click either the cconsole, crlogin, or ctelnet icon in the CCP launch pad.
Where to Go From Here
You can also start cconsole, crlogin, or ctelnet sessions from the command line.
How to Access the scsetup Utility
The scsetup(1M) utility enables you to interactively
configure quorum, resource group, cluster transport, private hostname, device group, and new node options for the cluster.
-
Become superuser on any node in the cluster.
-
Enter the scsetup utility.
The Main Menu is displayed.
-
Make your configuration selection from the menu. Follow the onscreen instructions to complete a task.
See the scsetup online help for more information.
How to Display Sun Cluster Release and Version Information
You do not need to be logged in as superuser to perform these procedures.
Display the Sun Cluster patch numbers.
Sun Cluster update releases are identified by the main product patch number plus the update version.
Display the Sun Cluster release number and version strings for all Sun Cluster packages.
Example—Displaying the Sun Cluster Release Number
The following example displays the cluster's release number.
% showrev -p | grep 110648
Patch: 110648-05 Obsoletes: Requires: Incompatibles: Packages:
|
Example—Displaying Sun Cluster Release and Version Information
The following example displays the cluster's release information and version information for all packages.
% scinstall -pv
SunCluster 3.1
SUNWscr: 3.1.0,REV=2000.10.01.01.00
SUNWscdev: 3.1.0,REV=2000.10.01.01.00
SUNWscu: 3.1.0,REV=2000.10.01.01.00
SUNWscman: 3.1.0,REV=2000.10.01.01.00
SUNWscsal: 3.1.0,REV=2000.10.01.01.00
SUNWscsam: 3.1.0,REV=2000.10.01.01.00
SUNWscvm: 3.1.0,REV=2000.10.01.01.00
SUNWmdm: 4.2.1,REV=2000.08.08.10.01
|
How to Display Configured Resource Types, Resource Groups, and Resources
You can also accomplish this procedure by using the SunPlex Manager GUI. Refer to Chapter 10, Administering Sun Cluster With the Graphical User Interfaces. See the SunPlex Manager online help
for more information.
You do not need to be logged in as superuser to perform this procedure.
Display the cluster's configured resource types, resource groups, and resources.
Example—Displaying Configured Resource Types, Resource Groups, and Resources
The following example shows the resource types (RT Name), resource groups (RG Name), and resources (RS Name) configured for the cluster schost.
% scrgadm -p
RT Name: SUNW.SharedAddress
RT Description: HA Shared Address Resource Type
RT Name: SUNW.LogicalHostname
RT Description: Logical Hostname Resource Type
RG Name: schost-sa-1
RG Description:
RS Name: schost-1
RS Description:
RS Type: SUNW.SharedAddress
RS Resource Group: schost-sa-1
RG Name: schost-lh-1
RG Description:
RS Name: schost-3
RS Description:
RS Type: SUNW.LogicalHostname
RS Resource Group: schost-lh-1
|
How to Check the Status of Cluster Components
You can also accomplish this procedure by using the SunPlex Manager GUI. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
Check the status of cluster components.
Example—Checking the Status of Cluster Components
The following example provides a sample of status information for cluster components returned by scstat(1M).
% scstat -p
-- Cluster Nodes --
Node name Status
--------- ------
Cluster node: phys-schost-1 Online
Cluster node: phys-schost-2 Online
Cluster node: phys-schost-3 Online
Cluster node: phys-schost-4 Online
------------------------------------------------------------------
-- Cluster Transport Paths --
Endpoint Endpoint Status
-------- -------- ------
Transport path: phys-schost-1:qfe1 phys-schost-4:qfe1 Path online
Transport path: phys-schost-1:hme1 phys-schost-4:hme1 Path online
...
------------------------------------------------------------------
-- Quorum Summary --
Quorum votes possible: 6
Quorum votes needed: 4
Quorum votes present: 6
-- Quorum Votes by Node --
Node Name Present Possible Status
--------- ------- -------- ------
Node votes: phys-schost-1 1 1 Online
Node votes: phys-schost-2 1 1 Online
...
-- Quorum Votes by Device --
Device Name Present Possible Status
----------- ------- -------- ------
Device votes: /dev/did/rdsk/d2s2 1 1 Online
Device votes: /dev/did/rdsk/d8s2 1 1 Online
...
-- Device Group Servers --
Device Group Primary Secondary
------------ ------- ---------
Device group servers: rmt/1 - -
Device group servers: rmt/2 - -
Device group servers: schost-1 phys-schost-2 phys-schost-1
Device group servers: schost-3 - -
-- Device Group Status --
Device Group Status
------------ ------
Device group status: rmt/1 Offline
Device group status: rmt/2 Offline
Device group status: schost-1 Online
Device group status: schost-3 Offline
------------------------------------------------------------------
-- Resource Groups and Resources --
Group Name Resources
---------- ---------
Resources: test-rg test_1
Resources: real-property-rg -
Resources: failover-rg -
Resources: descript-rg-1 -
...
-- Resource Groups --
Group Name Node Name State
---------- --------- -----
Group: test-rg phys-schost-1 Offline
Group: test-rg phys-schost-2 Offline
...
-- Resources --
Resource Name Node Name State Status Message
------------- --------- ----- --------------
Resource: test_1 phys-schost-1 Offline Offline
Resource: test_1 phys-schost-2 Offline Offline
-----------------------------------------------------------------
-- IPMP Groups --
Node Name Group Status Adapter Status
--------- ----- ------ ------- ------
IPMP Group: phys-schost-1 sc_ipmp0 Online qfe1 Online
IPMP Group: phys-schost-2 sc_ipmp0 Online qfe1 Online
------------------------------------------------------------------
|
How to Check the Status of the Public Network
You can also accomplish this procedure by using the SunPlex Manager GUI. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
To check the status of the IP Network Multipathing groups, use the scstat(1M) command.
-
Check the status of cluster components.
Example—Checking the Public Network Status
The following example provides a sample of status information for cluster components returned by scstat -i.
% scstat -i
-----------------------------------------------------------------
-- IPMP Groups --
Node Name Group Status Adapter Status
--------- ----- ------ ------- ------
IPMP Group: phys-schost-1 sc_ipmp1 Online qfe2 Online
IPMP Group: phys-schost-1 sc_ipmp0 Online qfe1 Online
IPMP Group: phys-schost-2 sc_ipmp1 Online qfe2 Online
IPMP Group: phys-schost-2 sc_ipmp0 Online qfe1 Online
------------------------------------------------------------------
|
How to View the Cluster Configuration
You can also accomplish this procedure by using the SunPlex Manager GUI. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
View the cluster configuration
To display more information using the scconf command, use the verbose options. See the scconf(1M) man page for details.
Example—Viewing the Cluster Configuration
The following example lists the cluster configuration.
% scconf -p
Cluster name: cluster-1
Cluster ID: 0x3908EE1C
Cluster install mode: disabled
Cluster private net: 172.16.0.0
Cluster private netmask: 192.168.0.0
Cluster new node authentication: unix
Cluster new node list: <NULL - Allow any node>
Cluster nodes: phys-schost-1 phys-schost-2 phys-schost-3
phys-schost-4
Cluster node name: phys-schost-1
Node ID: 1
Node enabled: yes
Node private hostname: clusternode1-priv
Node quorum vote count: 1
Node reservation key: 0x3908EE1C00000001
Node transport adapters: hme1 qfe1 qfe2
Node transport adapter: hme1
Adapter enabled: yes
Adapter transport type: dlpi
Adapter property: device_name=hme
Adapter property: device_instance=1
Adapter property: dlpi_heartbeat_timeout=10000
...
Cluster transport junctions: hub0 hub1 hub2
Cluster transport junction: hub0
Junction enabled: yes
Junction type: switch
Junction port names: 1 2 3 4
...
Junction port: 1
Port enabled: yes
Junction port: 2
Port enabled: yes
...
Cluster transport cables
Endpoint Endpoint State
-------- -------- -----
Transport cable: phys-schost-1:hme1@0 hub0@1 Enabled
Transport cable: phys-schost-1:qfe1@0 hub1@1 Enabled
Transport cable: phys-schost-1:qfe2@0 hub2@1 Enabled
Transport cable: phys-schost-2:hme1@0 hub0@2 Enabled
...
Quorum devices: d2 d8
Quorum device name: d2
Quorum device votes: 1
Quorum device enabled: yes
Quorum device name: /dev/did/rdsk/d2s2
Quorum device hosts (enabled): phys-schost-1
phys-schost-2
Quorum device hosts (disabled):
...
Device group name: schost-3
Device group type: SVM
Device group failback enabled: no
Device group node list: phys-schost-3, phys-schost-4
Diskset name: schost-3
|
How to Validate a Basic Cluster Configuration
The sccheck(1M) command runs a set of checks to validate the basic configuration required for a cluster to function properly. If no checks fail, sccheck
returns to the shell prompt. If a check fails, sccheck produces reports in either the specified or the default output directory. If you run sccheck against more than one node, sccheck will produce a report for each node and a report for multi-node
checks.
The sccheck command runs in two steps: data collection and analysis. Data collection can be time consuming, depending on the system configuration. You can invoke sccheck in verbose mode with the -v1 flag to print progress messages, or you can
use the -v2 flag to run sccheck in highly verbose mode which prints more detailed progress messages, especially during data collection.
Note –
Run sccheck after performing an administration procedure that might result in changes to devices, volume management components, or the Sun Cluster configuration.
-
Become superuser on any node in the cluster.
-
Verify the cluster configuration.
Example—Checking the Cluster Configuration With All Checks Passing
The following example shows sccheck being run in verbose mode against nodes phys-schost-1 and phys-schost-2 with all checks passing.
# sccheck -v1 -h phys-schost-1,phys-schost-2
sccheck: Requesting explorer data and node report from phys-schost-1.
sccheck: Requesting explorer data and node report from phys-schost-2.
sccheck: phys-schost-1: Explorer finished.
sccheck: phys-schost-1: Starting single-node checks.
sccheck: phys-schost-1: Single-node checks finished.
sccheck: phys-schost-2: Explorer finished.
sccheck: phys-schost-2: Starting single-node checks.
sccheck: phys-schost-2: Single-node checks finished.
sccheck: Starting multi-node checks.
sccheck: Multi-node checks finished
#
|
Example—Checking the Cluster Configuration With a Failed Check
The following example shows the node phys-schost-2 in the cluster suncluster missing the mount point /global/phys-schost-1. Reports are created in the output directory /var/cluster/sccheck/myReports/.
# sccheck -v1 -h phys-schost-1,phys-schost-2 -o /var/cluster/sccheck/myReports
sccheck: Requesting explorer data and node report from phys-schost-1.
sccheck: Requesting explorer data and node report from phys-schost-2.
sccheck: phys-schost-1: Explorer finished.
sccheck: phys-schost-1: Starting single-node checks.
sccheck: phys-schost-1: Single-node checks finished.
sccheck: phys-schost-2: Explorer finished.
sccheck: phys-schost-2: Starting single-node checks.
sccheck: phys-schost-2: Single-node checks finished.
sccheck: Starting multi-node checks.
sccheck: Multi-node checks finished.
sccheck: One or more checks failed.
sccheck: The greatest severity of all check failures was 3 (HIGH).
sccheck: Reports are in /var/cluster/sccheck/myReports.
#
# cat /var/cluster/sccheck/myReports/sccheck-results.suncluster.txt
...
===================================================
= ANALYSIS DETAILS =
===================================================
------------------------------------
CHECK ID : 3065
SEVERITY : HIGH
FAILURE : Global filesystem /etc/vfstab entries are not consistent across
all Sun Cluster 3.x nodes.
ANALYSIS : The global filesystem /etc/vfstab entries are not consistent across
all nodes in this cluster.
Analysis indicates:
FileSystem '/global/phys-schost-1' is on 'phys-schost-1' but missing from 'phys-schost-2'.
RECOMMEND: Ensure each node has the correct /etc/vfstab entry for the
filesystem(s) in question.
...
#
|
How to Check the Global Mount Points
The sccheck(1M) command includes checks which examine the /etc/vfstab file for configuration errors with the cluster file system and its global mount points.
Note –
Run sccheck after making cluster configuration changes that have affected devices or volume management components.
-
Become superuser on any node in the cluster.
-
Verify the cluster configuration.
Example—Checking the Global Mount Points
The following example shows the node phys-schost-2 of the cluster suncluster missing the mount point /global/schost-1. Reports are being sent to the output directory /var/cluster/sccheck/myReports/
# sccheck -v1 -h phys-schost-1,phys-schost-2 -o /var/cluster/sccheck/myReports
sccheck: Requesting explorer data and node report from phys-schost-1.
sccheck: Requesting explorer data and node report from phys-schost-2.
sccheck: phys-schost-1: Explorer finished.
sccheck: phys-schost-1: Starting single-node checks.
sccheck: phys-schost-1: Single-node checks finished.
sccheck: phys-schost-2: Explorer finished.
sccheck: phys-schost-2: Starting single-node checks.
sccheck: phys-schost-2: Single-node checks finished.
sccheck: Starting multi-node checks.
sccheck: Multi-node checks finished.
sccheck: One or more checks failed.
sccheck: The greatest severity of all check failures was 3 (HIGH).
sccheck: Reports are in /var/cluster/sccheck/myReports.
#
# cat /var/cluster/sccheck/myReports/sccheck-results.suncluster.txt
...
===================================================
= ANALYSIS DETAILS =
===================================================
------------------------------------
CHECK ID : 3065
SEVERITY : HIGH
FAILURE : Global filesystem /etc/vfstab entries are not consistent across
all Sun Cluster 3.x nodes.
ANALYSIS : The global filesystem /etc/vfstab entries are not consistent across
all nodes in this cluster.
Analysis indicates:
FileSystem '/global/phys-schost-1' is on 'phys-schost-1' but missing from 'phys-schost-2'.
RECOMMEND: Ensure each node has the correct /etc/vfstab entry for the
filesystem(s) in question.
...
#
# cat /var/cluster/sccheck/myReports/sccheck-results.phys-schost-1.txt
...
===================================================
= ANALYSIS DETAILS =
===================================================
------------------------------------
CHECK ID : 1398
SEVERITY : HIGH
FAILURE : An unsupported server is being used as a Sun Cluster 3.x node.
ANALYSIS : This server may not been qualified to be used as a Sun Cluster 3.x node.
Only servers that have been qualified with Sun Cluster 3.x are supported as
Sun Cluster 3.x nodes.
RECOMMEND: Because the list of supported servers is always being updated, check with
your Sun Microsystems representative to get the latest information on what servers
are currently supported and only use a server that is supported with Sun Cluster 3.x.
...
#
|