Chapter 1 Introduction to Administering Sun Cluster
This chapter provides information on preparing to administer
the cluster and the procedures for using Sun Cluster administration tools.
Administering Sun Cluster Overview
Sun Cluster's highly-available environment ensures that critical applications
are available to end users. The system administrator's job is to make sure
that Sun Cluster is stable and operational.
Familiarize yourself with the planning information in the Sun Cluster Software Installation Guide for Solaris
OS and the Sun Cluster Concepts Guide for Solaris OS before beginning
administration tasks. Sun Cluster administration is organized into tasks among
the following manuals.
For the most part, you can perform Sun Cluster administration tasks while
the cluster is operational, with the impact on cluster operation limited to
a single node. For those procedures that require that the entire cluster be
shut down, schedule downtime for off hours, to impose minimal impact on the
system. If you plan to take down the cluster or a cluster node, notify users
ahead of time.
Administration Tools
You can perform administrative tasks on Sun Cluster by using a Graphical
User Interface (GUI) or by using the command-line. The following section provides
an overview of the GUI and command-line tools.
Graphical User Interface
Sun Cluster supports Graphical User Interface (GUI)
tools that you can use to perform various administrative tasks on your cluster.
These GUI tools are SunPlexTM Manager and,
if you are using Sun Cluster on a SPARC based system, Sun Management Center.
See Chapter 10, Administering Sun Cluster With the Graphical User Interfaces for more information and for procedures about
configuring SunPlex Manager and Sun Management Center. For specific information
about how to use these tools, see the online help for each GUI.
Command-line Interface
You can perform most Sun Cluster administration tasks interactively through
the scsetup(1M)
utility. Whenever possible, administration procedures in this guide are described
using scsetup.
You can administer the following Main Menu items through the scsetup utility.
You can administer the following Resource Group Menu items through the scsetup utility.
-
Create a resource group
-
Add a network resource to a resource group
-
Add a data service resource to a resource group
-
Online/Offline or Switchover a resource group
-
Enable/Disable a resource
-
Change properties of a resource group
-
Change properties of a resource
-
Remove a resource from a resource group
-
Remove a resource group
-
Clear the stop_failed error flag from a resource
Table 1-1 lists other commands that you use to administer Sun Cluster. See
the man pages for more detailed information.
Table 1–1 Sun Cluster Command-Line Interface Commands
|
Command
|
Description
|
|
ccp(1M)
|
Starts remote console access to the
cluster.
|
|
if_mpadm(1M)
|
Use to switch IP addresses from one
adapter to another in an IP Network Multipathing group.
|
|
sccheck(1M)
|
Checks and validates the Sun Cluster
configuration to ensure the very basic configuration for a cluster to be functional.
|
|
scconf(1M)
|
Updates a Sun Cluster configuration. The -p option lists cluster configuration information.
|
|
scdidadm(1M)
|
Provides administrative access to the
device ID configuration.
|
|
scgdevs(1M)
|
Runs the global device namespace administration
script.
|
|
scinstall(1M)
|
Installs and configures Sun Cluster software.
The command can be run interactively or non-interactively. The -p option displays release and package version information for the Sun Cluster
software.
|
|
scrgadm(1M)
|
Manages the registration of resource
types, the creation of resource groups, and the activation of resources within
a resource group. The -p option displays information on
installed resources, resource groups, and resource types.
Note –
Resource type, resource group, and resource property names are case insensitive
when executing scrgadm.
|
|
scsetup(1M)
|
Runs the interactive cluster configuration
utility, which generates the scconf command and its various
options.
|
|
scshutdown(1M)
|
Shuts down the entire cluster.
|
|
scstat(1M)
|
Provides a snapshot of the cluster status.
|
|
scswitch(1M)
|
Performs changes that affect node mastery
and states for resource groups and disk device groups.
|
In addition, use commands to administer the volume manager portion of Sun Cluster.
These commands depend on the specific volume manager used in your cluster,
either Solstice DiskSuiteTM, VERITAS Volume Manager, or Solaris Volume ManagerTM.
Preparing to Administer the Cluster
This section describes what to do to prepare for administering your
cluster.
Documenting a Sun Cluster Hardware Configuration
Document the hardware aspects that are unique to your site as your Sun Cluster
configuration is scaled. Refer to your hardware documentation when you change
or upgrade the cluster to save administration labor. Labeling cables and connections
between the various cluster components can also make administration easier.
Reduce the time required by a third-party service provider when servicing
your cluster by keeping records of your original cluster configuration, and
subsequent changes.
Using an Administrative Console
You can use a dedicated SPARC workstation, known as the administrative console, to administer the active cluster. Typically,
you install and run the Cluster Control Panel (CCP) and graphical user interface
(GUI) tools on the administrative console. For more information on the CCP,
see How to Log In to Sun Cluster Remotely. For instructions on installing the Cluster
Control Panel module for Sun Management Center and SunPlex Manager GUI tools,
see the Sun Cluster
Software Installation Guide for Solaris OS.
The administrative console is not a cluster node. The administrative
console is used for remote access to the cluster nodes, either over the public
network or through a network-based terminal concentrator.
If your SPARC cluster consists of a Sun EnterpriseTM 10000 server, you must log in from the administrative console
to the System Service Processor (SSP). Connect using the netcon(1M) command. The default method
for netcon to connect with a Sun Enterprise 10000 domain
is through the network interface. If the network is inaccessible, you can
use netcon in “exclusive” mode by setting the -f option. You can also send ~* during a normal netcon session. Either of the previous solutions give you the option
of toggling to the serial interface if the network becomes unreachable.
Sun Cluster does not require a dedicated administrative console, but using
a console provides these benefits:
Backing Up the Cluster
Back up your cluster on a regular basis. Even though Sun Cluster
provides an HA environment, with mirrored copies of data on the storage devices,
Sun Cluster is not a replacement for regular backups. Sun Cluster can survive multiple
failures, but does not protect against user or program error, or catastrophic
failure. Therefore, you must have a backup procedure in place to protect against
data loss.
The following information should be included as part of your backup.
-
All file system partitions
-
All database data if you are running DBMS data services
-
Disk partition information for all cluster disks
-
The md.tab file if you are using Solstice DiskSuite/Solaris
Volume Manager as your volume manager
Beginning to Administer the Cluster
Table 1–2 provides a starting point for administering
your cluster.
Table 1–2 Sun Cluster 3.1 4/04 Administration Tools
How to Log In to Sun Cluster Remotely
The Cluster Control Panel (CCP) provides a launch pad for cconsole(1M), crlogin(1M),
and ctelnet(1M)
tools. All three tools start a multiple window connection to a set of specified
nodes. The multiple-window connection consists of a host window for each of
the specified nodes and a common window. Input to the common window is sent
to each of the host windows, allowing you to run commands simultaneously on
all nodes of the cluster. See the ccp(1M) and cconsole(1M) man pages for more information.
-
Verify that the following prerequisites are met before starting the
CCP.
-
Make sure the PATH variable on the administrative
console includes the Sun Cluster tools directory, /opt/SUNWcluster/bin, and /usr/cluster/bin. You can specify an
alternate location for the tools directory by setting the $CLUSTER_HOME environment variable.
-
Configure the clusters file, the serialports file, and the nsswitch.conf file
if using a terminal concentrator. The files can be either /etc
files or NIS/NIS+ databases. See clusters(4) and serialports(4) for more information.
-
Determine if you have a Sun Enterprise 10000 server platform.
-
Start the CCP launch pad.
From the administrative console, type the following command.
The CCP launch pad is displayed.
-
To start a remote session with the cluster, click either the cconsole,
crlogin, or ctelnet icon in the CCP launch pad.
Where to Go From Here
You can also start cconsole, crlogin,
or ctelnet sessions from the command line.
How to Access the scsetup Utility
The scsetup(1M) utility enables you to interactively
configure quorum, resource group, cluster transport, private hostname, device
group, and new node options for the cluster.
-
Become superuser on any node in the cluster.
-
Enter the scsetup utility.
The Main Menu is displayed.
-
Make your configuration selection from the menu. Follow the onscreen
instructions to complete a task.
See the scsetup online help for more information.
How to Display Sun Cluster Release and Version Information
You do
not need to be logged in as superuser to perform these procedures.
Display the Sun Cluster patch numbers.
Sun Cluster update releases are identified by the main product patch
number plus the update version.
Display the Sun Cluster release number and version strings for all Sun Cluster
packages.
Example—Displaying the Sun Cluster Release Number
The following example displays the cluster's release number.
% showrev -p | grep 110648
Patch: 110648-05 Obsoletes: Requires: Incompatibles: Packages:
|
Example—Displaying Sun Cluster Release and Version Information
The following example displays the cluster's release information and
version information for all packages.
% scinstall -pv
SunCluster 3.1
SUNWscr: 3.1.0,REV=2000.10.01.01.00
SUNWscdev: 3.1.0,REV=2000.10.01.01.00
SUNWscu: 3.1.0,REV=2000.10.01.01.00
SUNWscman: 3.1.0,REV=2000.10.01.01.00
SUNWscsal: 3.1.0,REV=2000.10.01.01.00
SUNWscsam: 3.1.0,REV=2000.10.01.01.00
SUNWscvm: 3.1.0,REV=2000.10.01.01.00
SUNWmdm: 4.2.1,REV=2000.08.08.10.01
|
How to Display Configured Resource Types, Resource Groups, and Resources
You can also accomplish this procedure by using the SunPlex Manager
GUI. Refer to Chapter 10, Administering Sun Cluster With the Graphical User Interfaces. See the SunPlex Manager online help
for more information.
You do not need to be logged in as superuser to perform this procedure.
Display the cluster's configured resource types, resource groups, and
resources.
Example—Displaying Configured Resource Types, Resource Groups,
and Resources
The following example shows the resource types (RT Name),
resource groups (RG Name), and resources (RS Name) configured for the cluster schost.
% scrgadm -p
RT Name: SUNW.SharedAddress
RT Description: HA Shared Address Resource Type
RT Name: SUNW.LogicalHostname
RT Description: Logical Hostname Resource Type
RG Name: schost-sa-1
RG Description:
RS Name: schost-1
RS Description:
RS Type: SUNW.SharedAddress
RS Resource Group: schost-sa-1
RG Name: schost-lh-1
RG Description:
RS Name: schost-3
RS Description:
RS Type: SUNW.LogicalHostname
RS Resource Group: schost-lh-1
|
How to Check the Status of Cluster Components
You
can also accomplish this procedure by using the SunPlex Manager GUI. See the
SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
Check the status of cluster components.
Example—Checking the Status of Cluster Components
The following example provides a sample of status information for cluster
components returned by scstat(1M).
% scstat -p
-- Cluster Nodes --
Node name Status
--------- ------
Cluster node: phys-schost-1 Online
Cluster node: phys-schost-2 Online
Cluster node: phys-schost-3 Online
Cluster node: phys-schost-4 Online
------------------------------------------------------------------
-- Cluster Transport Paths --
Endpoint Endpoint Status
-------- -------- ------
Transport path: phys-schost-1:qfe1 phys-schost-4:qfe1 Path online
Transport path: phys-schost-1:hme1 phys-schost-4:hme1 Path online
...
------------------------------------------------------------------
-- Quorum Summary --
Quorum votes possible: 6
Quorum votes needed: 4
Quorum votes present: 6
-- Quorum Votes by Node --
Node Name Present Possible Status
--------- ------- -------- ------
Node votes: phys-schost-1 1 1 Online
Node votes: phys-schost-2 1 1 Online
...
-- Quorum Votes by Device --
Device Name Present Possible Status
----------- ------- -------- ------
Device votes: /dev/did/rdsk/d2s2 1 1 Online
Device votes: /dev/did/rdsk/d8s2 1 1 Online
...
-- Device Group Servers --
Device Group Primary Secondary
------------ ------- ---------
Device group servers: rmt/1 - -
Device group servers: rmt/2 - -
Device group servers: schost-1 phys-schost-2 phys-schost-1
Device group servers: schost-3 - -
-- Device Group Status --
Device Group Status
------------ ------
Device group status: rmt/1 Offline
Device group status: rmt/2 Offline
Device group status: schost-1 Online
Device group status: schost-3 Offline
------------------------------------------------------------------
-- Resource Groups and Resources --
Group Name Resources
---------- ---------
Resources: test-rg test_1
Resources: real-property-rg -
Resources: failover-rg -
Resources: descript-rg-1 -
...
-- Resource Groups --
Group Name Node Name State
---------- --------- -----
Group: test-rg phys-schost-1 Offline
Group: test-rg phys-schost-2 Offline
...
-- Resources --
Resource Name Node Name State Status Message
------------- --------- ----- --------------
Resource: test_1 phys-schost-1 Offline Offline
Resource: test_1 phys-schost-2 Offline Offline
-----------------------------------------------------------------
-- IPMP Groups --
Node Name Group Status Adapter Status
--------- ----- ------ ------- ------
IPMP Group: phys-schost-1 sc_ipmp0 Online qfe1 Online
IPMP Group: phys-schost-2 sc_ipmp0 Online qfe1 Online
------------------------------------------------------------------
|
How to Check the Status of the Public Network
You
can also accomplish this procedure by using the SunPlex Manager GUI. See the
SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
To check the status of the IP Network Multipathing groups, use the scstat(1M)
command.
-
Check the status of cluster components.
Example—Checking the Public Network Status
The following example provides a sample of status information for cluster
components returned by scstat -i.
% scstat -i
-----------------------------------------------------------------
-- IPMP Groups --
Node Name Group Status Adapter Status
--------- ----- ------ ------- ------
IPMP Group: phys-schost-1 sc_ipmp1 Online qfe2 Online
IPMP Group: phys-schost-1 sc_ipmp0 Online qfe1 Online
IPMP Group: phys-schost-2 sc_ipmp1 Online qfe2 Online
IPMP Group: phys-schost-2 sc_ipmp0 Online qfe1 Online
------------------------------------------------------------------
|
How to View the Cluster Configuration
You can also accomplish this procedure by using the
SunPlex Manager GUI. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
View the cluster configuration
To display more information using the scconf command,
use the verbose options. See the scconf(1M) man page for details.
Example—Viewing the Cluster Configuration
The following example lists the cluster configuration.
% scconf -p
Cluster name: cluster-1
Cluster ID: 0x3908EE1C
Cluster install mode: disabled
Cluster private net: 172.16.0.0
Cluster private netmask: 255.255.0.0
Cluster new node authentication: unix
Cluster new node list: <NULL - Allow any node>
Cluster nodes: phys-schost-1 phys-schost-2 phys-schost-3
phys-schost-4
Cluster node name: phys-schost-1
Node ID: 1
Node enabled: yes
Node private hostname: clusternode1-priv
Node quorum vote count: 1
Node reservation key: 0x3908EE1C00000001
Node transport adapters: hme1 qfe1 qfe2
Node transport adapter: hme1
Adapter enabled: yes
Adapter transport type: dlpi
Adapter property: device_name=hme
Adapter property: device_instance=1
Adapter property: dlpi_heartbeat_timeout=10000
...
Cluster transport junctions: hub0 hub1 hub2
Cluster transport junction: hub0
Junction enabled: yes
Junction type: switch
Junction port names: 1 2 3 4
...
Junction port: 1
Port enabled: yes
Junction port: 2
Port enabled: yes
...
Cluster transport cables
Endpoint Endpoint State
-------- -------- -----
Transport cable: phys-schost-1:hme1@0 hub0@1 Enabled
Transport cable: phys-schost-1:qfe1@0 hub1@1 Enabled
Transport cable: phys-schost-1:qfe2@0 hub2@1 Enabled
Transport cable: phys-schost-2:hme1@0 hub0@2 Enabled
...
Quorum devices: d2 d8
Quorum device name: d2
Quorum device votes: 1
Quorum device enabled: yes
Quorum device name: /dev/did/rdsk/d2s2
Quorum device hosts (enabled): phys-schost-1
phys-schost-2
Quorum device hosts (disabled):
...
Device group name: schost-3
Device group type: SVM
Device group failback enabled: no
Device group node list: phys-schost-3, phys-schost-4
Diskset name: schost-3
|
How to Validate a Basic Cluster Configuration
The sccheck(1M) command runs a set of checks to validate
the basic configuration required for a cluster to function properly. If no
checks fail, sccheck returns to the shell prompt. If a
check fails, sccheck produces reports in either the specified
or the default output directory. If you run sccheck against
more than one node, sccheck will produce a report for each
node and a report for multi-node checks.
The sccheck command runs in two steps: data collection
and analysis. Data collection can be time consuming, depending on the system
configuration. You can invoke sccheck in verbose mode with
the -v1 flag to print progress messages, or you can use the -v2 flag to run sccheck in highly verbose mode
which prints more detailed progress messages, especially during data collection.
Note –
Run sccheck after performing an administration
procedure that might result in changes to devices, volume management components,
or the Sun Cluster configuration.
-
Become superuser on any node in the cluster.
-
Verify the cluster configuration.
Example—Checking the Cluster Configuration With All Checks Passing
The following example shows sccheck being run in
verbose mode against nodes phys-schost-1 and phys-schost-2 with all checks passing.
# sccheck -v1 -h phys-schost-1,phys-schost-2
sccheck: Requesting explorer data and node report from phys-schost-1.
sccheck: Requesting explorer data and node report from phys-schost-2.
sccheck: phys-schost-1: Explorer finished.
sccheck: phys-schost-1: Starting single-node checks.
sccheck: phys-schost-1: Single-node checks finished.
sccheck: phys-schost-2: Explorer finished.
sccheck: phys-schost-2: Starting single-node checks.
sccheck: phys-schost-2: Single-node checks finished.
sccheck: Starting multi-node checks.
sccheck: Multi-node checks finished
#
|
Example—Checking the Cluster Configuration With a Failed Check
The following example shows the node phys-schost-2
in the cluster suncluster missing the mount point /global/phys-schost-1. Reports are created in the output directory /var/cluster/sccheck/myReports/.
# sccheck -v1 -h phys-schost-1,phys-schost-2 -o /var/cluster/sccheck/myReports
sccheck: Requesting explorer data and node report from phys-schost-1.
sccheck: Requesting explorer data and node report from phys-schost-2.
sccheck: phys-schost-1: Explorer finished.
sccheck: phys-schost-1: Starting single-node checks.
sccheck: phys-schost-1: Single-node checks finished.
sccheck: phys-schost-2: Explorer finished.
sccheck: phys-schost-2: Starting single-node checks.
sccheck: phys-schost-2: Single-node checks finished.
sccheck: Starting multi-node checks.
sccheck: Multi-node checks finished.
sccheck: One or more checks failed.
sccheck: The greatest severity of all check failures was 3 (HIGH).
sccheck: Reports are in /var/cluster/sccheck/myReports.
#
# cat /var/cluster/sccheck/myReports/sccheck-results.suncluster.txt
...
===================================================
= ANALYSIS DETAILS =
===================================================
------------------------------------
CHECK ID : 3065
SEVERITY : HIGH
FAILURE : Global filesystem /etc/vfstab entries are not consistent across
all Sun Cluster 3.x nodes.
ANALYSIS : The global filesystem /etc/vfstab entries are not consistent across
all nodes in this cluster.
Analysis indicates:
FileSystem '/global/phys-schost-1' is on 'phys-schost-1' but missing from 'phys-schost-2'.
RECOMMEND: Ensure each node has the correct /etc/vfstab entry for the
filesystem(s) in question.
...
#
|
How to Check the Global Mount Points
The sccheck(1M)
command includes checks which examine the /etc/vfstab
file for configuration errors with the cluster file system and its global
mount points.
Note –
Run sccheck after making cluster configuration
changes that have affected devices or volume management components.
-
Become superuser on any node in the cluster.
-
Verify the cluster configuration.
Example—Checking the Global Mount Points
The following example shows the node phys-schost-2
of the cluster suncluster missing the mount point /global/schost-1. Reports are being sent to the output directory /var/cluster/sccheck/myReports/
# sccheck -v1 -h phys-schost-1,phys-schost-2 -o /var/cluster/sccheck/myReports
sccheck: Requesting explorer data and node report from phys-schost-1.
sccheck: Requesting explorer data and node report from phys-schost-2.
sccheck: phys-schost-1: Explorer finished.
sccheck: phys-schost-1: Starting single-node checks.
sccheck: phys-schost-1: Single-node checks finished.
sccheck: phys-schost-2: Explorer finished.
sccheck: phys-schost-2: Starting single-node checks.
sccheck: phys-schost-2: Single-node checks finished.
sccheck: Starting multi-node checks.
sccheck: Multi-node checks finished.
sccheck: One or more checks failed.
sccheck: The greatest severity of all check failures was 3 (HIGH).
sccheck: Reports are in /var/cluster/sccheck/myReports.
#
# cat /var/cluster/sccheck/myReports/sccheck-results.suncluster.txt
...
===================================================
= ANALYSIS DETAILS =
===================================================
------------------------------------
CHECK ID : 3065
SEVERITY : HIGH
FAILURE : Global filesystem /etc/vfstab entries are not consistent across
all Sun Cluster 3.x nodes.
ANALYSIS : The global filesystem /etc/vfstab entries are not consistent across
all nodes in this cluster.
Analysis indicates:
FileSystem '/global/phys-schost-1' is on 'phys-schost-1' but missing from 'phys-schost-2'.
RECOMMEND: Ensure each node has the correct /etc/vfstab entry for the
filesystem(s) in question.
...
#
# cat /var/cluster/sccheck/myReports/sccheck-results.phys-schost-1.txt
...
===================================================
= ANALYSIS DETAILS =
===================================================
------------------------------------
CHECK ID : 1398
SEVERITY : HIGH
FAILURE : An unsupported server is being used as a Sun Cluster 3.x node.
ANALYSIS : This server may not been qualified to be used as a Sun Cluster 3.x node.
Only servers that have been qualified with Sun Cluster 3.x are supported as
Sun Cluster 3.x nodes.
RECOMMEND: Because the list of supported servers is always being updated, check with
your Sun Microsystems representative to get the latest information on what servers
are currently supported and only use a server that is supported with Sun Cluster 3.x.
...
#
|