Contained Within
Find More Documentation
Featured Support Resources
| Download this book in PDF (685 KB)
Chapter 4 Frequently Asked Questions
This chapter includes answers to the most frequently asked questions
about the SunPlex system. The questions are organized by topic.
High Availability FAQs
-
What exactly is
a highly available system?
The SunPlex system defines high availability (HA) as the ability of a
cluster to keep an application up and running, even though a failure has occurred
that would normally make a server system unavailable.
-
What is the process by which the cluster
provides high availability?
Through a process known as failover, the cluster framework provides
a highly available environment. Failover is a series of steps performed by
the cluster to migrate data service resources from a failing node to another
operational node in the cluster.
-
What is the difference between a failover
and scalable data service?
There are two types of highly available data services, failover and
scalable.
A failover data service runs an application on only one primary node
in the cluster at a time. Other nodes might run other applications, but each
application runs on only a single node. If a primary node fails, the applications
running on the failed node fail over to another node and continue running.
A scalable service spreads an application across multiple nodes to create
a single, logical service. Scalable services leverage the number of nodes
and processors in the entire cluster on which they run.
For each application, one node hosts the physical interface to the cluster.
This node is called a Global Interface (GIF) Node. There can be multiple GIF
nodes in the cluster. Each GIF node hosts one or more logical interfaces that
can be used by scalable services. These logical interfaces are called global interfaces. One GIF node hosts a global interface for all
requests for a particular application and dispatches them to multiple nodes
on which the application server is running. If the GIF node fails, the global
interface fails over to a surviving node.
If any of the nodes on which the application is running fails, the application
continues to run on the other nodes with some performance degradation until
the failed node returns to the cluster.
File Systems FAQs
-
Can I run one or more of the cluster
nodes as highly available NFS server(s) with other cluster nodes as clients?
No, do not do a loopback mount.
-
Can I use a cluster file system for
applications that are not under Resource Group Manager control?
Yes. However, without RGM control, the applications need to be
restarted manually after the failure of the node on which they are running.
-
Must all cluster file systems have
a mount point under the /global directory?
No. However, placing cluster file systems under the same mount point,
such as /global, enables better organization and management
of these file systems.
-
What are the differences between using
the cluster file system and exporting NFS file systems?
There are several differences:
-
The cluster file system supports global devices. NFS does
not support remote access to devices.
-
The cluster file system has a global namespace. Only one mount
command is required. With NFS, you must mount the file system on each node.
-
The cluster file system caches files in more cases than does
NFS. For example, when a file is being accessed from multiple nodes for read,
write, file locks, async I/O.
-
The cluster file system is built to exploit future
fast cluster interconnects that provide remote DMA and zero-copy functions.
-
If you change the attributes on a file (using chmod(1M), for example) in a cluster file system, the change is reflected
immediately on all nodes. With an exported NFS file system, this can take
much longer.
-
The file system /global/.devices/node@<nodeID>
appears on my cluster nodes. Can I use this file system to store data that
I want to be highly available and global?
These file systems store the global device namespace. They are not intended
for general use. While they are global, they are never accessed in a global
manner--each node only accesses its own global device namespace. If a node
is down, other nodes cannot access this namespace for the node that is down.
These file systems are not highly available. They should not be used to store
data that needs to be globally accessible or highly available.
Volume Management FAQs
-
Do I need to mirror all disk devices?
For a disk device to be considered highly available, it must be mirrored,
or use RAID-5 hardware. All data services should use either highly available
disk devices, or cluster file systems mounted on highly available disk devices.
Such configurations can tolerate single disk failures.
-
Can I use one volume manager for the
local disks (boot disk) and a different volume manager for the multihost disks?
SPARC: This configuration is supported with the Solaris Volume Manager software managing
the local disks and VERITAS Volume Manager managing the multihost disks. No other combination
is supported.
x86: No, this configuration is not supported, as only Solaris Volume Manager is supported
in x86 based clusters.
Data Services FAQs
-
What SunPlex data services are available?
The list of supported data services is included in the Sun Cluster Release Notes.
-
What application versions are supported
by SunPlex data services?
The list of supported application versions is included in the Sun Cluster Release Notes.
-
Can I write my own data service?
Yes. See the Sun Cluster Data Services Developer's Guide and the Data Service Enabling
Technologies documentation provided with the Data Service Development Library
API for more information.
-
When creating network resources, should
I specify numeric IP addresses or hostnames?
The preferred method for specifying network resources is to use the
UNIX hostname rather than the numeric IP address.
-
When creating network resources, what
is the difference between using a logical hostname (a LogicalHostname resource)
or a shared address (a SharedAddress resource)?
Except in the case of Sun Cluster HA for NFS, wherever the documentation calls for
the use of a LogicalHostname resource in a Failover mode resource group, a SharedAddress resource
or LogicalHostname resource may be used interchangeably.
The use of a SharedAddress resource incurs some additional
overhead because the cluster networking software is configured for a SharedAddress but not for a LogicalHostname.
The advantage to using a SharedAddress is the case
where you are configuring both scalable and failover data services, and want
clients to be able to access both services using the same hostname. In this
case, the SharedAddress resource(s) along with the failover
application resource are contained in one resource group, while the scalable
service resource is contained in a separate resource group and configured
to use the SharedAddress. Both the scalable and failover
services may then use the same set of hostnames/addresses which are configured
in the SharedAddress resource.
Public Network FAQs
-
What public network adapters does
the SunPlex system support?
Currently, the SunPlex system supports Ethernet (10/100BASE-T and 1000BASE-SX
Gb) public network adapters. Because new interfaces might be supported in
the future, check with your Sun sales representative for the most current
information.
-
What is the role of the MAC address
in failover?
When a failover occurs, new Address Resolution Protocol (ARP) packets
are generated and broadcast to the world. These ARP packets contain the new
MAC address (of the new physical adapter to which the node failed over) and
the old IP address. When another machine on the network receives one of these
packets, it flushes the old MAC-IP mapping from its ARP cache and uses the
new one.
-
Does the SunPlex system support setting local-mac-address?=true?
Yes. In fact, IP Network Multipathing requires that local-mac-address? must be set to true.
You can set local-mac-address? with eeprom(1M), at the OpenBoot PROM ok prompt in a SPARC based cluster, or with the SCSI utility that
you optionally run after the BIOS boots in an x86 based cluster.
-
How much delay can I expect when IP Network Multipathing
performs a switchover between adapters?
The delay could be several minutes. This is because when a IP Network Multipathing
switchover is done, it involves sending out a gratuitous ARP. However, there
is no guarantee that the router between the client and the cluster will use
the gratuitous ARP. So, until the ARP cache entry for this IP address on the
router times out, it is possible that it could use the stale MAC address.
-
How fast are failures of a network
adapter detected?
The default failure detection time is 10 seconds. The algorithm tries
to meet the failure detection time, but the actual time depends on the network
load.
Cluster Member FAQs
-
Do all cluster members need to have
the same root password?
You are not required to have the same root password on each cluster
member. However, you can simplify administration of the cluster by using the
same root password on all nodes.
-
Is the order in which nodes are booted
significant?
In most cases, no. However, the boot order is important to prevent amnesia
(refer to Quorum and Quorum Devices for details on amnesia). For example,
if node two was the owner of the quorum device and node one is down, and then
you bring node two down, you must bring up node two before bringing back node
one. This prevents you from accidentally bringing up a node with out of date
cluster configuration information.
-
Do I need to mirror local disks in
a cluster node?
Yes. Though this mirroring is not a requirement, mirroring the cluster
node's disks precludes against a non-mirrored disk failure taking down the
node. The downside to mirroring a cluster node's local disks is more system
administration overhead.
-
What are the cluster member backup
issues?
You can use several backup methods for a cluster. One method is to have
a node as the backup node with a tape drive/library attached. Then use the
cluster file system to back up the data. Do not connect this node to the shared
disks.
See the Sun Cluster System Administration Guide for additional information on
backup and restore procedures.
-
When is a node healthy enough to be
used as a secondary node?
After a reboot, a node is healthy enough to be a secondary node when
the node displays the login prompt.
Cluster Storage FAQs
-
What makes multihost storage highly
available?
Multihost storage is highly available because it can survive the loss
of a single disk, due to mirroring (or due to hardware-based RAID-5 controllers).
Because a multihost storage device has more than one host connection, it can
also withstand the loss of a single node to which it is connected. In addition,
redundant paths from each node to the attached storage provide tolerance for
the failure of a host bus adapter, cable, or disk controller.
Cluster Interconnect FAQs
-
What cluster interconnects does the SunPlex
system support?
Currently, the SunPlex system supports Ethernet (100BASE-T Fast Ethernet
and 1000BASE-SX Gb) cluster interconnects in both SPARC based and x86 based
clusters. The SunPlex system supports the SCI network interface cluster interconnect
in SPARC based clusters only.
-
What is the difference between a “cable”
and a transport “path?”
Cluster transport cables are configured using transport adapters and
switches. Cables join adapters and switches on a component-to-component basis.
The cluster topology manager uses available cables to build end-to-end transport
paths between nodes. A cable does not map directly to a transport path.
Cables are statically “enabled” and “disabled”
by an administrator. Cables have a “state,” (enabled or disabled)
but not a “status.” If a cable is disabled, it is as if it were
unconfigured. Cables that are disabled cannot be used as transport paths.
They are not probed and therefore, it is not possible to know their status.
The state of a cable can be viewed using scconf -p.
Transport paths are dynamically established by the cluster topology
manager. The “status” of a transport path is determined by the
topology manager. A path can have a status of “online” or “offline.”
The status of a transport path can be viewed using scstat(1M).
Consider the following example of a two-node cluster with four cables.
node1:adapter0 to switch1, port0
node1:adapter1 to switch2, port0
node2:adapter0 to switch1, port1
node2:adapter1 to switch2, port1
|
There are two possible transport paths that can be formed from these
four cables.
node1:adapter0 to node2:adapter0
node2:adapter1 to node2:adapter1
|
Client Systems FAQs
-
Do I need to consider any special
client needs or restrictions for use with a cluster?
Client systems connect to the cluster as they would any other server.
In some instances, depending on the data service application, you might need
to install client-side software or perform other configuration changes so
that the client can connect to the data service application. See individual
chapters in Sun Cluster Data Services Planning and Administration Guide for more information on client-side
configuration requirements.
Administrative Console FAQs
-
Does the SunPlex system require an
administrative console?
Yes.
-
Does the administrative console have
to be dedicated to the cluster, or can it be used for other tasks?
-
The SunPlex system does not require a dedicated
administrative console, but using one provides these benefits:
-
Does the administrative console need
to be located “close” to the cluster itself, for example, in the
same room?
Check with your hardware service provider. The provider might require
that the console be located in close proximity to the cluster itself. No technical
reason exists for the console to be located in the same room.
-
Can an administrative console serve
more than one cluster, as long as any distance requirements are also first
met?
Yes. You can control multiple clusters from a single administrative
console. You can also share a single terminal concentrator between clusters.
Terminal Concentrator and System Service Processor FAQs
-
Does the SunPlex system require a terminal
concentrator?
All software releases starting with Sun Cluster 3.0 do not require a terminal
concentrator to run. Unlike the Sun Cluster 2.2 product, which required a terminal
concentrator for failure fencing, later products do not depend on the terminal
concentrator.
-
I see that most SunPlex servers use
a terminal concentrator, but the Sun Enterprise E10000 server does not. Why is that?
The terminal concentrator is effectively a serial-to-Ethernet converter
for most servers. Its console port is a serial port. The Sun Enterprise E10000 server doesn't
have a serial console. The System Service Processor (SSP) is the console,
either through an Ethernet or jtag port. For the Sun Enterprise E10000 server,
you always use the SSP for consoles.
-
What are the benefits of using a terminal
concentrator?
Using a terminal concentrator provides console-level access to each
node from a remote workstation anywhere on the network, including when the
node is at the OpenBoot PROM (OBP) on a SPARC based node or a boot subsystem
on an x86 based node.
-
If I use a terminal concentrator not
supported by Sun, what do I need to know to qualify the one that I want to
use?
The main difference between the terminal concentrator supported by Sun
and other console devices is that the Sun terminal concentrator has special
firmware that prevents the terminal concentrator from sending a break to the
console when it boots. Note that if you have a console device that can send
a break, or a signal that might be interpreted as a break to the console,
it shuts down the node.
-
Can I free a locked port on the terminal
concentrator supported by Sun without rebooting it?
Yes. Note the port number that needs to be reset and type the following
commands:
telnet tc
Enter Annex port name or number: cli
annex: su -
annex# admin
admin : reset port_number
admin : quit
annex# hangup
#
|
Refer to the Sun Cluster System Administration Guide for more information about
configuring and administering the terminal concentrator supported by Sun.
-
What if the terminal concentrator
itself fails? Must I have another one standing by?
No. You do not lose any cluster availability if the terminal concentrator
fails. You do lose the ability to connect to the node consoles until the concentrator
is back in service.
-
If I do use a terminal concentrator,
what about security?
Generally, the terminal concentrator is attached to a small network
used by system administrators, not a network that is used for other client
access. You can control security by limiting access to that particular network.
-
SPARC: How do I use dynamic reconfiguration
with a tape or disk drive?
-
Determine whether the disk or tape drive is part of an active
device group. If the drive is not part of an active device group, you can
perform the DR remove operation on it.
-
If the DR remove-board operation would affect an active disk
or tape drive, the system rejects the operation and identifies the drives
that would be affected by the operation. If the drive is part of an active
device group, go to SPARC: DR Clustering Considerations for Disk and Tape Drives.
-
Determine whether the drive is a component of the primary
node or the secondary node. If the drive is a component of the secondary node,
you can perform the DR remove operation on it.
-
If the drive is a component of the primary node, you must
switch the primary and secondary nodes before performing the DR remove operation
on the device.
 Caution – If the current primary node fails while you are performing
the DR operation on a secondary node, cluster availability is impacted. The
primary node has no place to fail over until a new secondary node is provided.
|