Dynamic Reconfiguration User's Guide
  Search only this book
Download this book in PDF

DR Configuration Issues

2

Configuring the System for DR

This chapter describes how to configure a domain for all DR operations and capabilities. The DR features are enabled only when the OpenBootTM PROM (OBP) environment variable dr-max-mem is set to a non-zero value. The sections in this chapter include more information about dr-max-mem.

Note - DR features are disabled on domains that have less than 256MB of memory.

Configuring for DR Attach

This section describes how to configure DR before performing an attach operation.

I/O Devices


CAUTION Warning - Be careful when choosing the slot into which a board is inserted to prevent disk controller re-numbering. For more information see "Reconfiguration After a DR Operation" on page 2-7

Memory: dr-max-mem

The kernel has a number of memory-related data structures such as page structures, which are statically allocated at boot time and are based on the amount of physical memory in the domain at that time. Use DR Attach to dynamically add a board and its physical memory after the domain is booted. However, this extra memory can be supported by the kernel only if enough memory data structures are allocated at boot time to support it; these structures cannot be added dynamically after boot time.
To reserve enough memory data structures to support DR attach operations, each domain supports the OBP environment variable, dr-max-mem, which the kernel reads at boot time. dr-max-mem specifies the maximum number of megabytes to which the domain can grow without requiring a reboot. Each domain has its own unique copy of dr-max-mem.
To calculate the optimum value for dr-max-mem, combine the amount of memory most likely to be added during all DR Attaches and the current amount of memory present in the domain. Then, set dr-max-mem to the total.
Note that if dr-max-mem is too large relative to the memory in the domain, its size can impact the performance of the operating system. Therefore, the system limits the maximum value of dr-max-mem at boot time, as follows:
Table 2-1 dr-max-mem
Current Physical Memorydr-max-mem Maximum Value
256 MB0
512 MB16 GB
1 GB32 GB
2 GB32 GB
If the value of dr-max-mem is smaller than the amount of physical memory present when the domain is booted, the operating system sets its working copy of dr-max-mem to the current memory size. You cannot attach more memory, but you can detach, then re-attach memory. The maximum amount of memory you can re-attach in this manner is the amount present when the domain was booted. Note that dr-max-mem is not modified in this situation.

Caution - Set dr-max-mem high enough so that all anticipated new memory can be dynamically attached, but no higher. If you set it too low and later attach a board whose memory combined with domain memory exceeds the value of dr-max-mem, the memory on that board will not be attached. If you set the value of dr-max-mem too high, you over-allocate data structures, which can waste available memory and adversely affect system performance. If you set it to zero, the DR functions are disabled.

dr-max-mem must be set before the domain is booted.
To set the dr-max-mem environment variable, bring up the OBP prompt for the domain and enter:

  ok# setenv-dr dr-max-mem NNN  

where NNN is the number of megabytes of memory to be supported by the domain after the boards are attached. The value of dr-max-mem persists across domain reboots, and is only applicable to that particular domain.
If the dr-max-mem variable is non-zero, the following messages are displayed at boot time in the system msgbuf (/var/adm/messages):

  DR: current memory size is XXX MBytes  
  DR: capacity to allow an additional YYY MBytes of memory  

where XXX represents the amount of physical memory available to the operating system and is effectively the same as the operating system variable, physinstalled; and where YYY is the difference between dr-max-mem and XXX.
When a board with memory is successfully attached or detached, another message is displayed:

  DR: capacity to allow an additional ZZZ MBytes of memory  

where ZZZ represents the updated amount of memory that can still be attached.

Processors

Prior to attaching a board, diagnostics are run on the board. This requires the presence of at least one processor on the board. The processor must not be blacklisted.

Configuring for DR Detach

This section describes how to configure DR before performing an attach operation.

Enabling DR Detach

The DR Detach feature requires that the OBP variable dr-max-mem is set to a non-zero value.This setting is required at the time the domain is booted.

I/O Devices

The DR Detach board feature relies on the Alternate Pathing feature or Solstice DiskSuite mirroring, or a mixture of the two, when used to detach a board that hosts I/O controllers attached to vital system resources. If, for example, the root or /usr partition is on a disk attached to a controller on the board, the board cannot be detached unless there is a hardware alternate path to the disk or the disk is mirrored. The alternate path and/or the mirrors must be hosted by other boards in the system. The same applies to network controllers. The board that hosts the Ethernet controller that connects the SSP to the Enterprise 10000 platform cannot be detached unless an alternate path exists to an Ethernet controller on another board for this network connection.
The system swap space should be configured as multiple partitions on disks attached to controllers hosted by different boards. With this kind of configuration, a particular swap partition is not a vital resource since swap partitions can be added and deleted dynamically. See swap(1M) for more information.

Note - When memory (swapfs) or disk swap space is detached, there must be enough memory or swap disk space remaining in the system to accommodate currently running programs.

A board hosting non-vital system resources can be detached whether or not there are alternate paths to the resources. There is a domain disruption penalty associated with the detach operation since all of the board's devices must be closed before the board can be detached. All of its file systems must be unmounted and its swap partitions must be deleted. You may have to kill processes that have open files or devices, or place a hard lock on the file systems (using lockfs(1M)) before unmounting them.
All I/O device drivers must support the DDI_DETACH option in the driver detach entry point. This option releases all system resources associated with that device or adapter.

Memory


Note - The information in this section applies only to systems in which memory is not interleaved between boards.

If you use memory interleaving between system boards, those system boards cannot be detached. This is because DR does not yet support inter-board interleaving. By default, hpost(1M) does not set up boards with interleaved memory. Look for the following line in the hpost(1M) file .postrc (see postrc(4)):

  mem_board_interleave_ok  

If mem_board_interleave_ok is present, you may not be able to detach a board that contains memory.
Before a board can be detached, the memory on that board must be vacated by the operating system. Vacating a board means flushing its pageable memory to swap space and copying its permanent memory--that is, non-pageable kernel and OBP memory--to another memory board. When permanent memory is on the detaching board, the operating system must find other memory to receive the copy.
You can use the dr(1M) command drshow(1M), as follows, to determine whether a board's memory is pageable:

  % dr  
  dr> drshow board_number mem  

Alternatively, you can determine whether a board's memory is pageable by looking at the DR Memory Configuration window, which is available when you perform a detach operation within Hostview. The DR Memory Configuration window is described in "Viewing System Information" on page 3-23.

Target Memory Constraints

When permanent memory is detached, DR chooses a target memory area to receive a copy of the memory. The target memory area must adhere to the following rules. The DR software automatically checks for total adherence, and disallows the DR memory operation if it cannot verify that. The rules are shown here to illustrate the possible reasons why a DR memory operation might be disallowed.
  • It must be large enough to hold a copy of the non-pageable memory.
  • It must not be interleaved with memory on other boards.
If no target board is found, the detach operation is refused and DR displays the following messages on the system console:

  Jul 28 06:00:00 unix: WARNING:dr_build_adg_detach_list:no target  
  memory board found  

Swap Space

The system swap configuration consists of the swap devices and swapfs (memory). The system must contain enough swap space so it can flush pageable memory. For example, if you want to remove 1 GB of memory from a 2 GB system, you will need up to 16 GB of swap space, depending on the load.
Insufficient swap space prevents DR from completing the detach of a board that contains memory. If this happens, the memory drain phase of the detach operation does not complete, and you must abort the operation.

Reconfiguration After a DR Operation

This section describes how to reconfigure your system after you have attached or detached a system board.

I/O Devices

The DR user interface enables you reconfigure the system after a DR Attach or DR Detach operation. The reconfiguration sequence is the same as the reconfiguration boot sequence (boot -r):

  drvconfig; devlinks; disks; ports; tapes;  

When the reconfiguration sequence is executed after a board is attached, device path names not previously seen by the system are entered into the /etc/path_to_inst file. The same path names are also added to the /devices hierarchy and links to them are created in the /dev directory.

Disk Devices

Disk controllers are numbered consecutively as they are encountered by the disks(1M) program. All disk partitions are assigned /dev names according to the disk controller number that disks(1M) assigns. For example, all disk partitions accessible via disk controller 1 are named /dev/dsk/c1tLdMsN, where c1 is the disk controller number and, in most cases, L corresponds to the disk target number, M to the logical unit number, and N to the partition number. When the reconfiguration sequence is executed after a board is detached, the /dev links for all the disk partitions on that board are deleted. The remaining boards retain their current numbering. Disk controllers on a newly inserted board are assigned the next available lowest number by disks(1M).
For example, suppose there are four system boards numbered 0 to 3, and you detach boards 1 and 2, which are then removed from the system. Your service provider repairs board 2 and reinserts it, and you attach it; board 1 is still out.
If you now execute disks(1M), controller numbers from board 1 are reassigned to controllers on board 2 if the old board 1 controller numbers are the next available lowest numbers.

Note - The disk controller number is part of the /dev link name used to access the disk. If that number changes during the reconfiguration sequence, the /dev link name also changes. This change may affect file system tables and software, such as Solstice DiskSuiteTM, which uses the /dev link names. Update /etc/vfstab files and execute other administrative actions necessary due to the changes in the /dev link names.

When to Reconfigure

You should reconfigure the system under several conditions, as described below.
  • Board Addition - When adding a board to a domain, you must execute the reconfiguration sequence to configure the I/O devices associated with the board.
  • Board Deletion - If you remove a board that is not to be replaced, you may (but do not have to) execute the reconfiguration sequence to clean up the /dev links for disk devices.
  • Board Replacement - If you remove a board then reinsert it in a different slot, or replace a board with another board that has different I/O devices, you must execute the reconfiguration sequence to configure the I/O devices associated with the board. However, if you replace a board with another board that hosts the same set of I/O devices, inserting the replacement into the same slot, you do not need to execute the reconfiguration sequence. But be sure to insert a replacement into the same slot that was vacated to retain the original /dev link names.

DR and AP Interaction

DR notifies the Alternate Pathing (AP) subsystem when system boards are attached, detached, or placed in the detach drain state. In addition, DR queries AP about which controllers are in the AP database and their status (active or inactive). This communication occurs between the dr_daemon(1M) and ap_daemon(1M). If the ap_daemon(1M) is not present, an error message is
placed in the system messages buffer and DR operations continue without error. To disable this interaction, use the -a option when invoking dr_daemon(1M). See the dr_daemon(1M) man page in the Solaris Reference Manual for SMCC-Specific Software.
For more information, also see the Alternate Pathing 2.0 User's Guide.

RPC Timeout or Loss of Connection

dr_daemon(1M) (which runs in each domain) communicates with Hostview and the dr(1M) shell application (both of which run on the SSP) via remote procedure calls (RPC). If an RPC timeout or connection failure is reported during a DR operation, check the domain. The daemon must be configured in the domain's /etc/inetd.conf file. The following line (which appears on a single line) must be present in the file:

  300326/4 tli rpc/tcp wait root /usr/platform/sun4u1/  
  sbin/dr_daemon dr_daemon  

If the DR daemon is configured in /etc/inetd.conf, kill the dr_daemon(1M) if it is currently running. In addition, send a HUP signal to the inetd(1M) daemon to cause it to re-read the inetd.conf(4) configuration file:

  # kill dr_daemon_pid  
  # kill -HUP inetd_pid  

where dr_daemon_pid is the process ID of the DR daemon, and inetd_pid is the process ID of the inetd(1M) daemon. You can check /var/adm/messages for possible error messages from inetd(1M) if it's having trouble starting the dr_daemon(1M). The DR daemon executable file should exist in /usr/platform/sun4u1/sbin/dr_daemon.
At this point you should try the DR operation again, starting from the beginning.

Operating System Quiesce

During a DR detach operation on a system board with non-pageable OBP or kernel memory, the operating system is briefly quiesced; that is, all operating system and device activity on the domain centerplane must cease for a few seconds during a critical phase of the operation. The quiesce only affects the target domain; other domains in the system are not affected.
Before it can quiesce, the operating system must temporarily suspend all processes, processors, and device activities. If the operating system cannot quiesce, it displays its reasons, which may include the following:
  • A user thread in the domain did not suspend.
  • Real-time processes are running in the domain.
  • A device that cannot be quiesced by the operating system (i.e., a suspend-unsafe device) is open.
The conditions that cause processes to fail to suspend are generally temporary in nature. You can retry the operation until the quiesce succeeds.
A quiesce failure due to real-time processes or open suspend-unsafe devices is known as a forcible condition. You have the option of performing either a retry or forced retry. When you force the quiesce, you give the operating system permission to continue with the quiesce even if forcible conditions are still present.

Caution - Exercise care when using the force option.

If a real-time process is running, determine whether suspending the process would have an adverse effect on the functions performed by the process. If not, you can force the operating system to quiesce. (To force a quiesce, you can either choose the Force button within Hostview as described in "To Detach a Board with Hostview" on page 3-18, or enter the complete_detach command with the force option within the dr(1M) shell application. Otherwise, you can abort the operation and try again later.
If any suspend-unsafe device in the domain is open and cannot be closed, you can manually suspend the device, and then force the operating system to quiesce. After the operating system resumes, you can manually resume the device. See "Suspend-Safe / Suspend-Unsafe Devices" on page 2-11.
If the operating system fails to quiesce, pay close attention to the reasons for the failure. If the operating system encountered a transient condition--a failure to suspend a process--you can try the operation again. If, however, the condition(s) require your approval (e.g. a real-time process is running) or intervention (e.g. a suspend-unsafe device open), you can force the operating system to quiesce.

Suspend-Safe / Suspend-Unsafe Devices

A suspend-safe device is one that does not access the domain centerplane (e.g., does not access memory or interrupt the system) while the operating system is quiesced. A driver is considered suspend-safe if it supports operating system quiescence (suspend/resume) and guarantees that when a suspend request is successfully completed, the device that the driver manages will not attempt to access the domain centerplane, even if the device is open when the suspend request is made. All other I/O devices are suspend-unsafe when open.
The drivers released by Sun Microsystems that are known to be suspend-safe are sd, isp, esp, fas, hme (Sun FastEthernetTM), nf (NPI-FDDI), qe (Quad Ethernet), le (Lance Ethernet), and the SSA drivers (soc, pln, and ssd). See sd(7D), isp(7D), esp(7D), and ssd(7D) in the SunOS Reference Manual. The known suspend-unsafe drivers are the tape-related drivers (sga and st). These lists will change over time. To add a driver to either list, see "Adding Suspend-Safe Devices" on page 2-14.
The operating system refuses a quiesce request if a suspend-unsafe device is open. If you can manually suspend the device, you can force the operating system to quiesce. To manually suspend the device, you may have to close the device by killing the processes that have it open, ask users not to use the device, or disconnect the cables. For example, if a device that allows asynchronous unsolicited input is open, you can disconnect its cables prior to quiescing the operating system and reconnect them after the operating system resumes. Doing this prevents traffic from arriving at the device and, thus, the device has no reason to access the domain centerplane. If you cannot make a device suspend its accesses to the domain centerplane, you should not force the operating system to quiesce. Doing so could cause a domain to crash or hang. Instead, postpone the DR operation until the suspend-unsafe device is no longer open.

CAUTION Warning - If you attempt to do a forced quiesce operation when suspend-unsafe devices are present, the domain may hang if any activity occurs on any suspend-unsafe device during the period of system quiescence. This will not affect other running domains on the Ultra Enterprise 10000 system.

Special Quiesce Handling for Tape Devices

The sequential nature of tape devices prevents them from being reliably suspended in the middle of an operation, and then resumed. Therefore, all tape drivers are suspend-unsafe and cannot be quiesced. Before executing a DR operation that quiesces the operating system, make sure all tape devices are closed or not in use.

DR and DDI

Not all drivers support the Ultra Enterprise 10000 system's Dynamic Reconfiguration (DR) features. To support DR a driver must be able to perform two basic DDI/DKI (Device Driver Interface/Device Kernel Interface) functions, DDI_DETACH and DDI_SUSPEND/DDI_RESUME. These two functions impact DR in different ways.

DR and DDI_DETACH

You can detach a system board that hosts a device only if the driver for that device supports the DDI_DETACH interface, or is not currently loaded. DDI_DETACH provides the ability to detach a particular instance of a driver without impacting other instances that are servicing other devices. A driver that supports DDI_DETACH is called detach-safe; a driver that does not support DDI_DETACH is called detach-unsafe. See "DR Detach-Safe Devices" on page 2- 16

· To Detach a Detach-Unsafe Driver that is Loaded

  1. Stop all usage of the controller for the detach-unsafe device, and all other controllers of the same type on all boards in the domain.

    Since the detach-unsafe driver must be modunloaded (next step), you must stop use of that controller type on all system boards in the domain. The remaining controllers can be used again after the DR Detach is complete.

  2. Use standard Solaris interfaces to manually close and modunload all such drivers on the board.

    See modload(1M) in the SunOS Reference Manual.

  3. Detach the system board in the normal fashion.

If you cannot execute the above steps, you may reboot your domain with the board blacklisted (see blacklist(4)), and the board can later be removed.

Note - Many third-party drivers (those purchased from vendors other than Sun Microsystems) do not properly support the standard Solaris modunload interface. Conditions that invoke the functions occur infrequently during normal operation and the functions are sometimes missing or work improperly. Sun Microsystems strongly suggests that you test these driver functions during the qualification and installation phases of any third-party device.

DR and DDI_SUSPEND/DDI_RESUME

To perform a DR detach of a board that contains nonpageable OBP or kernel memory, the domain must be quiesced. Memory can be detached only when all drivers throughout the entire domain (not just on the board being detached) either support the DDI_SUSPEND/DDI_RESUME driver interface, or are closed. Drivers that support these DDI functions are called suspend-safe; drivers that do not are called suspend-unsafe. See "Adding Suspend-Safe Devices" on page 2- 14. Note that a quiesce is required only when detaching a board that contains non-pageable memory.
The most straightforward way to quiesce a domain is to close any suspend-unsafe devices. For each network driver you must execute the ifconfig(1M) command with its down parameter, then again with its unplumb parameter. See ifconfig(1M).

Note - It should be possible to unplumb all network drivers. However, this action is rarely tested in normal environments and may result in driver error conditions. If you to use DR, Sun Microsystems strongly suggests that you test these driver functions during the qualification and installation phases of any suspend-unsafe device.

If the system refuses to quiesce because a suspend-unsafe driver is open, you can force the operating system to quiesce. Doing so forces the system to permit the detach. Note that, although a detach can be forced to proceed when there are open suspend-unsafe devices in the system, it is not possible to force a detach when a detach-unsafe device resides on the board and its driver is loaded.
To successfully force the operating system to quiesce, you must manually quiesce the controller. Procedures to do that, if any, are device-specific. The device must not transfer any data, reference memory, or generate interrupts during the operation. Be sure to test any procedures used to quiesce the controller while it is open prior to executing them on a production system.

Caution - Using the force option to quiesce the operating system without first successfully quiescing the controller can result in a domain failure and subsequent reboot.

Adding Suspend-Safe Devices

Before each Enterprise 10000 system is shipped, Sun Microsystems configures the DR driver (dr) to recognize those devices that support DDI_SUSPEND/DDI_RESUME; that is, that can be safely quiesced. The Note in "Suspend-Safe / Suspend-Unsafe Devices" on page 2-11 lists the known suspend-safe and suspend-unsafe devices at the time this document was printed.
If you want to add a device to your system and the device and its driver support DDI_SUSPEND/DDI_RESUME, configure the DR driver to recognize the device as suspend-safe by placing an entry in the /etc/system file as described below. This file enables you to append the list already maintained in the system. No harm results from a device being listed multiple times. If you are not sure whether a device supports DDI_SUSPEND/DDI_RESUME, ask your service provider or the manufacturer of the device.
If a listed device is open when a quiesce is requested, the device is suspended and resumed prior to the quiesce. Tape devices are not suspend-safe; do not append such devices to the suspend-safe list via the /etc/system file.

Note - Previously, the suspend-safe list was called the dr_safe list. You use the old name but when the dr module is first loaded, the following messages are displayed:
NOTICE: dr: using old style safe/bypass list (dr_safe_listx)
NOTICE: dr: next time use new style (suspend_safe_listx)


To add new devices that support quiescing to the /etc/system file, use the following format, where device names represent device driver module names:

  set hswp:suspend_safe_list1="device1 device2 ... devicen"  
  set hswp:suspend_safe_list2="device1 device2 ... devicen"  
  set hswp:suspend_safe_list3="device1 device2 ... devicen"  
  set hswp:suspend_safe_list4="device1 device2 ... devicen"  
  set hswp:suspend_safe_list5="device1 device2 ... devicen"  


Note - The /etc/system file can contain up to five suspend-safe strings, each no more than 128 characters long.

Adding Suspend-Bypass Devices

The Ultra Enterprise 10000 system has a preset list of devices that it ignores during the quiesce process, making no attempt to quiesce them. These devices, which include pseudo devices, do not perform I/O operations and do not need to be suspended during the quiesce.

Caution - Do not add suspend-unsafe devices to the suspend-bypass list.

You can add devices to the /etc/system file which do not support quiescing, but which can be safely ignored during the quiesce process. Do so in the following format, where device names represent device driver module names.

  set hswp:suspend_bypass_list1="device1 device2 ... devicen"  
  set hswp:suspend_bypass_list2="device1 device2 ... devicen"  
  set hswp:suspend_bypass_list3="device1 device2 ... devicen"  
  set hswp:suspend_bypass_list4="device1 device2 ... devicen"  
  set hswp:suspend_bypass_list5="device1 device2 ... devicen"  


Note - The /etc/system file can contain up to five suspend-bypass strings, each no more than 128 characters long

DR Detach-Safe Devices

Before each Ultra Enterprise 10000 system is shipped, DR is configured to recognize those devices that can be safely detached. A driver is safe for detaching if it fully supports the DDI/DKI DDI_DETACH function in the driver's detach entry point. Normally, such DR-capable drivers must also support DDI_SUSPEND and DDI_RESUME, as described in "DR and DDI_SUSPEND/DDI_RESUME" on page 2-13. However, some exceptions do exist, such as tape devices that can be detach-safe while they are inherently suspend-unsafe.
If you want to add a device to your system and the device and its driver can be safely detached, be sure to add the device name to the detach-safe list in the /etc/system file. This file enables you to append the list already maintained in the system. No harm results when a device is listed multiple times. If you are not sure whether a device can be safely detached, ask your service provider.
If a DR Detach operation fails because the board hosts a device that is not included in the detach-safe list, and the corresponding driver is loaded, the system displays a message similar to the following:

  WARNING: DR: driver (xxx) not known to support DDI_DETACH  

where xxx is the name of the driver module as it would reside under /kernel/drv and named in /etc/name_to_major.

Adding Detach-Safe Devices

Add new devices that support DR Detach to the /etc/system file in the following format, where device names represent device driver module names:

  set dr:detach_safe_list1="device1 device2 ... devicen"  
  set dr:detach_safe_list2="device1 device2 ... devicen"  
  set dr:detach_safe_list3="device1 device2 ... devicen"  
  set dr:detach_safe_list4="device1 device2 ... devicen"  
  set dr:detach_safe_list5="device1 device2 ... devicen"  


Note - The /etc/system file can contain up to five detach-safe strings, each no more than 128 characters long.