以 PDF 格式下载本书 (670 KB)
Chapter 6 OS Distribution and Deployment ProblemsThis chapter lists problems that can occur with OS distribution creation and deployment, their causes, and the solution for each problem. OS distribution creation and OS deployment failures can be caused by many different issues as described in this section. The following topics are discussed:
Possible Causes for Distribution and Deployment FailureOS deployments might fail or fail to complete if any of the following conditions occur:
Possible Causes for Windows Distribution and Deployment FailureProvisioning a Windows distribution to a managed server can fail for several reasons:
Deploying Solaris OS 9 Update 7 or Previous Distributions From a Linux OS Management Server FailsThe inability to deploy Solaris OS 9 Update 7 and previous Solaris OS 9 distributions to manageable servers from a Linux OS management server is usually due to a problem with NFS mounts. To solve this problem, you need to apply a patch to the mini-root of the Solaris OS 9 distribution. The instructions differ according to the management and patch server configuration scenarios listed in the following table. The patch is not required if you are deploying Solaris OS 9 Update 8 or later. Table 6–1 Task Map for Patching a Solaris OS 9 Distribution
Invalid Management Server NetmaskIf the target server cannot access DHCP information or mount the distribution directories on the management server during a Solaris OS 10 deployment, you might have network problems caused by an invalid netmask. The console output might be similar to the following:
To fix the problem, set the management server netmask value to 255.255.255.0. See To Configure the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide. Linux OS Deployment StopsIf you are deploying a Linux OS and the deployment stops, check the console of the target server to see whether the installer is in interactive mode. If the installer is in interactive mode, the deployment timed out because of a delay in the transmission of data from the management server to the target server. This delay usually occurs because the switch or switches connecting the two machines has spanning tree enabled. Either turn off spanning tree on the switch or disable spanning tree for the ports that are connected to the management server and the target server. If spanning tree is already disabled and OS deployment stops, a problem might exist with your network. Note – For Red Hat installations to work with some networking configurations, you must enable spanning tree. Management Server Reboots During load os OperationsIf the IP address range specified for discovery includes the management server IP addresses, and the management server service processor port is connected to the management network, the discovery process discovers the management server. Subsequently, it is possible that a load os operation that includes the discovered management server will attempt to load an OS to the management server, thus causing the management server to reboot.
Solution: Mount Point IssuesDistribution copy failures might also occur if there are file systems on the /mnt mount point. Move all file systems off the /mnt mount point before attempting create os command operations. OS Deployment Fails on a Sun Fire V20z or V40z With internal error MessageIf OS deployment fails on a Sun Fire V20z or a V40z server with the internal error occurred message provided in the job results, direct the platform console output to the service processor. If the platform console output cannot be directed to the service processor, reboot the service processor. To reboot the service processor, log in to the service processor and run the sp reboot command. To check the console output, log in to the service processor, and run the platform console command. Examine the output during OS deployment to resolve the problem. OS Deployment Fails on a Sun Blade X8400 Server Blade That Has Correct FirmwareProvisioning an OS distribution to a Sun BladeTM X8400 server blade will fail if the following conditions are met:
To provision an OS distribution to a Sun Blade X8400 server blade, use either of the following two methods:
For further information about the bootnetworkdevice and the networkdevice options, see Chapter 3, Provisioning Sun Blade X8400 Server Modules in the Sun Blade 8000 Chassis, in Sun N1 System Manager 1.3.1 What’s New OS Distribution Creation Fails With a Copying Files ErrorIf the creation of an OS distribution fails with a copying files error, check the size of the ISO image and ensure that it is not corrupted. You might see output similar to the following example in the job details:
In the above case, try copying a different set of distribution files to the management server. See To Copy an OS Distribution From CDs or a DVD in Sun N1 System Manager 1.3 Operating System Provisioning Guide or To Copy an OS Distribution From ISO Files in Sun N1 System Manager 1.3 Operating System Provisioning Guide. Red Hat Linux OS Deployment Fails on a Sun Blade X8400 Server Blade With Factory-Default Firmware or After a Firmware UpdateLinux OS deployment fails after some Sun Blade X8400 server blade BIOS firmware upgrades and on factory default blades. Some BIOS firmware upgrades may cause CMOS checksum errors, prompting you to restore the default CMOS settings after the server blade resets. The default BIOS settings will not work with Linux. To resolve this problem:
Red Hat Linux OS Profile Creation FailsBuilding Red Hat OS profiles on the N1 System Manager might require additional analysis to avoid failures. If you have a problem with a custom OS profile, perform the following steps while the problem deployment is still active.
The failed_ks_cfg file will contain all of the KickStart parameters, including those that you customized. Verify that the parameters stated in the configuration file are appropriate for the current hardware configuration. Correct any errors and try the deployment again. Restarting NFS to Resolve Boot Failed ErrorsBoot Failed messages occur when the management server cannot access files during a Load OS operation, and appear similar to the following example.
Note – The message differs depending on the OS that is being deployed. Stale NFS file handles are the most common cause of this problem. Log in to the management server as root (su - root) and restart NFS.
Solaris OS Deployment Job Times Out or StopsIf you attempt to load a Solaris OS profile and the OS Deploy job times out or stops, check the output in the job details to ensure that the target server completed a PXE boot. For example:
If the PXE boot fails, the /etc/dhcpd.conf file on the management server might have erroneous network interface connection entries, which can occur if incorrect information is specified during the N1 System Manager configuration process. Note – The best diagnostic tool is to open a console window on the target machine and then run the deployment. See Connecting to the Serial Console for a Managed Server in Sun N1 System Manager 1.3 Discovery and Administration Guide. If you suspect that the /etc/dhcpd.conf file was configured incorrectly, perform the following steps to modify the configuration.
Solaris OS Profile Installation FailsOS profiles that install only the Core System Support distribution group do not load successfully. Specify “Entire Distribution plus OEM Support” as the value for the distributiongroup parameter. This setting configures a profile that will install the needed version of SSH and other tools that are required for servers to be managed by the N1 System Manager. SuSE OS Profile Fails to Load on a Sun Fire V20z or Sun Fire V40zLoading a SuSE OS profile on a Sun Fire X4000 series server modifies the associated SuSE OS distribution, which makes the SuSE OS distribution unusable by Sun Fire V20z and V40z servers. To avoid this problem, you must create separate SuSE Linux Enterprise Server 9 OS and SuSE Linux Enterprise Server 9 SP1 OS distributions profiles for the Sun Fire V20z and V40z servers, and for the Sun Fire X4000 series servers. Windows Deployment Fails after Upgrade from N1 System Manager 1.3 to 1.3.1The N1 System Manager 1.3 to 1.3.1 does not upgrade the scripts and drivers in the Windows RIS server C:\N1SM directory. To upgrade the RIS server for N1 System Manager 1.3.1 you must perform the following tasks:
The following procedure provides the specific steps required to update the RIS server for N1 System Manager 1.3.1.
|
Add, Delete, or Modify Windows RIS server? ([n]/y) y
CURRENT RIS Servers:
ID: 1
Name: default
HostName:
IP: 192.168.0.100
Subnet_Address: 192.168.0.0
OSP_Location: C:\\\\N1SM
RIS_Share_Path: D:\\RemoteInstall
Active_dir_domain: mularis.sfbay.sun.com
Active_dir_user: n1smuser
ssh_user: n1smuser
Delete this RIS server? ([n]/y)
|
Type y to delete the RIS server from N1 System Manager.
Respond to the remaining prompts as appropriate for your network and N1 System Manager configuration.
Add the RIS server to N1 System Manager as follows.
Type n1smconfig.
The current N1 System Manager configuration is displayed.
You are notified that only options that can be changed will be displayed.
Type y to continue.
Respond to each prompt as appropriate for your network and N1 System Manager configuration.
Type y when prompted Add, Delete, or Modify Windows RIS server? ([n]/y).
Respond to each prompt, specifying the values that were displayed in Step 3 substep Step a.
After RIS server configuration is completed, respond to the remaining prompts as appropriate for your network and N1 System Manager configuration.