| C H A P T E R 5 |
This chapter contains information and tasks about using the Logical Domains software that are not described in the preceding chapters.
The following sections describe the restrictions on entering names in the Logical Domains Manager CLI.
The logical domain configuration name (config_name) that you assign to a configuration stored on the system controller must have no more than 64 characters.
This section shows the syntax usage for the ldm subcommands, defines some output terms, such as flags and utilization statistics, and provides examples of the output.
If you are creating scripts that use ldm list command output, always use the -p option to produce the machine-readable form of the output. See To Generate a Parseable, Machine-Readable List (-p) for more information.
The following flags can be shown in the output for a domain:
| - | normal |
| placeholder | s |
| c | starting or stopping |
| control domain | t |
| d | transition |
| delayed reconfiguration | v |
| n | virtual I/O domain |
If you use the long (-l) option for the command, the flags are spelled out. If not, you see the letter abbreviation.
The list flag values are position dependent. Following are the values that can appear in each of the five columns from left to right:
| Column 1 | Column 3 | Column 5 | n or t | c or - |
| Column 2 | Column 4 | s or - | d or - | v or - |
The per virtual CPU utilization statistic (UTIL) is shown on the long (-l) option of the ldm list command. The statistic is the percentage of time, since the last statistics display, that the virtual CPU spent executing on behalf of the guest operating system. A virtual CPU is considered to be executing on behalf of the guest operating system except when it has been yielded to the hypervisor. If the guest operating system does not yield virtual CPUs to the hypervisor, the utilization of CPUs in the guest operating system will always show as 100%.
The utilization statistic reported for a logical domain is the average of the virtual CPU utilizations for the virtual CPUs in the domain.
To view the current software versions installed, do the following and you receive a listing similar to the following.
primary$ ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -t-cv 4 1G 0.5% 3d 21h 7m ldg1 active -t--- 5000 8 1G 23% 2m |
primary# ldm list-domain ldg1 NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME ldg1 active -t--- 5000 8 1G 0.3% 2m |
primary$ ldm list-variable boot-device ldg1 boot-device=/virtual-devices@100/channel-devices@200/disk@0:a |
To the Logical Domains Manager, constraints are one or more resources you want to have assigned to a particular domain. You either receive all the resources you ask to be added to a domain or you get none of them, depending upon the available resources. The list-constraints subcommand lists those resources you requested assigned to the domain.
An ldm stop-domain command can time out before the domain completes shutting down. When this happens, an error similar to the following is returned by the Logical Domains Manager:
LDom ldg8 stop notification failed |
However, the domain could still be processing the shutdown request. Use the ldm list-domain command to verify the status of the domain. For example:
# ldm list-domain ldg8 NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME ldg8 active s---- 5000 22 3328M 0.3% 1d 14h 31m |
The preceding list shows the domain as active, but the s flag indicates that the domain is in the process of stopping. This should be a transitory state.
The following example shows the domain has now stopped:
# ldm list-domain ldg8 NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME ldg8 bound ----- 5000 22 3328M |
There is no way to determine the Solaris OS network interface name on a guest, corresponding to a given virtual device, directly from the output provided by the ldm list-* commands. However, you can do this by using a combination of the output from ldm list -l command and the entries under /devices on the Solaris OS guest.
In this example, guest domain ldg1 contains two virtual network devices, net-a and net-c, to find the Solaris OS network interface name in ldg1 that corresponds to net-c, do the following.
Use the ldm command to find the virtual network device instance for net-c.
# ldm list -l ldg1 ... NETWORK NAME SERVICE DEVICE MAC net-a primary-vsw0@primary network@0 00:14:4f:f8:91:4f net-c primary-vsw0@primary network@2 00:14:4f:f8:dd:68 ... # |
To find the corresponding network interface on ldg1, log into ldg1 and find the entry for this instance under /devices.
# uname -n ldg1 # find /devices/virtual-devices@100 -type c -name network@2\* /devices/virtual-devices@100/channel-devices@200/network@2:vnet1 # |
The network interface name is the part of the entry after the colon; that is, vnet1.
Plumb vnet1 to see that it has the MAC address 00:14:4f:f8:dd:68 as shown in the ldm list -l output for net-c in Step 1.
# ifconfig vnet1 vnet1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 0.0.0.0 netmask 0 ether 0:14:4f:f8:dd:68 # |
You must have enough media access control (MAC) addresses to assign to the number of logical domains, virtual switches, and virtual networks you are going to use. You can have the Logical Domains Manager automatically assign MAC addresses to a logical domain, a virtual network (vnet), and a virtual switch (vswitch), or you can manually assign MAC addresses from your own pool of assigned MAC addresses. The ldm subcommands that set MAC addresses are add-domain, add-vsw, set-vsw, add-vnet, and set-vnet. If you do not specify a MAC address in these subcommands, the Logical Domains Manager assigns one automatically.
The advantage to having the Logical Domains Manager assign the MAC addresses is that it utilizes the block of MAC addresses dedicated for use with logical domains. Also, the Logical Domains Manager detects and prevents MAC address collisions with other Logical Domains Manager instances on the same subnet. This frees you from having to manually manage your pool of MAC addresses.
MAC address assignment happens as soon as a logical domain is created or a network device is configured into a domain. In addition, the assignment is persistent until the device, or the logical domain itself, is removed.
The following topics are addressed in this section:
Logical domains have been assigned the following block of 512K MAC addresses:
00:14:4F:F8:00:00 ~ 00:14:4F:FF:FF:FF
The lower 256K addresses are used by the Logical Domains Manager for automatic MAC address allocation, and you cannot manually request an address in this range:
00:14:4F:F8:00:00 - 00:14:4F:FB:FF:FF
You can use the upper half of this range for manual MAC address allocation:
When you do not specify a MAC address in creating logical domain or a network device, the Logical Domains Manager automatically allocates and assigns a MAC address to that logical domain or network device. To obtain this MAC address, the Logical Domains Manager iteratively attempts to select an address and then checks for potential collisions.
Before selecting a potential address, the Logical Domains Manager first looks to see if it has a recently freed, automatically assigned address saved in a database for this purpose (see Freed MAC Addresses). If so, the Logical Domains Manager selects its candidate address from the database.
If no recently freed addresses are available, the MAC address is randomly selected from the 256K range of addresses set aside for this purpose. The MAC address is selected randomly to lessen the chance of a duplicate MAC address being selected as a candidate.
The address selected is then checked against other Logical Domains Managers on other systems to prevent duplicate MAC addresses from actually being assigned. The algorithm employed is described in Duplicate MAC Address Detection. If the address is already assigned, the Logical Domains Manager iterates, choosing another address, and again checking for collisions. This continues until a MAC address is found that is not already allocated, or a time limit of 30 seconds has elapsed. If the time limit is reached, then the creation of the device fails, and an error message similar to the following is shown:
Automatic MAC allocation failed. Please set the vnet MAC address manually. |
To prevent the same MAC address from being allocated to different devices, one Logical Domains Manager checks with other Logical Domains Managers on other systems by sending a multicast message over the control domain’s default network interface, including the address that the Logical Domain Manager wants to assign to the device. The Logical Domains Manger attempting to assign the MAC address waits for one second for a response back. If a different device on another LDoms-enabled system has already been assigned that MAC address, the Logical Domains Manager on that system sends back a response containing the MAC address in question. If the requesting Logical Domains Manager receives a response, it knows the chosen MAC address has already been allocated, chooses another, and iterates.
By default, these multicast messages are sent only to other managers on the same subnet; the default time-to-live (TTL) is 1. The TTL can be configured using the Service Management Facilities (SMF) property ldmd/hops.
Each Logical Domains Manager is responsible for:
If the Logical Domains Manager on a system is shut down for any reason, duplicate MAC addresses could occur while the Logical Domains Manager is down.
Automatic MAC allocation occurs at the time the logical domain or network device is created and persists until the device or the logical domain is removed.
When a logical domain or a device associated with an automatic MAC address is removed, that MAC address is saved in a database of recently freed MAC addresses for possible later use on that system. These MAC addresses are saved to prevent the exhaustion of Internet Protocol (IP) addresses from a Dynamic Host Configuration Protocol (DHCP) server. When DHCP servers allocate IP addresses, they do so for a period of time (the lease time). The lease duration is often configured to be quite long, generally hours or days. If network devices are created and removed at a high rate without the Logical Domains Manager reusing automatically allocated MAC addresses, the number of MAC addresses allocated could soon overwhelm a typically configured DHCP server.
When a Logical Domains Manager is requested to automatically obtain a MAC address for a logical domain or network device, it first looks to the freed MAC address database to see if there is a previously assigned MAC address it can reuse. If there is a MAC address available from this database, the duplicate MAC address detection algorithm is run. If the MAC address had not been assigned to someone else since it was previously freed, it will be reused and removed from the database. If a collision is detected, the address is simply removed from the database. The Logical Domains Manager then either tries the next address in the database or if none is available, randomly picks a new MAC address.
The following procedure tells you how to create a manual MAC address.
Convert the subnet portion of the IP address of the physical host into hexadecimal format and save the result.
# grep $hostname /etc/hosts| awk ’{print $1}’ | awk -F. ’{printf("%x",$4)}’
27
|
Determine the number of domains present excluding the control domain.
# /opt/SUNWldm/bin/ldm list-domain NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv SP 4 768M 0.3% 4h 54m myldom1 active -n--- 5000 2 512M 1.9% 1h 12m |
There is one guest domain, and you need to include the domain you want to create, so the domain count is 2.
Append the converted IP address (27) to the vendor string (0x08020ab) followed by 10 plus the number of logical domains (2 in this example), which equals 12.
0x08020ab and 27 and 12 = 0x08020ab2712 or 8:0:20:ab:27:12 |
The Solaris Fault Management Architecture (FMA) reports CPU errors in terms of physical CPU numbers and memory errors in terms of physical memory addresses.
If you want to determine within which logical domain an error occurred and the corresponding virtual CPU number or real memory address within the domain, then you must perform a mapping.
The domain and the virtual CPU number within the domain, which correspond to a given physical CPU number, can be determined with the following procedures.
The domain and the real memory address within the domain, which correspond to a given physical memory address (PA), can be determined as follows.
Generate a long parseable list for all domains.
primary$ ldm ls -l -p |
Look for the line in the list’s MEMORY sections where the PA falls within the inclusive range pa to (pa + size - 1): that is, pa <= PA < (pa + size - 1).
Here pa and size refer to the values in the corresponding fields of the line.
Suppose you have a logical domain configuration as shown in EXAMPLE 5-16, and you want to determine the domain and the virtual CPU corresponding to physical CPU number 5, and the domain and the real address corresponding to physical address 0x7e816000.
Looking through the VCPU entries in the list for the one with the pid field equal to 5, you can find the following entry under logical domain ldg1:
Hence, the physical CPU number 5 is in domain ldg1 and within the domain it has virtual CPU number 1.
|vid=1|pid=5|util=29|strand=100 |
Looking through the MEMORY entries in the list, you can find the following entry under domain ldg2:
ra=0x8000000|pa=0x78000000|size=1073741824 |
Where 0x78000000 <= 0x7e816000 <= (0x78000000 + 1073741824 - 1), that is, pa <= PA <= (pa + size - 1).Hence, the PA is in domain ldg2 and the corresponding real address is 0x8000000 + (0x7e816000 - 0x78000000) = 0xe816000.
The PCI Express (PCI-E) bus on a Sun UltraSPARC T1-based server consists of two ports with various leaf devices attached to them. These are identified on a server with the names pci@780 (bus_a) and pci@7c0 (bus_b). In a multidomain environment, the PCI-E bus can be programmed to assign each leaf to a separate domain using the Logical Domains Manager. Thus, you can enable more than one domain with direct access to physical devices instead of using I/O virtualization.
When the Logical Domains system is powered on, the control (primary) domain uses all the physical device resources, so the primary domain owns both the PCI-E bus leaves.
The example shown here is for a Sun Fire T2000 server. This procedure also can be used on other Sun UltraSPARC T1-based servers, such a Sun Fire T1000 server and a Netra T2000 server. The instructions for different servers might vary slightly from these, but you can obtain the basic principles from the example. Mainly, you need to retain the leaf that has the boot disk and remove the other leaf from the primary domain and assign it to another domain.
Verify that the primary domain owns both leaves of the PCI Express bus.
primary# ldm list-bindings primary ... IO DEVICE PSEUDONYM OPTIONS pci@780 bus_a pci@7c0 bus_b ... |
Determine the device path of the boot disk, which needs to be retained.
primary# df / / (/dev/dsk/c1t0d0s0 ): 1309384 blocks 457028 files |
Determine the physical device to which the block device c1t0d0s0 is linked.
primary# ls -l /dev/dsk/c1t0d0s0 lrwxrwxrwx 1 root root 65 Feb 2 17:19 /dev/dsk/c1t0d0s0 -> ../ ../devices/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@0,0:a |
In this example, the physical device for the boot disk for domain primary is under the leaf pci@7c0, which corresponds to our earlier listing of bus_b. This means that we can assign bus_a (pci@780) of the PCI-Express bus to another domain.
Check /etc/path_to_inst to find the physical path of the onboard network ports.
primary# grep e1000g /etc/path_to_inst |
Remove the leaf that does not contain the boot disk (pci@780 in this example) from the primary domain.
primary# ldm remove-io pci@780 primary |
Add this split PCI configuration (split-cfg in this example) to the system controller.
primary# ldm add-config split-cfg |
This configuration (split-cfg) is also set as the next configuration to be used after the reboot.
Note - Currently, there is a limit of 8 configurations that can be saved on the SC, not including the factory-default configuration. |
Reboot the primary domain so that the change takes effect.
primary# shutdown -i6 -g0 -y |
Add the leaf (pci@780 in this example) to the domain (ldg1 in this example) that needs direct access.
primary# ldm add-io pci@780 ldg1 Notice: the LDom Manager is running in configuration mode. Any configuration changes made will only take effect after the machine configuration is downloaded to the system controller and the host is reset. |
If you have an Infiniband card, you might need to enable the bypass mode on the pci@780 bus. See Enabling the I/O MMU Bypass Mode on a PCI Bus for information on whether you need to enable the bypass mode.
Reboot domain ldg1 so that the change takes effect.
All domains must be inactive for this reboot. If you are configuring this domain for the first time, the domain will be inactive.
ldg1# shutdown -i6 -g0 -y |
Confirm that the correct leaf is still assigned to the primary domain and the correct leaf is assigned to domain ldg1.
primary# ldm list-bindings primary NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv SP 4 4G 0.4% 18h 25m ... IO DEVICE PSEUDONYM OPTIONS pci@7c0 bus_b ... ---------------------------------------------------------------- NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME ldg1 active -n--- 5000 4 2G 10% 35m ... IO DEVICE PSEUDONYM OPTIONS pci@780 bus_a ... |
This output confirms that the PCI-E leaf bus_b and the devices below it are assigned to domain primary, and bus_a and its devices are assigned to ldg1.
If you have an Infiniband Host Channel Adapter (HCA) card, you might need to turn the I/O memory management unit (MMU) bypass mode on. By default, Logical Domains software controls PCI-E transactions so that a given I/O device or PCI-E option can only access the physical memory assigned within the I/O domain. Any attempt to access memory of another guest domain is prevented by the I/O MMU. This provides a higher level of security between the I/O domain and all other domains. However, in the rare case where a PCI-E or PCI-X option card does not load or operate with the I/O MMU bypass mode off, this option allows you to turn the I/O MMU bypass mode on. However, if you turn the bypass mode on, there no longer is a hardware-enforced protection of memory accesses from the I/O domain.
The bypass=on option turns on the I/O MMU bypass mode. This bypass mode should be enabled only if the respective I/O domain and I/O devices within that I/O domain are trusted by all guest domains. This example turns on the bypass mode.
primary# ldm add-io bypass=on pci@780 ldg1 |
The virtual network terminal server daemon, vntsd(1M), enables you to provide access for multiple domain consoles using a single TCP port. At the time of domain creation, the Logical Domains Manager assigns a unique TCP port to each console by creating a new default group for that domain’s console. The TCP port is then assigned to the console group as opposed to the console itself. The console can be bound to an existing group using the set-vcons subcommand.
Bind the consoles for the domains into one group.
The following example shows binding the console for three different domains (ldg1, ldg2, and ldg3) to the same console group (group1).
primary# ldm set-vcons group=group1 service=primary-vcc0 ldg1 primary# ldm set-vcons group=group1 service=primary-vcc0 ldg2 primary# ldm set-vcons group=group1 service=primary-vcc0 ldg3 |
Connect to the associated TCP port (localhost at port 5000 in this example).
# telnet localhost 5000
primary-vnts-group1: h, l, c{id}, n{name}, q:
|
List the domains within the group by selecting l (list).
primary-vnts-group1: h, l, c{id}, n{name}, q: l
DOMAIN ID DOMAIN NAME DOMAIN STATE
0 ldg1 online
1 ldg2 online
2 ldg3 online
|
You can move a logical domain, which is not running, from one server to another. Before you move the domain, if you set up the same domain on two servers, the domain would be easier to move. In fact, you do not have to move the domain itself; you only have to unbind and stop the domain on one server and bind and start the domain on the other server.
Create a domain with the same name on two servers; for example, create domainA1 on serverA and serverB.
Add a virtual disk server device and a virtual disk to both servers. The virtual disk server opens the underlying device for export as part of the bind.
Bind the domain only on one server; for example, serverA. Leave the domain inactive on the other server.
This section describes how to remove all guest domains and revert to a single OS instance that controls the whole server.
List all the logical domain configurations on the system controller.
primary# ldm ls-config |
Remove all configurations (config_name) previously saved to the system controller (SC). Use the following command for each such configuration.
primary# ldm rm-config config_name |
Once you remove all the configurations previously saved to the SC, the factory-default domain would be the next one to use when the control domain (primary) is rebooted.
Stop all guest domains using the -a option.
primary# ldm stop-domain -a |
List all domains to see all the resources attached to guest domains.
primary# ldm ls |
Release all the resources attached to guest domains. To do this, use the ldm unbind-domain command for each guest domain (ldom) configured in your system.
Note - You might not be able to unbind an I/O domain in a split-PCI configuration if it is providing services required by the control domain. In this situation, skip this step. |
primary# ldm unbind-domain ldom |
primary# shutdown -i1 -g0 -y |
Power-cycle the system controller so that the factory-default configuration is reloaded.
sc> poweroff sc> poweron |
This section describes the changes in behavior in using the Solaris OS that occur once a configuration created by the Logical Domains Manager is instantiated; that is, domaining is enabled.
Note - Any discussion about whether domaining is enabled pertains only to Sun UltraSPARC T1–based platforms. Otherwise, domaining is always enabled. |
If domaining is enabled, the OpenBoot firmware is not available after the Solaris OS has started, because it is removed from memory.
To reach the ok prompt from the Solaris OS, you must halt the domain. You can use the Solaris OS halt command to halt the domain.
Whenever performing any maintenance on a system running LDoms software that requires power-cycling the server, you must save your current logical domain configurations to the SC first.
The OpenBoot
power-off command
does not power down a system. To power down
a system while in OpenBoot firmware, use your system controller’s
or system processor’s poweroff command. The OpenBoot power-off command
displays the following message:
NOTICE: power-off command is not supported, use appropriate NOTICE: command on System Controller to turn power off. |
If domaining is not enabled, the Solaris OS normally goes to the OpenBoot prompt after a break is issued. The behavior described in this section is seen in two situations:
You press the L1-A key sequence when the input device is set to keyboard.
You enter the send break command when the virtual console is at the telnet prompt.
If domaining is enabled, you receive the following prompt after these types of breaks.
c)ontinue, s)ync, r)eboot, h)alt? |
Type the letter that represents what you want the system to do after these types of breaks.
The following table shows the expected behavior of halting or rebooting the control (primary) domain.
Note - The question in TABLE 5-1 regarding whether domaining is enabled pertains only to the Sun UltraSPARC T1 processors. Otherwise, domaining is always enabled. |
The Solaris OS format(1M) command does not work in a guest domain with virtual disks:
Some subcommands, such as label, verify, or inquiry fail with virtual disks.
The format(1M) command crashes when you select a virtual disk that has an extensible firmware interface (EFI) disk label.
When running the format(1M) command in a guest domain, all virtual disks are seen as unformatted, even when they are correctly formatted and have a valid disk label.
For getting or setting the volume table of contents (VTOC) of a virtual disk, use the prtvtoc(1M) command and fmthard(1M) command instead of the format(1M) command. You also can use the format(1M) command from the service domain on the real disks.
The section describes information to be aware of in using Advanced Lights Out Manager (ALOM) chip multithreading (CMT) with the Logical Domains Manager. For more information about using the ALOM CMT software, refer to the Advanced Lights Out Management (ALOM) CMT v1.3 Guide.
An additional option is available to the existing ALOM CMT command.
bootmode [normal|reset_nvram|bootscript=strong|config=”config-name”] |
The config=”config-name” option enables you to set the configuration on the next power on to another configuration, including the factory-default shipping configuration.
You can invoke the command whether the host is powered on or off. It takes effect on the next host reset or power on.