Writing Device Drivers
只搜寻这本书
以 PDF 格式下载本书

Debugging

13

This chapter describes how to debug a device driver. This includes how to set up a tip(1) connection to the test machine, how to prepare for a crash, how to use existing memory driver, and also some hints for coding the device driver. It also introduces system debugging tools that are available, and gives hints on how to test the device driver.

Note - The information presented in this chapter is specific to the release of the operating system, and is subject to change.

Machine Configuration

Setting Up a tip(1) Connection

The serial ports on one system (the host system) can be used to connect to a driver debugging and test machine using tip(1). This allows a window on the host system, called a tip window, to be used as the console of the test machine. See tip(1) for additional information.

Note - A second machine is not required to debug a SunOS device driver. It is only required for the use of tip(1).

Using a tip window is very helpful:
  • It lets the window system to assist in interactions with the boot PROM or kadb. For example, the window can keep a log of the session, which is very handy if the driver crashes the test system.
  • It allows the test machine to be remote. It is reached by logging into a host machine (often called a tip host) and using tip(1) to connect to the test machine.

Setting Up the Host System

A simple setup for connecting serial port A on the host (running Solaris 2.x) to serial port A on the test machine (a SPARC system with an Open Boot PROM) is:
  1. Connect the host system to the test machine using either serial port; the example in this section uses port A. This connection must be made with a null modem cable, which connects the signal Receive to Transmit, and Ground to Ground. This cable can be constructed by the developer, or a null modem adaptors can be found at electronics stores.

  2. On the host system, make an entry in /etc/remote for the connection if it is not already there (see remote(4)). The terminal entry must match the serial port being used. Solaris 2.x comes with the correct entry for serial port B, but one must be added for serial port A:


  debug:\  
       :dv=/dev/term/a:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:  

  1. In a shell window on the host, run tip(1) and specify the name of the entry:


  test% tip debug  
  connected  

The shell window is now a tip window directed to the test machine.

Setting Up the Test System

A quick way to set up the test machine is to unplug its keyboard before turning it on. It then automatically uses serial port A as the console. Another way to do this is to use boot PROM commands to make serial port A the console:
  1. On the test machine, enter the boot PROM (ok prompt). Direct I/O to the serial line, indicating the correct serial port. In this example, the test machine is using serial port A, so the command is ttya io. Pressing Return in the tip window should get a boot PROM prompt.


Caution - Do not use L1-A on the host machine to send a break to stop the test machine. This actually stops the host machine. To send a break to the test machine, type ~# in the tip window. Tilde commands such as this are recognized only if they are the first characters on a line, so press the Return key or Control-U first if there is no effect.

  1. To make the test machine always come up with serial port A as the console, set the environment variables input-device and output-device:


  ok setenv input-device ttya  
  ok setenv output-device ttya  

On x86 platforms, the test machine needs to set console = 1 in /etc/system. This causes a switch to COM1 during reboot.

Preparing for the Worst

It is possible that the driver will render the system unbootable; this is most likely if the driver is for the boot device. If a complete system reinstallation is to be avoided, some advance work must be done to prepare for this possibility.

Boot Off a Backup Root Partition

One way to deal with this is to have another bootable root file system. Use format(1M) to make a partition the exact size of the original, then (from SunOS) use dd(1M) to copy it. Do this from single-user mode so that there is as little file system activity as possible, and run fsck(1M) on the new file system to ensure its integrity.
Later, if the system cannot boot from the original root partition, boot the backup partition and use dd(1M) to copy the backup partition onto the original one. If the system will not boot but the root file system is undamaged (just the boot block or boot program was destroyed), boot off the backup partition with the ask (-a) option, then specify the original filesystem as the root filesystem.

Boot Off the Network.

If the system is attached to a network, the test machine can be added as a client of a server. If a problem occurs, the system can be booted off the network. The local disks can then be mounted and fixed.

Critical System Files

There are a number of driver-related system files that are difficult, if not impossible, to reconstruct. Files such as /etc/name_to_major could be corrupted if the driver crashes the system during installation (see add_drv(1M)).
To be safe, once the test machine is in the proper configuration, make a backup copy of the root file system.

Recreating /devices and /dev

If the /devices or /dev directories are damaged (most likely if the driver crashes during attach(9E)), they may be recreated in the following way. Boot the system from somewhere else (another disk, an installation CD, or the network), and run fsck(1M) to repair the damaged root filesystem. Then, mount the root filesystem and recreate /devices by running drvconfig(1M) and specifying the devices directory on the mounted disk. The /dev directory can be repaired by running devlinks(1M), disks(1M), tapes(1M), and ports(1M) on the dev directory of the mounted disk.
For example, if the damaged disk is /dev/dsk/c0t3d0s0, and an alternate boot disk is /dev/disk/c0t1d0s0, do the following:.
ok boot disk1
...
Rebooting with command: disk1
Boot device: /sbus/esp@0,800000/sd@1,0   File and args:
SunOS Release 5.4 Version Generic [UNIX(R) System V Release 4.0]
Copyright (c) 1983-1994, Sun Microsystems, Inc.
...
# fsck /dev/dsk/c0t3d0s0
** /dev/dsk/c0t3d0s0
** Last Mounted on /
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
1478 files, 9922 used, 29261 free(141 frags, 3640 blocks, 0.4% fragmentation)
# mount /dev/dsk/c0t3d0s0 /mnt
# drvconfig -r /mnt/devices
# devlinks -r /mnt
# disks -r /mnt
# tapes -r /mnt
# ports -r /mnt


Caution - Fixing /devices and /dev may allow the system to boot, but other parts of the system may still be corrupted. This may only be a temporary fix to allow saving of information (such as system core dumps) before reinstalling the system.

Booting an Alternate Kernel

A kernel other than /kernel/unix can be booted by specifying it as the boot file. In fact, backup copies of all the system drivers in /kernel can be made and used in the event the originals fail (this is probably more useful if more than one driver is being debugged). For example:

  # cp -r /kernel /kernel.orig  

To boot the original system, boot /kernel.orig/unix. By default, the first module directory in the module directory path is the one the kernel resides in. By booting /kernel.orig/unix, the module directory path becomes "/kernel.orig /usr/kernel".

  ok boot disk1 /kernel.orig/unix  
  ...  
  Rebooting with command: disk1 /kernel.orig/unix  
  Boot device: /sbus/esp@0,800000/sd@1,0   File and args:/kernel.orig/unix  
  SunOS Release 5.4 Version Generic [UNIX(R) System V Release 4.0]  
  Copyright (c) 1983-1994, Sun Microsystems, Inc.  
  ...  

For more complete control, boot with the ask (-a) option; this allows alternate files to be specified (such as /etc/system.orig if that is the original "clean" system file).

  ok boot disk1 -a  
  ...  
  Rebooting with command: disk1 -a  
  Boot device: /sbus/esp@0,800000/sd@1,0   File and args: -a  
  Enter filename [/kernel/unix]: /kernel.orig/unix  
  SunOS Release 5.4 Version Generic [UNIX(R) System V Release 4.0]  
  Copyright (c) 1983-1994, Sun Microsystems, Inc.  
  Name of system file [etc/system]:        etc/system.orig  
  Name of default directory for modules [/kernel.orig /usr/kernel]: <CR>  
  root filesystem type [ufs]: <CR>  
  Enter physical name of root device  
  [/sbus@1,f8000000/esp@0,800000/sd@1,0:a]: <CR>  

Coding Hints

During development, debugging the driver should be a constant consideration. Since the driver is operating much closer to the hardware, without the protection of the operating system, debugging kernel code is harder than debugging user-level code. A stray pointer access can crash the entire system. This section provides some information that may be used to make the driver easier to debug.

Process Layout

SunOS 5.x operating system processes follow the definition given in the System V Application Binary Interface, SPARC Processor Supplement (also known as the ABI). A standard process looks similar to this:

Imported image(387x169)

The ABI specifies the system portion of a process' virtual address space is in the high end, and may occupy no more than 512 megabytes. In other words, all kernel addresses will be 0xE0000000 or higher. Some implementations may use less kernel space, and so begin at a higher address. This fact can be used when debugging: if pointers point below the address 0xE0000000, they probably are user addresses.

System Support

The system provides a number of routines that can aid in debugging; they are documented in Section 9F of the man Pages(9F): DDI and DKI Kernel Functions.

cmn_err( )

cmn_err(9F) is the function to use to print messages to the console from the kernel. See cmn_err(9F) and "Printing Messages" on page 55 for more information on its use.

Note - Though printf() and uprintf() currently exist, they should not be used if the driver is to be Solaris DDI-compliant.

An example from the probe(9E) routine might be to print a message if the device is not found. Normally, probe(9E) routines should not print messages if the device is not there.
if (ddi_pokec(dip, &regp->csr, ENABLE_INTERRUPTS) != DDI_SUCCESS) {
     cmn_err(CE_NOTE, "%s not found.", ddi_get_name(dip));
     return (DDI_PROBE_FAILURE);
}

A handy format for printing device register bits is %b. See cmn_err(9F) for information on how to use it.

ASSERT( )

void ASSERT(int expression)

ASSERT(9F) can be used to assure that a condition is true at some point in the program. It is a macro, and what it does depends on whether or not the symbol DEBUG is defined (from <sys/debug.h>). If DEBUG is not defined, the macro expands to nothing and the expression is not evaluated. If DEBUG is defined, the expression is evaluated, and if the value is zero a message is printed and the system panics.
For example, if at a point in the driver a pointer should be non-NULL--and if it is not, something is seriously wrong--the following assertion could be used:
    ASSERT(ptr != NULL);

If compiled with DEBUG defined, and the assertion fails, a panic occurs:
panic: assertion failed: ptr != NULL, file: driver.c, line: 56


Note - Because ASSERT(9F) uses DEBUG, it is suggested that any conditional debugging code should also be based on DEBUG, rather than with a driver symbol (such as MYDEBUG). Otherwise, for ASSERT(9F) to function properly, DEBUG must be defined whenever MYDEBUG is defined.

Assertions are an extremely valuable form of active documentation.

mutex_owned( )

int mutex_owned(kmutex_t *mp);

A significant portion of driver development involves properly handling multiple threads. Comments should always be used when a mutex is acquired, and are even more useful when an apparently necessary mutex is not acquired. To determine if a mutex is held by a thread, use mutex_owned(9F) within ASSERT(9F):
void helper(void)
{
    /* this routine should always be called with the mu mutex held */
    ASSERT(mutex_owned(&xsp->mu));
    ...
}

Future releases of Solaris may only support the use of mutex_owned(9F) within ASSERT(9F) by not defineing mutex_owned(9F) unless the preprocessor symbol DEBUG is defined.

Conditional Compilation and Variables

There are two common ways to place debugging code in a driver: conditionally compiling code based on a preprocessor symbol such as DEBUG, or using a global variable. Conditional compilation has the advantage that unnecessary code can be removed in the production driver. Using a variable allows the amount of debugging output to be chosen at run time, such as by setting a debugging level at run time with an I/O control or through a debugger. Commonly, these two methods are combined.
The following example relies on the compiler to remove unreachable code (the code following the always-false test of zero), and also provides a local variable that can be set in /etc/system or patched by a debugger.

  #ifdef DEBUG  
  comments on values of xxdebug and what they do  
  static int xxdebug;  
  #define dcmn_err if (xxdebug) cmn_err  
  #else  
  #define dcmn_err if (0) cmn_err  
  #endif  
  ...  
       dcmn_err(CE_NOTE, "Error!\n");  

This method handles the fact that cmn_err(9F) has a variable number of arguments. Another method relies on the macro having one argument, a parenthesized argument list for cmn_err(9F), which the macro removes. It also removes the reliance on the optimizer by expanding the macro to nothing if DEBUG is not defined.

  #ifdef DEBUG  
  comments on values of xxdebug and what they do  
  static int xxdebug;  
  #define dcmn_err(X) if (xxdebug) cmn_err X  
  #else  
  #define dcmn_err(X) /* nothing */  
  #endif  
       ...  
       dcmn_err((CE_NOTE, "Error!"));  

This can be extended in many ways, such as by having different messages from cmn_err(9F) depending on the value of xxdebug, but be careful not to obscure the code with too much debugging information.
Another common scheme is to write an xxlog() function, and have it use vsprintf(9F) or vcmn_err(9F) to handle variable argument lists.

The Optimizer and volatile

The volatile keyword must be used when declaring any variable that will reference a device register, or the optimizer may optimize important accesses away. This is very important since not using volatile can result in bugs that are very difficult to track down. See "volatile" on page 69 for more information.

Using Existing Drivers

Using existing drivers with a user program is a good way to see if the kernel sees the device. This allows the device to be debugged without the need for the device-specific driver, and separates device debugging from driver debugging. Depending on the driver and device, mmap(2), or read(2) and write(2) (possibly with lseek(2)) may be used.
The "mem" and "kmem" drivers access physical memory and kernel memory, respectively. /dev/mem is used to provide a core memory image to debuggers such as adb(1).
To examine kernel memory from a user program, consider using the libkvm routines (see section kvm_open(3K)) for a (slightly) portable way. Be aware that kernel structures change frequently, so any code that examines the kernel is likely to need changes in future releases or on other platforms.
For devices, there are two bus space drivers, "vmemem" and "sbusmem". These allow access to devices on the bus without a driver. After verifying that the device is accessible through the PROM (see "The PROM on SPARC Machines" on page 26), these drivers can verify that it is accessible by SunOS. However, special handling requiring knowledge of the device (such as interrupt handling and DMA) can not be performed by these drivers. Be careful to not compromise system security (such as by giving non-root users access to the special files for these drivers), or system integrity (by accessing other devices).
This program opens the "sbusmem" driver for the slot the bwtwo is in and performs the same operations that were done previously with the PROM (see "Reading and Writing" on page 33).
Code Example 13-1 Accessing bwtwo with the "sbusmem" driver
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
void
fill(char *base, int count, char val)
{
    register int i;

    for(i=0; i < count; i++)
        *base++ = val;
    sleep(2);
}
int
main(int argc, char *argv[])
{
    int     fd;
    caddr_t base;
    size_t fb_len = 0x20000;
    off_t   fb_offset = 0x800000;
    fd = open("/devices/sbus@1,f8000000/sbusmem@3,0:slot3",
        O_RDWR);
    if (fd == -1) {
        perror("open /devices/sbus@1,f8000000/sbusmem@3,0:slot3");
        return (1);
    }
    base = mmap(NULL, fb_len, PROT_READ | PROT_WRITE, MAP_SHARED,
        fd, fb_offset);
    if (base == (caddr_t)-1) {
        perror("mmap of SBus slot 3");
        return (1);
    }
    close(fd);
    fill(base, 0x20000, 0xff);
    fill(base, 0x20000, 0);
    fill(base, 0x18000, 0x55);
    fill(base, 0x15000, 0x3);
    fill(base, 0x10000, 0x5);
    fill(base, 0x5000, 0xf9);
    return (0);
}

Debugging Tools

This section describes some programs and files that can be used to debug the driver at run time.

/etc/system

The /etc/system file is read once while the kernel is booting. It is used to set various kernel options. After modifying this file, the system must be rebooted for the changes to take effect. If a change in the file causes the system not to work, boot with the ask (-a) option and specify /dev/null as the system file.
The path the kernel uses when looking for modules can be set by changing the moddir variable in the system file. If the driver module is in a working area, such as /home/driver, the following example adds that directory to the module path:
moddir: /kernel /usr/kernel /home/driver


Caution - Do not allow non-root users to write to the module directory.

The set command is used to set integer variables. To set module variables, the module name must also be specified:
set module:variable=value

For example, to set the variable xxdebug in the driver "xx", use the following set command:
set xx:xxdebug=1

To set a kernel integer variable, omit the module name. Other assignments are also supported, such as bitwise ORing a value into an existing value:
set moddebug | 0x80000000

See system(4) for more information.

Note - Most kernel variables are not guaranteed to be present in subsequent releases.

moddebug is a bit field that controls the module loading process. See <sys/modctl.h> for all its possible values. Here are a few useful ones:
0x80000000 - Print messages to the console when loading/unloading
modules.
0x40000000 - Give more detailed error messages
0x20000000 - Print more detail when loading/unloading (such as
including the address and size).
0x00001000 - No autounloading drivers: the system will not attempt to
unload the device driver when the system resources become low.
0x00000080 - No autounloading streams: the system will not attempt to
unload the streams module when the system resources become low.

modload and modunload

Since the kernel automatically loads needed modules, and unloads unused ones, these two commands are now obsolete. However, they can be used for debugging.
modload(1M) can be used to force a module into memory. The kernel may unload it subsequently, but modload(1M) may be used to insure that the driver has no unresolved references when loaded.
modunload(1M) can be used to unload a module, given a module ID (which can be determined with modinfo(1M)). Unloading a module does not necessarily remove it from memory. To unload all unloadable modules and forcibly remove them from memory (so that they will be reloaded from the actual object file), use module ID zero:

  # modunload -i 0  


Note - modload(1M) and modunload(1M) may be removed in a future release.

Saving System Core Dumps

When the system panics, it writes the interesting portions of memory to the dump device (which is usually the swap device). This is a system core dump, similar to the core dumps generated by applications.
To save a core dump, there must be enough space in the swap area to contain it. To be safe, the primary swap area should be at least the size of main memory (all the information is in main memory, though not all of it is dumped).
savecore(1M) is used to copy the system's core image to a file. Normally, the system does not examine the swap area for core dumps when it boots. This must be enabled in /etc/init.d/sysetup. Change the lines that read:

  ##  
  ## Default is to not do a savecore  
  ##  
  #if [ ! -d /var/crash/`uname -n` ]  
  #then mkdir -p /var/crash/`uname -n`  
  #fi  
  #                echo 'checking for crash dump...\c '  
  #savecore /var/crash/`uname -n`  
  #                echo ''  

To:

  ##  
  ## Default is to not do a savecore  
  ##  
  if [ ! -d /var/crash/`uname -n` ]  
  then mkdir -p /var/crash/`uname -n`  
  fi  
                  echo 'checking for crash dump...\c '  
  savecore /var/crash/`uname -n`  
                  echo ''  

When savecore(1M) runs, it makes a copy of the kernel that was running (called unix.n) and dumps a core file (called vmcore.n) in the specified directory, normally /var/crash/machine_name. There must be enough space
in /var/crash to contain the core dump or it will be truncated. The file will appear larger than it actually is, since it contains holes, so avoid copying it. adb(1) can then be used on the core dump and the saved kernel.

Note - savecore(1M) can be prevented from filling the file system if there is a file called minfree in the directory in which the dump will be saved. This file contains a number of kilobytes to remain free after savecore(1M) has run. However, if not enough space is available, the core file is not saved.

adb and kadb

adb(1) can be used to debug applications or the kernel, though it cannot debug the kernel interactively (such as by setting breakpoints). To interactively debug the kernel, use kadb(1M). Both adb(1) and kadb(1M) share a common command set.
adb(1) is a very terse debugger. It does not normally prompt for input (though kadb(1M) does).

Starting adb

The command for starting adb to debug a kernel core dump is:

  % adb -k /var/crash/hostname/unix.n /var/crash/hostname/vmcore.n  

To start adb on a live system, use (as root)::

  # adb -k /dev/ksyms /dev/mem  

/dev/ksyms is a special driver that provides an image of the kernel's symbol table. This can be used to examine the debugging information (traces) the driver has left in the memory.
When adb(1) responds with 'physmem XXX', it is ready for a command.

Starting kadb

The system must be booted under kadb(1M) before kadb(1M) can be used. From the Open Boot PROM, use:

  ok boot kadb  
  ...  
  Boot device: /sbus/esp@0,800000/sd@3,0   File and args: kadb  
  kadb: /kernel/unix  
  Size: 673348+182896+46008 bytes  
  /kernel/unix loaded - 0x125000 bytes used  
  SunOS Release 5.4 Version Generic [UNIX(R) System V Release 4.0]  
  Copyright (c) 1983-1994, Sun Microsystems, Inc.  
  ...  

By default, kadb(1M) boots (and debugs) /kernel/unix. It can be passed a file name as an argument to boot a different kernel, or -d can be passed to have kadb(1M) prompt for the kernel name. This flag also causes kadb(1M) to provide a prompt after it has loaded the kernel, so breakpoints can be set.

  ok boot kadb -d  
  ...  
  Boot device: /sbus/esp@0,800000/sd@3,0   File and args: kadb -d  
  kadb: /kernel/unix  
  kadb: /kernel/unix  
  Size: 673348+182896+46008 bytes  
  /kernel/unix loaded - 0x125000 bytes used  
  kadb[0]:  

At this point you can set break points or continue with the :c command.

Note - kadb(1M) passes on any kernel flags to the booted kernel. For example, the flags -r, -s and -a can be passed to /kernel/unix with the command boot kadb -ras.

Once the system is booted, sending a break passes control to kadb(1M). A break is generated with L1-A, or by ~# if the console is connected through a tip window

  ...  
  The system is ready.  
  
  test console login: ~stopped at 0xfbd01028: ta  0x7d  
  kadb[0]:  

The number in brackets is the CPU that kadb(1M) is currently executing on; the remaining CPUs are halted. The CPU number is zero on a uniprocessor. Halting the system will also drop to a kadb(1M) prompt.

Warning - Before rebooting or shutting off the power, always halt the system cleanly (with init 0 or shutdown). Buffers may not be flushed otherwise. If the shutdown must occur from the boot PROM, make sure to flush buffers with sync (on the OBP), or 'g 0' (on SunMon).

To continue back to SunOS, use :c.

  kadb[0]: :c  
  
  test console login:  

Exiting

To exit either adb(1M) or kadb(1M), use $q.

  kadb[0]: $q  
  Type 'go' to resume  
  ok  

kadb(1M) can be continued by typing go (on the OBP) or c (on SunMON).

Warning - No other commands can be performed from the PROM if the system is to be continued. PROM commands other than :c (continue) may change system state that SunOS depends on.

Staying at the kadb(1M) prompt for too long may cause the system to lose track of the time of day, and cause network connections to time out.

Commands

The general form of an adb(1M)/kadb(1M) command is:
[ address ] [ ,count ] command [;]

If address is omitted, the current location is used ('.' also stands for the current location). The address can be a kernel symbol. If count is omitted, it defaults to 1.
Commands to adb consist of a verb followed by a modifier or list of modifiers. Verbs can be:
?Print locations starting at address in the executable.
/Print locations starting at address in the core file.
=Print the value of address itself.
>Assign a value to a variable or register.
<Read a value from a variable or register.
RETURNRepeat the previous command with a count of 1. Increment '.'.
With ?, /, and =, output format specifiers can be used. Lowercase letters normally print 2 bytes, uppercase letters print 4 bytes:
o, O             Octal

d,D              Decimal

x,X              Hexadecimal

u,UUnsigned decimal
f,F4,8 byte floating point
cPrint the addressed character
CPrint the addressed character using ^ escape notation.
sPrint the addressed string.
SPrint the addressed string using ^ escape notation.
iPrint as machine instructions (disassemble)
aPrint the value of `.' in symbolic form.
w,WWrite a 2/4 byte value

Note - Understand exactly what sizes the objects are, and what effects changing them might have, before making any changes.

For example, to set a bit in the moddebug variable when debugging the driver, first examine the value of moddebug, then OR in the desired bit.

  kadb[0]: moddebug/X  
  moddebug:  
  moddebug:    0  
  kadb[0]: moddebug/W 0x80000000  
  moddebug:    0x0 = 0x80000000  

Routines can be disassembled with the 'i' command. This is useful when tracing crashes, since the only information may be the program counter at the time of the crash. The output has been formatted for readability:

  kadb[0]: strcmp,4?i  
  strcmp:  
  strcmp: ba       strcmp + 0x20  
           ldsb    [%o1], %o5  
           add     %o0, 0x1, %o0  
           orcc    %g0, %o5, %g0  

To show the addresses also, specify symbolic notation with the 'a' command:

  kadb[0]: strcmp,4?ai  
  strcmp: strcmp: ba        strcmp + 0x20  
  strcmp+4:        ldsb     [%o1], %o5  
  strcmp+8:        add      %o0, 0x1, %o0  
  strcmp+0xc:      orcc     %g0, %o5, %g0  

Register Identifiers

Machine or kadb(1M) internal registers are identified with the '<' command, followed by the register of interest. The following register names are recognized:
."dot," the current location
i0-7input registers to current function
o0-7output registers for current function
l0-7local registers
g0-7global registers
psrProcessor Status Register
tbrTrap Base Register
wimWindow Invalid Mask.
g7 always contains the current thread pointer. For more information on how these registers are normally used, see The SPARC Architecture Manual, Version 8, and the System V Application Binary Interface, SPARC Processor Supplement.
The following command displays the PSR as a 4-byte hexadecimal value:

  kadb[0]: <psr=X  
                  400cc3  

The individual bits of the PSR are defined in <sys/psw.h>. More information is available in The SPARC Architecture Manual, Version 8 and The SPARC Assembly Language Reference Manual.

Display and Control Commands

The following commands display and control the status of adb(1)/kadb(1M):
$bDisplay all breakpoints
$cDisplay stack trace
$dChange default radix to value of dot
$qQuit
$rDisplay registers
$MDisplay built-in macros.
'$c' is very useful with crash dumps: it shows the call trace and arguments at the time of the crash. It is also useful in kadb(1M) when a breakpoint is reached, but is usually not useful if kadb(1M) is entered at a random time. The number of arguments to print can be passed following the '$c' ('$c 2' for two arguments).

Breakpoints

In kadb(1M), breakpoints can be set, which will automatically drop back into kadb when reached. The standard form of a breakpoint command is:
addr [, count]:b [command]

addr is the address at which the program will be stopped and the debugger will receive control, count is the number of times that the breakpoint address occurs before stopping, and command is almost any adb(1) command. Other breakpoint commands are:
:ccontinue execution
:ddelete breakpoint
:ssingle step
:esingle step, but step over function calls
:ustop after return to caller of current function
:zdelete all breakpoints
Here is an example of setting a breakpoint in a commonly used routine, scsi_transport(9F). Upon reaching the breakpoint, '$c' is used to get a stack trace. The top of stack is the first function printed. Note that kadb(1M) does not know how many arguments were passed to the function; it always prints six.

  kadb[0]: scsi_transport:b  
  kadb[0]: :c  
  
  test console login: root  
  Password:  
  breakpoint scsi_transport: save %sp, -0x60, %sp  
  kadb[0]: $c  
  scsi_transport(0xff09c400,0x3,0x3,0x1,0xff09c534,0xff0a0690)  
  sdstrategy(0xff0a0690,0x3,0xff09c440,0x170,0xff09c534,0xff09c400) + 3d8  
  bwrite(0xff0a0690,0xff1ed400,0x1,0xb,0xff0a06f0,0xf017d9d8) + bc  
  sbupdate(0xf00dcfe8,0xff20f400,0xff0a0690,0x400,0x0,0xff03b000)+ 9c  
  ufs_update(0xff034c80,0x3,0xff034cac,0xff2ac764,0xff03b000,0xf00dfe8) + 198  
  ufs_sync(0x0,0x10000,0xff007980,0x1e,0xf00d8e38,0xf00d8e38) + 8  
  fsflush(0xf00dd330,0xf00e0830,0x1af,0x2,0xf00f3ab8,0xf00b8254) + 568  
  kadb[0]: :s  
  stopped at scsi_transport+4: ld [%i0 + 0x4], %o0  
  kadb[0]: $b  
  breakpoints  
  count   bkpt              command  
  1       scsi_transport  
  kadb[0]: scsi_transport:d  
  kadb[0]: :c  

Conditional Breakpoints

Breakpoints can also be set to occur only if a certain condition is met. By providing a command, the breakpoint will be taken only if the count is reached or the command returns zero. For example, a breakpoint that occurs only on
certain I/O controls could be set in the driver's ioctl(9E) routine. Here is an example of breaking only in the sdioctl() routine if the DKIOGVTOC (get volume table of contents) I/O control occurs.

  kadb[0]: sdioctl+4,0:b <i1-0x40B  
  kadb[0]: $b  
  breakpoints  
  count   bkpt          command  
  0       sdioctl+4     <i1-0x40B  
  kadb[0]: :c  

Adding four to sdioctl skips to the second instruction in the routine, bypassing the save instruction that establishes the stack. The '<i1' refers to the first input register, which is the second parameter to the routine (the cmd argument of ioctl(9E)). The count of zero is impossible to reach, so it stops only when the command returns zero, which is when 'i1 - 0x40B' is true. This means i1 contains 0x40B (the value of the ioctl command, determined by examining the header file).
To force the breakpoint to be reached, the prtvtoc(1M) command is used. It known to issue this I/O control:

  # prtvtoc /dev/rdsk/c0t3d0s0  
  breakpoint sdioctl+4: mov %i0, %o0  
  kadb[0]: $c  
  sdioctl(0x800018,0x40b,0xeffffc24,0x1,0xff22fa80,0xf01e9918) + 4  
  ioctl(0xf01e9e90,0xf01e9918,0x1,0x40b,0xff2ab380,0xff0894b4) + 1ec  
  syscall(0xf00c1c54) + 4d4  
  .syscall(0x3) +8c  
  ?(?) + 7fffffff  
  Syssize(0x3,0xeffffc24,0xeffffd6c,0x5403148,0x0,0x5452ea0) + 20338  
  Syssize(0x3,0xefffff7c,0xeffffc24,0x80,0x3,0x0)+ fb70  
  Syssize(0xefffff7c,0x2000,0x1,0x1,0x1,0x3) + f51c  
  Syssize(0x2,0xeffffee4,0xeffffef0,0x22c00,0x0,0xffffffff) + eb8c  

kadb(1M) cannot always determine where the bottom of the stack is. In the above example, the calls to Syssize and ?(?) are not part of the stack.

Macros

adb(1) and kadb(1M) support macros. adb(1) macros are in /usr/kvm/lib/adb, while kadb(1M)'s macros are built-in and can be displayed with $M. Most of the existing macros are for private kernel structures. New macros for adb can be created with adbgen(1).
Macros are used in the form:
[ address ] $<macroname

threadlist is a useful macro that displays the stacks of all the threads in the system. This macro that does not take an address, and can generate a lot of output, so be ready to use Control-S and Control-Q to start/stop if necessary (this is another good reason to use a tip window). Control-C can be used to abort the listing.:

  kadb[0]: $<threadlist  
           thread_id f0141ee0  
  ?() + 1e  
  data address not found  
           thread_id f0165ee0  
  ?(0xf00e24e0,0xf00e24e0,0xff004000,0xc,0x0,0x4000e0) + 1e  
  callout_thread(0xff004090,0xf00d7d9a,0xf00e24e0,0xf00ac6c0,0x0,0xf004000) + 2c  
           thread_id f016bee0  
  ?(0xf00e04d0,0xf00e04d0,0x80b5f7ff,0x1,0x0,0x4000e0) + 1e  
  background(0x0,0x0,0x0,0xf00e23ac,0x0,0xf00e04d0) + 64  
           thread_id f016eee0  
  ?(0xf00dd884,0xf00dd884,0x80b777ff,0x1,0x0,0x4000e0) + 1e  
  freebs(0x0,0x0,0xf00e2388,0xff21d270,0xf00e24a0,0xf00dd884) + 2c  
  ^C  

Another useful macro is thread. Given a thread ID, it prints the corresponding thread structure. This can be used to look at a certain thread found with the threadlist macro, to look at the owner of a mutex, or to look at the current thread.

  kadb[0]: <g7$<thread  
  0xf0141ee0:  
           link    stk           stksize  
           0       f0141ee0      ee0  
  0xf0141eec:  
           affinity     affcnt bind_cpu  
           1            1        -1  
  ...  


Note - There is no type information kept in the kernel, so using a macro on an inappropriate object will result in garbage output.

Macros do not necessarily output all the fields of the structures, nor is the output necessarily in the order given in the structure definition. Occasionally, memory may need to be dumped for certain structures, and then matched with the structure definition in the kernel header files.

Warning - The driver should have knowledge only of headers and structures listed in Section 9S of the man Pages(9S): DDI and DKI Data Structures, even though interesting knowledge may be uncovered while debugging.

Example: adb on a Core Dump

During the development of the example ramdisk driver, the system crashes with a data fault when running mkfs(1M).

  test# mkfs -F ufs -o nsect=8,ntrack=8,free=5 /devices/pseudo/ramdisk:0,raw 1024  
  BAD TRAP  
  mkfs: Data fault  
  kernel read fault at addr=0x4, pme=0x0  
  Sync Error Reg 80<INVALID>  
  pid=280, pc=0xff2f88b0, sp=0xf01fe750, psr=0xc0, context=2  
  g1-g7: ffffff98, 8000000, ffffff80, 0, f01fe9d8, 1, ff1d4900  
  Begin traceback... sp = f01fe750  
  Called from f0098050,fp=f01fe7b8,args=1180000 f01fe878 ff1ed280 ff1ed280 2 ff2f8884  
  Called from f0097d94,fp=f01fe818,args=ff24fd40 f01fe878 f01fe918 0 0 ff2c9504  
  Called from f0024e8c,fp=f01fe8b0,args=f01fee90 f01fe918 2 f01fe8a4 f01fee90 3241c  
  Called from f0005a28,fp=f01fe930,args=f00c1c54 f01fe98c 1 f00b9d58 0 3  
  Called from 15c9c,fp=effffca0,args=5 3241c 200 0 0 7fe00  
  End traceback...  
  panic: Data fault  

savecore(1M) was not enabled. After enabling it (See "Saving System Core Dumps" on page 247), the system is rebooted. The crash is then recreated by running mkfs(1M) again. When the system comes up, it saves the kernel and the core file, which can then be examined with adb(1):

  # cd /var/crash/test  
  # ls  
  bounds    unix.0    vmcore.0  
  # adb -k unix.0 vmcore.0  
  physmem ac0  

The first step is to examine the stack to determine where the system was when it crashed:

  $c  
  complete_panic(0x0,0x1,0xf00b6c00,0x7d0,0xf00b6c00,0xe3) + 114  
  do_panic(0xf00be7ac,0xf0269750,0x4,0xb,0xb,0xf00b6c00) + 1c  
  die(0x9,0xf0269704,0x4,0x80,0x1,0xf00be7ac) + 5c  
  trap(0x9,0xf0269704,0x4,0x80,0x1,0xf02699d8) + 6b4  

This stack trace is not very helpful initially, since the ramdisk routines are not on the stack trace. However, there is a useful bit of information: the call to trap(). The first argument to trap() is the trap type--in this case 9--which is a T_DATA_FAULT trap (from <sys/trap.h>). See The SPARC Architecture, Version 8 manual for more information.
The second argument to trap() is a pointer to a regs structure containing the state of the registers at the time of the trap.

  0xf0269704$<regs  
  0xf0269704: psr       pc           npc  
                c0      ff2dd8b0      ff2dd8b4  
  0xf0269710: y             g1            g2           g3  
                e0000000     ffffff98     8000000      ffffff80  
  0xf0269720: g4            g5            g6           g7  
                0            f02699d8     1            ff22c800  
  0xf0269730: o0            o1            o2           o3  
                f02697a0     ff080000     19000        ef709000  
  0xf0269740: o4            o5            o6           o7  
                8000         0            f0269750     7fffffff  

Note that the PC was ff2dd8b0 when the trap occurred. The next step is to determine which routine that is in:

  ff2dd8b0/i  
  rd_write+0x2c: ld [%o2 + 0x4], %o3  

That PC corresponds to rd_write(), which is a routine in the ramdisk driver. The bug is in the ramdisk write routine, and occurs during an ld (load) instruction. This load instruction is dereferencing the value of o2+4, so the next step is to determine the value of o2.

Note - Using the $r command to examine the registers is inappropriate because the registers have been reused in the trap routine. Instead, examine the value of o2 from the regs structure.

o2 has the value 19000 in the regs structure. Valid kernel addresses are constrained to be above 0xE0000000 by the ABI, so this address is probably a user one. The ramdisk does not deal with user addresses though, so this is something the ramdisk write routine should not be dereferencing.
Now, where this occurs in relation to the complete routine must be determined, so that the assembly language can be matched to the C code. To do this, the routine is disassembled up to the problem instruction, which occurs 2c bytes into the routine. Each instruction is 4 bytes in size, so 2c/4 or 0xb additional instructions should be displayed:

  rd_write,c/i  
  rd_write:  
  rd_write:    sethi    %hi(0xfffffc00), %g1  
                add     %g1, 0x398, %g1 ! ffffff98  
                save    %sp, %g1, %sp  
                st      %i0, [%fp + 0x44]  
                st      %i1, [%fp + 0x48]  
                st      %i2, [%fp + 0x4c]  
                ld      [%fp + 0x44], %o0  
                call    getminor  
                nop  
                st      %o0, [%fp - 0x4]  
                ld      [%fp - 0x8], %o2  
                ld      [%o2 + 0x4], %o3  

The crash occurs a few instructions after a call to getminor(9F). Examining the ramdisk.c source file these lines stand out in rd_write:
int instance = getminor(dev);
rd_devstate_t *rsp;
if (uiop->uio_offset >= rsp->ramsize)
    return (EINVAL);

Notice that rsp is never initialized. This is the problem. It is fixed by including the correct call to ddi_get_soft_state(9F) (since the ramdisk driver uses the soft state routines to do state management):
int instance = getminor(dev);
rd_devstate_t *rsp = ddi_get_soft_state(rd_state, instance);
if (uiop->uio_offset >= rsp->ramsize)
    return (EINVAL);


Note - Most data fault panics are bad pointer references.

Example: kadb on a Deadlocked Thread

The next problem is that the system does not panic, but the mkfs(1M) command hangs, and cannot be aborted. Though a core dump can be forced--by sending a break and then using sync from the OBP or using 'g 0' from SunMon--in this case kadb(1M) will be used. After logging in remotely and using ps (which indicated that only the mkfs(1M) process was hung, not the entire system) the system is shut down and booted using kadb(1M).

  ok boot kadb -d  
  Boot device: /sbus/esp@0,800000/sd@3,0   File and args: kadb -d  
  kadb:/kernel/unix  
  Size: 673348+182896+46008 bytes  
  /kernel/unix loaded - 0x125000 bytes used  
  kadb[0]::c  
  SunOS Release 5.4 Version Generic [UNIX(R) System V Release 4.0]  
  Copyright (c) 1983-1994, Sun Microsystems, Inc.  
  ...  

After the rest of the kernel has loaded, moddebug is patched to see if loading is the problem (since it got to rd_write() before, it is probably not the problem, it will be checked anyway).

  # ~stopped at 0xfbd01028: ta 0x7d  
  kadb[0]: moddebug/X  
  moddebug:  
  moddebug: 0  
  kadb[0]: moddebug/W 0x80000000  
  moddebug: 0x0 = 0x80000000  
  kadb[0]: :c  

modload(1M) is used to load the driver, to separate module loading from the real access:

  # modload /home/driver/drv/ramdisk  
  load '/usr/kernel/drv/ramdisk' id 61 loaded @ 0xff335000 size 3304  
  installing ramdisk, module id 61.  

It loads fine, so loading is not the problem. The condition is recreated with mkfs(1M).
#
  mkfs -F ufs -o nsect=8,ntrack=8,free=5 /devices/pseudo/ramdisk@0:c,raw 1024
ramdisk0: misusing 524288 bytes of memory

It hangs. At this point, kadb(1M) is entered and the stack examined:

  ~stopped at 0xfbd01028: ta 0x7d  
  kadb[0]: $c  
  _end() + bc1eb40  
  debug_enter(0xfbd01000,0xff1a7054,0x0,0x0,0xb,0xff1a7000) + 88  
  zs_high_intr(0xff1a0230) + 19c  
  _level1(0xf0141ee0) + 404  
  idle(0x0,0x0,0x0,0xf0171ee0,0x0,0x1) + 28  

It does not look like the current thread is the problem, so the entire thread list is checked for hung threads:

  kadb[0]: $<threadlist  
           thread_id f0141ee0  
  ?(0xfbd01000,0xff1a7054,0x0,0x0,0xb,0xff1a7000)+ 1e  
  zs_high_intr(0xff1a0230) + 19c  
  _level1(0xf0141ee0) + 404  
  idle(0x0,0x0,0x0,0xf0171ee0,0x0,0x1) + 28  
           thread_id f0165ee0  
  ?(?) + 1e  
  cv_wait(0xf00e24e0,0xf00e24e0,0xff004000,0xb,0x0,0x4000e4)  
  callout_thread(0xff004090,0xf00d7d9a,0xf00e24e0,0xf00ac6c0,0x0,0xff004000) + 2c  
           thread_id f016bee0  
  ...  
           thread_id ff11c600  
  ?(?) + 1e  
  biowait(0xf01886d0,0x0,0x7fe00,0x200,0xf00e085c,0x3241c)  
  physio(0xff196120,0xf01886d0,0xf01888a4,0x3241c,0x0,0xf0188878)+ 338  
  rd_write(0x1180000,0xf0188878,0xff19b680,0xff19b680,0x2,0xff335884) + 8c  
  rdwr(0xff1505c0,0xf0188878,0xf0188918,0x0,0x0,0xff24dd04) + 138  
  rw(0xf0188e90,0xf0188918,0x2,0xf01888a4,0xf0188e90,0x3241c) + 11c  
  syscall(0xf00c1c54) + 4d4  

Of all the threads, only one has a stack trace that references the ramdisk driver. It happens to be the last one. It seems that the process running mkfs(1M) is blocked in biowait(9F), after a call to physio(9F). biowait(9F) takes a buf(9S) structure as a parameter, the next step is to examine the buf(9S) structure:

  kadb[0]: f01886d0$<buf  
  0xf01886d0: flags  
                129  
  0xf01886d4: forw          back          av_forw      av_back  
                ff24dd04     72616d64     69736b3a     302c7261  
  0xf01886e8: count         bufsize       error        edev  
                512          770          0            1180000  
  0xf01886ec: addr          blkno         resid        proc  
                3241c        3ff          0            ff26f000  
  0xf0188714: iodone        vp            pages  
                0            f01888a4     efffff68  

The resid field is 0, which indicates that the transfer is complete. physio(9F) is still blocked, however. Examining the physio(9F) manual page points out that biodone(9F) should be called to unblock biowait(9F). This is the problem; rd_strategy( ) did not call biodone(9F). Adding a call to biodone(9F) before returning fixes this problem.

Testing

Once a device driver is functional, it should be thoroughly tested before it is distributed. In addition to the testing done to traditional UNIX device drivers, Solaris 2.x drivers require testing of Solaris 2.x features such as dynamic loading and unloading of drivers and multithreading.

Configuration Testing

A driver's ability to handle multiple configurations is very important, and is a part of the test process. Once the driver is working on a simple, or default, configuration, additional configurations should be tested. Depending on the device, this may be accomplished by changing jumpers or DIP switches. If the number of possible configurations is small, all of them should be tried. If the number is large, various classes of possible configurations should be defined, and a sampling of configurations from each class should be tested. The designation of such classes depends on how the different configuration parameters might interact, which in turn depends on the device and on how the driver was written.
For each configuration, the basic functions must be tested, which include loading, opening, reading, writing, closing, and unloading the driver. Any function that depends on the configuration deserves special attention. For example, changing the base memory address of device registers is not likely to affect the behavior of most driver functions; if the driver works well with one address, it is likely to work as well with a different address, providing the configuration code allows it to work at all. On the other hand, a special I/O control call may have different effects depending on the particular device configuration.
Loading the driver with varying configurations assures that the probe(9E) and attach(9E) entry points can find the device at different addresses. For basic functional testing, using regular UNIX commands such as cat(1) or dd(1M) is usually sufficient for character devices. Mounting or booting may be required for block devices.

Functionality Testing

After a driver has been run through configuration testing, all of its functionality should be thoroughly tested. This requires exercising the operation of all of the driver's entry points. In addition to the basic functional tests done in configuration testing, full functionality testing requires testing the rest of the entry points and functions to obtain confidence that the driver can correctly perform all of its functions.
Many drivers will require custom applications to test functionality, but basic drivers for devices such as disks, tapes, or asynchronous boards can be tested using standard system utilities. All entry points should be tested in this process, including mmap(9E), poll(9E) and ioctl(9E), if applicable. The ioctl(9E) tests may be quite different for each driver, and for nonstandard devices a custom testing application will be required.

Error Handling

A driver may perform correctly in an ideal environment, but fail to handle cases where a device encounters an error or an application specifies erroneous operations or sends bad data to the driver. Therefore, an important part of driver testing is the testing of its error handling.
All of a driver's possible error conditions should be exercised, including error conditions for actual hardware malfunctions. Some hardware error conditions may be difficult to induce, but an effort should be made to cause them or to simulate them if possible. It should always be assumed that all of these conditions will be encountered in the field. Cables should be removed or loosened, boards should be removed, and erroneous user application code should be written to test those error paths.

Stress, Performance, and Interoperability Testing

To help ensure that the driver performs well, it should be subjected to vigorous stress testing. Running single threads through a driver will not test any of the locking logic and might not test condition variable waits. Device operations should be performed by multiple processes at once in order to cause several threads to execute the same code simultaneously. The way this should be done depends on the driver; some drivers will require special testing applications, but starting several UNIX commands in the background will be suitable for others. It depends on where the particular driver uses locks and condition variables. Testing a driver on a multiprocessor machine is more likely to expose problems than testing on a single processor machine.
Interoperability between drivers must also be tested, particularly because different devices can share interrupt levels. If possible, configure another device at the same interrupt level as the one being tested, and whether the driver correctly claims its own interrupts and otherwise operates correctly under the stress tests described above should be tested. Stress tests should be run on both devices at once. Even if the devices do not share an interrupt level, this test can still be valuable; for example, if serial communication devices start to experience errors while a network driver is being tested, that could indicate that the network driver is causing the rest of the system to encounter interrupt latency problems.
Performance of a driver under these stress tests should be measured using UNIX performance measuring tools. This can be as simple as using the time(1) command along with commands used for stress tests.

DDI/DKI Compliance Testing

To assure compatibility with later releases and reliable support for the current release, every driver should be Solaris 2.4 DDI/DKI compliant. One way to determine if the driver is compliant is by inspection. The driver can be visually inspected to ensure that only kernel routines and data structures specified in sections 9F and 9S of the Solaris 2.4 Reference Manual AnswerBook are used.
In addition, the Solaris 2.4 Driver Developer Kit (DDK) now includes a DDI compliance tool (DDICT) that checks device driver C source code for non-DDI/DKI compliance and issues either error or warning messages when it finds non-compliant code. SunSoft recommends that all drivers be written to pass DDICT. After the DDK has been installed, the DDICT can be found in:
/opt/SUNWddk/driver_dev/bin/ddict

A new manual page describing DDICT is available in:
/opt/SUNWddk/driver_dev/ddict/man/man1/ddict.1

Installation and Packaging Testing

Drivers are delivered to customers in packages. A package can be added and removed from the system using a standard, documented mechanism (see the SunOS 5.4 Application Packaging and Installation Guide). Test that the driver has been correctly packaged to ensure that the end user will be able to add it to and remove it from a system.
In testing, the package should be installed and removed from every type of media on which it will be released, and on several system configurations. Packages must not make unwarranted assumptions about the directory environment of the target system. Certain valid assumptions may be made about where standard kernel files are kept, however. It is a good idea to test the adding and removing of packages on newly-installed machines that have not been modified for a development environment. It is a common packaging error for a package to use a tool or file that exists only in a development environment, or only on the driver writer's own development system. For example, no tools from Source Compatibility package, SUNWscpu, should be used in driver installation programs.
The driver installation must be tested on a minimal Solaris system without any of the optional packages installed.

Testing Specific Types of Drivers

Since each type of device is different, it is difficult to describe how to test them all specifically. This section provides some information about how to test certain types of standard devices.

Tape Drivers

Tape drivers should be tested by performing several archive and restore operations. The cpio(1) and tar(1) commands may be used for this purpose. The dd(1M) command can be used to write an entire disk partition to tape, which can then be read back and written to another partition of the same size,
and the two copies compared. The mt(1) command will exercise most of the I/O controls that are specific to tape drivers (see mtio(7)); all of the options should be attempted. The error handling of tape drivers can be tested by attempting various operations with the tape removed, attempting writes with the write protect on, and removing power during operations. Tape drivers typically implement exclusive-access open(9E) calls, which should be tested by having a second process try to open the device while a first process already has it open.

Disk Drivers

Disk drivers should be tested in both the raw and block device modes. For block device tests, a new file system should be created on the device and mounted. Multiple file operations can be performed on the device at this time.

Note - The file system uses a page cache, so reading the same file over and over again will not really be exercising the driver. The page cache can be forced to retrieve data from the device by memory mapping the file (with mmap(2)), and using msync(2) to invalidate the in-memory copies.

Another (unmounted) partition of the same size can be copied to the raw device and then commands such as fsck(1M) can be used to verify the correctness of the copy. The new partition can also be mounted and compared to the old one on a file-by-file basis.

Asynchronous Communication Drivers

Asynchronous drivers can be tested at the basic level by setting up a login line to the serial ports. A good start is if a user can log in on this line. To sufficiently test an asynchronous driver, however, all of the I/O control functions must be tested, and many interrupts at high speed must occur. A test involving a loopback serial cable and high speed communication will help test the reliability of the driver. Running uucp(1C) over the line also provides some exercise; note, however, that uucp(1C) performs its own error handling, so it is important to verify that the driver is not reporting excessive numbers of errors to the uucp(1C) process.
These types of devices are usually STREAMS-based.

Network Drivers

Network drivers may be tested using standard network utilities. ftp(1) and rcp(1) are useful because the files can be compared on each end of the network. The driver should be tested under heavy network loading, so various commands should be run by multiple processes. Heavy network loading means:
  • There is a lot of traffic to the test machine.
  • There is heavy traffic among all machines on the network.
Network cables should be unplugged while the tests are executing, and the driver should recover gracefully from the resulting error conditions. Another important test is for the driver to receive multiple packets in rapid succession (''back-to-back'' packets). In this case, a relatively fast host on a lightly-loaded network should send multiple packets in quick succession to the machine with the driver being tested. It should be verified that the receiving driver does not drop the second and subsequent packets.
These types of devices are usually STREAMS-based.