Site/SunNet/Domain Manager Troubleshooting Guide
  Search only this book
Download this book in PDF

Network Communication Problems

1

1.1 System Resources

1.1.1 My system is hanging

This problem may be due to an infinite loop created by the SNMP trap forwarding feature. If, within Site/SunNet/Domain Manager, a trap is forwarded from machine to machine, an infinite loop (machine A -> B -> A) may occur thereby exhausting machine resources and increasing the burden on network traffic.
The trap forwarding feature should be used to forward SNMP traps to applications external to Site/SunNet/Domain Manager. Within this product, Cooperative Consoles should be used to exchange information.
For example, if machine A forwards a trap to machine B and both A and B use Cooperative Consoles to exchange events/traps, machine B may possibly receive the same trap twice depending on the Cooperative Consoles configuration.
If your system is hanging from an infinite loop as described above:
  1. Reboot your system.

  2. Edit the configuration file, snm.conf; either delete the snmtrap.forward line completely or change the snmtrap.forward specification to an external application.

1.1.2 CPU usage undesirable

The CPU goes to an undesirable level after running snm -i.
This problem is due to the ISAM lock file, NETISAM.LOCKTABLE.lock, being left in a bad state. If the lock file is not present, it is created by Site/SunNet/Domain Manager and then released when the application exits. If an exits occurs while the ISAM lock is being held (due to a segmentation fault), the next snm -i waits forever to get the lock. This results in the undesirable CPU usage.
The problem does not go away even if you deinstall SNM packages because the lock file is not a part of the package and will not be deleted. To resolve this problem:
  1. Kill the snm process:

ps -ef | grep snm in the case of Solaris 2.X
ps -ax | grep snm in the case of SUNOS

  1. Change directories: cd /usr/tmp

  2. Remove the file: NETISAM.LOCKTABLE.lock

  3. Restart the Site/SunNet/Domain Manager application: snm -i or snm

1.1.3 System performance degrades

Everything runs fine until asynchronous traps are received at a fairly regular rate (2 - 4 per second). When the asynchronous traps are received, the window system response becomes severely degraded.
This performance degradation occurs when DNS is not available and asynchronous traps are received at a fairly regular rate (2 - 4 per second). If DNS is not available, the console and trap daemon end up waiting for DNS response, resulting in response degradation. To address this problem, make sure DNS is available. See your platform-specific documentation for more information.

1.2 Making Contact

1.2.1 Can't start application

If the Console fails to start or crashes, you may see messages that are returned by NetISAM. These messages have the following format:

  netisam[xxx]:<error>:unix errno <n>  

where errno <n> messages can be found in the header file /usr/include/errno.h.
The following is a list of the more common NetISAM message numbers and descriptions of the problem.
Table 0-1
errnoDescription
2There is no such file or directory. The database directory specified by the SNMDBDIR environment variable does not exist. Or, the runtime MDB files nc.rec, nc.ind, events.rec, and events.ind (normally in./db.<username> directory) do not exist.
12There is not enough memory. You have run out of virtual memory and need to increase the amount of swap space.
13You do not have the proper read/write access permission to the database directory specified by the SNMDBDIR environment variable or to the database files in the MDB directory (normally in ./db.<username>).
28The file system on which the database exists is full.
30The database directory that you are trying to write to is a read-only file system.
70The database you are using is on an NFS-mounted file system and the machine from which you are mounting may have been rebooted. Reboot your machine if this is the case.
71The database is on a file system that is mounted from more than one host.

1.2.2 Can't contact activity or event daemon

You might receive a message to this effect. In this case, check for the existence of /etc/inetd.conf. Make sure that this file contains lines to start na.activity and na.event. For example, in a Solaris 1.x environment, it would look like this:

  na.activity/10  tli rpc/udp  wait   root    /usr/snm/agents/na.activity  na.activity  
  
  <several intervening lines>  
  
  na.event/10     tli rpc/udp  wait   root    /usr/snm/agents/na.event na.event  
  na.event/10     tli rpc/tcp  wait   root    /usr/snm/agents/na.event na.event  

If inetd.conf appears to be correct, reinvoke /usr/etc/inetd (in a Solaris 1.x environment; for a Solaris 2.x machine invoke /usr/sbin/inetd).

1.2.3 Can't communicate with target system

  • The target system does not support SNMP or the SNMP agent for the target system is not running.

    If you are trying to send a request to a target system through the SNMP proxy agent and the proxy is unable to communicate with the target system, the following messages appear in the Console footer and Error Reports log, respectively.


  Cannot start request 'snmp.<group>.<#>'--will retry later. See  
  error log for details.  


  Error: Request snmp.<group>.<#>: API error: Remote procedure call  
  timed out: Cannot send request: Retrying.  

See the section on managing SNMP devices in the Administration Guide for more information about using the SNMP proxy agent.

1.3 Agent Software

1.3.1 Can't access agent software

  • Agent software is not installed on target system.

    The specified agent may not be installed on the target system if the request is stopped and the following message appears in the Error Reports window:


  Error: Request <request_name>: API error: Cannot create RPC  
  client: program=<number>, version=10: RPC. Program not registered  

See the Installation Guide for instructions on installing agents.
If an agent is a proxy agent, the agent software does not need to be resident on the target system (the proxy agent software needs to be installed only on the proxy system). SNM-supplied proxy agents are hostperf, ippath, lpstat, ping, traffic, and SNMP proxy agents.
  • Agent software is not installed on target system.

    The specified agent may not be installed on the target system if the request is stopped and the following message appears in the Error Reports window:


  Error: Request <request_name>: API error: Cannot create RPC  
  client: program=<number>, version=10: RPC. Program not registered  

Run the getagents program to install the agent software and libraries from the manager station. For instructions on running getagents, see the Installation Guide. If an agent is a proxy agent, then the agent software does not need to be resident on the target system (the proxy agent software needs to be installed only on the proxy system). SNM-supplied proxy agents are hostperf, ippath, lpstat, ping, traffic, and SNMP proxy agents.

1.3.2 Proxy agent doesn't work

  • The proxy agent cannot find the schema or group specified in the request.

    If the SNMP proxy agent cannot find the schema file for the target system or the schema file on the proxy system is not the same as the one loaded in the Console, the following message appears:


  Error(Get): Invalid object group: <gro>  

This can happen if you are attempting to use a schema file that is not a "standard" schema file supplied by Sun (the schema file has been supplied by another vendor, or you have created your own schema file). Make sure that the schema file on the SNMP proxy system is located in a directory specified by the na.snmp.schemas keyword in the /etc/snm.conf (Solaris 1.x) or /etc/opt/SUNWconn/snm/snm.conf (Solaris 2.x) file on the proxy system. If you change the keyword value while the SNMP proxy agent is running, you need to kill the proxy agent for the change to take effect. Make sure that the schema file on the proxy system is the same as the schema file on the Console.
  • The SNMP proxy agent encounters an error while performing the request.

    Any error messages that are prefixed by Error(Set): are messages returned by the SNMP proxy agent; see the na.snmp man page for more information about the error.

  • The community name defined in the request does not match the community name associated with the SNMP device.

    This is the likely to be the case if either of the following messages appear:


  No response from system: <system>  
  
  This variable not available for set  

An authenticationFailure trap is typically sent by SNMP devices in this situation.
Read and write community names are defined in the Properties window for the element. In the Set Tool window, you can also specify community name values in the Options field for each request. If you do not specify community name values in the Properties window of the element or in the Options field of the request, the SNMP proxy agent uses "public" for both community names.

1.3.3 Can't start Agents or Daemons

Occasionally you may find that an agent does not start when you send a request to it. It could be due to a remote procedure call error, listed in "Error Messages". If you notice the following message in the messages file or the system console log:

  <host> inetd[nnn]: <agent>/rpc/udp server failing (looping), service terminated  

it's because every time inetd(8c) starts the agent, the agent dies immediately. inetd(8c) tries a number of times to start the agent, and gives up when it's clear the agent doesn't want to be started. In this case, start the agent manually with debugging enabled by entering,

  host# <agents-path>/agents/na.<agent> -d 3  

where <agents-path> is typically /usr/snm for Solaris 1.x installations or /opt/SUNWconn/snm for Solaris 2.x installations.
You then should get some indication of why the agent is failing. Maybe it can't find libnetmgt.so. Maybe ldconfig(8) needs to be run to get the runtime linker-editor to know about the shared library. It's possible the agent is failing for some other reason. Maybe some resource it needs isn't available, and it is unable to send an error message back to the manager. Once you know what the error means, you should stop the agent (if it hasn't stopped itself), correct the error, and restart inetd(8c), rather than sending it a HUP signal. The commands to do this are shown below:
On Solaris 1.x:

  host# ps -ax | grep inetd  
  host# kill <processid>  
  host# /usr/etc/inetd  

On Solaris 2.x:

  host# ps -ef | grep inetd  
  host# kill <processid>  
  host# /usr/sbin/inetd  

If an agent dumps core, the core file is stored in the /usr/tmp directory. If you cannot resolve the problem with the agent, send a copy of the core file in with your problem report.
The above commands can also be used to stop and restart the SNM activity daemon. You should do this if you see the following message:

  Cannot contact na.activity to determine ID  

1.4 Sending and Receiving Requests

1.4.1 Can't get reports back

Expected reports are not returning. To resolve, check the request to make sure that the Defer Reports field of the data request is off (no check mark appears in the accompanying box).

1.4.2 Can't send request

No threshold has been set in the event specification. If you attempt to send an event specification to an agent without specifying the conditions of the event, the following message is displayed.

  No threshold is set, request is not sent.  

Make sure that you click SELECT on the Apply button after specifying any thresholds.

1.4.3 Can't get request back

Note that the states awaiting activation, being activated, awaiting stop, and being stopped are normally temporary. If you notice that a request seems to be in one of these states for a long time without changing, you should make sure that the target element is reachable by using the ping agent to do a Quick Dump request.

1.5 License-related Issues

1.5.1 Can't get license info

The Console comes up and displays a notice, "could not get license information..." and exits.
To resolve, invoke the install_snm_license took which is available under /usr/snm/bin (Solaris 1.x) or /opt/SUNWconn/snm/bin (Solaris 2.x). The tool displays a set of phone numbers to call for the license. Once you have the license, use the tool to install the license.

1.5.2 Can't run Network Layout or CC_Receiver tools

You are not able to run the Network Layout (NLA) application or the CC_Receiver tools.
To resolve, click "About..." under the File menu. The display contains your license type. If the license is not Domain Manager, you cannot run NLS or CC_Receiver. Call for an upgrade to Domain Manager.