System Administration Guide, Volume II
검색에만이 책은
PDF로 이 문서 다운로드

Monitoring Network Performance

72

This chapter describes the how to monitor network performance. This is a list of the step-by-step instructions in this chapter.
How to Check the Response of Hosts on the Networkpage 1440
How to Send Packets to Hosts on the Networkpage 1441
How to Capture Packets From the Networkpage 1442
How to Check the Network Statuspage 1443
How to Display NFS Server and Client Statisticspage 1446

Monitoring Network Performance

Table 72-1 describes the commands available for monitoring network performance.
Table 72-1
CommandUse This Command To ...
pingLook at the response of hosts on the network.
sprayTest the reliability of your packet sizes. It can tell you whether packets are being delayed or dropped.
snoopCapture packets from the network and trace the calls from each
client to each server.
netstatDisplay network status, including state of the interfaces used for TCP/IP traffic, the IP routing table, and the per-protocol statistics for UDP, TCP, ICMP, and IGMP.
nfsstatDisplay a summary of server and client statistics that can be used to identify NFS problems.

· How to Check the Response of Hosts on the Network

Check the response of hosts on the network with the ping command.

  $ ping hostname  

If you suspect a physical problem, you can use ping to find the response time of several hosts on the network. If the response from one host is not what you would expect, you can investigate that host. Physical problems could be caused by:
  • Loose cables or connectors
  • Improper grounding
  • Missing termination
  • Signal reflection
For more information about this command, see the ping(1M) man pages.

Examples--Checking the Response of Hosts on the Network

The simplest version of ping sends a single packet to a host on the network. If it receives the correct response, it prints the message host is alive.

  $ ping elvis  
  elvis is alive.  

With the -s option, ping sends one datagram per second to a host. It then prints each response and the time it took for the round trip. For example:

  $ ping -s pluto  
  64 bytes from pluto (123.456.78.90): icmp_seq=0. time=10. ms  
  64 bytes from pluto (123.456.78.90): icmp_seq=5. time=0. ms  
  64 bytes from pluto (123.456.78.90): icmp_seq=6. time=0. ms  
  ^C  
  ----pluto PING Statistics----  
  8 packets transmitted, 8 packets received, 0% packet loss  
  round-trip (ms) min/avg/max = 0/2/10  

· How to Send Packets to Hosts on the Network

Test the reliability of your packet sizes with the spray command.

  $ spray [ -c count -d interval -l packet_size ] hostname  

In this command,
-c countIs the number of packets to send.
-d intervalIs the number of microseconds to pause between sending packets. If you don't use a delay, you may run out of buffers.
-l packet_size Is the packet size.
hostname...Is the system to send packets.
For more information about this command, see the spray(1M) man pages.

Example--Sending Packets to Hosts on the Network

The following example sends 100 packets to a host (-c 100) with each packet having a size of 2048 bytes (-l 2048). The packets are sent with a delay time of 20 microseconds between each burst (-d 20).

  $ spray -c 100 -d 20 -l 2048 pluto  
  sending 100 packets of length 2048 to pluto ...  
  no packets dropped by pluto  
  279 packets/sec, 573043 bytes/sec  

· How to Capture Packets From the Network

To capture packets from the network and trace the calls from each client to each server, use snoop. This command provides accurate time stamps that allow some network performance problems to be isolated quickly. For more information, see snoop(1M).

  # snoop  

Dropped packets could be caused by insufficient buffer space, or an overloaded CPU.

· How to Check the Network Status

Display network status information, such as statistics about the state of network interfaces, routing tables, and various protocols, with the netstat command.

  $ netstat [-i] [-r] [-s]  

In this command,
-iDisplays the state of the TCP/IP interfaces.
-rDisplays the IP routing table.
-s                          Displays statistics for the UDP, TCP, ICMP, and IGMP 
                            protocols.

For more information, see the netstat(1M) man pages.

Examples--Checking the Network Status

The following example shows output from the netstat -i command, which displays the state of the interfaces used for TCP/IP traffic.

  $ netstat -i  
  Name  Mtu  Net/Dest      Address        Ipkts  Ierrs Opkts  Oerrs Collis Queue  
  lo0   8232 software          localhost      1280   0     1280   0     0      0  
  le0   1500 loopback           venus           1628480 0     347070 16    39354  0  

This display shows how many packets a machine has transmitted and received on each interface. A machine with active network traffic should show both Ipkts and Opkts continually increasing.
Calculate the network collisions rate by dividing the number of collision counts (Collis) by the number of out packets (Opkts). In the above example, the collision rate is 3.5 percent. A network-wide collision rate greater than 5 to 10 percent can indicate a problem.
Calculate the input packet error rate by dividing the number of input errors by the total number of input packets (Ierrs/Ipkts). The output packet error rate is the number of output errors divided by the total number of output packets (Oerrs/Opkts). If the input error rate is high (over 0.25 percent), the host may be dropping packets.
The following example shows output from the netstat -s command, which displays the per-protocol statistics for the UDP, TCP, ICMP, and IGMP protocols.

  UDP                                                                        0  
           udpInDatagrams           =    61321     udpInErrors  
           udpOutDatagrams          =    6783  
           tcpRtoAlgorithm          =     4        ttcpRtoMin         =     50  
           tcpRtoMax                =     60000    tcpMaxConn         =     -1  
           .  
           .  
           .  
  IP      ipForwarding             =    1          ipDefaultTT        =    255  
           ipInReceives             =    13429     ipInHdrError       =       0  
                                         8         s  
           .  
           .  
           .  
  ICMP    icmpInMsgs               =    116        icmpInErrors       =   0  
           icmpInCksumErrs          =     0        icmpInUnknow  
                                                    n  
  
  IGMP:  
           0 messages received  
           0 messages received with too few bytes  
  
  
           0 membership reports sent  

The following example shows output from the netstat -r command, which displays the IP routing table.

  Routing Table:  
  Destination           Gateway            Flags     Ref       Use    Interface  
  ---------------       ------------       -----     ---       ---    --------  
  --  
  localhost             localhost          UGHD      0           0    lo0  
  earth-bb              sleepy             U         3           1  
  software              pluto              U         3         147    lo0  
  224.0.0.0             pluto              UG        3           0    lo0  
  default               mars               UG        0          18  
  default               earth              UG        0          30  
  default               venus              UG        0          18  
  default               neptune            UG        0          26  
  default               saturn             UG        0           3  

The fields in the netstat -r report are described in Table 72-2.
Table 72-2 netstat -r
Field
Description
FlagsU
G
H
D
The route is up
The route is through a gateway
The route is to a host
The route was dynamically created using a redirect
RefShows the current number of routes sharing the same link layer
UseIndicates the number of packets sent out
InterfaceLists the network interface used for the route

· How to Display NFS Server and Client Statistics

The NFS distributed file service uses a remote procedure call (RPC) facility which translates local commands into requests for the remote host. The remote procedure calls are synchronous. That is, the client application is blocked or suspended until the server has completed the call and returned the results. One of the major factors affecting NFS performance is the retransmission rate.
If the file server cannot respond to a client's request, the client retransmits the request a specified number of times before it quits. Each retransmission imposes system overhead, and increases network traffic. Excessive retransmissions can cause network performance problems. If the retransmission rate is high, you could look for:
  • Overloaded servers that take too long to complete requests
  • An Ethernet interface dropping packets
  • Network congestion which slows the packet transmission
Use nfsstat -c to show client statistics, and nfsstat -s to show server statistics. Use netstat -m to display network statistics for each file system. For more information, see the nfsstat(1M) man pages.

Examples--Displaying NFS Server and Client Statistics

The following example displays RPC and NFS data for the client, pluto.

  $ nfsstat -c  
  
  Client rpc:  
  calls  badcalls  retrans badxid  timeout  wait     newcred timers  
  6888    123         10        51       101        0         0         138  
  
  Client nfs:  
  calls  badcalls     nclget    nclcreate  
  6765    0           6765      0  
  
  null    getattr     setattr root      lookup      readlink    read  
  0  0%  1364 20%      4 0%     0 0%    1643 24%   928 13%      1622%  
  
  wrcache write      create     remove   rename     link        symlink  
  0 0%     14 0%     11 0%      1 0%      0 0%       0 0%       0 0%  
  
  mkdir    rmdir     readdir    fsstat  
  1 0%     0 0%      2535 37% 10 21%  

The output of the nfsstat -c command is described in Table 72-3.
Table 72-3 nfsstat -c
FieldDescription
callsShows the total number of calls sent.
badcallsThe total number of calls rejected by RPC.
retransThe total number of retransmissions. For this client, the number of retransmissions is less than 1 percent (10 time-outs out of 6888 calls). These may be caused by temporary failures. Higher rates may indicate a problem.
badxidThe number of times that a duplicate acknowledgment was received for a single NFS request.
timeoutThe number of calls that timed out.
Table 72-3 nfsstat -c
FieldDescription
waitThe number of times a call had to wait because no client handle was available.
newcredThe number of times the authentication information had to be refreshed.
timersThe number of times the time-out value was greater than or equal to the specified time-out value for a call.
readlinkThe number of times a read was made to a symbolic link. If this number is high (over 10 percent), it could mean that there are too many symbolic links.
The following example shows output from the nfsstat -m command.

  pluto$ nfsstat -m  
  /usr/man from pluto:/export/svr4/man  
   Flags:   hard,intr,dynamic read size=8192, write size=8192,  retrans = 5  
   Lookups: srtt=14 (35ms), dev=4 (20ms), cur=3 (60ms)  
   Reads:   srtt=17 (42ms), dev=6 (30ms), cur=5 (100ms)  
   All:     srtt=15 (37ms), dev=7 (35ms), cur=5 (100ms)  

This output of the nfsstat -m command, which is displayed in milliseconds, is described in Table 72-4:
Table 72-4 nfsstat -m
FieldDescription
srttThe smoothed average of the round-trip times
devThe average deviations
curThe current "expected" response time
  • srtt is the smoothed average of the round-trip times
  • dev is the average deviations
  • cur is the current "expected" response time
If you suspect that the hardware components of your network are creating problems, you need to look carefully at the cabling and connectors.