SMCC NFS Server Performance and Tuning Guide
  Search only this book
Download this book in PDF

Troubleshooting

5

Table 5-1 lists the actions to perform when you encounter a tuning problem.
Table 5-1
Command/ToolCommand Output/ResultAction
netstat -iCollis+Ierrs+Oerrs/Ipkts
+ Opkts > 2%
Check the Ethernet hardware.
netstat -iCollis/Opkts > 10%Add Ethernet interface and distribute client load.
netstat -iIerrs/Ipks > 25%The host may be dropping packets, causing high input error rate. To compensate for bandwidth-limited network hardware: reduce the packet size; set the read buffer size, rsize and/or the write buffer size wsize to 2048 when using mount or in the /etc/vfstab file. See "To find the number of packets and collisions or errors on each network" in Chapter 3, "Analyzing NFS Performance."
nfsstat -sreadlink > 10%Replace symbolic links with mount points.
nfsstat -swrites > 5%Install a Prestoserve NFS accelerator (SBus card or NVRAM-NVSIMM) for peak performance. See "Prestoserve NFS Accelerator" on page 75 in Chapter 4.
nfsstat -sThere are any badcalls.The network may be overloaded. Identify an overloaded network using network interface statistics.
Table 5-1 (Continued)
Command/ToolCommand Output/ResultAction
nfsstat -sgetattr > 40%Increase the client attribute cache using the actimeo option. Make sure the DNLC and inode caches are large. Use vmstat -s to determine the percent hit rate (cache hits) for the DNLC and if needed increase ncsize in the /etc/system file. See "Directory Name Lookup Cache (DNLC)" on page 82 in Chapter 4.
vmstat -sHit rate (cache hits) < 90%Increase ncsize in the /etc/system file.
Ethernet monitor, for example: SunNet Manager(TM), SharpShooter, NetMetrixLoad > 35%Add an Ethernet interface and distribute client load.
Table 5-2, Table 5-3, and Table 5-4 show potential client bottlenecks.
Table 5-2
Symptom(s)Command/ToolCauseSolution
NFS server hostname not responding or slow response to commands when using NFS-mounted directoriesnfsstatUser's path variableList directories on local file systems first, critical directories on remote file systems second, and then the rest of the remote file systems.
NFS server hostname not responding or slow response to commands when using NFS-mounted directoriesnfsstatRunning executable from an NFS-mounted file systemCopy the application locally (if used often).
NFS server hostname not responding; badxid >5% of total calls and badxid = timeoutnfsstat -rcClient times out before server respondsCheck for server bottleneck. If server's response time isn't improved, increase the timeo parameter in the /etc/vfstab file of clients. Try increasing timeo to 25, 50, 100, 200 (tenths of seconds). Wait one day between modifications and check to see if the number of time-outs decreases.
badxid = 0nfsstat -rcSlow networkIncrease rsize and wsize in the /etc/vfstab file. Check interconnection devices (bridges, routers, gateways).
Table 5-3
Symptom(s)Command/ToolCauseSolution
NFS server hostname not
responding
vmstat -s
or
iostat
Cache hit rate is <90%Adjust the suggested parameters for
DNLC, then run to see if the symptom is
gone. If not, reset the parameters for
DNLC. Adjust the parameters for the
buffer cache, then the inode cache,
following the same procedure as for the
DNLC.
NFS server hostname not
responding
netstat -m
or
nfsstat
Server not keeping up
with request arrival
rate
Check network. If the problem is not
network, add appropriate Prestoserve
NFS accelerator, or upgrade the server.
High I/O wait time or CPU idle time. Slow disk access times or

NFS server hostname not responding

iostat -xI/O load not balanced across disks. The svc_t value is greater than 40 msTake a large sample (~2 weeks). Balance the load across disks; add disks as necessary. Add a Prestoserve NFS accelerator for synchronous writes. To reduce disk and network traffic, use tmpfs for /tmp for both server and clients. Measure system cache efficiencies. Balance load across disks; add disks as necessary.
Slow response when accessing remote filesnetstat -s or snoopEthernet interface dropping packetsIf retransmissions are indicated, increase buffer size. For information on how to use snoop, see "snoop" on page 96 in Appendix A.
Table 5-4
SymptomsCommand/ToolCauseSolution
Poor response time when accessing directories mounted on different subnets or NFS server hostname not respondingnetstat -rsNFS requests being routedKeep clients on subnet directly connected to server.
Poor response time when accessing directories mounted on different subnets or NFS server hostname not respondingnfsstatDropped packetsMake protocol queues deeper.
Poor response time when accessing directories mounted on different subnets or NFS server hostname not respondingnetstat -s shows incomplete or bad headers, bad data length fields, bad checksums.Network problemsCheck network hardware.
Poor response time when accessing directories mounted on different subnets or NFS server hostname not responding; sum of input and output packets per second for an interface is over 600 per secondnetstat -iNetwork overloadedThe network segment is very busy. If this is a recurring problem, consider adding another (le) network interface.
Network interface collisions are over 120 per secondnetstat -iNetwork overloadedReduce the number of machines on the network or check the network hardware.
Poor response time when accessing directories mounted on different subnets or NFS server hostname not respondingnetstat -iHigh packet collision rate (Collis/Opkts >.10)- If packets are corrupted, it may be due to a corrupted MUX box; use the Network General Sniffer product or another protocol analyzer to find the cause.

- Check for overloaded network. If there are too many nodes, create another subnet.

- Check network hardware; could be bad tap, transceiver, hub on 10base-T. Check cable length and termination.