System Administration Guide, Volume II
  Rechercher uniquement dans ce livre
Télécharger cet ouvrage au format PDF

Overview of System Performance

69

Getting good performance from a computer or network is an important part of system administration. This chapter is an overview of some of the factors that contribute to maintaining and managing the performance of the computer systems in your care.
This is a list of the overview information in this chapter.
System Performance and System Resourcespage 1368
Processes and System Performancepage 1369
Disk I/O and System Performancepage 1375
Memory and System Performancepage 1375
Kernel Parameters and System Performancepage 1377
About Monitoring Performancepage 1378

System Performance and System Resources

The performance of a computer system depends upon how the system uses and allocates its resources. It is important to monitor your system's performance on a regular basis so that you know how it behaves under normal conditions. You should have a good idea of what to expect, and be able to recognize a problem when it occurs.
System resources that affect performance include:
  • Central processing unit (CPU) - The CPU processes instructions, fetching instructions from memory and executing them.
  • Input/output (I/O) devices - I/O devices transfer information into and out of the computer. Such a device could be a terminal and keyboard, a disk drive, or a printer.
  • Memory - Physical (or main) memory is the amount of memory (RAM) on the system.
Chapter 8, "Monitoring Performance," describes the tools that display statistics about the activity and the performance of the computer system.

Other Sources of Information

Performance is a broad subject that can't be adequately covered in these chapters. There are several books available that cover various aspects of improving performance and tuning your system or network. Three useful books are:
  • Sun Performance and Tuning: SPARC and Solaris, by Adrian Cockcroft, SunSoft Press/PRT Prentice Hall, ISBN 0-13-149642-3
  • System Performance Tuning, by Mike Loukides, O'Reilly & Associates, Inc.
  • Managing NFS and NIS, by Hal Stern, O'Reilly & Associates, Inc.

Processes and System Performance

Terms related to processes are described in Table 69-1.
Table 69-1
TermDescription
processAn instance of program in execution.
lightweight process (LWP)Is a virtual CPU or execution resource. LWPs are scheduled by the kernel to use available CPU resources based on their scheduling class and priority. LWPs include a kernel thread, which contains information that has to be in memory all the time and an LWP, which contains information that is swappable.
application threadA series of instructions with a separate stack that can execute independently in a user's address space. They can be multiplexed on top of LWPs.
A process can consist of multiple LWPs and multiple application threads. The kernel schedules a kernel-thread structure, which is the scheduling entity in the SunOS 5.x environment. Various process structures are described in Table 69-2.
Table 69-2
StructureDescription
procContains information that pertains to the whole process and has to be in main memory all the time.
kthreadContains information that pertains to one LWP and has to be in main memory all the time.
userContains the per process information that is swappable.
klwpContains the per LWP process information that is swappable.
Figure 69-1 illustrates the relationship of these structures.

Graphique

Figure 69-1

Most process resources are accessible to all the threads in the process. Almost all process virtual memory is shared. A change in shared data by one thread is available to the other threads in the process.

Process Commands

The ps command enables you to check the status of active processes on a system, as well as display technical information about the processes. This data is useful for such administrative tasks as determining how to set process priorities, and how to kill processes that have hung or become inactive. See Chapter 70, "Managing Processes," for more information about using the ps command and its options.
In addition, process tools are available in /usr/proc/bin that display highly detailed information about the processes listed in /proc, also known as the process file system (PROCFS). Images of active processes are stored here by their process ID number.
The process tools are similar to some options of the ps command, except that the output provided by the tools is more detailed. In general, the process tools:
  • Display more details about processes, such as fstat and fcntl information, working directories, and trees of parent and child processes.
  • Provide control over processes, allowing users to stop or resume them.
The new /usr/proc/bin utilities are summarized in Table 69-3.
Table 69-3
Tools That Control ProcessesWhat the Tools Do
/usr/proc/bin/pstop pidStops the process
/usr/proc/bin/prun pidRestarts the process
/usr/proc/bin/ptime pidTimes the process using microstate accounting
/usr/proc/bin/pwait [-v] pidWaits for specified processes to terminate
Tools That Display Process DetailsWhat the Tools Display
/usr/proc/bin/pcred pidCredentials
/usr/proc/bin/pfiles pidfstat and fcntl information
for open files
/usr/proc/bin/pflags pid/proc tracing flags, pending and held signals, and other status information for each lwp
/usr/proc/bin/pldd pidDynamic libraries linked into each process
/usr/proc/bin/pmap pidAddress space map
/usr/proc/bin/psig pidSignal actions
/usr/proc/bin/pstack pidHex+symbolic stack trace for each lwp
/usr/proc/bin/ptree pidProcess trees containing specified pids
/usr/proc/bin/pwdx pidCurrent working directory
In these commands, pid is a process identification number. You can obtain this number by using the ps -ef command.
Chapter 70, "Managing Processes," describes how to use the process tool commands to perform selected system administration tasks, such as displaying details about processes, and starting and stopping them. A more detailed description of the process tools can be found in the proc(1) man pages.
If a process becomes trapped in an endless loop, or if it takes too long to execute, you may want to stop (kill) the process. See Chapter 70, "Managing Processes," for more information about stopping processes using the kill command.

Process Priority Levels

A process is allocated CPU time according to its scheduling class and its priority level. By default, the Solaris operating system has four process scheduling classes: real-time, system, timesharing and interactive.
  • Real-time processes have the highest priority. This class includes processes that must respond to external events as they happen. For example, a process that collects data from a sensing device may need to process the data and respond immediately. In most cases, a real-time process requires a dedicated system. No other processes can be serviced while a real-time process has control of the CPU. By default, the range of priorities is 100-159.
  • System processes have the middle priorities. This class is made up of those processes that are automatically run by the kernel, such as the swapper and the paging daemon. By default, the range of priorities is 60-99.
  • Timesharing processes have the lowest priority. This class includes the standard UNIX processes. Normally, all user processes are timesharing processes. They are subject to a scheduling policy that attempts to distribute processing time fairly, giving interactive applications quick response time and maintaining good throughput for computations. By default, the range of priorities is 0-59.
  • Interactive processes are introduced in the SunOS 5.4 environment. The priorities range from 0-59. All processes started under OpenWindows are placed in the interactive class and those processes with keyboard focus get higher priorities.
The scheduling priority determines the order in which processes will be run.
Real-time processes have fixed priorities. If a real-time process is ready to run, no system process or timesharing process can run.
System processes have fixed priorities that are established by the kernel when they are started. The processes in the system class are controlled by the kernel, and cannot be changed.
Timesharing and interactive processes are controlled by the scheduler, which dynamically assigns their priorities. You can manipulate the priorities of the processes within this class.

Changing the Scheduling Priority of Processes With priocntl

The scheduling priority of a process is the priority it is assigned by the process scheduler. These priorities are assigned according to the scheduling policies of the scheduler. The dispadmin command lists the default scheduling policies. See "Scheduler Configuration" on page 1461," for information on the dispadmin command.
The priocntl command can be used to assign processes to a priority class and to manage process priorities. See the section called "How to Designate Priority" on page 1401 for instructions on using the priocntl command to manage processes.

Changing the Priority of a Timesharing Process With nice

The nice command is only supported for backward compatibility to previous Solaris releases. The priocntl command provides more flexibility in managing processes.
The priority of a process is determined by the policies of its scheduling class, and by its nice number. Each timesharing process has a global priority which is calculated by adding the user-supplied priority, which can be influenced by the nice or priocntl commands, and the system-calculated priority.
The execution priority number of a process is assigned by the operating system, and is determined by several factors, including its schedule class, how much CPU time it has used, and (in the case of a timesharing process) its nice number.
Each timesharing process starts with a default nice number, which it inherits from its parent process. The nice number is shown in the NI column of the ps report.
A user can lower the priority of a process by increasing its user-supplied priority. But only the system administrator (or root) can lower a nice number to increase the priority of a process. This is to prevent users from increasing the priorities of their own processes, thereby monopolizing a greater share of the CPU.
Nice numbers range between 0 and +40, with 0 representing the highest priority. The default value is 20. Two versions of the command are available, the standard version, /usr/bin/nice, and a version that is part of the C shell.
See "How to Change the Priority of a Process" on page 1404 for information about using the nice command.

Process Troubleshooting

Here are some tips on obvious problems you may find:
  • Look for several identical jobs owned by the same user. This may come as a result of running a script that starts a lot of background jobs without waiting for any of the jobs to finish.
  • Look for a process that has accumulated a large amount of CPU time. You'll see this by looking at the TIME field. Possibly, the process is in an endless loop.
  • Look for a process running with a priority that is too high. Type ps -c to see the CLS field, which displays the scheduler class of each process. A process executing as a real-time (RT) process can monopolize the CPU. Or look for a timeshare (TS) process with a high nice value. A user with root privileges may have bumped up the priorities of this process. The system administrator can lower the priority by using the nice command.
  • Look for a runaway process--one that progressively uses more and more CPU time. You can see it happening by looking at the time when the process started (STIME) and by watching the cumulation of CPU time (TIME) for awhile.

Disk I/O and System Performance

The disk is used to store data and instructions used by your computer system. You can examine how efficiently the system is accessing data on the disk by looking at the disk access activity and terminal activity. See "Monitoring Performance" on page 1405 for a discussion of the iostat and sar commands, which report statistics on disk activity. Managing and allocating disk space and dividing your disk into slices are discussed in "Managing Disks" in System Administration Guide, Volume I.
If the CPU spends much of its time waiting for I/O completions, there is a problem with disk slowdown. Some ways to prevent disk slowdowns are:
  • Keep disk space with 10% free so file systems are not full. If a disk becomes full, back up and restore the file systems to prevent disk fragmentation. Consider purchasing products that resolve disk fragmentation.
  • Organize the file system to minimize disk activity. If you have two disks, distribute the file system for a more balanced load. Using Sun's Solstice DiskSuite(TM) product provides more efficient disk usage.
  • Add more memory. Additional memory reduces swapping and paging traffic, and allows an expanded buffer pool (reducing the number of user-level reads and writes that need to go out to disk).
  • Add a disk and balance the most active file systems across the disks.

Memory and System Performance

Performance suffers when the programs running on the system require more physical memory than is available. When this happens, the operating system begins paging and swapping, which is costly in both disk and CPU overhead.
Paging involves moving pages that have not been recently referenced to a free list of available memory pages. Most of the kernel resides in main memory and is not pageable.
Swapping occurs if the page daemon cannot keep up with the demand for memory. The swapper will attempt to swap out sleeping or stopped lightweight processes (LWPs). If there are no sleeping or stopped LWPs, the swapper will swap out a runnable process. The swapper will swap LWPs back in based on their priority. It will attempt to swap in processes that are runnable.

Swap Space

Swap areas are really file systems used for swapping. Swap areas should be sized based on the requirements of your applications. Check with your vendor to identify application requirements.
Table 69-4 describes the formula used to size default swap areas by the Solaris 2.x installation program. These default swap sizes are a good place to start if you are not sure how to size your swap areas.
Table 69-4
If Your Physical Memory Size Is ...Your Default Swap Size Is ...
16-64 Mbytes32 Mbytes
64-128 Mbytes64 Mbytes
128-512 Mbytes128 Mbytes
greater than 512 Mbytes256 Mbytes
See the "Managing File Systems" section of System Administration Guide, Volume I for information about managing swap space.

Buffer Resources

The buffer cache for read and write system calls uses a range of virtual addresses in the kernel address space. A page of data is mapped into the kernel address space and the amount of data requested by the process is then physically copied to the process' address space. The page is then unmapped in the kernel. The physical page will remain in memory until the page is freed up by the page daemon.
This means a few I/O-intensive processes can monopolize or force other processes out of main memory. To prevent monopolization of main memory, balance the running of I/O-intensive processes serially in a script or with the at(1) command. Programmers can use mmap(2) and madvise(3) to ensure that their programs free memory when they are not using it.

Kernel Parameters and System Performance

Many basic parameters (or tables) within the kernel are calculated from the value of the maxusers parameter. Tables are allocated space dynamically. However, you can set maximums for these tables to ensure that applications won't take up large amounts of memory.
By default, maxusers is approximately set to the number of Mbytes of physical memory on the system. However, the system will never set maxusers higher than 1024. The maximum value of maxusers is 2048, which can be set by modifying the /etc/system file.
See Chapter 73, "Tuning Kernel Parameters," and the system(3S) man page for details on kernel parameters.
In addition to maxusers, a number of kernel parameters are allocated dynamically based on the amount of physical memory on the system, as shown in Table 69-5 below.
Table 69-5
Kernel ParameterDescription
ufs_ninodeThe maximum size of the inode table
ncsizeThe size of the directory name lookup cache
max_nprocsThe maximum size of the process
ndquotThe number of disk quota structures
maxuprcThe maximum number of user processes per user-id
Table 69-6 lists the default settings for kernel parameters affected by the value assigned to maxusers.
Table 69-6
Kernel TableVariableDefault Setting
Inodeufs_ninodemax_nprocs + 16 + maxusers + 64
Name cachencsizemax_nprocs + 16 + maxusers + 64
Processmax_nprocs10 + 16 * maxusers
Quota tablendquot(maxusers * NMOUNT) / 4 + max_nprocs
User processmaxuprcmax_nprocs - 5
See Chapter 73, "Tuning Kernel Parameters," for a description of the kernel parameters and how to change the default values.

About Monitoring Performance

While your computer is running, counters in the operating system are incremented to keep track of various system activities. System activities that are tracked are:
  • Central processing unit (CPU) utilization
  • Buffer usage
  • Disk and tape input/output (I/O) activity
  • Terminal device activity
  • System call activity
  • Context switching
  • File access
  • Queue activity
  • Kernel tables
  • Interprocess communication
  • Paging
  • Free memory and swap space
  • Kernel Memory Allocation (KMA)
The following sections describe tools and commands that help you monitor performance.

The sar Command

Use the sar command to:
  • Organize and view data about system activity
  • Access system activity data on a special request basis
  • Generate automatic reports to measure and monitor system performance, and special request reports to pinpoint specific performance problems. "Automatic Collection of System Activity Data" on page 1384 describes these tools.

The vmstat Command

The vmstat command reports virtual memory statistics and shows CPU load, paging, number of context switches, device interrupts, and system calls.
The following example shows the vmstat display of statistics gathered at five-second intervals.

  $ vmstat 5  
   procs     memory              page               disk       faults     cpu  
   r b w  swap  free  re  mf  pi  po  fr  de sr f0 s3 -- --  in  sy  cs us sy id  
   0 0 8 28312   668   0   9   2   0   1   0  0  0  1  0  0  10  61  82  1  2 97  
   0 0 3 31940   248   0  10  20   0  26   0 27  0  4  0  0  53 189 191  6  6 88  
   0 0 3 32080   288   3  19  49   6  26   0 15  0  9  0  0  75 415 277  6 15 79  
   0 0 3 32080   256   0  26  20   6  21   0 12  1  6  0  0 163 110 138  1  3 96  
   0 1 3 32060   256   3  45  52  28  61   0 27  5 12  0  0 195 191 223  7 11 82  
   0 0 3 32056   260   0   1   0   0   0   0  0  0  0  0  0   4  52  84  0  1 99  

The fields in the vmstat report have the following meanings:
procs reports the following states:
  • r..The number of kernel threads in the dispatch queue
  • b..Blocked kernel threads waiting for resources
  • w..Swapped out LWPs waiting for processing resources to finish
memory reports on usage of real and virtual memory:
  • swap Available swap space
  • free Size of the free list
page reports on page faults and paging activity, in units per second:
  • re..Pages reclaimed
  • mf..Minor and major faults
  • pi..Kbytes paged in
  • po..Kbytes paged out
  • fr..Kbytes freed
  • de..Anticipated memory needed by recently swapped-in processes
  • sr..Pages scanned by page daemon (not currently used)
If sr does not equal zero, the page daemon has been running.
disk reports the number of disk operations per second. This field can show data on up to four disks.
faults reports the trap/interrupt rates (per second):
  • in..Interrupts per second
  • sy..System calls per second
  • cs..CPU context switch rate
cpu reports on the use of CPU time:
  • us..User time
  • sy..System time
  • id..Idle time
The vmstat command can also display statistics on swapping, cache flushing, and interrupts.
System Events Run vmstat -s to show the total of various system events that have taken place since the system was last booted.
Swapping Run vmstat -S to show swapping statistics in addition to paging statistics. The additional fields are:
  • si..Average number of LWPs swapped in per second
  • so..Number of whole processes swapped out

Note - The vmstat command truncates the output of both of these fields. Use the sar command to display a more accurate accounting of swap statistics.

Cache Flushing Run vmstat -c to show cache flushing statistics for a virtual cache. It shows the total number of cache flushes since the last boot. The cache types are:
  • usr User
  • ctx Context
  • rgn Region
  • seg Segment
  • pag Page
  • par Partial-page
Interrupts Run vmstat -i to show interrupts per device.

  $ vmstat -i  
  interrupt         total     rate  
  ---------------------------------  
  clock         104638405      100  
  esp0            2895003        2  
  fdc0                  0        0  
  ---------------------------------  
  Total         107533408      102  

The iostat Command

The iostat command reports statistics about disk input and output, and produces measures of throughput, utilization, queue lengths, transaction rates, and service time.
The following example shows disk statistics gathered every five seconds.

  $ iostat 5  
        tty          fd0           sd3       cpu  
   tin tout bps tps serv  bps tps serv  us sy wt  id  
     0    1   0   0    0    1   0 5640   0  1  0  98  
     0   10   0   0    0    0   0    0   0  1  0  99  
     0   10   0   0    0    0   0    0   0  1  0  99  
     0   10   0   0    0   27   3  319   0  4  9  88  
     0   10   0   0    0    2   0 5061   0  0  0  99  
     0   10   0   0    0    0   0    0   0  0  0 100  
     0   10   0   0    0    0   0    0   0  0  0 100  
     0   10   0   0    0    0   0    0   0  0  0 100  
     0   10   0   0    0    0   0    0   0  0  0 100  

The first line of output shows the statistics since the last boot. Each subsequent line shows the interval statistics. The default is to show statistics for the terminal (tty), disks (fd and sd), and CPU (cpu).
For each terminal, iostat displays:
  • tin Number of characters in the terminal input queue
  • tout Number of characters in the terminal output queue
For each disk, iostat displays the following information:
  • bps Blocks per second
  • tps Transactions per second
  • serv Average service time, in milliseconds
For the CPU, iostat displays the CPU time spent in the following modes:
  • us..In user mode
  • sy..In system mode
  • wt..Waiting for I/O
  • id..Idle
Run iostat -xtc to get extended disk statistics.

  $ iostat -xtc  
  
  disk    r/s w/s    Kr/s Kw/s   wait actv   svc_t     %w %b   tin tout us sy  wt  id  
  sd0     0.2 1.7     1.0  9.7   0.0  0.1     39.8     0  3      0   9   1  6  9   85  
  sd1     0.5 2.5    10.6 21.0   0.0  0.1     26.6     0  5  
  sd2     0.0 0.2     0.1  0.0   0.0  0.0    157.7     0  0  

Each disk has a line of output:
  • r/s Reads per second
  • w/s Writes per second
  • Kr/s Kbytes read per second
  • Kw/s Kbytes written per second
  • wait Average number of transactions waiting for service (queue length)
  • actv Average number of transactions actively being serviced
  • svc_t Average service time, in milliseconds
  • %w..Percentage of time the queue is not empty
  • %b..Percentage of time the disk is busy

The df Command

The df command shows the amount of free disk space on each mounted disk. The usable disk space reported by df reflects only 90% of full capacity, as the reporting statistics leave a 10% head room above the total available space. This head room normally stays empty for better performance. The percentage of disk space actually reported by df is used space divided by usable space. If the file system is above 90% capacity, transfer files to a disk that is not as full by using cp, or to a tape by using tar or cpio; or remove the files.
Use the df -k command to display file system information in Kbytes. The following information is given:
  • kbytes...Total size of usable space in the file system
  • used....Amount of space used
  • avail...Amount of space available for use
  • capacity Amount of space used, as a percent of the total capacity
  • mounted on Mount point

  $  df -k  
  filesystem            kbytes     used     avail     capacity       mounted on  
  /dev/dsk/c0t3d0s0       17269    11099     4450          71%       /  
  /dev/dsk/c0t3d0s6      136045    79818    42627          65%       /usr  
  /proc                       0        0        0           0        /proc  
  swap                    40424        0    40416           0        /tmp  

The profil Command

profil uses CPU statistics to show the amount of time that a program uses. You can analyze a program and identify the functions that consume a high percentage of CPU time. See the man page for profil(2) for more information.

Automatic Collection of System Activity Data

Three commands are involved in automatic system activity data collection: sadc, sa1, and sa2.
The sadc data collection utility periodically collects data on system activity and saves it in a file in binary format--one file for each 24-hour period. You can set up sadc to run periodically (usually once each hour), and whenever the system boots to multiuser mode. The data files are placed in the directory /usr/adm/sa. Each file is named sadd, where dd is the current date. The format of the command is as follows:
/usr/lib/sa/sadc [t n] [ofile]

The command samples n times with an interval of t seconds (t should be greater than 5 seconds) between samples. It then writes, in binary format, to the file ofile, or to standard output. If t and n are omitted, a special file is written once.

Running sadc When Booting

The sadc command should be run at system boot time in order to record the statistics from when the counters are reset to zero. To make sure that sadc is run at boot time, the /etc/init.d/perf file must contain a command line that writes a record to the daily data file.
The command entry has the following format:
su sys -c "/usr/lib/sa/sadc /usr/adm/sa/sa'date +5d'"

Running sadc Periodically With sa1

To generate periodic records, you need to run sadc regularly. The simplest way to do this is by putting a line into the /var/spool/cron/sys file, which calls the shell script, sa1. This script invokes sadc and writes to the daily data files, /var/adm/sa/sadd. It has the following format:
/usr/lib/sa/sa1 [t n]

The arguments t and n cause records to be written n times at an interval of t seconds. If these arguments are omitted, the records are written only one time.

Producing Reports With sa2

There is another shell script, sa2, which produces reports rather than binary data files. The sa2 command invokes the sar command and writes the ASCII output to a report file.

Collecting System Activity Data With sar

The sar command can be used either to gather system activity data itself or to report what has been collected in the daily activity files created by sadc.
The sar command has the following formats:
sar [-aAbcdgkmpqruvwy] [-o file] t [n]
sar [-aAbcdgkmpqruvwy] [-s time] [-e time] [-i sec] [-f file]

The sar command below samples cumulative activity counters in the operating system every t seconds, n times. (t should be 5 seconds or greater; otherwise, the command itself may affect the sample.) You must specify a time interval between which to take the samples; otherwise, the command operates according to the second format. The default value of n is 1. The following example takes two samples separated by 10 seconds. If the -o option is specified, samples are saved in file in binary format.

  $ sar -u 10 2  

Other important information about the sar command:
  • With no sampling interval or number of samples specified, sar extracts data from a previously recorded file, either the one specified by the -f option or, by default, the standard daily activity file, /var/adm/sa/sadd, for the most recent day.
  • The -s and -e options define the starting and ending times for the report. Starting and ending times are of the form hh[:mm[:ss]] (where h, m, and s represent hours, minutes, and seconds).
  • The -i option specifies, in seconds, the intervals between record selection. If the -i option is not included, all intervals found in the daily activity file are reported.
Table 69-7 lists the sar options and their actions.
Table 69-7 sar
OptionActions
-aChecks file access operations
-bChecks buffer activity
-cChecks system calls
-dChecks activity for each block device
-gChecks page-out and memory freeing
-kChecks kernel memory allocation
-mChecks interprocess communication
-pChecks swap and dispatch activity
-qChecks queue activity
-rChecks unused memory
-uChecks CPU utilization
-vChecks system table status
-wChecks swapping and switching volume
-yChecks terminal activity
-AReports overall system performance (same as entering all options)
If no option is used, it is equivalent to calling the command with the -u option.

Monitoring Tools

The Solaris 2.x system software provides several tools to help you keep track of how your system is performing. These include:
  • The sar and sadc utilities, which collect and report on many aspects of system activity. Chapter 71, "Monitoring Performance," describes these utilities and the information that they provide.
  • The ps command, which provides information about the active processes. Chapter 70, "Managing Processes," describes the ps command.
  • The performance meter, which provides a graphical representation of the status of your system and other hosts on the network. Chapter 71, "Monitoring Performance," describes the performance meter.
  • The vmstat and iostat commands, which summarize system activity, providing information about virtual memory activity, disk usage, and CPU activity. Chapter 71, "Monitoring Performance," describes these tools.
  • The swap command, which can be used to display information about available swap space on your system. See the "Managing File Systems" section in System Administration Guide, Volume I for information on using the swap command.
  • The netstat and nfsstat commands, which display information about network performance. Chapter 72, "Monitoring Network Performance," describes these commands.