Contained Within
Find More Documentation
Featured Support Resources
| Download this book in PDF (706 KB)
UFS
bufhwm
- Description
-
Maximum amount of memory
for caching I/O buffers. The buffers are used for writing file system metadata
(superblocks, inodes, indirect blocks, and directories). Buffers are allocated
as needed until the amount to be allocated would exceed bufhwm.
At this point, enough buffers are reclaimed to satisfy the request.
For historical reasons, this parameter does not require the ufs: prefix.
- Data Type
-
Signed integer
- Default
-
2% of physical memory
- Range
-
80 Kbytes to 20% of physical
memory
- Units
-
Kbytes
- Dynamic?
-
No. Value is used to compute
hash bucket sizes and is then stored into a data structure that adjusts the
value in the field as buffers are allocated and deallocated. Attempting to
adjust this value without following the locking protocol on a running system
can lead to incorrect operation.
- Validation
-
If bufhwm
is less than 80 Kbytes or greater than the lesser of 20% of physical memory
or twice the current amount of kernel heap, it is reset to the lesser of 20%
of physical memory or twice the current amount of kernel heap. The following
message appears on the system console and in the /var/adm/messages file.
"binit: bufhwm out of range (value attempted). Using N."
|
Value attempted refers to the value entered in /etc/system or by using the kadb -d command. N is the value computed by
the system based on available system memory.
- When to Change
-
Since buffers are
only allocated as they are needed, the overhead from the default setting is
the allocation of a number of control structures to handle the maximum possible
number of buffers. These structures consume 52 bytes per potential buffer
on a 32–bit kernel and 104 bytes per potential buffer on a 64–bit
kernel. On a 512 Mbyte 64–bit kernel this consumes 104*10144 bytes,
or 1 Mbyte. The header allocations assumes buffers are 1 Kbyte in size, although
in most cases, the buffer size is larger.
The amount of memory, which has not been allocated in the buffer pool,
can be found by looking at the bfreelist structure in the
kernel with a kernel debugger. The field of interest in the structure is bufsize, which is the possible remaining memory in bytes. Looking
at it with the buf macro by using mdb:
# mdb -k
Loading modules: [ unix krtld genunix ip nfs ipc ]
> bfreelist$<buf
bfreelist:
[ elided ]
bfreelist + 0x78: bufsize [ elided ]
75734016
|
bufhwm on this system, with 6 Gbytes of memory, is
122277. It is not directly possible to determine the number of header structures
used since the actual buffer size requested is usually larger than 1 Kbyte.
However, some space might be profitably reclaimed from control structure allocation
for this system.
The same structure on the 512 Mbyte system shows that only 4 Kbytes
of 10144 Kbytes has not been allocated. When the biostats kstat
is examined with kstat -n biostats, it is seen that the
system had a reasonable ratio of buffer_cache_hits to buffer_cache_lookups as well. This indicates that the default setting
is reasonable for that system.
- Commitment Level
-
Unstable
ndquot
- Description
-
Number of quota structures
for the UFS file system that should be allocated. Relevant only if quotas
are enabled on one or more UFS file systems. Because of historical reasons,
the ufs: prefix is not needed.
- Data Type
-
Signed integer
- Default
-
((maxusers
x 40) / 4) + max_nprocs
- Range
-
0 to MAXINT
- Units
-
Quota structures
- Dynamic?
-
No
- Validation
-
None. Excessively large
values hang the system.
- When to Change
-
When the default
number of quota structures is not enough. This situation is indicated by the
following message displayed on the console or written in the message log.
- Commitment Level
-
Unstable
ufs_ninode
- Description
-
Number of inodes to
be held in memory. Inodes are cached globally (for UFS), not on a per-file
system basis.
A key variable in this situation is ufs_ninode. This
parameter is used to compute two key limits that affect the handling of inode
caching. A high watermark of ufs_ninode / 2 and a low water
mark of ufs_ninode / 4 are computed.
When the system is done with an inode, one of two things can happen:
-
The file referred to by the inode is no longer on the system
so the inode is deleted. After it is deleted, the space goes back into the
inode cache for use by another inode (which is read from disk or created for
a new file).
-
The file still exists but is no longer referenced by a running
process. The inode is then placed on the idle queue. Any referenced pages
are still in memory.
When inodes are idled, the kernel defers the idling process to a later
time. If a file system is a logging file system the kernel also defers deletion
of inodes. Two kernel threads do this. Each thread is responsible for one
of the queues.
When the deferred processing is done, the system drops the inode onto
either a delete or idle queue, each of which has a thread that can run to
process it. When the inode is placed on the queue, the queue occupancy is
checked against the low watermark. If it is in excess of the low watermark,
the thread associated with the queue is awakened. After it is awakened, the
thread runs through the queue and forces any pages associated with the inode
out to disk and frees the inode. The thread stops when it has removed 50%
of the inodes on the queue at the time it was awakened.
A second mechanism is in place if the idle thread is unable to keep
up with the load. When the system needs to find a vnode, it goes through the ufs_vget routine. The first thing vget does is check the length of the idle queue. If the length is
above the high watermark, then it pops two inodes off the idle queue and "idles"
them (flushes pages and frees inodes). It does this before
it gets an inode for its own use.
The system does attempt to optimize by placing inodes with no in-core
pages at the head of the idle list and inodes with pages at the end of the
idle list, but it does no other ordering of the list. Inodes are always removed
from the front of the idle queue.
The only time that inodes are removed from the queues as a whole is
when a sync, unmount, or remount occur.
For historical reasons, this parameter does not require the ufs: prefix.
- Data Type
-
Signed integer
- Default
-
ncsize
- Range
-
0 to MAXINT
- Units
-
Inodes
- Dynamic?
-
Yes
- Validation
-
If ufs_ninode is less than or equal to zero, the value is set to ncsize.
- When to Change
-
When the default
number of inodes is not enough. If the maxsize reached
field as reported by kstat -n inode_cache is larger than
the maxsize field in the kstat, the value of ufs_ninode may be too small. Excessive inode idling (described previously)
can also be a problem.
This situation can be identified by using kstat -n inode_cache to look at the inode_cache kstat. Thread idles are inodes idled by the background threads while vget idles are idles by the requesting process before using an inode.
- Commitment Level
-
Unstable
ufs:ufs_WRITES
- Description
-
If ufs_WRITES is non-zero, the number of bytes outstanding for writes on a file
is checked. See ufs_HW subsequently to determine whether
the write should be issued or should be deferred until only ufs_LW bytes are outstanding. The total number of bytes outstanding is
tracked on a per-file basis so that if the limit is passed for one file, it
won't affect writes to other files.
- Data Type
-
Signed integer
- Default
-
1 (enabled)
- Range
-
0 (disabled), 1 (enabled)
- Units
-
Toggle (on/off)
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
When you want UFS
write throttling turned off entirely. If sufficient I/O capacity does not
exist, disabling this parameter can result in long service queues for disks.
- Commitment Level
-
Unstable
ufs:ufs_LW and ufs:ufs_HW
- Description
-
ufs_HW
is the number of bytes outstanding on a single file barrier value. If the
number of bytes outstanding is greater than this value and ufs_WRITES is set, then the write is deferred. The write is deferred by putting
the thread issuing the write to sleep on a condition variable.
ufs_LW is the barrier for the number of bytes outstanding
on a single file below which the condition variable on which other sleeping
processes are toggled. When a write completes and the number of bytes is less
than ufs_LW, then the condition variable is toggled, which
causes all threads waiting on the variable to awaken and try to issue their
writes.
- Data Type
-
Signed integer
- Default
-
8 x 1024 x 1024 for ufs_LW and 16 x 1024 x 1024 for ufs_HW
- Range
-
0 to MAXINT
- Units
-
Bytes
- Dynamic?
-
Yes
- Validation
-
None
- Implicit
-
ufs_LW
and ufs_HW have meaning only if ufs_WRITES
is not equal to zero. ufs_HW and ufs_LW
should be changed together to avoid needless churning when processes awake
and find that they either cannot issue a write (when ufs_LW
and ufs_HW are too close) or when they might have waited
longer than necessary (when ufs_LW and ufs_HW are too far apart).
- When to Change
-
Consider changing
these values when file systems consist of striped volumes. The aggregate bandwidth
available can easily exceed the current value of ufs_HW.
Unfortunately, this is not a per-file system setting.
When ufs_throttles is a non-trivial number. ufs_throttles can currently be accessed only with a kernel debugger.
- Commitment Level
-
Unstable
|