Contained Within
Find More Documentation
Featured Support Resources
| Download this book in PDF (1345 KB)
Chapter 2 Solaris Kernel Tunable Parameters
This chapter describes most of the Solaris kernel tunable parameters.
Where to Find Tunable Parameter Information
General Kernel and Memory Parameters
This section describes general kernel parameters that are related to
physical memory and stack configuration.
physmem
- Description
-
Modifies the system's configuration of the number of physical
pages of memory after the Solaris OS and firmware are accounted for.
- Data Type
-
Unsigned long
- Default
-
Number of usable pages of physical memory available on the
system, not counting the memory where the core kernel and data are stored
- Range
-
1 to amount of physical memory on system
- Units
-
Pages
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
Whenever you want to test the effect of running the system
with less physical memory. Because this parameter does not take
into account the memory used by the core kernel and data, as well as various
other data structures allocated early in the startup process, the value of physmem should be less than the actual number of pages that represent
the smaller amount of memory.
- Commitment Level
-
Unstable
zfs_arc_min
- Description
-
Determines the minimum size of the ZFS Adjustable Replacement
Cache (ARC). See also zfs_arc_max.
- Data Type
-
Unsigned Integer (64-bit)
- Default
-
1/32nd of physical memory or 64 Mbytes, whichever value is
larger.
- Range
-
64 Mbytes to zfs_arc_max
- Units
-
Bytes
- Dynamic?
-
No
- Validation
-
Yes, the range is validated.
- When to Change
-
When a system's workload demand for memory fluctuates, the
ZFS ARC caches data at a period of weak demand and then shrinks at a period
of strong demand. However, ZFS does not shrink below the value of zfs_arc_min. The default value of zfs_arc_min is 12% of
memory on large memory systems and so, can be a significant amount of memory.
If a workload's highest memory usage requires more than 88% of system memory,
consider tuning this parameter.
- Commitment Level
-
Unstable
- Change History
-
For information, see zfs_arc_min (Solaris 10 Releases).
zfs_arc_max
- Description
-
Determines the maximum size of the ZFS Adjustable Replacement
Cache (ARC). See also zfs_arc_min.
- Data Type
-
Unsigned Integer (64-bit)
- Default
-
Three-fourths of memory on systems with less than 4 Gbytes
of memory
physmem minus 1 Gbyte on systems with greater than
4 Gbytes of memory
- Range
-
64 Mbytes to physmem
- Units
-
Bytes
- Dynamic?
-
No
- Validation
-
Yes, the range is validated.
- When to Change
-
If a future memory requirement is significantly large and
well defined, you might consider reducing the value of this parameter to cap
the ARC so that it does not compete with the memory requirement. For example,
if you know that a future workload requires 20% of memory, it makes sense
to cap the ARC such that it does not consume more than the remaining 80% of
memory.
- Commitment Level
-
Unstable
- Change History
-
For information, see zfs_arc_max (Solaris 10 Releases).
default_stksize
- Description
-
Specifies the default stack size of all threads. No thread
can be created with a stack size smaller than default_stksize.
If default_stksize is set, it overrides lwp_default_stksize. See also lwp_default_stksize.
- Data Type
-
Integer
- Default
-
-
3 x PAGESIZE on SPARC systems
-
2 x PAGESIZE on x86 systems
-
5 x PAGESIZE on AMD64 systems
- Range
-
Minimum is the default values:
-
3 x PAGESIZE on SPARC systems
-
2 x PAGESIZE on x86 systems
-
5 x PAGESIZE on AMD64 systems
Maximum is 32 times the default value.
- Units
-
Bytes in multiples of the value returned by the getpagesize parameter. For more information, see getpagesize(3C).
- Dynamic?
-
Yes. Affects threads created after the variable is changed.
- Validation
-
Must be greater than or equal to 8192 and less than or equal
to 262,144 (256 x 1024). Also must be a multiple of the system page size.
If these conditions are not met, the following message is displayed:
Illegal stack size, Using N
|
The value of N is the default value of default_stksize.
- When to Change
-
When the system panics because it has run out of stack space.
The best solution for this problem is to determine why the system is running
out of space and then make a correction.
Increasing the default stack size means that almost every kernel thread
will have a larger stack, resulting in increased kernel memory consumption
for no good reason. Generally, that space will be unused. The increased consumption
means other resources that are competing for the same pool of memory will
have the amount of space available to them reduced, possibly decreasing the
system's ability to perform work. Among the side effects is a reduction in
the number of threads that the kernel can create. This solution should be
treated as no more than an interim workaround until the root cause is remedied.
- Commitment Level
-
Unstable
lwp_default_stksize
- Description
-
Specifies the default value of the stack size to be used when
a kernel thread is created, and when the calling routine does not provide
an explicit size to be used.
- Data Type
-
Integer
- Default
-
-
8192 for x86 platforms
-
24,576 for SPARC platforms
-
20,480 for AMD64 platforms
- Range
-
Minimum is the default values:
-
3 x PAGESIZE on SPARC systems
-
2 x PAGESIZE on x86 systems
-
5 x PAGESIZE on AMD64 systems
Maximum is 32 times the default value.
- Units
-
Bytes in multiples of the value returned by the getpagesize parameter. For more information, see getpagesize(3C).
- Dynamic?
-
Yes. Affects threads created after the variable is changed.
- Validation
-
Must be greater than or equal to 8192 and less than or equal
to 262,144 (256 x 1024). Also must be a multiple of the system page size.
If these conditions are not met, the following message is displayed:
Illegal stack size, Using N
|
The value of N is the default value of lwp_default_stksize.
- When to Change
-
When the system panics because it has run out of stack space.
The best solution for this problem is to determine why the system is running
out of space and then make a correction.
Increasing the default stack size means that almost every kernel thread
will have a larger stack, resulting in increased kernel memory consumption
for no good reason. Generally, that space will be unused. The increased consumption
means other resources that are competing for the same pool of memory will
have the amount of space available to them reduced, possibly decreasing the
system's ability to perform work. Among the side effects is a reduction in
the number of threads that the kernel can create. This solution should be
treated as no more than an interim workaround until the root cause is remedied.
- Commitment Level
-
Unstable
- Change History
-
For information, see lwp_default_stksize (Solaris 9 Releases).
logevent_max_q_sz
- Description
-
Maximum number of system events allowed to be queued and waiting
for delivery to the syseventd daemon. Once the size of
the system event queue reaches this limit, no other system events are allowed
on the queue.
- Data Type
-
Integer
- Default
-
5000
- Range
-
0 to MAXINT
- Units
-
System events
- Dynamic?
-
Yes
- Validation
-
The system event framework checks this value every time a
system event is generated by ddi_log_sysevent and sysevent_post_event.
For more information, see ddi_log_sysevent(9F) and sysevent_post_event(3SYSEVENT).
- When to Change
-
When error log messages indicate that a system event failed
to be logged, generated, or posted.
- Commitment Level
-
Unstable
segkpsize
- Description
-
Specifies the amount of kernel pageable memory available.
This memory is used primarily for kernel thread stacks. Increasing this number
allows either larger stacks for the same number of threads or more threads.
This parameter can only be set on a system running a 64-bit kernel. A system
running a 64-bit kernel uses a default stack size of 24 Kbytes.
- Data Type
-
Unsigned long
- Default
-
64-bit kernels, 2 Gbytes
32-bit kernels, 512 Mbytes
- Range
-
64-bit kernels, 512 Mbytes to 24 Gbytes
- Units
-
8-Kbyte pages
- Dynamic?
-
No
- Validation
-
Value is compared to minimum and maximum sizes (512 Mbytes
and 24 Gbytes for 64-bit systems). If smaller than the minimum or larger than
the maximum, it is reset to 2 Gbytes. A message to that effect is displayed.
The actual size used in creation of the cache is the lesser of the value
specified in segkpsize after the validation checking or
50 percent of physical memory.
- When to Change
-
Required to support large numbers of processes on a system.
The default size of 2 Gbytes, assuming at least 1 Gbyte of physical memory
is present. This default size allows creation of 24-Kbyte stacks for more
than 87,000 kernel threads. The size of a stack in a 64-bit kernel is the
same, whether the process is a 32-bit process or a 64-bit process. If more
than this number is needed, segkpsize can be increased,
assuming sufficient physical memory exists.
- Commitment Level
-
Unstable
- Change History
-
For information, see segkpsize (Solaris 9 12/02 Release).
noexec_user_stack
- Description
-
Enables the stack to be marked as nonexecutable, which helps
make buffer-overflow attacks more difficult.
A Solaris system running a 64-bit kernel makes the stacks of all
64-bit applications nonexecutable by default. Setting this parameter is necessary
to make 32-bit applications nonexecutable on systems running 64-bit or 32-bit
kernels.
Note –
This parameter exists on all systems running the Solaris 2.6,
7, 8, 9, or 10 releases, but it is only effective on 64–bit SPARC and
AMD64 architectures.
- Data Type
-
Signed integer
- Default
-
0 (disabled)
- Range
-
0 (disabled) or 1 (enabled)
- Units
-
Toggle (on/off)
- Dynamic?
-
Yes. Does not affect currently running processes, only processes
created after the value is set.
- Validation
-
None
- When to Change
-
Should be enabled at all times unless applications are deliberately
placing executable code on the stack without using mprotect to
make the stack executable. For more information, see mprotect(2).
- Commitment Level
-
Unstable
- Change History
-
For information, see noexec_user_stack (Solaris 9 Releases).
fsflush and Related Parameters
This section describes fsflush and related tunables.
fsflush
The
system daemon, fsflush, runs periodically to do three main
tasks:
-
On every invocation, fsflush flushes dirty
file system pages over a certain age to disk.
-
On every invocation, fsflush examines a
portion of memory and causes modified pages to be written to their backing
store. Pages are written if they are modified and if they do not meet one
of the following conditions:
The net effect is to flush pages from files that are mapped with mmap with write permission and that have actually been changed.
Pages are flushed to backing store but left attached to the process
using them. This will simplify page reclamation when the system runs low on
memory by avoiding delay for writing the page to backing store before claiming
it, if the page has not been modified since the flush.
-
fsflush writes file system metadata to
disk. This write is done every nth invocation,
where n is computed from various configuration
variables. See tune_t_fsflushr and autoup for details.
The following features are configurable:
-
Frequency of invocation (tune_t_fsflushr)
-
Whether memory scanning is executed (dopageflush)
-
Whether file system data flushing occurs (doiflush)
-
The frequency with which file system data flushing occurs
(autoup)
For most systems, memory scanning and file system metadata synchronizing
are the dominant activities for fsflush. Depending on system
usage, memory scanning can be of little use or consume too much CPU time.
tune_t_fsflushr
- Description
-
Specifies the number of seconds between fsflush invocations
- Data Type
-
Signed integer
- Default
-
1
- Range
-
1 to MAXINT
- Units
-
Seconds
- Dynamic?
-
No
- Validation
-
If the value is less than or equal to zero, the value is reset
to 1 and a warning message is displayed.
This check is done only at boot time.
- When to Change
-
See the autoup parameter.
- Commitment Level
-
Unstable
autoup
- Description
-
Along with tune_t_flushr, autoup controls
the amount of memory examined for dirty pages in each invocation and frequency
of file system synchronizing operations.
The
value of autoup is also used to control whether a buffer
is written out from the free list. Buffers marked with the B_DELWRI flag
(which identifies file content pages that have changed) are written out whenever
the buffer has been on the list for longer than autoup seconds.
Increasing the value of autoup keeps the buffers in memory
for a longer time.
- Data Type
-
Signed integer
- Default
-
30
- Range
-
1 to MAXINT
- Units
-
Seconds
- Dynamic?
-
No
- Validation
-
If autoup is less than or equal to zero,
it is reset to 30 and a warning message is displayed. This check is done only
at boot time.
- Implicit
-
autoup should be an integer multiple of tune_t_fsflushr. At a minimum, autoup should
be at least 6 times the value of tune_t_fsflushr. If not,
excessive amounts of memory are scanned each time fsflush is
invoked.
The total system pages multiplied by tune_t_fsflushr should
be greater than or equal to autoup to cause memory to be
checked if dopageflush is non-zero.
- When to Change
-
Here are several potential situations for changing autoup, tune_t_fsflushr, or both:
-
Systems with large amounts of memory – In this case,
increasing autoup reduces the amount of memory scanned
in each invocation of fsflush.
-
Systems with minimal memory demand – Increasing both autoup and tune_t_fsflushr reduces the number
of scans made. autoup should be increased also to maintain
the current ratio of autoup / tune_t_fsflushr.
-
Systems with large numbers of transient files (for example,
mail servers or software build machines) – If large numbers of files
are created and then deleted, fsflush might unnecessarily
write data pages for those files to disk.
- Commitment Level
-
Unstable
dopageflush
- Description
-
Controls whether memory is examined for modified pages during fsflush invocations. In each invocation of fsflush,
the number of physical memory pages in the system is determined. This number
might have changed because of a dynamic reconfiguration operation. Each invocation
scans by using this algorithm: total number of pages x tune_t_fsflushr / autoup pages
- Data Type
-
Signed integer
- Default
-
1 (enabled)
- Range
-
0 (disabled) or 1 (enabled)
- Units
-
Toggle (on/off)
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
If the system page scanner rarely runs, which is indicated
by a value of 0 in the sr column of vmstat output.
- Commitment Level
-
Unstable
- Change History
-
For information, see dopageflush (Solaris 10 Releases).
doiflush
- Description
-
Controls whether file system metadata syncs will be executed during fsflush invocations. This synchronization is done every Nth
invocation of fsflush where N= (autoup / tune_t_fsflushr). Because this algorithm is integer
division, if tune_t_fsflushr is greater than autoup,
a synchronization is done on every invocation of fsflush because
the code checks to see if its iteration counter is greater than or equal to N. Note that N is computed once on
invocation of fsflush. Later changes to tune_t_fsflushr or autoup have no effect on the frequency of
synchronization operations.
- Data Type
-
Signed integer
- Default
-
1 (enabled)
- Range
-
0 (disabled) or 1 (enabled)
- Units
-
Toggle (on/off)
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
When files are frequently modified over a period of time and
the load caused by the flushing perturbs system behavior.
Files whose existence, and therefore consistency of state, does not
matter if the system reboots are better kept in a TMPFS file system (for example, /tmp). Inode traffic can be reduced on systems, starting in the
Solaris 7 release, by using the mount -noatime option.
This option eliminates inode updates when the file is accessed.
For a system engaged in realtime processing, you might want to disable
this option and use explicit application file synchronizing to achieve consistency.
- Commitment Level
-
Unstable
Process-Sizing Parameters
Several parameters (or variables) are used to control the number of
processes that are available on the system and the number of processes that
an individual user can create. The foundation parameter is maxusers.
This parameter drives the values assigned to max_nprocs and maxuprc.
maxusers
- Description
-
Originally, maxusers defined the number of
logged in users the system could support. When a kernel was generated, various
tables were sized based on this setting. Current Solaris releases do much
of its sizing based on the amount of memory on the system. Thus, much of the
past use of maxusers has changed. A number of subsystems
that are still derived from maxusers:
-
The maximum number of processes on the system
-
The number of quota structures held in the system
-
The size of the directory name look-up cache (DNLC)
- Data Type
-
Signed integer
- Default
-
Lesser of the amount of memory in Mbytes or 2048
- Range
-
1 to 2048, based on physical memory if not set in the /etc/system file
1 to 4096, if set in the /etc/system file
- Units
-
Users
- Dynamic?
-
No. After computation of dependent parameters is done, maxusers is never referenced again.
- Validation
-
None
- When to Change
-
When the default number of user processes derived by the system
is too low. This situation is evident when the following message displays
on the system console:
You might also change this parameter when the default number of processes
is too high, as in these situations:
-
Database servers that have a lot of memory and relatively
few running processes can save system memory when the default value of maxusers is reduced.
-
If file servers have a lot of memory and few running processes,
you might reduce this value. However, you should explicitly set the size of
the DNLC. See ncsize.
-
If compute servers have a lot of memory and few running processes,
you might reduce this value.
- Commitment Level
-
Unstable
reserved_procs
- Description
-
Specifies the number of system process slots to be reserved in
the process table for processes with a UID of root (0). For example, fsflush has a UID of root (0).
- Data Type
-
Signed integer
- Default
-
5
- Range
-
5 to MAXINT
- Units
-
Processes
- Dynamic?
-
No. Not used after the initial parameter computation.
- Validation
-
Starting in the Solaris 8 release, any /etc/system setting
is honored.
- Commitment Level
-
Unstable
- When to Change
-
Consider increasing to 10 + the normal number of UID 0 (root)
processes on system. This setting provides some cushion should it be necessary
to obtain a root shell when the system is otherwise unable to create user-level
processes.
pidmax
- Description
-
Specifies the value of the largest possible process ID. Valid
for Solaris 8 and later releases.
pidmax sets the value for the maxpid variable.
Once maxpid is set, pidmax is ignored. maxpid is used elsewhere in the kernel to determine the maximum
process ID and for validation checking.
Any attempts to set maxpid by adding an entry to
the /etc/system file have no effect.
- Data Type
-
Signed integer
- Default
-
30,000
- Range
-
266 to 999,999
- Units
-
Processes
- Dynamic?
-
No. Used only at boot time to set the value of pidmax.
- Validation
-
Yes. Value is compared to the value of reserved_procs and
999,999. If less than reserved_procs or greater than 999,999,
the value is set to 999,999.
- Implicit
-
max_nprocs range checking ensures that max_nprocs is always less than or equal to this value.
- When to Change
-
Required to enable support for more than 30,000 processes
on a system.
- Commitment Level
-
Unstable
max_nprocs
- Description
-
Specifies the maximum number of processes that can be created
on a system. Includes system processes and user processes. Any value specified
in /etc/system is used in the computation of maxuprc.
This
value is also used in determining the size of several other system data structures.
Other data structures where this parameter plays a role are as follows:
-
Determining the size of the directory name lookup cache (if ncsize is not specified)
-
Allocating disk quota structures for UFS (if ndquot is
not specified)
-
Verifying that the amount of memory used by configured system
V semaphores does not exceed system limits
-
Configuring Hardware Address Translation resources for x86 platforms.
- Data Type
-
Signed integer
- Default
-
10 + (16 x maxusers)
- Range
-
266 to value of maxpid
- Dynamic?
-
No
- Validation
-
Yes. The value is compared to maxpid and
set to maxpid if it is larger. On x86 platforms, an additional
check is made against a platform-specific value. max_nprocs is
set to the smallest value in the triplet (max_nprocs, maxpid, platform value). Both SPARC and x86 platforms use 65,534
as the platform value.
- When to Change
-
Changing this parameter is one of the steps necessary to enable
support for more than 30,000 processes on a system.
- Commitment Level
-
Unstable
- Change History
-
For information, see max_nprocs (Solaris 9 Releases).
maxuprc
- Description
-
Specifies the maximum number of processes that can be created
on a system by any one user.
- Data Type
-
Signed integer
- Default
-
max_nprocs - reserved_procs
- Range
-
1 to max_nprocs - reserved_procs
- Units
-
Processes
- Dynamic?
-
No
- Validation
-
Yes. This value is compared to max_nprocs - reserved_procs and set to the smaller of the two values.
- When to Change
-
When you want to specify a hard limit for the number of processes
a user can create that is less than the default value of however many processes
the system can create. Attempting to exceed this limit generates the following
warning messages on the console or in the messages file:
out of per-user processes for uid N
|
- Commitment Level
-
Unstable
Paging-Related Parameters
The Solaris OS uses a demand paged virtual memory system. As the system
runs, pages are brought into memory as needed. When memory becomes occupied
above a certain threshold and demand for memory continues, paging begins.
Paging goes through several levels that are controlled by certain parameters.
The general paging algorithm is as follows:
-
A memory deficit is noticed. The page scanner thread runs
and begins to walk through memory. A two-step algorithm is employed:
-
A page is marked as unused.
-
If still unused after a time interval, the page is viewed
as a subject for reclaim.
If the page has been modified, a request is made to the pageout thread
to schedule the page for I/O. Also, the page scanner continues looking at
memory. Pageout causes the page to be written to the page's backing store
and placed on the free list. When the page scanner scans memory, no distinction
is made as to the origin of the page. The page might have come from a data
file, or it might represent a page from an executable's text, data, or stack.
-
As memory pressure on the system increases, the algorithm
becomes more aggressive in the pages it will consider as candidates for reclamation
and in how frequently the paging algorithm runs. (For more information, see fastscan and slowscan.) As available memory falls between the range lotsfree and minfree, the system linearly increases the amount of memory scanned
in each invocation of the pageout thread from the value specified by slowscan to the value specified by fastscan. The system
uses the desfree parameter to control a number of decisions
about resource usage and behavior.
The system initially constrains itself to use no more than 4 percent
of one CPU for pageout operations. As memory pressure increases,
the amount of CPU time consumed in support of pageout operations
linearly increases until a maximum of 80 percent of one CPU is consumed. The
algorithm looks through some amount of memory between slowscan and fastscan, then stops when one of the following occurs:
-
Enough pages have been found to satisfy the memory shortfall.
-
The planned number of pages have been looked at.
-
Too much time has elapsed.
If a memory shortfall is still present when pageout finishes its scan,
another scan is scheduled for 1/4 second in the future.
The configuration mechanism of the paging subsystem was changed, starting
in the Solaris 9 release. Instead of depending on a set of predefined values
for fastscan, slowscan, and handspreadpages, the system determines the appropriate settings for these parameters
at boot time. Setting any of these parameters in the /etc/system file
can cause the system to use less than optimal values.
 Caution – Remove all tuning of the VM system from the /etc/system file.
Run with the default settings and determine if it is necessary to adjust any
of these parameters. Do not set either cachefree or priority_paging. They have been removed, starting in the Solaris 9 release.
Beginning in the Solaris 7 5/99 release, dynamic reconfiguration (DR)
for CPU and memory is supported. A system in a DR operation that involves
the addition or deletion of memory recalculates values for the relevant parameters,
unless the parameter has been explicitly set in /etc/system.
In that case, the value specified in /etc/system is used,
unless a constraint on the value of the variable has been violated. In this
case, the value is reset.
lotsfree
- Description
-
Serves as the initial trigger for system paging to begin. When
this threshold is crossed, the page scanner wakes up to begin looking for
memory pages to reclaim.
- Data Type
-
Unsigned long
- Default
-
The greater of 1/64th of physical memory or 512 Kbytes
- Range
-
The minimum value is 512 Kbytes or 1/64th of physical memory,
whichever is greater, expressed as pages using the page size returned by getpagesize. For more information, seegetpagesize(3C).
The maximum value is the number of physical memory pages. The maximum
value should be no more than 30 percent of physical memory. The system does
not enforce this range, other than that described in the Validation section.
- Units
-
Pages
- Dynamic?
-
Yes, but dynamic changes are lost if a memory-based DR operation
occurs.
- Validation
-
If lotsfree is greater than the amount
of physical memory, the value is reset to the default.
- Implicit
-
The relationship of lotsfree being greater
than desfree, which is greater than minfree,
should be maintained at all times.
- When to Change
-
When demand for pages is subject to sudden sharp spikes, the
memory algorithm might be unable to keep up with demand. One workaround is
to start reclaiming memory at an earlier time. This solution gives the paging
system some additional margin.
A rule of thumb is to set this parameter to 2 times what the system
needs to allocate in a few seconds. This parameter is workload dependent.
A DBMS server can probably work fine with the default settings. However, you
might need to adjust this parameter for a system doing heavy file system I/O.
For systems with relatively static workloads and large amounts of memory,
lower this value. The minimum acceptable value is 512 Kbytes, expressed as
pages using the page size returned by getpagesize.
- Commitment Level
-
Unstable
desfree
- Description
-
Specifies the preferred amount of memory to be free at all times
on the system.
- Data Type
-
Unsigned integer
- Default
-
lotsfree / 2
- Range
-
The minimum value is 256 Kbytes or 1/128th of physical memory,
whichever is greater, expressed as pages using the page size returned by getpagesize.
The maximum value is the number of physical memory pages. The maximum
value should be no more than 15 percent of physical memory. The system does
not enforce this range other than that described in the Validation section.
- Units
-
Pages
- Dynamic?
-
Yes, unless dynamic reconfiguration operations that add or
delete memory occur. At that point, the value is reset to the value provided
in the /etc/system file or calculated from the new physical
memory value.
- Validation
-
If desfree is greater than lotsfree, desfree is set to lotsfree / 2. No message is
displayed.
- Implicit
-
The relationship of lotsfree being greater
than desfree, which is greater than minfree,
should be maintained at all times.
- Side Effects
-
Several side effects can arise from increasing the value of
this parameter. When the new value nears or exceeds the amount of available
memory on the system, the following can occur:
-
Asynchronous I/O requests are not processed, unless available
memory exceeds desfree. Increasing the value of desfree can result in rejection of requests that otherwise would succeed.
-
NFS asynchronous writes are executed as synchronous writes.
-
The swapper is awakened earlier, and the behavior of the swapper
is biased towards more aggressive actions.
-
The system might not prefault as many executable pages into
the system. This side effect results in applications potentially running slower
than they otherwise would.
- When to Change
-
For systems with relatively static workloads and large amounts
of memory, lower this value. The minimum acceptable value is 256 Kbytes, expressed
as pages using the page size returned by getpagesize.
- Commitment Level
-
Unstable
minfree
- Description
-
Specifies the minimum acceptable memory level. When memory drops
below this number, the system biases allocations toward allocations necessary
to successfully complete pageout operations or to swap processes completely
out of memory. Either allocation denies or blocks other allocation requests.
- Data Type
-
Unsigned integer
- Default
-
desfree / 2
- Range
-
The minimum value is 128 Kbytes or 1/256th of physical memory,
whichever is greater, expressed as pages using the page size returned by getpagesize.
The maximum value is the number of physical memory pages. The maximum
value should be no more than 7.5 percent of physical memory. The system does
not enforce this range other than that described in the Validation section.
- Units
-
Pages
- Dynamic?
-
Yes, unless dynamic reconfiguration operations that add or
delete memory occur. At that point, the value is reset to the value provided
in the /etc/system file or calculated from the new physical
memory value.
- Validation
-
If minfree is greater than desfree, minfree is set to desfree / 2. No message is
displayed.
- Implicit
-
The relationship of lotsfree being greater
than desfree, which is greater than minfree,
should be maintained at all times.
- When to Change
-
The default value is generally adequate. For systems with
relatively static workloads and large amounts of memory, lower this value.
The minimum acceptable value is 128 Kbytes, expressed as pages using the page
size returned by getpagesize.
- Commitment Level
-
Unstable
throttlefree
- Description
-
Specifies the memory level at which blocking memory allocation
requests are put to sleep, even if the memory is sufficient to satisfy the
request.
- Data Type
-
Unsigned integer
- Default
-
minfree
- Range
-
The minimum value is 128 Kbytes or 1/256th of physical memory,
whichever is greater, expressed as pages using the page size returned by getpagesize.
The maximum value is the number of physical memory pages. The maximum
value should be no more than 4 percent of physical memory. The system does
not enforce this range other than that described in the Validation section.
- Units
-
Pages
- Dynamic?
-
Yes, unless dynamic reconfiguration operations that add or
delete memory occur. At that point, the value is reset to the value provided
in the /etc/system file or calculated from the new physical
memory value.
- Validation
-
If throttlefree is greater than desfree, throttlefree is set to minfree.
No message is displayed.
- Implicit
-
The relationship of lotsfree is greater
than desfree, which is greater than minfree,
should be maintained at all times.
- When to Change
-
The default value is generally adequate. For systems with
relatively static workloads and large amounts of memory, lower this value.
The minimum acceptable value is 128 Kbytes, expressed as pages using the page
size returned by getpagesize. For more information, see getpagesize(3C).
- Commitment Level
-
Unstable
pageout_reserve
- Description
-
Specifies the number of pages reserved for the exclusive use of
the pageout or scheduler threads. When available memory is less than this
value, nonblocking allocations are denied for any processes other than pageout
or the scheduler. Pageout needs to have a small pool of memory for its use
so it can allocate the data structures necessary to do the I/O for writing
a page to its backing store. This variable was introduced in the Solaris 2.6
release to ensure that the system would be able to perform a pageout operation
in the face of the most severe memory shortage.
- Data Type
-
Unsigned integer
- Default
-
throttlefree / 2
- Range
-
The minimum value is 64 Kbytes or 1/512th of physical memory,
whichever is greater, expressed as pages using the page size returned by getpagesize(3C).
The maximum is the number of physical memory pages. The maximum value
should be no more than 2 percent of physical memory. The system does not enforce
this range, other than that described in the Validation section.
- Units
-
Pages
- Dynamic?
-
Yes, unless dynamic reconfiguration operations that add or
delete memory occur. At that point, the value is reset to the value provided
in the /etc/system file or calculated from the new physical
memory value.
- Validation
-
If pageout_reserve is greater than throttlefree / 2, pageout_reserve is set to throttlefree / 2. No message is displayed.
- Implicit
-
The relationship of lotsfree being greater
than desfree, which is greater than minfree,
should be maintained at all times.
- When to Change
-
The default value is generally adequate. For systems with
relatively static workloads and large amounts of memory, lower this value.
The minimum acceptable value is 64 Kbytes, expressed as pages using the page
size returned by getpagesize.
- Commitment Level
-
Unstable
pages_pp_maximum
- Description
-
Defines the number of pages that must be unlocked. If a request
to lock pages would force available memory below this value, that request
is refused.
- Data Type
-
Unsigned long
- Default
-
The greater of (tune_t_minarmem + 100 and
[4% of memory available at boot time + 4 Mbytes])
- Range
-
Minimum value enforced by the system is tune_t_minarmem +
100. The system does not enforce a maximum value.
- Units
-
Pages
- Dynamic?
-
Yes, unless dynamic reconfiguration operations that add or
delete memory occur. At that point, the value is reset to the value provided
in the /etc/system file or was calculated from the new
physical memory value.
- Validation
-
If the value specified in the /etc/system file
or the calculated default is less than tune_t_minarmem +
100, the value is reset to tune_t_minarmem + 100.
No message is displayed if the value from the /etc/system file
is increased. Validation is done only at boot time and during dynamic reconfiguration
operations that involve adding or deleting memory.
- When to Change
-
When memory-locking requests fail or when attaching to a shared
memory segment with the SHARE_MMU flag fails, yet the amount
of memory available seems to be sufficient.
Excessively large values can cause memory locking requests (mlock, mlockall, and memcntl) to fail unnecessarily.
For more information, see mlock(3C),
mlockall(3C),
and memcntl(2).
- Commitment Level
-
Unstable
- Change History
-
For information, see pages_pp_maximum (Solaris Releases Prior to Solaris 9 Releases).
tune_t_minarmem
- Description
-
Defines the minimum available resident (not swappable) memory
to maintain necessary to avoid deadlock. Used to reserve a portion of memory
for use by the core of the OS. Pages restricted in this way are not seen when
the OS determines the maximum amount of memory available.
- Data Type
-
Signed integer
- Default
-
25
- Range
-
1 to physical memory
- Units
-
Pages
- Dynamic?
-
No
- Validation
-
None. Large values result in wasted physical memory.
- When to Change
-
The default value is generally adequate. Consider increasing
the default value if the system locks up and debugging information indicates
that no memory was available.
- Commitment Level
-
Unstable
fastscan
- Description
-
Defines the maximum number of pages per second that the system
looks at when memory pressure is highest.
- Data Type
-
Signed integer
- Default
-
The lesser of 64 Mbytes and 1/2 of physical memory.
- Range
-
1 to one-half of physical memory
- Units
-
Pages
- Dynamic?
-
Yes, unless dynamic reconfiguration operations that add or
delete memory occur. At that point, the value is reset to the value provided
by /etc/system or calculated from the new physical memory
value.
- Validation
-
The maximum value is the lesser of 64 Mbytes and 1/2 of physical
memory.
- When to Change
-
When more aggressive scanning of memory is preferred during
periods of memory shortfall, especially when the system is subject to periods
of intense memory demand or when performing heavy file I/O.
- Commitment Level
-
Unstable
slowscan
- Description
-
Defines the minimum number of pages per second that the system
looks at when attempting to reclaim memory.
- Data Type
-
Signed integer
- Default
-
The smaller of 1/20th of physical memory in pages and 100.
- Range
-
1 to fastscan / 2
- Units
-
Pages
- Dynamic?
-
Yes, unless dynamic reconfiguration operations that add or
delete memory occur. At that point, the value is reset to the value provided
in the /etc/system file or calculated from the new physical
memory value.
- Validation
-
If slowscan is larger than fastscan /
2, slowscan is reset to fastscan / 2.
No message is displayed.
- When to Change
-
When more aggressive scanning of memory is preferred during
periods of memory shortfall, especially when the system is subject to periods
of intense memory demand.
- Commitment Level
-
Unstable
min_percent_cpu
- Description
-
Defines the minimum percentage of CPU that pageout can
consume. This parameter is used as the starting point for determining the
maximum amount of time that can be consumed by the page scanner.
- Data Type
-
Signed integer
- Default
-
4
- Range
-
1 to 80
- Units
-
Percentage
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
Increasing this value on systems with multiple CPUs and lots
of memory, which are subject to intense periods of memory demand, enables
the pager to spend more time attempting to find memory.
- Commitment Level
-
Unstable
handspreadpages
- Description
-
The Solaris OS uses a two-handed clock algorithm to look for pages
that are candidates for reclaiming when memory is low. The first hand of the
clock walks through memory marking pages as unused. The second hand walks
through memory some distance after the first hand, checking to see if the
page is still marked as unused. If so, the page is subject to being reclaimed.
The distance between the first hand and the second hand is handspreadpages.
- Data Type
-
Unsigned long
- Default
-
fastscan
- Range
-
1 to maximum number of physical memory pages on the system
- Units
-
Pages
- Dynamic?
-
Yes. This parameter requires that the kernel reset_hands parameter also be set to a non-zero value. Once the new value of handspreadpages has been recognized, reset_hands is
set to zero.
- Validation
-
The value is set to the lesser of either the amount of physical
memory and the handspreadpages value.
- When to Change
-
When you want to increase the amount of time that pages are
potentially resident before being reclaimed. Increasing this value increases
the separation between the hands, and therefore, the amount of time before
a page can be reclaimed.
- Commitment Level
-
Unstable
pages_before_pager
- Description
-
Defines part of a system threshold that immediately frees pages
after an I/O completes instead of storing the pages for possible reuse. The
threshold is lotsfree + pages_before_pager. The NFS environment
also uses this threshold to curtail its asynchronous activities as memory
pressure mounts.
- Data Type
-
Signed integer
- Default
-
200
- Range
-
1 to amount of physical memory
- Units
-
Pages
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
You might change this parameter when the majority of I/O is
done for pages that are truly read or written once and never referenced again.
Setting this variable to a larger amount of memory keeps adding pages to the
free list.
You might also change this parameter when the system is subject to bursts
of severe memory pressure. A larger value here helps maintain a larger cushion
against the pressure.
- Commitment Level
-
Unstable
maxpgio
- Description
-
Defines the maximum number of page I/O requests that can be queued
by the paging system. This number is divided by 4 to get the actual maximum
number used by the paging system. This parameter is used to throttle the number
of requests as well as to control process swapping.
- Data Type
-
Signed integer
- Default
-
40
- Range
-
1 to a variable maximum that depends on the system architecture,
but mainly by the I/O subsystem, such as the number of controllers, disks,
and disk swap size
- Units
-
I/0s
- Dynamic?
-
No
- Validation
-
None
- Implicit
-
The maximum number of I/O requests from the pager is limited
by the size of a list of request buffers, which is currently sized at 256.
- When to Change
-
Increase this parameter to page out memory faster. A larger
value might help to recover faster from memory pressure if more than one swap
device is configured or if the swap device is a striped device. Note that
the existing I/O subsystem should be able to handle the additional I/O load.
Also, increased swap I/O could degrade application I/O performance if the
swap partition and application files are on the same disk.
- Commitment Level
-
Unstable
- Change History
-
For information, see maxpgio (Solaris 10 Releases).
Swapping-Related Parameters
Swapping in the Solaris OS is accomplished by the swapfs pseudo file
system. The combination of space on swap devices and physical memory is treated
as the pool of space available to support the system for maintaining backing
store for anonymous memory. The system attempts to allocate space from disk
devices first, and then uses physical memory as backing store. When swapfs
is forced to use system memory for backing store, limits are enforced to ensure
that the system does not deadlock because of excessive consumption by swapfs.
swapfs_reserve
- Description
-
Defines the amount of system memory that is reserved for use by
system (UID = 0) processes.
- Data Type
-
Unsigned long
- Default
-
The smaller of 4 Mbytes and 1/16th of physical memory
- Range
-
The minimum value is 4 Mbytes or 1/16th of physical memory,
whichever is smaller, expressed as pages using the page size returned by getpagesize.
The maximum value is the number of physical memory pages. The maximum
value should be no more than 10 percent of physical memory. The system does
not enforce this range, other than that described in the Validation section.
- Units
-
Pages
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
Generally not necessary. Only change when recommended by a
software provider, or when system processes are terminating because of an
inability to obtain swap space. A much better solution is to add physical
memory or additional swap devices to the system.
- Commitment Level
-
Unstable
swapfs_minfree
- Description
-
Defines the desired amount of physical memory to be kept free
for the rest of the system. Attempts to reserve memory for use as swap space
by any process that causes the system's perception of available memory to
fall below this value are rejected. Pages reserved in this manner can only
be used for locked-down allocations by the kernel or by user-level processes.
- Data Type
-
Unsigned long
- Default
-
The larger of 2 Mbytes and 1/8th of physical memory
- Range
-
1 to amount of physical memory
- Units
-
Pages
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
When processes are failing because of an inability to obtain
swap space, yet the system has memory available.
- Commitment Level
-
Unstable
Kernel Memory Allocator
The Solaris kernel memory allocator distributes chunks of memory for
use by clients inside the kernel. The allocator creates a number of caches
of varying size for use by its clients. Clients can also request the allocator
to create a cache for use by that client (for example, to allocate structures
of a particular size). Statistics about each cache that the allocator manages
can be seen by using the kstat -c kmem_cache command.
Occasionally, systems might panic because of memory corruption. The
kernel memory allocator supports a debugging interface (a set of flags), that
performs various integrity checks on the buffers. The kernel memory allocator
also collects information on the allocators. The integrity checks provide
the opportunity to detect errors closer to where they actually occurred. The
collected information provides additional data for support people when they
try to ascertain the reason for the panic.
Use of the flags incurs additional overhead and memory usage during
system operations. The flags should only be used when a memory corruption
problem is suspected.
kmem_flags
- Description
-
The Solaris kernel memory allocator has various debugging and
test options that were extensively used during the internal development cycle
of the Solaris OS. Starting in the Solaris 2.5 release, a subset of these
options became available. They are controlled by the kmem_flags variable,
which was set with a kernel debugger, and then rebooting the system. Because
of issues with the timing of the instantiation of the kernel memory allocator
and the parsing of the /etc/system file, it was not possible
to set these flags in the /etc/system file until the
Solaris 8 release.
Five supported flag settings are described here.
|
Flag
|
Setting
|
Description
|
|
AUDIT
|
0x1
|
The allocator maintains a log that contains recent history of its activity.
The number of items logged depends on whether CONTENTS is
also set. The log is a fixed size. When space is exhausted, earlier records
are reclaimed.
|
|
TEST
|
0x2
|
The allocator writes a pattern into freed memory and checks that the
pattern is unchanged when the buffer is next allocated. If some portion of
the buffer is changed, then the memory was probably used by a client that
had previously allocated and freed the buffer. If an overwrite is identified,
the system panics.
|
|
REDZONE
|
0x4
|
The allocator provides extra memory at the end of the requested buffer
and inserts a special pattern into that memory. When the buffer is freed,
the pattern is checked to see if data was written past the end of the buffer.
If an overwrite is identified, the kernel panics.
|
|
CONTENTS
|
0x8
|
The allocator logs up to 256 bytes of buffer contents when the buffer
is freed. This flag requires that AUDIT also be set.
The numeric value of these flags can be logically added together and
set by the /etc/system file, starting in the Solaris
8 release, or for previous releases, by booting kadb and
setting the flags before starting the kernel.
|
|
LITE
|
0x100
|
Does minimal integrity checking when a buffer is allocated and freed.
When enabled, the allocator checks that the redzone has not been written into,
that a freed buffer is not being freed again, and that the buffer being freed
is the size that was allocated. This flag is available as of the Solaris 7
3/99 release. Do not combine this flag with any other flags.
|
- Data Type
-
Signed integer
- Default
-
0 (disabled)
- Range
-
0 (disabled) or 1 - 15 or 256 (0x100)
- Dynamic?
-
Yes. Changes made during runtime only affect new kernel memory
caches. After system initialization, the creation of new caches is rare.
- Validation
-
None
- When to Change
-
When memory corruption is suspected
- Commitment Level
-
Unstable
General Driver Parameters
moddebug
- Description
-
When this
parameter is enabled, messages about various steps
in the module loading process aredisplayed.
- Data Type
-
Signed integer
- Default
-
0 (messages off)
- Range
-
Here are the most useful values:
-
0x80000000 – Prints [un] loading... message.
For every module loaded, messages such as the following appear on the console
and in the /var/adm/messages file:
Nov 5 16:12:28 sys genunix: [ID 943528 kern.notice]
load 'sched/TS_DPTBL' id 9 loaded @ 0x10126438/
0x10438dd8 size 132/2064
Nov 5 16:12:28 sys genunix: [ID 131579 kern.notice]
installing TS_DPTBL, module id 9.
|
-
0x40000000 – Prints detailed error messages. For every
module loaded, messages such as the following appear on the console and in
the /var/adm/messages file:
Nov 5 16:16:50 sys krtld: [ID 284770 kern.notice]
kobj_open: can't open /platform/SUNW,Ultra-80/kernel/
sched/TS_DPTBL
Nov 5 16:16:50 sys krtld: [ID 284770 kern.notice]
kobj_open: can't open /platform/sun4u/kernel/sched/
TS_DPTBL
Nov 5 16:16:50 sys krtld: [ID 797908 kern.notice]
kobj_open: '/kernel/sch...
Nov 5 16:16:50 sys krtld: [ID 605504 kern.notice]
descr = 0x2a
Nov 5 16:16:50 sys krtld: [ID 642728 kern.notice]
kobj_read_file: size=34,
Nov 5 16:16:50 sys krtld: [ID 217760 kern.notice]
offset=0
Nov 5 16:16:50 sys krtld: [ID 136382 kern.notice]
kobj_read: req 8192 bytes,
Nov 5 16:16:50 sys krtld: [ID 295989 kern.notice]
got 4224
Nov 5 16:16:50 sys krtld: [ID 426732 kern.notice]
read 1080 bytes
Nov 5 16:16:50 sys krtld: [ID 720464 kern.notice]
copying 34 bytes
Nov 5 16:16:50 sys krtld: [ID 234587 kern.notice]
count = 34
[33 lines elided]
Nov 5 16:16:50 sys genunix: [ID 943528 kern.notice]
load 'sched/TS_DPTBL' id 9 loaded @ 0x10126438/
0x10438dd8 size 132/2064
Nov 5 16:16:50 sys genunix: [ID 131579 kern.notice]
installing TS_DPTBL, module id 9.
Nov 5 16:16:50 sys genunix: [ID 324367 kern.notice]
init 'sched/TS_DPTBL' id 9 loaded @ 0x10126438/
0x10438dd8 size 132/2064
|
-
0x20000000 - Prints even more detailed messages. This value
doesn't print any additional information beyond what the 0x40000000 flag
does during system boot. However, this value does print additional information
about releasing the module when the module is unloaded.
These values can be added together to set the final value.
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
When a module is either not loading as expected, or the system
seems to hang while loading modules. Note that when 0x40000000 is
set, system boot is slowed down considerably by the number of messages written
to the console.
- Commitment Level
-
Unstable
ddi_msix_alloc_limit
- Description
-
This parameter, available on x86 systems only, controls the
number of Extended Message Signaled Interrupts (MSI-X) that a device instance
can allocate. Due to an existing system limitation, the default value is 2.
You can increase the number of MSI-X interrupts that a device instance can
allocate by increasing the value of this parameter. This parameter can be
set either by editing the /etc/system file or by setting
it with mdb before the device driver attach occurs.
- Data Type
-
Signed integer
- Default
-
2
- Range
-
1 to 16
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
To increase the number of MSI-X interrupts that a device instance
can allocate. However, if you increase the number of MSI-X interrupts that
a device instance can allocate, adequate interrupts might not be available
to satisfy all allocation requests. If this happens, some devices might stop
functioning or the system might fail to boot. Reduce the value or remove the
parameter in this case.
- Commitment Level
-
Unstable
- Change History
-
For information, see ddi_msix_alloc_limit (Solaris 10 Release and Open Solaris 2009.06 Release).
General I/O Parameters
maxphys
- Description
-
Defines the maximum size of physical I/O requests. If a driver
encounters a request larger than this size, the driver breaks the request
into maxphys sized chunks. File systems can and do impose
their own limit.
- Data Type
-
Signed integer
- Default
-
131,072 (sun4u or sun4v)
or 57,344 (x86). The sd driver uses the value of 1,048,576
if the drive supports wide transfers. The ssd driver uses
1,048,576 by default.
- Range
-
Machine-specific page size to MAXINT
- Units
-
Bytes
- Dynamic?
-
Yes, but many file systems load this value into a per-mount
point data structure when the file system is mounted. A number of drivers
load the value at the time a device is attached to a driver-specific data
structure.
- Validation
-
None
- When to Change
-
When doing I/O to and from raw devices in large chunks. Note
that a DBMS doing OLTP operations issues large numbers of small I/Os. Changing maxphys does not result in any performance improvement in that case.
You might also consider changing this parameter when doing I/O to and
from a UFS file system where large amounts of data (greater than 64 Kbytes)
are being read or written at any one time. The file system should be optimized
to increase contiguity. For example, increase the size of the cylinder groups
and decrease the number of inodes per cylinder group. UFS imposes an internal
limit of 1 Mbyte on the maximum I/O size it transfers.
- Commitment Level
-
Unstable
- Change History
-
For information, see maxphys (Solaris 10 Releases).
rlim_fd_max
- Description
-
Specifies the “hard” limit on file descriptors that
a single process might have open. Overriding this limit requires superuser
privilege.
- Data Type
-
Signed integer
- Default
-
65,536
- Range
-
1 to MAXINT
- Units
-
File descriptors
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
When the maximum number of open files for a process is not
enough. Other limitations in system facilities can mean that a larger number
of file descriptors is not as useful as it might be. For example:
-
A 32-bit program using standard I/O is limited to 256 file
descriptors. A 64-bit program using standard I/O can use up to 2 billion descriptors.
Specifically, standard I/O refers to the stdio(3C) functions in libc(3LIB).
-
select is by default limited to 1024 descriptors
per fd_set. For more information, see select(3C). Starting with the Solaris
7 release, 32-bit application code can be recompiled with a larger fd_set size (less than or equal to 65,536). A 64-bit application uses
an fd_set size of 65,536, which cannot be changed.
An alternative to changing this on a system wide basis is to use the plimit(1) command. If a parent process
has its limits changed by plimit, all children inherit
the increased limit. This alternative is useful for daemons such as inetd.
- Commitment Level
-
Unstable
- Change History
-
For information, see rlim_fd_max (Solaris 8 Release).
rlim_fd_cur
- Description
-
Defines the “soft” limit on file descriptors that
a single process can have open. A process might adjust its file descriptor
limit to any value up to the “hard” limit defined by rlim_fd_max by using the setrlimit() call or by issuing
the limit command in whatever shell it is running. You
do not require superuser privilege to adjust the limit to any value less than
or equal to the hard limit.
- Data Type
-
Signed integer
- Default
-
256
- Range
-
1 to MAXINT
- Units
-
File descriptors
- Dynamic?
-
No
- Validation
-
Compared to rlim_fd_max. If rlim_fd_cur is greater than rlim_fd_max, rlim_fd_cur is
reset to rlim_fd_max.
- When to Change
-
When the default number of open files for a process is not
enough. Increasing this value means only that it might not be necessary for
a program to use setrlimit to increase the maximum number
of file descriptors available to it.
- Commitment Level
-
Unstable
General File System Parameters
ncsize
- Description
-
Defines the number of entries in the directory name look-up cache
(DNLC). This parameter is used by UFS, NFS, and ZFS to cache elements of path names
that have been resolved.
Starting with the Solaris 8 6/00 release, the DNLC also caches negative
look-up information, which means it caches a name not found in the cache.
- Data Type
-
Signed integer
- Default
-
(4 x (v.v_proc + maxusers)
+ 320) + (4 x (v.v_proc + maxusers)
+ 320 / 100
- Range
-
0 to MAXINT
- Units
-
DNLC entries
- Dynamic?
-
No
- Validation
-
None. Larger values cause the time it takes to unmount a file
system to increase as the cache must be flushed of entries for that file system
during the unmount process.
- When to Change
-
Prior to the Solaris 8 6/00 release, it was difficult to determine
whether the cache was too small. You could make this inference by noting the
number of entries returned by kstat -n ncstats. If the
number seems high, given the system workload and file access pattern, this
might be due to the size of the DNLC.
Starting with the Solaris 8 6/00 release, you can use the kstat
-n dnlcstats command to determine when entries have been removed
from the DNLC because it was too small. The sum of the pick_heuristic and
the pick_last parameters represents otherwise valid entries
that were reclaimed because the cache was too small.
Excessive values of ncsize have an immediate impact
on the system because the system allocates a set of data structures for the
DNLC based on the value of ncsize. A system running a 32-bit
kernel allocates 36-byte structures for ncsize, while a
system running a 64-bit kernel allocates 64-byte structures for ncsize.
The value has a further effect on UFS and NFS, unless ufs_ninode and nfs:nrnode are explicitly set.
- Commitment Level
-
Unstable
- Change History
-
For information, see ncsize (Solaris 10 Release).
rstchown
- Description
-
Indicates whether the POSIX semantics for the chown system
call are in effect. POSIX semantics are as follows:
-
A process cannot change the owner of a file, unless it is
running with UID 0.
-
A process cannot change the group ownership of a file to a
group in which it is not currently a member, unless it is running as UID 0.
For more information, see chown(2).
- Data Type
-
Signed integer
- Default
-
1, indicating that POSIX semantics are used
- Range
-
0 = POSIX semantics not in force or 1 = POSIX semantics used
- Units
-
Toggle (on/off)
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
When POSIX semantics are not wanted. Note that turning off
POSIX semantics opens the potential for various security holes. Doing so also
opens the possibility of a user changing ownership of a file to another user
and being unable to retrieve the file without intervention from the user or
the system administrator.
- Commitment Level
-
Obsolete
dnlc_dir_enable
- Description
-
Enables large directory caching
Note –
This parameter has no effect on NFS or ZFS file systems.
- Data Type
-
Unsigned integer
- Default
-
1 (enabled)
- Range
-
0 (disabled) or 1 (enabled)
- Dynamic?
-
Yes, but do not change this tunable dynamically. You can enable
this parameter if it was originally disabled. Or, you can disable this parameter
if it was originally enabled. However, enabling, disabling, and then enabling
this parameter might lead to stale directory caches.
- Validation
-
No
- When to Change
-
Directory caching has no known problems. However, if problems
occur, then set dnlc_dir_enable to 0 to disable caching.
- Commitment Level
-
Unstable
dnlc_dir_min_size
- Description
-
Specifies the minimum number of entries cached for one directory.
Note –
This parameter has no effect on NFS or ZFS file systems.
- Data Type
-
Unsigned integer
- Default
-
40
- Range
-
0 to MAXUINT (no maximum)
- Units
-
Entries
- Dynamic?
-
Yes, this parameter can be changed at any time.
- Validation
-
None
- When to Change
-
If performance problems occur with caching small directories,
then increase dnlc_dir_min_size. Note that individual file
systems might have their own range limits for caching directories. For instance,
UFS limits directories to a minimum of ufs_min_dir_cache bytes
(approximately 1024 entries), assuming 16 bytes per entry.
- Commitment Level
-
Unstable
dnlc_dir_max_size
- Description
-
Specifies the maximum number of entries cached for one directory.
Note –
This parameter has no effect on NFS or ZFS file systems.
- Data Type
-
Unsigned integer
- Default
-
MAXUINT (no maximum)
- Range
-
0 to MAXUINT
- Dynamic?
-
Yes, this parameter can be changed at any time.
- Validation
-
None
- When to Change
-
If performance problems occur with large directories, then
decrease dnlc_dir_max_size.
- Commitment Level
-
Unstable
segmap_percent
- Description
-
Defines the maximum amount of memory that is used for the fast-access
file system cache. This pool of memory is subtracted from the free memory
list.
- Data Type
-
Unsigned integer
- Default
-
12 percent of free memory at system startup time
- Range
-
2 Mbytes to 100 percent of physmem
- Units
-
% of physical memory
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
If heavy file system activity is expected, and sufficient
free memory is available, you should increase the value of this parameter.
- Commitment Level
-
Unstable
UFS Parameters
bufhwm and bufhwm_pct
- Description
-
Defines the maximum amount of memory for caching I/O buffers.
The buffers are used for writing file system metadata (superblocks, inodes,
indirect blocks, and directories). Buffers are allocated as needed until the
amount of memory (in Kbytes) to be allocated exceed bufhwm.
At this point, metadata is purged from the buffer cache until enough buffers
are reclaimed to satisfy the request.
For historical reasons, bufhwm does not require the ufs: prefix.
- Data Type
-
Signed integer
- Default
-
2 percent of physical memory
- Range
-
80 Kbytes to 20 percent of physical memory, or 2 TB, whichever
is less. Consequently, bufhwm_pct can be between 1 and
20.
- Units
-
bufhwm: Kbytes
bufhwm_pct: percent of physical memory
- Dynamic?
-
No. bufhwm and bufhwm_pct are
only evaluated at system initialization to compute hash bucket sizes. The
limit in bytes calculated from these parameters is then stored in a data structure
that adjusts this value as buffers are allocated and deallocated.
Attempting to adjust this value without following the locking protocol
on a running system can lead to incorrect operation.
Modifying bufhwm or bufhwm_pct at
runtime has no effect.
- Validation
-
If bufhwm is less than its lower limit
of 80 Kbytes or greater than its upper limit (the lesser of 20 percent of
physical memory, 2 TB, or one quarter (1/4) of the maximum amount of kernel
heap), it is reset to the upper limit. The following message appears on the
system console and in the /var/adm/messages file if an
invalid value is attempted:
"binit: bufhwm (value attempted) out of range
(range start..range end). Using N as default."
|
“Value attempted” refers to the value specified in the/etc/system file or
by using a kernel debugger. N is the value computed
by the system based on available system memory.
Likewise, if bufhwm_pct is set to a value that is
outside the allowed range of 1 percent to 20 percent, it is reset to the default
of 2 percent. And, the following message appears on the system console and
in the /var/adm/messages file:
"binit: bufhwm_pct(value attempted) out of range(0..20).
Using 2 as default."
|
If both bufhwm or bufhwm_pct are
set to non-zero values, bufhwm takes precedence.
- When to Change
-
Because buffers are only allocated as they are needed, the
overhead from the default setting is the required allocation of control structures
for the buffer hash headers. These structures consume 52 bytes per potential
buffer on a 32-bit kernel and 96 bytes per potential buffer on a 64-bit kernel.
On a 512-Mbyte 64-bit kernel, the number of hash chains calculates to
10316 / 32 == 322, which scales up to next power of 2, 512. Therefore, the
hash headers consume 512 x 96 bytes, or 48 Kbytes. The hash header allocations
assume that buffers are 32 Kbytes.
The amount of memory, which has not been allocated in the buffer pool,
can be found by looking at the bfreelist structure in the
kernel with a kernel debugger. The field of interest in the structure is b_bufsize, which is the possible remaining memory in bytes. Looking
at it with the buf macro by using the mdb command:
# mdb -k
Loading modules: [ unix krtld genunix ip nfs ipc ]
> bfreelist::print "struct buf" b_bufsize
b_bufsize = 0x225800
|
The default value for bufhwm on this system, with
6 Gbytes of memory, is 122277. You cannot determine the number of header structures
used because the actual buffer size requested is usually larger than 1 Kbyte.
However, some space might be profitably reclaimed from control structure allocation
for this system.
The same structure on a 512-Mbyte system shows that only 4 Kbytes of
10144 Kbytes has not been allocated. When the biostats kstat is
examined with kstat -n biostats, it is determined that
the system had a reasonable ratio of buffer_cache_hits to buffer_cache_lookups as well. As such, the default setting is reasonable
for that system.
- Commitment Level
-
Unstable
- Change History
-
For information, see bufhwm (Solaris 9 Releases).
ndquot
- Description
-
Defines the number of quota structures for the UFS file system
that should be allocated. Relevant only if quotas are enabled on one or more
UFS file systems. Because of historical reasons, the ufs: prefix
is not needed.
- Data Type
-
Signed integer
- Default
-
((maxusers x 40) / 4) + max_nprocs
- Range
-
0 to MAXINT
- Units
-
Quota structures
- Dynamic?
-
No
- Validation
-
None. Excessively large values hang the system.
- When to Change
-
When the default number of quota structures is not enough.
This situation is indicated by the following message displayed on the console
or written in the message log:
- Commitment Level
-
Unstable
ufs_ninode
- Description
-
Specifies the number of inodes to be held in memory. Inodes are
cached globally for UFS, not on a per-file system basis.
A key parameter in this situation is ufs_ninode.
This parameter is used to compute two key limits that affect the handling
of inode caching. A high watermark of ufs_ninode / 2 and
a low watermark of ufs_ninode / 4 are computed.
When the system is done with an inode, one of two things can happen:
-
The file referred to by the inode is no longer on the system
so the inode is deleted. After it is deleted, the space goes back into the
inode cache for use by another inode (which is read from disk or created for
a new file).
-
The file still exists but is no longer referenced by a running
process. The inode is then placed on the idle queue. Any referenced pages
are still in memory.
When inodes are idled, the kernel defers the idling process to a later
time. If a file system is a logging file system, the kernel also defers deletion
of inodes. Two kernel threads handle this deferred processing. Each thread
is responsible for one of the queues.
When the deferred processing is done, the system drops the inode onto
either a delete queue or an idle queue, each of which has a thread that can
run to process it. When the inode is placed on the queue, the queue occupancy
is checked against the low watermark. If the queue occupancy exceeds the low
watermark, the thread associated with the queue is awakened. After the queue
is awakened, the thread runs through the queue and forces any pages associated
with the inode out to disk and frees the inode. The thread stops when it has
removed 50 percent of the inodes on the queue at the time it was awakened.
A second mechanism is in place if the idle thread is unable to keep
up with the load. When the system needs to find a vnode, it goes through the ufs_vget routine. The first thing vget does
is check the length of the idle queue. If the length is above the high watermark,
then it takes two inodes off the idle queue and “idles” them (flushes
pages and frees inodes). vget does this before it
gets an inode for its own use.
The system does attempt to optimize by placing inodes with no in-core
pages at the head of the idle list and inodes with pages at the end of the
idle list. However, the system does no other ordering of the list. Inodes
are always removed from the front of the idle queue.
The only time that inodes are removed from the queues as a whole is
when a synchronization, unmount, or remount occur.
For historical reasons, this parameter does not require the ufs: prefix.
- Data Type
-
Signed integer
- Default
-
ncsize
- Range
-
0 to MAXINT
- Units
-
Inodes
- Dynamic?
-
Yes
- Validation
-
If ufs_ninode is less than or equal to
zero, the value is set to ncsize.
- When to Change
-
When the default number of inodes is not enough. If the maxsize reached field as reported by kstat -n inode_cache is
larger than the maxsize field in the kstat,
the value of ufs_ninode might be too small. Excessive inode
idling can also be a problem.
You can identify excessive inode idling by using kstat -n inode_cache to look at the inode_cache kstat. Thread idles are inodes idled by the background threads while vget idles are idles by the requesting process before using an inode.
- Commitment Level
-
Unstable
ufs_WRITES
- Description
-
If ufs_WRITES is non-zero, the number of bytes
outstanding for writes on a file is checked. See ufs_HW to
determine whether the write should be issued or deferred until only ufs_LW bytes are outstanding. The total number of bytes outstanding is
tracked on a per-file basis so that if the limit is passed for one file, it
won't affect writes to other files.
- Data Type
-
Signed integer
- Default
-
1 (enabled)
- Range
-
0 (disabled) or 1 (enabled)
- Units
-
Toggle (on/off)
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
When you want UFS write throttling turned off entirely. If
sufficient I/O capacity does not exist, disabling this parameter can result
in long service queues for disks.
- Commitment Level
-
Unstable
ufs_LW and ufs_HW
- Description
-
ufs_HW specifies the number of bytes outstanding on a single file
barrier value. If the number of bytes outstanding is greater than this value
and ufs_WRITES is set, then the write is deferred. The
write is deferred by putting the thread issuing the write to sleep on a condition
variable.
ufs_LW is the barrier for the number of bytes outstanding
on a single file below which the condition variable on which other sleeping
processes are toggled. When a write completes and the number of bytes is less
than ufs_LW, then the condition variable is toggled, which
causes all threads waiting on the variable to awaken and try to issue their
writes.
- Data Type
-
Signed integer
- Default
-
8 x 1024 x 1024 for ufs_LW and 16 x 1024
x 1024 for ufs_HW
- Range
-
0 to MAXINT
- Units
-
Bytes
- Dynamic?
-
Yes
- Validation
-
None
- Implicit
-
ufs_LW and ufs_HW have
meaning only if ufs_WRITES is not equal to zero. ufs_HW and ufs_LW should be changed together to avoid
needless churning when processes awaken and find that either they cannot issue
a write (when ufs_LW and ufs_HW are
too close) or they might have waited longer than necessary (when ufs_LW and ufs_HW are too far apart).
- When to Change
-
Consider changing these values when file systems consist of
striped volumes. The aggregate bandwidth available can easily exceed the current
value of ufs_HW. Unfortunately, this parameter is not a
per-file system setting.
You might also consider changing this parameter when ufs_throttles is a non-trivial number. Currently, ufs_throttles can
only be accessed with a kernel debugger.
- Commitment Level
-
Unstable
freebehind
- Description
-
Enables the freebehind algorithm. When this
algorithm is enabled, the system bypasses the file system cache on newly read
blocks when sequential I/O is detected during times of heavy memory use.
- Data Type
-
Boolean
- Default
-
1 (enabled)
- Range
-
0 (disabled) or 1 (enabled)
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
The freebehind algorithm can occur too
easily. If no significant sequential file system activity is expected, disabling freebehind makes sure that all files, no matter how large, will
be candidates for retention in the file system page cache. For more fine-grained
tuning, see smallfile.
- Commitment Level
-
Unstable
smallfile
- Description
-
Determines the size threshold of files larger than this value
are candidates for no cache retention under the freebehind algorithm.
Large memory systems contain enough memory to cache thousands of 10-Mbyte
files without making severe memory demands. However, this situation is highly
application dependent.
The goal of the smallfile and freebehind parameters
is to reuse cached information, without causing memory shortfalls by caching
too much.
- Data Type
-
Signed integer
- Default
-
32,768
- Range
-
0 to 2,147,483,647
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
Increase smallfile if an application does
sequential reads on medium-sized files and can most likely benefit from buffering,
and the system is not otherwise under pressure for free memory. Medium-sized
files are 32 Kbytes to 2 Gbytes in size.
- Commitment Level
-
Unstable
TMPFS Parameters
tmpfs:tmpfs_maxkmem
- Description
-
Defines the maximum amount of kernel memory that TMPFS can use
for its data structures (tmpnodes and directory entries).
- Data Type
-
Unsigned long
- Default
-
One page or 4 percent of physical memory, whichever is greater.
- Range
-
Number of bytes in one page (8192 for sun4u or sun4v systems,
4096 for all other systems) to 25 percent of the available kernel memory at
the time TMPFS was first used.
- Units
-
Bytes
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
Increase if the following message is displayed on the console
or written in the messages file:
tmp_memalloc: tmpfs over memory limit
|
The current amount of memory used by TMPFS for its data structures is
held in the tmp_kmemspace field. This field can be examined
with a kernel debugger.
- Commitment Level
-
Unstable
- Change History
-
For information, see tmpfs:tmpfs_maxkmem (Solaris 10 Releases).
tmpfs:tmpfs_minfree
- Description
-
Defines the minimum amount of swap space that TMPFS leaves for
the rest of the system.
- Data Type
-
Signed long
- Default
-
256
- Range
-
0 to maximum swap space size
- Units
-
Pages
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
To maintain a reasonable amount of swap space on systems
with large amounts of TMPFS usage, you can increase this number. The limit
has been reached when the console or messages file displays the following
message:
fs-name: File system full, swap space limit exceeded
|
- Commitment Level
-
Unstable
- Change History
-
For information, see tmpfs:tmpfs_minfree (Solaris 8 Releases).
Pseudo Terminals
Pseudo terminals, ptys, are used for two purposes
in Solaris software:
-
Supporting remote logins by using the telnet, rlogin, or rsh commands
-
Providing the interface through which the X Window system
creates command interpreter windows
The default number of pseudo-terminals is sufficient for a desktop workstation.
So, tuning focuses on the number of ptys available for
remote logins.
Previous versions of Solaris required that steps be taken to explicitly
configure the system for the preferred number of ptys.
Starting with the Solaris 8 release, a new mechanism removes the necessity
for tuning in most cases. The default number of ptys is
now based on the amount of memory on the system. This default should be changed
only to restrict or increase the number of users who can log in to the system.
Three related variables are used in the configuration process:
-
pt_cnt – Default maximum number of ptys.
-
pt_pctofmem – Percentage of kernel
memory that can be dedicated to pty support structures.
A value of zero means that no remote users can log in to the system.
-
pt_max_pty – Hard maximum for number
of ptys.
pt_cnt has a default value of zero, which tells the
system to limit logins based on the amount of memory specified in pct_pctofmem, unless pt_max_pty is set. If pt_cnt is
non-zero, ptys are allocated until this limit is reached.
When that threshold is crossed, the system looks at pt_max_pty.
If pt_max_pty has a non-zero value, it is compared to pt_cnt. The pty allocation is allowed if pt_cnt is less than pt_max_pty. If pt_max_pty is
zero, pt_cnt is compared to the number of ptys
supported based on pt_pctofmem. If pt_cnt is
less than this value, the pty allocation is allowed. Note
that the limit based on pt_pctofmem only comes into play
if both pt_cnt and ptms_ptymax have
default values of zero.
To put a hard limit on ptys that is different than
the maximum derived from pt_pctofmem, set pt_cnt and ptms_ptymax in /etc/system to the preferred
number of ptys. The setting of ptms_pctofmem is
not relevant in this case.
To dedicate a different percentage of system memory to pty support
and let the operating system manage the explicit limits, do the following:
-
Do not set pt_cnt or ptms_ptymax in /etc/system.
-
Set pt_pctofmem in /etc/system to
the preferred percentage. For example, set pt_pctofmem=10 for
a 10 percent setting.
Note that the memory is not actually allocated until it is used in support
of a pty. Once memory is allocated, it remains allocated.
pt_cnt
- Description
-
The number of available /dev/pts entries
is dynamic up to a limit determined by the amount of physical memory available
on the system. pt_cnt is one of three variables that determines
the minimum number of logins that the system can accommodate. The default
maximum number of /dev/pts devices the system can support
is determined at boot time by computing the number of pty structures
that can fit in a percentage of system memory (see pt_pctofmem).
If pt_cnt is zero, the system allocates up to that maximum.
If pt_cnt is non-zero, the system allocates to the greater
of pt_cnt and the default maximum.
- Data Type
-
Unsigned integer
- Default
-
0
- Range
-
0 to maxpid
- Units
-
Logins/windows
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
When you want to explicitly control the number of users who
can remotely log in to the system.
- Commitment Level
-
Unstable
pt_pctofmem
- Description
-
Specifies the maximum percentage of physical memory that can be
consumed by data structures to support /dev/pts entries.
A system running a 64-bit kernel consumes 176 bytes per /dev/pts entry.
A system running a 32-bit kernel consumes 112 bytes per /dev/pts entry.
- Data Type
-
Unsigned integer
- Default
-
5
- Range
-
0 to 100
- Units
-
Percentage
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
When you want to either restrict or increase the number of
users who can log in to the system. A value of zero means that no remote users
can log in to the system.
- Commitment Level
-
Unstable
pt_max_pty
- Description
-
Defines the maximum number of ptys the system
offers
- Data Type
-
Unsigned integer
- Default
-
0 (Uses system-defined maximum)
- Range
-
0 to MAXUINT
- Units
-
Logins/windows
- Dynamic?
-
Yes
- Validation
-
None
- Implicit
-
Should be greater than or equal to pt_cnt.
Value is not checked until the number of ptys allocated
exceeds the value of pt_cnt.
- When to Change
-
When you want to place an absolute ceiling on the number of
logins supported, even if the system could handle more based on its current
configuration values.
- Commitment Level
-
Unstable
STREAMS Parameters
nstrpush
- Description
-
Specifies the number of modules that can be inserted into (pushed
onto) a STREAM.
- Data Type
-
Signed integer
- Default
-
9
- Range
-
9 to 16
- Units
-
Modules
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
At the direction of your software vendor. No messages are
displayed when a STREAM exceeds its permitted push count. A value of EINVAL is returned to the program that attempted the push.
- Commitment Level
-
Unstable
strmsgsz
- Description
-
Specifies the maximum number of bytes that a single system call
can pass to a STREAM to be placed in the data part of a message. Any write exceeding this size is broken into multiple messages. For more
information, see write(2).
- Data Type
-
Signed integer
- Default
-
65,536
- Range
-
0 to 262,144
- Units
-
Bytes
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
When putmsg calls return ERANGE.
For more information, see putmsg(2).
- Commitment Level
-
Unstable
strctlsz
- Description
-
Specifies the maximum number of bytes that a single system call
can pass to a STREAM to be placed in the control part of a message
- Data Type
-
Signed integer
- Default
-
1024
- Range
-
0 to MAXINT
- Units
-
Bytes
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
At the direction of your software vendor. putmsg(2) calls return ERANGE if
they attempt to exceed this limit.
- Commitment Level
-
Unstable
System V Message Queues
System V message queues provide a message-passing interface that enables
the exchange of messages by queues created in the kernel. Interfaces are provided
in the Solaris environment to enqueue and dequeue messages. Messages can have
a type associated with them. Enqueueing places messages at the end of a queue.
Dequeuing removes the first message of a specific type from the queue or the
first message if no type is specified.
For information about System V message queues in the Solaris 10 release,
see System V IPC Configuration.
For detailed information on tuning these system resources, see Chapter 6, Resource
Controls (Overview), in System Administration
Guide: Solaris Containers-Resource Management and Solaris Zones.
For legacy information about the obsolete System V message queues, see Parameters That Are Obsolete or Have Been Removed.
System V Semaphores
System V semaphores provide counting semaphores in the Solaris OS. A semaphore is a counter used to provide access to a shared data
object for multiple processes. In addition to the standard set and release
operations for semaphores, System V semaphores can have values that are incremented
and decremented as needed (for example, to represent the number of resources
available). System V semaphores also provide the ability to do operations
on a group of semaphores simultaneously as well as to have the system undo
the last operation by a process if the process dies.
For information about the changes to semaphore resources in the Solaris
10 release, see System V IPC Configuration.
For detailed information about using the new resource controls in the
Solaris 10 release, see Chapter
6, Resource Controls (Overview), in System Administration Guide: Solaris Containers-Resource Management
and Solaris Zones.
For legacy information about the obsolete System V semaphore parameters,
see Parameters That Are Obsolete or Have Been Removed.
System V Shared Memory
System V shared memory allows the creation of a segment by a process.
Cooperating processes can attach to the memory segment (subject to access
permissions on the segment) and gain access to the data contained in the segment.
This capability is implemented as a loadable module. Entries in the /etc/system file must contain the shmsys: prefix. Starting
with the Solaris 7 release, the keyserv daemon uses System
V shared memory.
A special kind of shared memory known as intimate shared memory (ISM)
is used by DBMS vendors to maximize performance. When a shared memory segment
is made into an ISM segment, the memory for the segment is locked. This feature
enables a faster I/O path to be followed and improves memory usage. A number
of kernel resources describing the segment are then shared between all processes
that attach to the segment in ISM mode.
For information about the changes to shared memory resources in the
Solaris 10 release, see System V IPC Configuration.
For detailed information about using the new resource controls in the
Solaris 10 release, see Chapter
6, Resource Controls (Overview), in System Administration Guide: Solaris Containers-Resource Management
and Solaris Zones.
For legacy information about the obsolete System V shared memory parameters,
see Parameters That Are Obsolete or Have Been Removed.
segspt_minfree
- Description
-
Identifies pages of system memory that cannot be allocated for
ISM shared memory.
- Data Type
-
Unsigned long
- Default
-
5 percent of available system memory when the first ISM segment
is created
- Range
-
0 to 50 percent of physical memory
- Units
-
Pages
- Dynamic?
-
Yes
- Validation
-
None. Values that are too small can cause the system to hang
or performance to severely degrade when memory is consumed with ISM segments.
- When to Change
-
On database servers with large amounts of physical memory
using ISM, the value of this parameter can be decreased. If ISM segments are
not used, this parameter has no effect. A maximum value of 128 Mbytes (0x4000)
is almost certainly sufficient on large memory machines.
- Commitment Level
-
Unstable
Scheduling
rechoose_interval
- Description
-
Specifies the number of clock ticks before a process is deemed
to have lost all affinity for the last CPU it ran on. After this interval
expires, any CPU is considered a candidate for scheduling a thread. This parameter
is relevant only for threads in the timesharing class. Real-time threads are
scheduled on the first available CPU.
- Data Type
-
Signed integer
- Default
-
3
- Range
-
0 to MAXINT
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
When caches are large, or when the system is running a critical
process or a set of processes that seem to suffer from excessive cache misses
not caused by data access patterns.
Consider using the processor set capabilities available as of the Solaris
2.6 release or processor binding before changing this parameter. For more
information, see psrset(1M) or pbind(1M).
- Commitment Level
-
Unstable
Timers
hires_tick
- Description
-
When set, this parameter causes the Solaris OS to use a system
clock rate of 1000 instead of the default value of 100.
- Data Type
-
Signed integer
- Default
-
0
- Range
-
0 (disabled) or 1 (enabled)
- Dynamic?
-
No. Causes new system timing variable to be set at boot time.
Not referenced after boot.
- Validation
-
None
- When to Change
-
When you want timeouts with a resolution of less than 10 milliseconds,
and greater than or equal to 1 millisecond.
- Commitment Level
-
Unstable
timer_max
- Description
-
Specifies the number of POSIXTM timers available.
- Data Type
-
Signed integer
- Default
-
32
- Range
-
0 to MAXINT
- Dynamic?
-
No. Increasing the value can cause a system crash.
- Validation
-
None
- When to Change
-
When the default number of timers offered by the system is
inadequate. Applications receive an EAGAIN error when executing timer_create system calls.
- Commitment Level
-
Unstable
sun4u or sun4v Specific Parameters
consistent_coloring
- Description
-
Starting with the Solaris 2.6 release, the ability to use different
page placement policies on the UltraSPARC® (sun4u) platform was introduced. A page placement policy attempts
to allocate physical page addresses to maximize the use of the L2 cache. Whatever
algorithm is chosen as the default algorithm, that algorithm can potentially
provide less optimal results than another algorithm for a particular application
set. This parameter changes the placement algorithm selected for all processes
on the system.
Based on the size of the L2 cache, memory is divided into bins. The
page placement code allocates a page from a bin when a page fault first occurs
on an unmapped page. The page chosen depends on which of the three possible
algorithms are used:
-
Page coloring – Various bits of the virtual address
are used to determine the bin from which the page is selected. This is the
default algorithm in the Solaris 8 release. consistent_coloring is
set to zero to use this algorithm. No per-process history exists for this
algorithm.
-
Virtual addr=physical address – Consecutive pages in
the program selects pages from consecutive bins. consistent_coloring is
set to 1 to use this algorithm. No per-process history exists for this algorithm.
-
Bin-hopping – Consecutive pages in the program generally
allocate pages from every other bin, but the algorithm occasionally skips
more bins. consistent_coloring is set to 2 to use this
algorithm. Each process starts at a randomly selected bin, and a per-process
memory of the last bin allocated is kept.
- Dynamic?
-
Yes
- Validation
-
None. Values larger than 2 cause a number of WARNING:
AS_2_BIN: bad consistent coloring value messages to appear on the
console. The system hangs immediately thereafter. A power-cycle is required
to recover.
- When to Change
-
When the primary workload of the system is a set of long-running
high-performance computing (HPC) applications. Changing this value might provide
better performance. File servers, database servers, and systems with a number
of active processes (for example, compile or time sharing servers) do not
benefit from changes.
- Commitment Level
-
Unstable
tsb_alloc_hiwater_factor
- Description
-
Initializes tsb_alloc_hiwater to impose an
upper limit on the amount of physical memory that can be allocated for translation
storage buffers (TSBs) as follows:
tsb_alloc_hiwater = physical memory (bytes) / tsb_alloc_hiwater_factor
When the memory that is allocated to TSBs is equal to the value of tsb_alloc_hiwater, the TSB memory allocation algorithm attempts
to reclaim TSB memory as pages are unmapped.
Exercise caution when using this factor to increase the value of tsb_alloc_hiwater. To prevent system hangs, the resulting high water value must be
considerably lower than the value of swapfs_minfree and segspt_minfree.
- Data Type
-
Integer
- Default
-
32
- Range
-
1 to MAXINIT
Note that a factor of 1 makes all physical memory available for allocation
to TSBs, which could cause the system to hang. A factor that is too high will
not leave memory available for allocation to TSBs, decreasing system performance.
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
Change the value of this parameter if the system has many
processes that attach to very large shared memory segments. Under most circumstances,
tuning of this variable is not necessary.
- Commitment Level
-
Unstable
default_tsb_size
- Description
-
Selects size of the initial translation storage buffers (TSBs)
allocated to all processes.
- Data Type
-
Integer
- Default
-
Default is 0 (8 Kbytes), which corresponds to 512 entries
- Range
-
Possible values are:
|
Value
|
Description
|
|
0
|
8 Kbytes
|
|
1
|
16 Kbytes
|
|
3
|
32 Kbytes
|
|
4
|
128 Kbytes
|
|
5
|
256 Kbytes
|
|
6
|
512 Kbytes
|
|
7
|
1 Mbyte
|
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
Generally, you do not need to change this value. However,
doing so might provide some advantages if the majority of processes on the
system have a larger than average working set, or if resident set size (RSS)
sizing is disabled.
- Commitment Level
-
Unstable
- Change History
-
For information, see default_tsb_size (Solaris 10 Releases).
enable_tsb_rss_sizing
- Description
-
Enables a resident set size (RSS) based TSB sizing heuristic.
- Data Type
-
Boolean
- Default
-
1 (TSBs can be resized)
- Range
-
0 (TSBs remain at tsb_default_size) or
1 (TSBs can be resized)
If set to 0, then tsb_rss_factor is ignored.
- Dynamic?
-
Yes
- Validation
-
Yes
- When to Change
-
Can be set to 0 to prevent growth of the TSBs. Under most
circumstances, this parameter should be left at the default setting.
- Commitment Level
-
Unstable
- Change History
-
For information, see enable_tsb_rss_sizing (Solaris 10 Releases).
tsb_rss_factor
- Description
-
Controls the RSS to TSB span ratio of the RSS sizing heuristic.
This factor divided by 512 yields the percentage of the TSB span which must
be resident in memory before the TSB is considered as a candidate for resizing.
- Data Type
-
Integer
- Default
-
384, resulting in a value of 75%. Thus, when the TSB is 3/4
full, its size will be increased. Note that some virtual addresses typically
map to the same slot in the TSB. Therefore, conflicts can occur before the
TSB is at 100% full.
- Range
-
0 to 512
- Dynamic?
-
Yes
- Validation
-
None
- When to Change
-
If the system is experiencing an excessive number of traps
due to TSB misses, for example, due to virtual address conflicts in the TSB,
you might consider decreasing this value toward 0.
For example, changing tsb_rss_factor to 256 (effectively,
50%) instead of 384 (effectively, 75%) might help eliminate virtual address
conflicts in the TSB in some cases, but will use more kernel memory, particularly
on a heavily loaded system.
TSB activity can be monitored with the trapstat -T command.
- Commitment Level
-
Unstable
- Change History
-
For information, see tsb_rss_factor (Solaris 10 Releases).
Locality Group Parameters
This section provides generic memory tunables, which apply to any SPARC
or x86 system that uses a Non-Uniform Memory Architecture (NUMA).
lpg_alloc_prefer
- Description
-
Controls a heuristic for allocation of large memory pages when
the requested page size is not immediately available in the local memory group,
but could be satisfied from a remote memory group.
By default, the Solaris OS allocates a remote large page if local free
memory is fragmented, but remote free memory is not. Setting this parameter
to 1 indicates that additional effort should be spent attempting to allocate
larger memory pages locally, potentially moving smaller pages around to coalesce
larger pages in the local memory group.
- Data Type
-
Boolean
- Default
-
0 (Prefer remote allocation if local free memory is fragmented
and remote free memory is not)
- Range
-
0 (Prefer remote allocation if local free memory is fragmented
and remote free memory is not)
1 (Prefer local allocation whenever possible, even if local free memory
is fragmented and remote free memory is not)
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
This parameter might be set to 1 if long-running programs
on the system tend to allocate memory that is accessed by a single program,
or if memory that is accessed by a group of programs is known to be running
in the same locality group (lgroup). In these circumstances, the extra cost
of page coalesce operations can be amortized over the long run of the programs.
This parameter might be left at the default value (0) if multiple programs
tend to share memory across different locality groups, or if pages tend to
be used for short periods of time. In these circumstances, quick allocation
of the requested size tends to be more important than allocation in a particular
location.
Page locations and sizes might be observed by using the NUMA
observability tools, available at http://hub.opensolaris.org/bin/view/Main/. TLB miss
activity might be observed by using the trapstat -T command.
- Commitment Level
-
Uncommitted
lgrp_mem_default_policy
- Description
-
This variable reflects the default memory allocation policy used
by the Solaris OS. This variable is an integer, and its value should correspond
to one of the policies listed in the sys/lgrp.h file.
- Data Type
-
Integer
- Default
-
1, LGRP_MEM_POLICY_NEXT indicating that
memory allocation defaults to the home lgroup of the thread performing the
memory allocation.
- Range
-
Possible values are:
|
Value
|
Description
|
Comment
|
|
0
|
LGRP_MEM_POLICY_DEFAULT
|
use system default policy
|
|
1
|
LGRP_MEM_POLICY_NEXT
|
next to allocating thread's home lgroup
|
|
2
|
LGRP_MEM_POLICY_RANDOM_PROC
|
randomly across process
|
|
3
|
LGRP_MEM_POLICY_RANDOM_PSET
|
randomly across processor set
|
|
4
|
LGRP_MEM_POLICY_RANDOM
|
randomly across all lgroups
|
|
5
|
LGRP_MEM_POLICY_ROUNDROBIN
|
round robin across all lgroups
|
|
6
|
LGRP_MEM_POLICY_NEXT_CPU
|
near next CPU to touch memory
|
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
For applications that are sensitive to memory latencies due
to allocations that occur from remote versus local memory on systems that
use NUMA.
- Commitment Level
-
Uncommitted
lgrp_mem_pset_aware
- Description
-
If a process is running within a user processor set, this variable
determines whether randomly placed memory for the process
is selected from among all the lgroups in the system or only from those lgroups
that are spanned by the processors in the processor set.
For more information about creating processor sets, see psrset(1M).
- Data Type
-
Boolean
- Default
-
0, the Solaris OS selects memory from all the lgroups in
the system
- Range
-
-
0, the Solaris OS selects memory from all the lgroups in the
system (default)
-
1, try selecting memory only from those lgroups that are spanned
by the processors in the processor set. If the first attempt fails, memory
can be allocated in any lgroup.
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
Setting this value to a value of one (1) might lead to more
reproducible performance when processor sets are used to isolate applications
from one another.
- Commitment Level
-
Uncommitted
Solaris Volume Manager Parameters
md_mirror:md_resync_bufsz
- Description
-
Sets the size of the buffer used for resynchronizing RAID 1 volumes
(mirrors) as the number of 512-byte blocks in the buffer. Setting larger values
can increase resynchronization speed.
- Data Type
-
Integer
- Default
-
The default value is 128.
Larger systems could use higher values to increase mirror resynchronization
speed.
- Range
-
128 to 2048
- Units
-
Blocks (512 bytes)
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
If you use Solaris Volume Manager RAID 1 volumes (mirrors),
and you want to increase the speed of mirror resynchronizations. Assuming
that you have adequate memory for overall system performance, you can increase
this value without causing other performance problems.
If you need to increase the speed of mirror resynchronizations, increase
the value of this parameter incrementally (using 128-block increments) until
performance is satisfactory. On fairly large or new systems, a value of 2048
seems to be optimal. High values on older systems might hang the system.
- Commitment Level
-
Unstable
md:mirrored_root_flag
- Description
-
Overrides Solaris Volume Manager requirements for replica quorum
and forces Solaris Volume Manager to start if any valid state database replicas
are available.
The default value is disabled, which requires that a majority of all
replicas are available and synchronized before Solaris Volume Manager will
start.
- Data Type
-
Boolean values
- Default
-
0 (disabled)
- Range
-
0 (disabled) or 1 (enabled)
- Dynamic?
-
No
- Validation
-
None
- When to Change
-
Use of this parameter is not supported.
Some people using Solaris Volume Manager accept the risk of enabling
this parameter if all three of the following conditions apply:
-
When root (/) or other system-critical
file systems are mirrored
-
Only two disks or controllers are available
-
An unattended reboot of the system is required
If this parameter is enabled, the system might boot with a stale replica
that inaccurately represents the system state (including which mirror sides
are good or in Maintenance state). This situation could result in data corruption
or system corruption.
Change this parameter only if system availability is more important
than data consistency and integrity. Closely monitor the system for any failures.
You can mitigate the risk by keeping the number of failed, Maintenance, or
hot-swapped volumes as low as possible.
For more information about state database replicas, see Chapter 6, State Database (Overview), in Solaris Volume Manager Administration Guide.
- Commitment Level
-
Unstable
|