Contenues dans
Trouver plus de documentation
Ressources d'assistance comprises
| Télécharger cet ouvrage au format PDF
Performance
3
- This chapter provides performance information that you can use to tune your application to make the best use of Sun hardware graphics accelerators. The first section provides general advice on how to optimize vertex processing performance for a variety of platforms. The subsequent sections provide specific techniques to ensure maximum performance on the Creator3D and Creator graphics accelerators.
General Tips on Vertex Processing
- To achieve the best vertex processing performance on all Sun platforms, follow these guidelines:
-
- Use vertex arrays or display list mode rather than immediate mode whenever rendering data repeatedly.
- Use consistent patterns of data types between glBegin(3gl) and glEnd(3gl). Consistent data types are described in "Consistent Data Types" on page 16.
- If you must use immediate mode, try to include as many primitives of the same type as possible between one glBegin and the corresponding glEnd.
- If vertex array is used, try to stay in vertex array mode, rather than switching between vertex array and immediate mode.
- These guidelines are discussed in the sections that follow.
Vertex Arrays
- Vertex array commands provide the best performance for vertex processing of big primitives because they avoid the function call overhead of passing one vertex, color, and normal at a time. Instead of calling an OpenGL command for each vertex, you can pre-specify arrays of vertices, colors, and normals, and use them to define a primitive or set of primitives of the same type with a single command. Interleaved vertex arrays may enable even faster performance, since the application passes the data packed in a single array.
Consistent Data Types
- For the Solaris OpenGL implementation on all Sun platforms, vertex processing is optimized if the application provides consistent, supported data types within a glBegin/glEnd pair. Data types are consistent when the commands between one vertex call, such as glVertex3fv, and the next vertex call include identical patterns of data types in the identical order. In other words, consistent data is data for which the pattern is the same for each vertex, except when glCallList or glEval* is included. For example, the following set of commands is consistent because the primitive is defined by the repeating set of calls glColor3fv(3gl); glVertex3fv(3gl).
-
-
glBegin(GL_LINES);
glColor3fv(...);
glVertex3fv(...);
glColor3fv(...);
glVertex3fv(...);
glColor3fv(...);
glVertex3fv(...);
glEnd();
- As another example, the following set of commands is consistent since each vertex contains the same data- a color, normal, and vertex in repeating order.
-
-
glBegin(GL_LINES);
glColor3f(...);
glNormal3f(...);
glVertex3f(...);
glColor3f(...);
glNormal3f(...);
glVertex3f(...);
glEnd();
-
Note - The *f versions of the calls may be used interchangeably with the *fv versions.
- Inconsistent data types do not follow a repeating, supported pattern. In the first example below, the data is inconsistent because the first vertex has a normal, but the second vertex doesn't. In the second example, the order is reversed in the second set of commands, although both vertices have a color and a normal.
-
-
glBegin(GL_LINES);
glNormal3fv(...);
glColor3fv(...);
glVertex3fv(...);
glColor3fv(...);
glVertex3fv(...);
glEnd();
glBegin(GL_LINES);
glColor3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glColor3fv(...);
glVertex3fv(...);
glEnd();
- For general information on the vertex data that can be specified between glBegin(3gl) and glEnd(3gl) calls, see the glBegin(3gl) reference page.
Low Batching
- Solaris OpenGL 1.1 performs best when given big primitives. If small primitives are sent to the library, the library will try to batch these primitives together, providing that the primitives are of the same primitive type, with the same consistent data pattern, and there are no attribute state changes outside the glBegin call.
- For example, the following primitives will be batched together by the library.
-
-
glBegin(GL_TRIANGLES);
glNormal3fv(...);
glVertex3fv(...);
-
-
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glEnd();
glBegin(GL_TRIANGLES);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glEnd();
- The following example shows that the primitives are not batched together because the glColor3fv call outside the glBegin call breaks the batching of the primitives.
-
-
glBegin(GL_LINES);
glVertex3fv(...);
glVertex3fv(...);
glEnd();
glColorfv(...);
glBegin(GL_LINES);
glVertex3fv(...);
glVertex3fv(...);
glEnd();
Optimized Data Types
- On any platform that uses the software pipeline for model coordinate rendering, your application will get better performance if it can pass vertex data in patterns for which the software pipeline has optimized code. Optimized data patterns are consistent data patterns which contain none of the following:
-
- glEdgeFlag*()
- glMaterial*()
- glEvalCoord*()
-
- glCallList() or glCallLists()
- both glColor*() and glIndex*()
- both glTexCoord*() and glIndex*()
Creator3D Graphics and Creator Graphics Performance
- The Ultra Creator and Creator 3D Graphics systems accelerate rasterization of lines, points, and triangles as well as most per-fragment operations. Vertex processing and texturing operations are performed on the UltraSPARC CPU. The Solaris OpenGL implementation for the Creator and Creator3D frame buffers uses all features of the Creator graphics subsystem.
- Rasterization and fragment processing is handled in one of three ways:
-
- Creator3D hardware rasterizer - Handles lines, points, and triangles, and does simple fragment processing.
- Optimized software rasterizer - UltraSPARC VIS (Visual Instruction Set) handles many texturing functions and pixel operations.
- Generic software rasterizer - Performs rasterization for all features not handled by the hardware or by the VIS software.
- To find out more about the Creator and Creator3D hardware platforms, refer to the Architecture Technical White paper at http://www.sun.com/desktop/products/Ultra2/.
- The following sections provide specific information on attribute use and pixel operations on these platforms.
Attributes Affecting Creator3D Performance
- Primitive-attribute settings affect performance; therefore, you will get a better level of performance if you can avoid setting the attributes listed below. In some cases, the listed attributes simply increase the amount of processing in the hardware or optimized software data paths. In other cases, setting these attributes forces the use of the software rasterizer, resulting in slow performance.
Attributes That Increase Vertex Processing Overhead
- Attributes that that result in more vertex processing overhead include:
-
- Enabling lighting.
- Turning on user specified clip planes (GL_CLIP_PLANE[i]).
- Enabling color material (GL_COLOR_MATERIAL).
- Enabling non-linear fog (glFog(GL_FOG_MODE, GL_EXP{2})). An exception to this is using RGBA mode on Creator3D Series 2.
- Enabling GL_NORMALIZE.
- Turning on polygon offset. However, polygon offset is optimized for the case when the factor parameter of the glPolygonOffset call is set to 0.0. Users may have to adjust the units parameter accordingly to avoid stitching for this case.
Primitive Types and Vertex Data Patterns That Increase Vertex Processing Overhead
- Types and patterns that result in more vertex processing overhead are:
-
- C4F_V3F:
- glColor4f(...); glVertex3f(...);
- ...
- V2F:
- glVertex2f(...); ...
- C3F_V2F:
- glColor3f(...); glVertex2f(...);
- ...
- C4F_V2F:
- glColor4f(...); glVertex2f(...);
- ...
-
- Using glDrawElements in immediate mode.
Attributes That Increase Hardware Rasterization Overhead
- Attributes that result in slower hardware rasterization are:
-
- Enabling line antialiasing (GL_LINE_SMOOTH)
- Enabling point antialiasing (GL_POINT_SMOOTH)
Attributes That Force the Use of the Software Rasterizer
- Setting the following attributes forces the use of the software rasterizer. This is the slowest data path. If your application requires any of the following attributes for performance critical functionality, you may want to determine whether this performance is acceptible. If not, you can evaluate whether the use of these attributes is advisable.
-
- Rasterization attributes
-
- In Indexed color mode, enabling line anti-aliasing (GL_LINE_SMOOTH) or point anti-aliasing (GL_POINT_SMOOTH)
- Enabling polygon anti-aliasing (GL_POLYGON_SMOOTH)
- Stippled lines (GL_LINE_STIPPLE) where the line stipple scale factor is larger than 15
- Non-antialiased ("jaggy") points with glPointSize(3gl)greater than 1.0
-
Note - The only anti-aliased point size supported by Creator3D and Creator is 1.0. glPointSize is ignored for anti-aliased points. Although the nominal antialiased point size is 1.0, the actual visible size is approximately 1.5.
-
- Fragment Attributes
-
- Blending (GL_BLEND) forces the use of the software rasterizer unless both the source and destination blend functions are in the following set of blend functions supported by the hardware:
-
-
GL_ZERO
GL_ONE
GL_SRC_ALPHA
GL_ONE_MINUS_SRC_ALPHA
-
- Enabling the stencil test (GL_STENCIL_TEST)
On the UltraSPARC platform, a VIS optimized software rasterizer is used for smooth-shaded non-textured stenciled triangles whenever the glStencilOp parameter fail is anything other than GL_INCR or GL_DECR and the depth test does not affect the stencil buffer. (This is the case when depth test is disabled or the glStencilOp parameters zfail and zpass are identical).
- Enabling any type of fog in Indexed color mode
-
Figure 3-1 shows the data path for hardware rasterization on the Creator3D system. Figure 3-2 on page 29 illustrates the data path that the application uses when it sets an attribute that forces the use of the software rasterizer.

Figure 3-1
-
- Texturing Attributes
-
Attributes That Vary Optimized Texturing Speed
- The VIS optimized software rasterizer will vary in texturing speed based on the texturing attributes specified. The factors affecting texturing speed are listed below. Note that this is variance within the optimized path, not the difference between the optimized and generic paths.
-
- Projection Type -- The type of projection matrix. Orthographic is faster than perspective.
- Wrap Mode -- Best speed is when all dimensions (GL_TEXTURE_WRAP_x) are set to GL_REPEAT.
- Dimension -- In general, 2D texturing is faster than 3D texturing, since there is one less texture coordinate to deal with. However, this does not mean it is better to use many 2D textures to approximate 3D texturing since the texture load time (see next section) may significantly increase the overhead.
- Minfilter -- The fastest GL_TEXTURE_MIN_FILTER parameter is GL_NEAREST, which is approximately 4x the speed of GL_LINEAR. After that the approximate relative speed in decreasing order is: GL_LINEAR, GL_NEAREST_MIPMAP_NEAREST, GL_NEAREST_MIPMAP_LINEAR, GL_LINEAR_MIPMAP_NEAREST, and GL_LINEAR_MIPMAP_LINEAR.
- Magfilter -- For GL_TEXTURE_MAG_FILTER, the same speed ratio of 4x applies to GL_NEAREST vs. GL_LINEAR. Note, however, that GL_TEXTURE_MAG_FILTER is ignored when GL_TEXTURE_MIN_FILTER is set to GL_NEAREST or GL_LINEAR. This can be overridden with a shell environment variable but this will slow down texturing speed for GL_NEAREST and GL_LINEAR, since they now have to perform level-of-detail calculations to determine when to use GL_TEXTURE_MAG_FILTER. The shell environment variable that forces this slower behavior is:
-
-
setenv SUN_OGL_MAGFILTER "conformant"
-
- Interior Texture Coordinates -- Before a triangle is textured, the texture coordinates at the triangle's vertices are checked to determine if they are all at least 1/2 texel away from the texture map edges towards the inside of the texture. Triangles that pass this criterion are rendered faster than triangles whose texture coordinates touch or cross the texture map's edges. Note that since quads are broken up into two triangles before texturing that this applies to quads as well. It also applies to each primitive in a connected list such as tristrip or quadstrip.
- Env Mode -- The fastest glTexEnv() GL_TEXTURE_ENV_MODE is GL_REPLACE, followed closely by GL_MODULATE. GL_DECAL is the same speed as GL_REPLACE.
- Color Table -- The use of the extension GL_TEXTURE_COLOR_TABLE_SGI will reduce texturing speed.
Attributes That Vary Texture Load Time
- The time to load the texture image into a texture object or a display list will vary depending on the pixel store and pixel transfer attributes specified when the texture is specified. The following recommendations should be followed where possible to reduce texture load time:
-
- If multiple textures are being used, put the textures in texture objects and use glBindTexture to switch among the textures. This enables the texture load operation to be performed only once.
- If for some reason texture objects cannot be used, then the next best thing is to put the texture into a display list, making sure to fully specify in the display list the scale and bias for glPixelTransfer that are used in the application. The intent is to not have the display list inherit any changes to its initial pixel transfer from the calling environment. This avoids reprocessing the texture image. Avoid calling glPixelMap and glColorTableSGI (with target GL_COLOR_TABLE_SGI) after creating the display list that contains the texture image. Avoid calling glPixelMap and glColorTableSGI (with target GL_COLOR_TABLE_SGI) inside the display list. Doing so will cause the texture image to be reprocessed for every glCallList.
- Avoid setting any of the glPixelTransfer parameters to anything other than their default values.
-
- For GL_RGBA textures, use the extension GL_ABGR_EXT to specify the texture format and GL_UNSIGNED_BYTE for the texture data type.
- For 3D textures, some combinations of base internal format and incoming texture image format are optimized as given in the table below. Note that these optimized cases are valid only for data type GL_UNSIGNED_BYTE.
-
Table 3-1
| Format | Base Internal Format |
| GL_LUMINANCE_ALPHA | GL_LUMINANCE_ALPHA |
| GL_RED | GL_INTENSITY |
| GL_RED | GL_LUMINANCE |
| GL_ALPHA | GL_ALPHA |
| GL_LUMINANCE | GL_INTENSITY |
| GL_LUMINANCE | GL_LUMINANCE |
| GL_ABGR_EXT | GL_RGBA |
Attributes Affecting Creator Performance
- This section applies when pure software rendering is being used. This happens on the single-buffered Creator platform when glDrawBuffer(3gl) is set to GL_BACK or GL_FRONT_AND_BACK. The data presented here is also valid for the SX, ZX, GX, GX+, TGX, TGX+, and TCX platforms. Note that for non-Ultra machines, VIS rasterization is replaced by an optimized software rasterizer.
Attributes That Increase Vertex Processing Overhead
- Attributes that result in more vertex processing overhead are:
-
- Enabling lighting.
- Turning on user specified clip planes (GL_CLIP_PLANE[i]).
- Enabling color material (GL_COLOR_MATERIAL).
- Enabling non-linear fog (glFog (GL_FOG_MODE, GL_EXP{2})). An exception to this is using RGBA mode on Creator3D Series 2.
- Enabling GL_NORMALIZE.
-
- Turning on polygon offset. However, polygon offset is optimized when the factor parameter of the glPolygonOffset call is set to 0.0. Users may have to adjust the units parameter accordingly to avoid stitching for this case.
Attributes That Force the Use of the Generic Software Rasterizer
- Setting the following attributes forces the use of the generic software rasterizer. This is the slowest data path. If your application requires any of the following attributes for performance critical functionality, you may want to determine whether this performance is acceptable. If not, you can evaluate whether the use of these attributes is advisable.
-
- Texturing Attributes
-
- All three-dimensional texturing attributes result in the use of the generic software rasterizer.
- Two-dimensional texture mapping (GL_TEXTURE_2D) in the following cases:
a. Texture environment mode glTexEnv GL_TEXTURE_ENV_MODE is set to GL_BLEND. b. glTexEnv texture base internal format is GL_ALPHA.
c. Texturing of points is handled by the generic software. d. Fog is enabled. e. Any use of the SGI Texture Color Table (GL_SGI_texture_color_table) extension.
-
- Fragment Attributes
-
- Enabling any type of fog in Indexed color mode.
- Enabling blending (glBlendFunc) (3gl) except when the source blending factor is GL_SRC_ALPHA and the destination blending factor is GL_ONE_MINUS_SRC_ALPHA. This case is optimized.
- Enabling logical operations.
- Enabling depth test glEnable(GL_DEPTH_TEST) forces the use of the optimized software rasterizer. If depth test is enabled, then if glDepthFunc(3gl) is on, enabling any Z comparison other than GL_LESS or GL_LEQUAL forces the use of the generic software rasterizer.
-
- Enabling alpha test.
- Setting glDrawBuffer(3gl) to GL_BACK or GL_FRONT_AND_BACK, or setting glReadBuffer(3gl) to GL_BACK.
Index Mode
- When pure software rendering is being used, index mode rendering is handled by the generic software rasterizer. This includes any logic operation, blending, fog, stencil, alpha test, and the above-mentioned cases for Z comparison.

Figure 3-2
Pixel Operations
- Under optimal conditions, the commands glDrawPixels(3gl), glReadPixels(3gl), and glCopyPixels(3gl) are optimized on the Creator and Creator3D systems using the VIS instruction set on the UltraSPARC CPU. Bitmap operations using the command glBitmap(3gl) are accelerated in the Creator3D font registers. However, some attribute settings result in the use of the software rasterizer for pixel operations.
-
Figure 3-3 shows the rasterization and fragment processing architecture for glDrawPixels(3gl). The figure shows the optimized and unoptimized paths for pixel rendering. Your application will experience performance degradation for each functional box that it needs. In addition, performance degradation will occur if the data type is not unsigned byte; in this case, the data must be reformatted internally.

Figure 3-3
Conditions That Result in VIS Optimization on Creator3D Systems
- In general, for DrawPixels, CopyPixels, and Bitmap, the use of texture mapping or nonlinear fog (except in RGBA mode on Creator3D Series 2) will force the use of the generic software rasterizer, resulting in slow performance. In addition, if the hardware does not support the per-fragment operations that the application has enabled, the generic software rasterizer is used. See the OpenGL documentation or the "OpenGL Machine" diagram for a list of per-fragment operations.
- For the Creator3D system, if the following conditions are true, pixel operations are optimized. If these conditions are not true, the generic software rasterizer is used.
-
-
glDrawPixels Command
-
- Pixel format is GL_RGBA, GL_RGB, GL_ABGR_EXT, GL_RED, GL_GREEN, GL_BLUE and GL_LUMINANCE.
- Data type is GL_UNSIGNED_BYTE.
- For the format of GL_DEPTH_COMPONENT, the types GL_INT, GL_UNSIGNED_INT, and GL_FLOAT are optimized.
- Texturing is disabled.
- Pixel unpacking is unnecessary.
- Pixel transfer, mapping, and zooming are in the default state.
- Fog mode is linear (Creator 3D) or any fog mode (Creator 3D Series 2).
- The fragment attributes are not those listed in "Fragment Attributes" on page 24.
-
-
glReadPixels Command
-
- Pixel format is GL_RGBA, GL_RGB, GL_ABGR_EXT, GL_RED, GL_GREEN, GL_BLUE, GL_LUMINANCE and GL_LUMINANCE_ALPHA.
- Data type is GL_UNSIGNED_BYTE.
- For the format of GL_DEPTH_COMPONENT, the types GL_INT, GL_UNSIGNED_INT,and GL_FLOAT are optimized.
- Pixel packing is unnecessary.
- Pixel transfer and mapping are in the default state.
-
-
glCopyPixels Command
-
- Pixel type is GL_COLOR.
- Texturing is disabled.
-
- Pixel transfer, mapping and zooming are in the default state.
- Fog mode is linear (Creator 3D) or any fog mode (Creator 3D Series 2).
- The fragment attributes are not those listed in "Fragment Attributes" on page 24.
-
-
glBitmap(3gl)Command
-
- Texturing is not enabled.
- Blending is not enabled.
Conditions That Result in VIS Optimization on Creator Systems
- For the Creator and non-Creator SMCC frame buffers, if the following conditions are true, pixel operations are optimized. If these conditions are not true, the generic software rasterizer is used.
-
-
glDrawPixels Command
-
- Pixel format is GL_RGBA, GL_RGB or GL_ABGR_EXT.
- Data type is GL_UNSIGNED_BYTE.
- Texturing is disabled.
- Pixel unpacking is unnecessary.
- If depth test is enabled, then if glDepthFunc(3gl) is on, enabling any Z comparison other than GL_LESS or GL_LEQUAL.
-
-
glReadPixels Command
-
- If glReadPixels format is GL_RGBA, GL_RGB or GL_ABGR_EXT, the pixel type GL_UNSIGNED_BYTE is optimized.
- If glReadPixels format is GL_DEPTH_COMPONENT, then these pixel types are optimized: GL_INT, GL_UNSIGNED_INT, or GL_FLOAT.
- Pixel packing is unnecessary.
-
-
glCopyPixels Command
-
- Pixel type is GL_COLOR.
- Texturing is disabled.
- Enabling any Z comparison other than GL_LESS or GL_LEQUAL.
-
-
glBitmap Command
-
-
- If depth test is enabled, then if glDepthFunc is on, enabling any Z comparison other than GL_LESS or GL_LEQUAL.
GX Performance
- GX performance is affected by attributes that force the use of the generic software rasterizer:
-
- Texturing Attributes
a. Only triangles are optimized. Texturing of points and lines is handled by the generic software. b. Texture environment mode glTexEnv(3gl) GL_TEXTURE_ENV_MODE is GL_BLEND.
- Fragment Attributes
a. Stencil operations b. Logic operations c. Any blending operation d. Linear or nonlinear fog e. Enabling any Z comparison other than GL_LESS or GL_LEQUAL
|
|