|
| 以 PDF 格式下載這本書
- CHAPTER 3
Performance
- This chapter provides performance information that you can use to tune your application to make the best use of Sun hardware graphics accelerators. The first section provides general advice on how to optimize vertex processing performance for a variety of platforms. The subsequent sections provide specific techniques to ensure maximum performance on the Creator3D and Creator graphics accelerators.
General Tips on Vertex Processing
- To achieve the best vertex processing performance on all Sun platforms, follow these guidelines:
-
- Use vertex arrays or display list mode rather than immediate mode whenever rendering data repeatedly.
- Use consistent patterns of data types between glBegin(3gl) and glEnd(3gl). Consistent data types are described in "Consistent Data Types" on page 18.
- If you must use immediate mode, try to include as many primitives of the same type as possible between one glBegin and the corresponding glEnd.
- If vertex array is used, try to stay in vertex array mode, rather than switching between vertex array and immediate mode.
- These guidelines are discussed in the sections that follow.
Vertex Arrays
- Vertex array commands provide the best performance for vertex processing of big primitives because they avoid the function call overhead of passing one vertex, color, and normal at a time. Instead of calling an OpenGL command for each vertex, you can pre-specify arrays of vertices, colors, and normals, and use them to define a primitive or set of primitives of the same type with a single command. Interleaved vertex arrays may enable even faster performance, since the application passes the data packed in a single array.
MultiDrawArrays
- OpenGL for Solaris contains the extension glMultiDrawArraysSUN(). This function allows multiple strips of primitives to be rendered with one call to OpenGL. Because of reduced function call and setup overhead, this function can provide significant spee when an object contains many short strips. For some implementations of this function, there may be additional performance gains if the strips are contiguous in the vertex array. As with the standard glDrawArrays(), using interleaved vertex arrays gives even better performance.
Consistent Data Types
- For the OpenGL for Solaris implementation on all Sun platforms, vertex processing is optimized if the application provides consistent, supported data types within a glBegin/glEnd pair. Data types are consistent when the commands between one vertex call, such as glVertex3fv, and the next vertex call include identical patterns of data types in the identical order. In other words, consistent data is data for which the pattern is the same for each vertex, except when glCallList or glEval* is included. For example, the following set of commands is consistent because the primitive is defined by the repeating set of calls glColor3fv(3gl); glVertex3fv(3gl).
-
-
glBegin(GL_LINES);
glColor3fv(...);
glVertex3fv(...);
glColor3fv(...);
glVertex3fv(...);
glColor3fv(...);
glVertex3fv(...);
glEnd();
- As another example, the following set of commands is consistent since each vertex contains the same data- a color, normal, and vertex in repeating order.
-
-
glBegin(GL_LINES);
glColor3f(...);
glNormal3f(...);
glVertex3f(...);
glColor3f(...);
glNormal3f(...);
glVertex3f(...);
glEnd();
-
Note - The *f versions of the calls may be used interchangeably with the *fv versions.
- Inconsistent data types do not follow a repeating, supported pattern. In the first example below, the data is inconsistent because the first vertex has a normal, but the second vertex doesn't. In the second example, the order is reversed in the second set of commands, although both vertices have a color and a normal.
-
-
glBegin(GL_LINES);
glNormal3fv(...);
glColor3fv(...);
glVertex3fv(...);
glColor3fv(...);
glVertex3fv(...);
glEnd();
glBegin(GL_LINES);
glColor3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glColor3fv(...);
glVertex3fv(...);
glEnd();
- For general information on the vertex data that can be specified between glBegin(3gl) and glEnd(3gl) calls, see the glBegin(3gl) reference page.
Low Batching
- OpenGL for Solaris performs best when given big primitives. If small primitives are sent to the library, the library will try to batch these primitives together, providing that the primitives are of the same primitive type, with the same consistent data pattern, and there are no attribute state changes outside the glBegin call.
- For example, the following primitives will be batched together by the library.
-
-
glBegin(GL_TRIANGLES);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glEnd();
glBegin(GL_TRIANGLES);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glNormal3fv(...);
glVertex3fv(...);
glEnd();
- The following example shows that the primitives are not batched together because the glColor3fv call outside the glBegin call breaks the batching of the primitives.
-
-
glBegin(GL_LINES);
glVertex3fv(...);
glVertex3fv(...);
glEnd();
-
-
glColorfv(...);
glBegin(GL_LINES);
glVertex3fv(...);
glVertex3fv(...);
glEnd();
Optimized Data Types
- On any platform that uses the software pipeline for model coordinate rendering, your application will get better performance if it can pass vertex data in patterns for which the software pipeline has optimized code. Optimized data patterns are consistent data patterns which contain none of the following:
-
- glEdgeFlag*()
- glMaterial*()
- glEvalCoord*()
- glCallList() or glCallLists()
- both glColor*() and glIndex*()
- both glTexCoord*() and glIndex*()
Creator3D Graphics and Creator Graphics Performance
- The Ultra Creator and Creator 3D Graphics systems accelerate rasterization of lines, points, and triangles as well as most per-fragment operations. Vertex processing and texturing operations are performed on the UltraSPARC CPU. The OpenGL for Solaris implementation for the Creator and Creator3D frame buffers uses all features of the Creator graphics subsystem.
- Rasterization and fragment processing is handled in one of three ways:
-
- Creator3D hardware rasterizer - Handles lines, points, and triangles, and does simple fragment processing.
- Optimized software rasterizer - UltraSPARC VIS (Visual Instruction Set) handles many texturing functions and pixel operations.
- Generic software rasterizer - Performs rasterization for all features not handled by the hardware or by the VIS software.
- To find out more about the Creator and Creator3D hardware platforms, refer to the Architecture Technical White paper at http://www.sun.com/desktop/ products/Ultra2/.
- The following sections provide specific information on attribute use and pixel operations on these platforms.
Attributes Affecting Creator3D Performance
- Primitive-attribute settings affect performance; therefore, you will get a better level of performance if you can avoid setting the attributes listed below. In some cases, the listed attributes simply increase the amount of processing in the hardware or optimized software data paths. In other cases, setting these attributes forces the use of the software rasterizer, resulting in slow performance.
Attributes That Increase Vertex Processing Overhead
- Attributes that that result in more vertex processing overhead include:
-
- Enabling lighting.
- Turning on user specified clip planes (GL_CLIP_PLANE[i]).
- Enabling color material (GL_COLOR_MATERIAL).
- Enabling non-linear fog (glFog(GL_FOG_MODE, GL_EXP{2})). An exception to this is using RGBA mode on Creator3D Series 2.
- Enabling GL_NORMALIZE.
- Turning on polygon offset. However, polygon offset is optimized for the case when the factor parameter of the glPolygonOffset call is set to 0.0. Users may have to adjust the units parameter accordingly to avoid stitching for this case.
Primitive Types and Vertex Data Patterns That Increase Vertex Processing Overhead
- Types and patterns that result in more vertex processing overhead are:
-
- Using a surface primitive type as an argument to glBegin. The surface primitive types are: GL_TRIANGLES, GL_TRIANGLE_STRP, GL_TRIANGLE_FAN, GL_QUADS, GL_QUAD_STRIP and GL_POLYGON.
-
- Using a vertex data pattern for GL_POINTS, GL_LINES, GL_LINE_STRIP, and GL_LINE_LOOP,other than one of the following repeating patterns. These are the patterns that are maximally accelerated.
V3F: glVertex3f(...); ... C3F_V3F: glColor3f(...); glVertex3f(...); ... C4F_V3F: glColor4f(...); glVertex3f(...); ... V2F: glVertex2f(...); ... C3F_V2F: glColor3f(...); glVertex2f(...); ... C4F_V2F: glColor4f(...); glVertex2f(...); ...
-
Note - All vertex data patterns, other than one of the above repeating patterns, take more memory.
-
- Using glDrawElements in immediate mode.
Attributes That Increase Hardware Rasterization Overhead
- Attributes that result in slower hardware rasterization are:
-
- Enabling line antialiasing (GL_LINE_SMOOTH)
- Enabling point antialiasing (GL_POINT_SMOOTH)
Environment Variables Affecting Read Performance
-
-
-
· setenv SUN_OGL_ABGR_READPIX_NOCONFORM
- The alpha value read back from the frame buffer during glReadPixels with the GL_ABGR_EXT format is undefined. This is up to 30% faster than the conformant version. For Creator, the alpha value is not stored in the frame buffer anyway. Consequently, if the application does not use the alpha value, then this version is a significantly faster way to read pixels back from the frame buffer.
Attributes That Force the Use of the Software Rasterizer
- Setting the following attributes forces the use of the software rasterizer. This is the slowest data path. If your application requires any of the following attributes for performance critical functionality, you may want to determine whether this performance is acceptible. If not, you can evaluate whether the use of these attributes is advisable.
-
- Rasterization attributes
-
- In Indexed color mode, enabling line anti-aliasing (GL_LINE_SMOOTH) or point anti-aliasing (GL_POINT_SMOOTH)
- Enabling polygon anti-aliasing (GL_POLYGON_SMOOTH)
- Stippled lines (GL_LINE_STIPPLE) where the line stipple scale factor is larger than 15
- Non-antialiased ("jaggy") points with glPointSize(3gl)greater than 1.0
-
Note - The only anti-aliased point size supported by Creator3D and Creator is 1.0. glPointSize is ignored for anti-aliased points. Although the nominal antialiased point size is 1.0, the actual visible size is approximately 1.5.
-
- Fragment Attributes
-
- Blending (GL_BLEND) forces the use of the software rasterizer unless both the source and destination blend functions are in the following set of blend functions supported by the hardware:
-
-
GL_ZERO
GL_ONE
GL_SRC_ALPHA
GL_ONE_MINUS_SRC_ALPHA
-
- Enabling the stencil test (GL_STENCIL_TEST) on Creator3D or Creator3D Series 2. (Enabling the stencil test does not force the use of the software rasterizer on Creator3D Series 3 because it supports hardware stencilling).
On the UltraSPARC platform, a VIS optimized software rasterizer is used for smooth-shaded non-textured stenciled triangles whenever the glStencilOp parameter fail is anything other than GL_INCR or GL_DECR and the depth test does not affect the stencil buffer. (This is the case when depth test is disabled or the glStencilOp parameters zfail and zpass are identical).
- Enabling any type of fog in Indexed color mode
-
FIGURE 3-1 shows the data path for hardware rasterization on the Creator3D system. FIGURE 3-5 on page 34 illustrates the data path that the application uses when it sets an attribute that forces the use of the software rasterizer.

FIGURE 3-1
-
- Texturing Attributes
-
- Color Table--When the GL_TEXTURE_COLOR_TABLE_SGI extension is used, the only glTexEnv texture base internal formats that are accelerated are
-
-
GL_LUMINANCE, GL_LUMINANCE_ALPHA and GL_INTENSITY.
-
- The texture environment mode glTexEnv GL_TEXTURE_ENV_MODE of GL_BLEND is not accelerated when it is used with the GL_TEXTURE_COLOR_TABLE_SGI extension.
-
- Fog--On Creator3D, only linear fog is accelerated. On Creator3D Series 2, all types of RGBA fog are accelerated.
Attributes That Vary Optimized Texturing Speed
- Texturing makes extensive use of VIS on UltraSparc platforms and allows for large textures. Texturing speed naturally increases with faster CPUs (a 300 Mhz UltraSparc CPU is 1.6 times faster than a 167 Mhz CPU). Though texturing fill rates are slower on a host CPU than on dedicated hardware, the system costs are lower.
- The extensions supported for texturing include 3D Texture Mapping, SGI Color Table, and SGI Texture Color Table.
- Stencil and some fragment blending cases are slow. The rest are fast (done by Creator 3D hardware).
- Some texturing attributes are handled by generic code and result in the slowest texturing speed when the GL_TEXTURE_COLOR_TABLE_SGI extension is used with texture environment color blending or base internal formats of GL_ALPHA, GL_RGB, or GL_RGBA.
- Texturing attributes with the most impact on speed are:
-
- Minification filter
- Texture Coordinate Interior/Exterior Classification (per triangle)
- All wrap modes set to GL_REPEAT
- Texture Color Lookup Table
- The VIS optimized software rasterizer will vary in texturing speed based on the texturing attributes specified. The factors affecting texturing speed are listed below. Note that this is variance within the optimized path, not the difference between the optimized and generic paths.
-
- Projection Type--The type of projection matrix. Orthographic is faster than perspective.
- Wrap Mode--Best speed is when all dimensions (GL_TEXTURE_WRAP_x) are set to GL_REPEAT. If all the texture wrap modes are GL_REPEAT, this case is specially optimized. If any of the texture wrap modes are GL_CLAMP, then the standard texture wrap routine is used, but it is slower than the special case.
- Dimension--In general, 2D texturing is faster than 3D texturing, since there is one less texture coordinate to deal with. However, this does not mean it is better to use many 2D textures to approximate 3D texturing since the texture load time (see next section) may significantly increase the overhead.
-
- Minfilter--The fastest GL_TEXTURE_MIN_FILTER parameter is GL_NEAREST, which is approximately 4x the speed of GL_LINEAR. See FIGURE 3-3 on page 31 and FIGURE 3-4 on page 32. The approximate relative speed in decreasing order is:
-
-
GL_NEAREST, GL_NEAREST_MIPMAP_NEAREST, GL_NEAREST_MIPMAP_LINEAR,
GL_LINEAR, GL_LINEAR_MIPMAP_NEAREST, and GL_LINEAR_MIPMAP_LINEAR.
-
- Magfilter--For GL_TEXTURE_MAG_FILTER, the same speed ratio of 4x applies to GL_NEAREST vs. GL_LINEAR. Note, however, that GL_TEXTURE_MAG_FILTER is ignored when GL_TEXTURE_MIN_FILTER is set to GL_NEAREST or GL_LINEAR. This can be overridden with a shell environment variable but this will slow down texturing speed for GL_NEAREST and GL_LINEAR, since they now have to perform level-of-detail calculations to determine when to use GL_TEXTURE_MAG_FILTER. The shell environment variable that forces this slower behavior is:
-
-
· setenv SUN_OGL_MAGFILTER "conformant"
-
- Texture Coordinate Classification--If all texture coordinates of a triangle/quad/ polygon are at LEAST 1/2 texel inside away from the texture map edge, then the primitive is considered interior and are render faster than those whose texture coordinates touch or cross the texture map's edges. If any vertex touches or crosses the texture map edge, then the primitive is considered exterior. If a primitive is interior, then the texture edge related attributes such as wrap modes and texture border no longer affect the texturing speed.
- Env Mode--The fastest glTexEnv() GL_TEXTURE_ENV_MODE is GL_REPLACE, followed closely by GL_MODULATE. GL_DECAL is the same speed as GL_REPLACE.
- Color Table--The use of the extension GL_TEXTURE_COLOR_TABLE_SGI will reduce texturing speed.
- Texture Color Lookup Table--Using this table causes significant slowdown of texturing speed. Only cases of one or two channel lookups are optimized -GL_LUMINANCE, GL_INTENSITY, GL_LUMINANCE_ALPHA. Three or four channel lookups (GL_RGB, GL_RGBA) go to a generic code routine that is slower than the special case.
Attributes That Vary Texture Load Time
- The time to load the texture image into a texture object or a display list will vary depending on the pixel store and pixel transfer attributes specified when the texture is specified.
-
FIGURE 3-2 shows the texture load processing flow.

FIGURE 3-2
- The following recommendations should be followed where possible to reduce texture load time:
-
- Use texture objects where possible.
- If multiple textures are being used, put the textures in texture objects and use glBindTexture to switch among the textures. This ensures that the internal copy of texture is evaluated only once.
- For faster load time of 1D and 2 D textures, use GL_ABGR_EXT format of data type GL_UNSIGNED_BYTE and texture internal format of GL_RGBA.
- 3D textures use packed representation to minimize memory usage.
-
- For 3D textures using data type GL_UNSIGNED_BYTE, the following format/base internal format combinations give the best loading performance:
-
TABLE 3-1
| Format | Base Internal Format |
| GL_LUMINANCE_ALPHA | GL_LUMINANCE_ALPHA |
| GL_RED | GL_INTENSITY |
| GL_RED | GL_LUMINANCE |
| GL_ALPHA | GL_ALPHA |
| GL_LUMINANCE | GL_INTENSITY |
| GL_LUMINANCE | GL_LUMINANCE |
| GL_ABGR_EXT | GL_RGBA |
Relative Performance of Attributes
- The following two charts show the relative performance of the attributes. The Y-axis is a ratio of the measured texturing speed against the fastest texturing case speed (which is 2D ortho nearest replace interior). Since all the charts were computed using the one number as a divisor, individual bars can be compared across charts. For example, the relative performance of 2D vs 3D texturing can be seen by comparing the bars between the 2D and 3D charts.
- The meanings of the legend annotations in the charts are:
- ortho--Orthographic Projection
- persp--Perspective Projection
- repeat--All wrap modes set to GL_REPEAT
- clamp--Some wrap modes set to GL_CLAMP
- intr--All texture coordinates are interior
- extr--All texture coordinates are exterior
- ctab--Texture Color Lookup Table extension (GL_TEXTURE_COLOR_TABLE_SGI) is enabled
- nearest--Texture minification filter is GL_NEAREST
- nmn--Texture minification filter is GL_NEAREST_MIPMAP_NEAREST
- nml--Texture minification filter is GL_NEAREST_MIPMAP_LINEAR
- linear--Texture minification filter is GL_LINEAR
- lmn--Texture minification filter is GL_LINEAR_MIPMAP_NEAREST
- lml--Texture minification filter is GL_LINEAR_MIPMAP_LINEAR

FIGURE 3-3

FIGURE 3-4
Attributes Affecting Creator Performance
- This section applies when pure software rendering is being used. This happens on the single-buffered Creator platform when glDrawBuffer(3gl) is set to GL_BACK or GL_FRONT_AND_BACK. The data presented here is also valid for the SX, ZX, GX, GX+, TGX, TGX+, and TCX platforms. Note that for non-Ultra machines, VIS rasterization is replaced by an optimized software rasterizer.
Attributes That Increase Vertex Processing Overhead
- Attributes that result in more vertex processing overhead are:
-
- Enabling lighting.
- Turning on user specified clip planes (GL_CLIP_PLANE[i]).
- Enabling color material (GL_COLOR_MATERIAL).
-
- Enabling non-linear fog (glFog (GL_FOG_MODE, GL_EXP{2})). An exception to this is using RGBA mode on Creator3D Series 2.
- Enabling GL_NORMALIZE.
- Turning on polygon offset. However, polygon offset is optimized when the factor parameter of the glPolygonOffset call is set to 0.0. Users may have to adjust the units parameter accordingly to avoid stitching for this case.
Attributes That Force the Use of the Generic Software Rasterizer
- Setting the following attributes forces the use of the generic software rasterizer. This is the slowest data path. If your application requires any of the following attributes for performance critical functionality, you may want to determine whether this performance is acceptable. If not, you can evaluate whether the use of these attributes is advisable.
-
- Texturing Attributes
-
- All three-dimensional texturing attributes result in the use of the generic software rasterizer.
- Two-dimensional texture mapping (GL_TEXTURE_2D) in the following cases:
i. Texture environment mode glTexEnv GL_TEXTURE_ENV_MODE is set to GL_BLEND. ii. glTexEnv texture base internal format is GL_ALPHA.
iii. Texturing of points is handled by the generic software. iv. Fog is enabled. v. Any use of the SGI Texture Color Table (GL_SGI_texture_color_table) extension.
-
- Fragment Attributes
-
- Enabling any type of fog in Indexed color mode.
- Enabling blending (glBlendFunc) (3gl) except when the source blending factor is GL_SRC_ALPHA and the destination blending factor is GL_ONE_MINUS_SRC_ALPHA. This case is optimized.
- Enabling logical operations.
- Enabling depth test glEnable(GL_DEPTH_TEST) forces the use of the optimized software rasterizer. If depth test is enabled, then if glDepthFunc(3gl) is on, enabling any Z comparison other than GL_LESS or GL_LEQUAL forces the use of the generic software rasterizer.
- Enabling alpha test.
-
- Setting glDrawBuffer(3gl) to GL_BACK or GL_FRONT_AND_BACK, or setting glReadBuffer(3gl) to GL_BACK.
Index Mode
- When pure software rendering is being used, index mode rendering is handled by the generic software rasterizer. This includes any logic operation, blending, fog, stencil, alpha test, and the above-mentioned cases for Z comparison.

FIGURE 3-5
Pixel Operations
- Under optimal conditions, the commands glDrawPixels(3gl), glReadPixels(3gl), and glCopyPixels(3gl) are optimized on the Creator and Creator3D systems using the VIS instruction set on the UltraSPARC CPU. Bitmap operations using the command glBitmap(3gl) are accelerated in the Creator3D font registers. However, some attribute settings result in the use of the software rasterizer for pixel operations.
-
FIGURE 3-6 shows the rasterization and fragment processing architecture for glDrawPixels(3gl). The figure shows the optimized and unoptimized paths for pixel rendering. Your application will experience performance degradation for each functional box that it needs. In addition, performance degradation will occur if the data type is not unsigned byte; in this case, the data must be reformatted internally.

FIGURE 3-6
Conditions That Result in VIS Optimization on Creator3D Systems
- In general, for DrawPixels, CopyPixels, and Bitmap, the use of texture mapping or nonlinear fog (except in RGBA mode on Creator3D Series 2) will force the use of the generic software rasterizer, resulting in slow performance. In addition, if the
- hardware does not support the per-fragment operations that the application has enabled, the generic software rasterizer is used. See the OpenGL documentation or the "OpenGL Machine" diagram for a list of per-fragment operations.
- For the Creator3D system, if the following conditions are true, pixel operations are optimized. If these conditions are not true, the generic software rasterizer is used.
glDrawPixels Command
-
- Pixel format is GL_RGBA, GL_RGB, GL_ABGR_EXT, GL_RED, GL_GREEN, GL_BLUE, GL_LUMINANCE, and GL_LUMINANCE_ALPHA.
- Data type is GL_UNSIGNED_BYTE. (For GL_LUMINANCE the data type can also be GL_SHORT).
- For the format of GL_DEPTH_COMPONENT, the types GL_INT, GL_UNSIGNED_INT, and GL_FLOAT are optimized for the case with no pixel transfer.
- Texturing is disabled.
- Pixel unpacking is unnecessary.
- For the formats listed in the first line, the pixel transfer operations for scale/bias, pixel map, SGI color table, convolution, SGI post convolution color table, histogram, and minmax may be enabled.
- Pixel Zoom may be done if it zoom factors are other than the default values.
- Pixel transform may be done if its current matrix is other than the identity matrix.
glReadPixels Command
-
- Pixel format is GL_RGBA, GL_RGB, GL_ABGR_EXT, GL_RED, GL_GREEN, GL_BLUE, GL_LUMINANCE, and GL_LUMINANCE_ALPHA.
- Data type is GL_UNSIGNED_BYTE.
- For the format of GL_DEPTH_COMPONENT, the types GL_INT, GL_UNSIGNED_INT, and GL_FLOAT are optimized for the case with no pixel transfer.
- Pixel packing is unnecessary.
- For the formats listed in the first line, the pixel transfer operations for scale/bias, pixel map, SGI color table, convolution, SGI post convolution color table, histogram, and minmax may be enabled.
glCopyPixels Command
-
- Pixel type is GL_COLOR.
- Texturing is disabled.
-
- Pixel zooming is in the default state.
- The pixel transfer operations for scale/bias, pixel map, SGI color table, convolution, SGI post convolution color table, histogram, and minmax may be enabled.
glBitmap(3gl)Command
-
- Texturing is not enabled.
- Blending is not enabled.
Conditions That Result in VIS Optimization on Creator Systems
- For the Creator and non-Creator SMCC frame buffers, if the following conditions are true, pixel operations are optimized. If these conditions are not true, the generic software rasterizer is used.
glDrawPixels Command
-
- For GL_LUMINANCE with data types GL_UNSIGNED_BYTE and GL_SHORT, there are special VIS optimized routines for:
· drawing directly to the framebuffer (or pbuffer).
· performing pixel transfer (ie. scale/bias, pixel map, SGI color table, convolution, SGI post convolution color table, histogram, and minmax) then displaying directly to the framebuffer (or pbuffer).
· performing the pixel transform extension, then drawing directly to the framebuffer (or pbuffer).
· performing pixel transfer followed by the pixel transform extension, then finally drawing directly to the framebuffer (or pbuffer).
- Pixel format is GL_RGBA, GL_RGB or GL_ABGR_EXT.
- Data type is GL_UNSIGNED_BYTE.
- Texturing is disabled.
- Pixel unpacking is unnecessary.
- If depth test is enabled, then if glDepthFunc(3gl) is on, enabling any Z comparison other than GL_LESS or GL_LEQUAL.
glReadPixels Command
-
- For GL_RED with the data type GL_UNSIGNED_BYTE, there is one special VIS optimized routine for extracting the red channel from an ABGR framebuffer or pbuffer.
-
- If glReadPixels format is GL_RGBA, GL_RGB, or GL_ABGR_EXT, and the pixel type is GL_UNSIGNED_BYTE, then glReadPixels is optimized.
- If glReadPixels format is GL_DEPTH_COMPONENT, then these pixel types are optimized: GL_INT, GL_UNSIGNED_INT, GL_FLOAT.
- Pixel packing is unnecessary.
glCopyPixels Command
-
- Pixel type is GL_COLOR.
- Texturing is disabled.
- Enabling any Z comparison other than GL_LESS or GL_LEQUAL.
glBitmap Command
-
- Texturing is disabled.
- If depth test is enabled, then if glDepthFunc is on, enabling any Z comparison other than GL_LESS or GL_LEQUAL.
Pixel Transfer Pipeline Imaging Extensions and the Pixel Transform
- The Pixel Transfer Pipeline consists of a small set of image processing functions which operate on most rectangular imagery with OpenGL. These operations are performed whenever Pixel Transfer operations can occur within OpenGL (that is,.
-
-
glDrawPixels, glReadPixels, glCopyPixels, glTexImage2D,
glTexImage3DEXT, and so on).
- This pipeline has been fine tuned for maximum performance on GL_LUMINANCE formatted data for the data types GL_UNSIGNED_BYTE and GL_SHORT. Other formats have been accelerated as well; however, GL_LUMINANCE gains the most in performance with this Implementation of the Pipeline.
- This pipeline has been accelerated using the Visual Instruction Set, which is only available on those systems with the UltraSPARC processor. The Pixel Transfer Pipeline with VIS acceleration is not supported on Non-UltraSPARC processors; however, the original Pixel Transfer Functionality is still there, minus the new imaging extensions.
Implementation
- The following figure shows the functions and the order of execution (from top to bottom) of these functions in the Pixel Transfer Pipeline:

FIGURE 3-7
- All functions in the pipeline have been accelerated using VIS whenever possible. The new imaging extensions within this pipeline are convolution, post convolution scale/bias, post convolution color table, histogram, minmax, and pixel transform. The last one, pixel transform, is not really part of the pixel transfer pipeline, but is instead considered part of the pixel rasterizer. Also, pixel transform is only executed in the glDrawPixels interface. The functions for scale/bias, pixel map, and SGI color table are part of the previous release, OpenGL 1.1. The difference here is that they are accelerated using VIS when possible in OpenGL 1.1.1.
- Another optimization that is worth noting here is that direct output to the display, via the glDrawPixels interface, or into a pbuffer has been optimized for GL_LUMINANCE format with GL_UNSIGNED_BYTE and GL_SHORT data types. For GL_UNSIGNED_BYTE, while the framebuffer is in TrueColor mode (rgb mode), the luminance pixels are expanded to XBGR format and then written directly to the
- framebuffer memory using VIS for optimal throughput. For GL_LUMINANCE, GL_SHORT data, the conversion of GL_SHORT data to GL_UNSIGNED_BYTE and then expansion to XBGR for direct display has been optimized for maximum throughput using VIS.
- When the input format is GL_LUMINANCE and the input data type is GL_SHORT the Pixel Transfer Pipeline has been made so that it will process the data from the beginning to end of the pipe as GL_SHORT data. This maintains the accuracy and integrity of the data from one stage of the pipeline to the next. Only just before rendering into the frame buffer or pbuffer does the data get scaled down and clamped to [0, 255].
- In this pipeline none or all of these processing blocks can be enabled. Any time the Pixel Transfer Pipeline is used, there is only one pass through the pipe, and the order of execution does not change from that represented in the figure above.
How To Use the Pixel Transfer Pipeline and Pixel Transform
- For the most part, OpenGL operates on RGBA colors. Therefore, to be specification compliant in OpenGL, if a user of OpenGL wants to do pixel transfer operations on GL_LUMINANCE data, then that data should first be expanded to GL_RGBA format, (or GL_ABGR_EXT format) before doing any processing. However, depending on the OpenGL pixel transfer state parameters, it may not be necessary to expand the image data before processing in the pixel transfer pipeline. That is, if we expand the data from GL_LUMINANCE to GL_RGBA first, process the image as 4 banded data in the Pixel Transfer Pipeline, and then display, or if we process the GL_LUMINANCE data as a single banded image in the Pixel Transfer Pipeline, then expand the data at the end of the pipeline, then display the data; if the result would be the same using either of the 2 paths, then it makes sense to use the faster path, which, in this case, would be the latter path.
- This takes about 1/4th the time, (or less) to do the correct desired operation. The Pixel Transfer Pipeline evaluates the various states of the pixel transfer functions and determines if it needs to do format expansion, before, during, or after processing, but expansion always occurs, if needed, just before rendering to the framebuffer or pbuffer.
- The only case were format expansion can occur inside the Pixel Transfer Pipeline is within the "pixel map" block. If you want optimal throughput for GL_LUMINANCE data, do not use pixel map, instead use SGI color table if you need to use a color table at this stage in the pipeline.
- The following sections explain each stage of the Pixel Transfer Pipeline. The example code provided shows you how to set the state parameters for the given stage so that GL_LUMINANCE data is not expanded until the very end of the pipeline, just before rendering to the frame buffer's window or the pbuffer.
Scale/Bias
- This operation multiplies all pixels by a given scale value, then adds a bias value. Scale and Bias values can be set differently for each color component of a pixel. These values are set as follows:
-
-
glPixelTransferf (GL_RED_SCALE, red_scale_value);
glPixelTransferf (GL_GREEN_SCALE, green_scale_value);
glPixelTransferf (GL_BLUE_SCALE, blue_scale_value);
glPixelTransferf (GL_ALPHA_SCALE, alpha_scale_value);
glPixelTransferf (GL_RED_BIAS, red_bias_value);
glPixelTransferf (GL_GREEN_BIAS, green_bias_value);
glPixelTransferf (GL_BLUE_BIAS, blue_bias_value);
glPixelTransferf (GL_ALPHA_BIAS, alpha_bias_value);
- If any of these deviate from their default values, (1.0 for scale and 0.0 for bias) then the Scale/Bias block in the Pixel Transfer Pipeline is enabled. If any of the red, green, blue, or alpha components differ from each other for either scale or bias, and if the input format can be expanded to GL_RGBA or GL_ABGR_EXT format, then the expansion will occur before processing starts in the pixel transfer pipeline. If the red, green, blue and alpha scale values are all the same or alpha scale is 1.0, and the red, green, blue and alpha bias values are the same or the alpha bias is 0.0, but the red, green, and blue components are different from their default values, then expansion does not need to occur. Hence, if you do a glDrawPixels operation and pass in GL_LUMINANCE data, the red component will be used to do the scale and bias, and the output will be a GL_LUMINANCE format image. Hence, the following OpenGL calls will setup Scale/Bias to process GL_LUMINANCE without format expansion:
-
-
glPixelTransferf (GL_RED_SCALE, scale_value);
glPixelTransferf (GL_GREEN_SCALE, scale_value);
glPixelTransferf (GL_BLUE_SCALE, scale_value);
glPixelTransferf (GL_ALPHA_SCALE, scale_value);
glPixelTransferf (GL_RED_BIAS, bias_value);
glPixelTransferf (GL_GREEN_BIAS, bias_value);
glPixelTransferf (GL_BLUE_BIAS, bias_value);
glPixelTransferf (GL_ALPHA_BIAS, bias_value);
- or
-
-
glPixelTransferf (GL_RED_SCALE, scale_value);
glPixelTransferf (GL_GREEN_SCALE, scale_value);
glPixelTransferf (GL_BLUE_SCALE, scale_value);
glPixelTransferf (GL_ALPHA_SCALE, 1.0);
glPixelTransferf (GL_RED_BIAS, bias_value);
glPixelTransferf (GL_GREEN_BIAS, bias_value);
glPixelTransferf (GL_BLUE_BIAS, bias_value);
glPixelTransferf (GL_ALPHA_BIAS, 0.0);
- To disable scale/bias, just reset the scale/bias values back to their default values as shown below:
-
-
glPixelTransferf (GL_RED_SCALE, 1.0);
glPixelTransferf (GL_GREEN_SCALE, 1.0);
glPixelTransferf (GL_BLUE_SCALE, 1.0);
glPixelTransferf (GL_ALPHA_SCALE, 1.0);
glPixelTransferf (GL_RED_BIAS, 0.0);
glPixelTransferf (GL_GREEN_BIAS, 0.0);
glPixelTransferf (GL_BLUE_BIAS, 0.0);
glPixelTransferf (GL_ALPHA_BIAS, 0.0);
Pixel Map
- When in true color mode (RGB mode), if the input image data format is not GL_RGBA or GL_ABGR_EXT, then expansion is always forced if pixel map is enabled using glPixelTransfer (GL_MAP_COLOR, GL_TRUE). If the input image format is GL_COLOR_INDEX and the current display mode is RGB, then Pixel Map is called automatically whether it was enabled or not to do the conversion from color index to RGBA. In terms of performance for GL_LUMINANCE, this case is not optimal and you should use SGI color table instead.
- To learn how to use Pixel Map consult the "OpenGL Reference Manual," by the OpenGL Architecture Review Board, known as the blue book. Read the sections on glPixelTransfer, and glPixelMap.
SGI Color Table
- This extension is very useful for accelerating color lookup for GL_LUMINANCE data. Other formats are accelerated as well; however, GL_LUMINANCE benefits the most. The following code fragment shows how to correctly setup SGI color table to perform a color lookup for GL_LUMINANCE data:
-
-
int unpack_row_length;
int unpack_skip_pixels;
int unpack_skip_rows;
int unpack_alignment;
int lut_size;
void *lut;
/* Turns on SGI color table. */
glEnable (GL_COLOR_TABLE_SGI);
/* The current pixel storage modes also affect color table */
/* definition at the time the color table is created. We */
/* need to grab the current values, set the row length, */
/* skip pixels and skip rows to the defaults and */
/* set unpack alignment to 1. When finished defining the */
/* color table, restore the original values. */
glGetIntegerv (GL_UNPACK_ROW_LENGTH, (long *) &unpack_row_length);
glGetIntegerv (GL_UNPACK_SKIP_PIXELS, (long *) &unpack_skip_pixels);
glGetIntegerv (GL_UNPACK_SKIP_ROWS, (long *) &unpack_skip_rows);
glGetIntegerv (GL_UNPACK_ALIGNMENT, (long *) &unpack_alignment);
glPixelStorei (GL_UNPACK_ROW_LENGTH, 0);
glPixelStorei (GL_UNPACK_SKIP_PIXELS, 0);
glPixelStorei (GL_UNPACK_SKIP_ROWS, 0);
glPixelStorei (GL_UNPACK_ALIGNMENT, 1);
/* Define the color table for GL_LUMINANCE. */
/* If data type is GL_UNSIGNED_BYTE create a lookup table with */
/* 256 entries. Each entry is of type GL_UNSIGNED_BYTE. */
/* Range of values for any entry is [0, 255]. */
/* For a GL_SHORT lookup table, generate a table of 65536 entries */
/* ranging from -32768 to 32767. */
if (data_type == GL_UNSIGNED_BYTE) {
lut_size = 256;
lut = generate_unsigned_byte_lut();
-
-
}
else if (data_type == GL_SHORT) {
lut_size = 65536;
lut = generate_short_lut();
}
glColorTableSGI (GL_COLOR_TABLE_SGI,
GL_LUMINANCE, /* Need to specify internal format. */
lut_size,
GL_LUMINANCE, /* Format of lut passed in. */
data_type, /* Data type of lut passed in. */
lut); /* Actual pointer to lut arrayl. */
/* Restore original Pixel Storage values in case something else */
/* needed these values. */
glPixelStorei (GL_UNPACK_ROW_LENGTH, unpack_row_length);
glPixelStorei (GL_UNPACK_SKIP_PIXELS, unpack_skip_pixels);
glPixelStorei (GL_UNPACK_SKIP_ROWS, unpack_skip_rows);
glPixelStorei (GL_UNPACK_ALIGNMENT, unpack_alignment);
Convolution, Post Convolution Scale/Bias and Post Convolution Color Table
- -------------------------------------------------------------------------
- Convolution comes in 3 flavors: 1D convolution (applies to 1D textures only), 2D general convolution, and 2D separable convolution. Special effort has been made to maximize throughput for 2D general and separable convolutions for GL_LUMINANCE format for GL_UNSIGNED_BYTE and GL_SHORT data types via the glDrawPixels interface.
- Convolution allows you to set scale and bias values that are applied to the convolution filter kernel before it is used for convolving the image. This is different from post convolution scale/bias (below) in that the bias is applied to the filter itself before processing, where as with post convolution scale/bias, the bias is added to the final convolution result before clamping for the given data type (GL_UNSIGNED_BYTE or GL_SHORT).
- Convolution and post convolution scale/bias have been combined into one operation. The kernel values for convolution are multiplied by the scale value of the post convolution scale/bias, then after each pixel is convolved the bias is added. Since this is all done in VIS, there is no loss in performance when compared with an ordinary convolve implemented in VIS.
- The OpenGL 1.1.1 implementation of convolution only supports 1x3, 1x5, and 1x7 convolves for 1D convolves, and 3x3, 5x5, and 7x7 for 2D convolves. Also, the source image must be 3 times larger than the size of the convolve kernel to be used.
- OpenGL 1.1.1 convolution also supports the following border modes:
- GL_REDUCE_EXT, GL_IGNORE_BORDER_HP, GL_CONSTANT_BORDER_HP,
- GL_WRAP_BORDER_SUN, GL_REPLICATE_BORDER_HP.
- SGI post convolution color table is set up exactly the same way as SGI color table. The only difference being the target value when defining the table.
- The code fragment below shows how to setup 2D convolution for both the general and separable cases for a 3x3 convolve on GL_LUMINANCE format image data. The setup is the same for either GL_UNSIGNED_BYTE or GL_SHORT data. It also prepares for using the GL_CONSTANT_BORDER_HP mode, uses the GL_CONVOLUTION_FILTER_SCALE_EXT and the
- GL_CONVOLUTION_FILTER_BIAS_EXT, sets up for post convolution scale/bias, then finally sets up the SGI post convolution color table.
-
-
int unpack_row_length;
int unpack_skip_pixels;
int unpack_skip_rows;
int unpack_alignment;
int lut_size;
void *lut;
float kernel3x3[9] = { 0.111111111, 0.111111111, 0.111111111,
0.111111111, 0.111111111, 0.111111111,
0.111111111, 0.111111111, 0.111111111};
float sepkernel3[3] = { 0.333333333, 0.333333333, 0.333333333};
float const_color[4] = { 0.5, 0.5, 0.5, 0.5 };
float kernel_scales[4] = { 0.8, 0.8, 0.8, 0.8 };
float kernel_biases[4] = { 0.2, 0.2, 0.2, 0.2 };
float post_conv_scales[4] = { 0.75, 0.75, 0.75, 0.75 };
float post_conv_biases[4] = { 0.25, 0.25, 0.25, 0.25 };
/* The current pixel storage modes affect convolve kernel */
/* destination at the time the kernels are created. */
/* We need to grab the current values, set the row length, */
/* skip pixels and skip rows to the defaults and set unpack */
/* alignment to 1. */
/* When finished defining the color table, restore the */
-
-
/* original values. */
glGetIntegerv (GL_UNPACK_ROW_LENGTH, (long *) &unpack_row_length);
glGetIntegerv (GL_UNPACK_SKIP_PIXELS, (long *) &unpack_skip_pixels);
glGetIntegerv (GL_UNPACK_SKIP_ROWS, (long *) &unpack_skip_rows);
glGetIntegerv (GL_UNPACK_ALIGNMENT, (long *) &unpack_alignment);
glPixelStorei (GL_UNPACK_ROW_LENGTH, 0);
glPixelStorei (GL_UNPACK_SKIP_PIXELS, 0);
glPixelStorei (GL_UNPACK_SKIP_ROWS, 0);
glPixelStorei (GL_UNPACK_ALIGNMENT, 1);
/* Now, setup convolution with constant color border mode. */
if (convolve_type == GL_CONVOLUTION_2D_EXT) {
glEnable (GL_CONVOLUTION_2D_EXT);
glConvolutionFilter2DEXT (GL_CONVOLUTION_2D_EXT,
GL_LUMINANCE, /* Internal format. */
3, 3, /* Kernal dimensions. */
GL_LUMINANCE, /* Input kernel data format */
GL_FLOAT, /* Data type for kernel */
/* entries. */
(void *) kernel3x3); /* Pointer to kernel .*/
glConvolutionParameteriEXT(GL_CONVOLUTION_2D_EXT,
GL_CONVOLUTION_BORDER_MODE_EXT,
GL_CONSTANT_BORDER_HP);
glConvolutionParameterfvEXT(GL_CONVOLUTION_2D_EXT,
GL_CONVOLUTION_BORDER_COLOR_HP,
const_color);
glConvolutionParameterfvEXT(GL_CONVOLUTION_2D_EXT,
GL_CONVOLUTION_FILTER_SCALE_EXT,
kernel_scales);
glConvolutionParameterfvEXT(GL_CONVOLUTION_2D_EXT,
GL_CONVOLUTION_FILTER_BIAS_EXT,
kernel_biases);
}
else if (convolve_type == GL_SEPARABLE_2D_EXT) {
glEnable (GL_SEPARABLE_2D_EXT);
glSeparableFilter2DEXT (GL_SEPARABLE_2D_EXT,
GL_LUMINANCE,
3, 3,
GL_LUMINANCE,
-
-
GL_FLOAT,
sepkernel3, /* Horizontal Kernal Values. */
sepkernel3); /* Vertical Kernal Values. */
glConvolutionParameteriEXT(GL_SEPARABLE_2D_EXT,
GL_CONVOLUTION_BORDER_MODE_EXT,
GL_CONSTANT_BORDER_HP);
glConvolutionParameterfvEXT(GL_SEPARABLE_2D_EXT,
GL_CONVOLUTION_BORDER_COLOR_HP,
const_color);
glConvolutionParameterfvEXT(GL_SEPARABLE_2D_EXT,
GL_CONVOLUTION_FILTER_SCALE_EXT,
kernel_scales);
glConvolutionParameterfvEXT(GL_SEPARABLE_2D_EXT,
GL_CONVOLUTION_FILTER_BIAS_EXT,
kernel_biases);
}
glPixelTransferf(GL_POST_CONVOLUTION_RED_SCALE_EXT,
post_conv_scales[0]);
glPixelTransferf(GL_POST_CONVOLUTION_GREEN_SCALE_EXT,
post_conv_scales[1]);
glPixelTransferf(GL_POST_CONVOLUTION_BLUE_SCALE_EXT,
post_conv_scales[2]);
glPixelTransferf(GL_POST_CONVOLUTION_ALPHA_SCALE_EXT,
post_conv_scales[3]);
glPixelTransferf(GL_POST_CONVOLUTION_RED_BIAS_EXT,
post_conv_biases[0]);
glPixelTransferf(GL_POST_CONVOLUTION_GREEN_BIAS_EXT,
post_conv_biases[1]);
glPixelTransferf(GL_POST_CONVOLUTION_BLUE_BIAS_EXT,
post_conv_biases[2]);
glPixelTransferf(GL_POST_CONVOLUTION_ALPHA_BIAS_EXT,
post_conv_biases[3]);
/* Turns on SGI post convolution color table. */
glEnable (GL_POST_CONVOLUTION_COLOR_TABLE_SGI);
/* Define the color table for GL_LUMINANCE. */
/* If data type is GL_UNSIGNED_BYTE create a lookup table with */
-
-
/* 256 entries. Each entry is of type GL_UNSIGNED_BYTE. */
/* Range of values for any entry is [0, 255]. */
/* For a GL_SHORT lookup table, generate a table of 65536 entries */
/* ranging from -32768 to 32767.*/
if (data_type == GL_UNSIGNED_BYTE) {
lut_size = 256;
lut = generate_unsigned_byte_lut();
}
else if (data_type == GL_SHORT) {
lut_size = 65536;
lut = generate_short_lut();
}
glColorTableSGI (GL_POST_CONVOLUTION_COLOR_TABLE_SGI,
GL_LUMINANCE, /* Need to specify internal format. */
lut_size,
GL_LUMINANCE, /* Format of lut passed in. */
data_type, /* Data type of lut passed in. */
lut); /* Actual pointer to lut arrayl. */
/* Restore original Pixel Storage values in case something else */
/* needed these values. */
glPixelStorei (GL_UNPACK_ROW_LENGTH, unpack_row_length);
glPixelStorei (GL_UNPACK_SKIP_PIXELS, unpack_skip_pixels);
glPixelStorei (GL_UNPACK_SKIP_ROWS, unpack_skip_rows);
glPixelStorei (GL_UNPACK_ALIGNMENT, unpack_alignment);
Histogram and Minmax
- The Histogram and Minmax operations come at the end of the Pixel Transfer Pipeline. When used, both can have their own "sink" values. If sink is enabled (GL_TRUE), processing of image data stops here, and does not continue down the pipeline and no output is generated. If the histogram's sink value is true, then minmax is not executed. (See the man pages for more information about the sink behavior of these operations).
- The code below gives an example of getting a histogram for GL_LUMINANCE and data for both GL_UNSIGNED_BYTE and GL_SHORT. Notice below that the requested width of the histogram definition for GL_SHORT has been specified to be 32768 instead of 65536. The reason is that, for GL_SHORT data, the data is effectively
- clamped in the range [0, 32767]. That is, if any of the GL_SHORT values are negative, they will contribute to the very first histogram bin counter value for 0. Specifying a larger width is pointless since only every other histogram bin would have a value in it. Histogram widths, in general, may be any value which is a power of 2 in the range [0, 65536]. However, for those cases where you want to actually display the computed histogram, you can specify a smaller width for GL_SHORT data type, say 256, 512, or 1024. This saves you the time because you do not have to do the code. By requesting a smaller histogram width, histogram bins are added together. For example, for GL_SHORT, if you requested a width of 256, each returned bin value in the histogram image would have 128 bins added together. Hence, all values in the range [0, 127] would be in bin 0. All values in the range [128, 255] would be in bin 1, and so on.
- Minmax uses the histogram to compute its values. It gets the minmax values using the histogram for the full width of the positive values for GL_UNSIGNED_BYTE and GL_SHORT. Therefore, if the histogram is taken of GL_UNSIGNED_BYTE, the possible range of minmax values is [0, 255]. For GL_SHORT, the possible range of minmax values is [0, 32767].
-
-
int minmax[2];
int histogram[32768];
unsigned char *uc_buff;
short *s_buff;
glEnable(GL_HISTOGRAM_EXT);
glEnable(GL_MINMAX_EXT);
/* Allocate enough space for 64 x 64 GL_LUMINANCE images. */
uc_buff = (unsigned char *) malloc (4096*sizeof(unsigned char));
s_buff = (short *) malloc (4096*sizeof(short));
/* First, do it for GL_UNSIGNED_BYTE with GL_LUMINANCE format. */
glHistogramEXT(GL_HISTOGRAM_EXT, 256, GL_LUMINANCE, GL_FALSE);
glMinmaxEXT(GL_MINMAX_EXT, GL_LUMINANCE, GL_FALSE);
glDrawPixels(64, 64, GL_LUMINANCE, GL_UNSIGNED_BYTE, uc_buff);
/* Since the call to glHistogramEXT defined a width of 256, */
/* 256 entries of the histogram array will be filled in. */
/* The remaining entries in the array are untouched. */
glGetHistogramEXT(GL_HISTOGRAM_EXT, GL_TRUE, GL_LUMINANCE, GL_INT,
histogram);
glGetMinmaxEXT(GL_MINMAX_EXT, GL_TRUE, GL_LUMINANCE, GL_INT,
minmax);
/* Do something with the histogram and minmax. */
/* Now, do GL_SHORT data. */
-
-
glHistogramEXT(GL_HISTOGRAM_EXT, 32768, GL_LUMINANCE, GL_FALSE);
glMinmaxEXT(GL_MINMAX_EXT, GL_LUMINANCE, GL_FALSE);
glDrawPixels(64, 64, GL_LUMINANCE, GL_SHORT, s_buff);
/* Since the call to glHistogramEXT defined a width of 32768, */
/* 32768 entries of the histogram array will be filled in. */
glGetHistogramEXT(GL_HISTOGRAM_EXT, GL_TRUE, GL_LUMINANCE, GL_INT,
histogram);
glGetMinmaxEXT(GL_MINMAX_EXT, GL_TRUE, GL_LUMINANCE, GL_INT,
minmax);
Pixel Transform
- Pixel Transform, while shown at the end of the Pixel Transfer Pipeline, is not part of it. Pixel Transform is in the Pixel Rasterizer, and it only works through the glDrawPixels interface.
- Pixel Transform has been especially optimized for applying affine transformation warping to an input image on its way to the frame buffer or pbuffer. It has been specially tuned for handling GL_LUMINANCE format and the GL_UNSIGNED_BYTE and GL_SHORT data types. For GL_SHORT, the data is scaled and clamped to [0, 255] and then warped into the frame buffer or pbuffer. On the way to the frame buffer, the data is also expanded from GL_LUMINANCE data to XBGR format, which is the native format of the frame buffer while in rgb mode.
- Pixel Transform has its own matrix mode with its own matrix stack 32 deep.
-
-
glMatrixMode(GL_PIXEL_TRANSFORM_2D_EXT);
- Pixel Transform is always enabled; however, if its current matrix is the identity matrix, then the pixel transform is not performed. Only when the current matrix is not the identity matrix will pixel transform be performed.
- You can use all of the existing API calls available for matrix operations in OpenGL. These will operate on the current matrix of the GL_PIXEL_TRANSFORM_2D_EXT matrix mode (that is, glLoadMatrix, glTranslate, glRotate, glScale, glLoadIdentity, glPushMatrix, glPopMatrix, glMultMatrix, and so on). When using these matrix operators on the current matrix, after the operation is performed, only the affine components are kept. Entries in the matrix which apply to the z and w components are left like they were initialized with the identity matrix.
- The pixel transform extension operates as if the current raster position is the origin of the coordinate system. To simplify, set the current raster position to be located in the lower left corner of the display window, then figure out your operations. If you want to translate the image, you can use glTranslate, or move the current raster
- position. The difference is that glTranslate will be integrated into the total transformation for pixel transform, while moving the raster position will translate the image regardless of the current matrix contents of the pixel transform matrix.
-
glPixelZoom also affects the pixel transform current matrix; however, only if the current matrix mode is set to GL_PIXEL_TRANSFORM_2D_EXT. Also, if glPixelZoom is called, it replaces the contexts of the current matrix as shown below:
-
-
+-- --+
| x_zoom 0 0 0 |
| 0 y_zoom 0 0 |
| 0 0 1 0 |
| 0 0 0 1 |
+-- --+
- If the current matrix mode is not GL_PIXEL_TRANSFORM_2D_EXT, then the current matrix of GL_PIXEL_TRANSFORM_2D_EXT is not replaced. However, pixel zoom will still be set.
- If the current matrix of GL_PIXEL_TRANSFORM_2D_EXT has been set to something different than identity, and glPixelZoom has been set, then the pixel transform will override the glPixelZoom operation.
- If you want to do any image warping, use the pixel transform extension. Do not use the glPixelZoom interface. Instead, use glScale to set up a zoom matrix. If you are using multiple matrix operations on the pixel transform's current matrix, do not use glPixelZoom in the middle or end of the list of operations since it will reset the matrix (shown above) and remove the affect of any previous operations. Instead, use glScale.
- Pixel Transform supports 4 types of resampling for minification and 3 types for magnification. GL_NEAREST, GL_LINEAR, and GL_CUBIC_EXT are shared by minification and magnification. GL_AVERAGE_EXT is only supported for minification.
- The code fragment below demonstrates how to prepare a pixel transform matrix to do an arbitrary rotation of "angle" degrees about the center of the input image in the center of the frame buffer display window. It assumes the image is GL_LUMINANCE data and GL_UNSIGNED_BYTE. It also sets up the resampling method to be GL_LINEAR for minification and GL_CUBIC_EXT for magnification and sets the GL_CUBIC_WEIGHT_EXT to have the value -0.5.
-
-
double rotation_angle;
int window_width, window_height;
int image_width, image_height;
unsigned char *image_data;
/* Grab needed values for placing image in center. */
window_width = get_window_width();
window_height = get_window_height();
-
-
image_width = get_image_width();
image_height = get_image_height();
image_data = get_image_data();
rotation_angle = get_rotation_angle_between_0_and_360_degrees();
/* Prepare current pixel transform matrix. */
glMatrixMode(GL_PIXEL_TRANSFORM_2D_EXT);
glLoadIdentity();
glTranslated(window_width/2.0, window_height/2.0, 0.0);
glRotated(rotation_angle, 0.0, 0.0, 1.0);
glTranslated (-image_width/2.0, -image_height/2.0, 0.0);
/* Set up resampling methods. */
glPixelTransformParameteriEXT(GL_PIXEL_TRANSFORM_2D_EXT,
GL_PIXEL_MIN_FILTER_EXT,
GL_LINEAR);
glPixelTransformParameteriEXT(GL_PIXEL_TRANSFORM_2D_EXT,
GL_PIXEL_MAG_FILTER_EXT,
GL_CUBIC_EXT);
glPixelTransformParameterfEXT(GL_PIXEL_TRANSFORM_2D_EXT,
GL_PIXEL_CUBIC_WEIGHT_EXT,
-0.5);
/* Finally, render the image to the screen. */
glDrawPixels (image_width, image_height, GL_LUMINANCE,
GL_UNSINGED_BYTE,
image_data);
GX Performance
- GX performance is affected by attributes that force the use of the generic software rasterizer:
-
- Texturing Attributes
a. Only triangles are optimized. Texturing of points and lines is handled by the generic software. b. Texture environment mode glTexEnv(3gl) GL_TEXTURE_ENV_MODE is GL_BLEND.
-
- Fragment Attributes
a. Stencil operations b. Logic operations c. Any blending operation d. Linear or nonlinear fog e. Enabling any Z comparison other than GL_LESS or GL_LEQUAL
|
|