Patrick Cozzi Analytical Graphics, Inc.
Overview Z-Buffer Review Hardware: Early-Z Software: Front-to-Back Sorting Hardware: Double-Speed Z-Only Software: Early-Z Pass Software: Deferred Shading Hardware: Fast Clear Hardware: Z-Cull Future: Programmable Culling Unit
Z-Buffer Review Fragment vs Pixel Alternatives: Painter’s, Ray Casting, etc
Z-Buffer History “Brute-force approach” “Ridiculously expensive” Sutherland, Sproull, and, Schumacker, “A Characterization of Ten Hidden-Surface Algorithms”, 1974
Z-Buffer Quiz 10 triangles cover a pixel. Rendering these in random order with a Z-buffer, what is the average number of times the pixel’s z-value is written?
Z-Buffer Quiz 1st triangle writes depth 2nd triangle has 1/2 chance of writing depth 3rd triangle has 1/3 chance of writing depth 1 + 1/2 + 1/3 + …+ 1/10 = 2.9289…
Z-Buffer Quiz
Z-Test in the Pipeline
Early-Z
Early-Z
Front-to-Back Sorting Old hardware still has less z-buffer writes CPU overhead. Need efficient sorting Conflicts with state sorting
Double Speed Z-Only GeForce FX and later render at double speed when writing only depth or stencil Enabled when - Color writes are disabled
- Fragment shader discards or write depth
- Alpha-test is disabled
Early-Z Pass Software technique to utilize Early-Z and Double Speed Z-Only Two passes - Render depth only. “Lay down depth” – Double Speed Z-Only
- Render with full shaders – Early-Z (and Z-Cull)
Deferred Shading Similar to Early-Z Pass - 1st Pass: Visibility tests
- 2nd Pass: Shading
Different than Early-Z Pass - Geometry is only transformed once
Deferred Shading 1st Pass - Render geometry into G-Buffers:
Deferred Shading 2nd Pass - Shading == post processing effects
- Render full screen quads that read from G-Buffers
- Objects are no longer needed
Deferred Shading
Deferred Shading Eliminates shading fragments that fail Z-Test Increases video memory requirement How does it affect bandwidth?
Buffer Compression Reduce depth buffer bandwidth Generally does not reduce memory usage of actual depth buffer Same architecture applies to other buffers, e.g. color and stencil
Buffer Compression Tile Table: Status for nxn tile of depths, e.g. n=8 - [state, zmin, zmax]
- state is either compressed, uncompressed, or cleared
Buffer Compression
Buffer Compression Depth Buffer Write - Rasterizer modifies copy of uncompressed tile
- Tile is lossless compressed (if possible) and sent to actual depth buffer
- Update Tile Table
- zmin and zmax
- status: compressed or decompressed
Buffer Compression Depth Buffer Read - Tile Status
- Uncompressed: Send tile
- Decompress: Decompress and send tile
- Cleared: See Fast Clear
Fast Clear Don’t touch depth buffer glClear sets state of each tile to cleared When the rasterizer reads a cleared buffer - A tile filled with GL_DEPTH_CLEAR_VALUE is sent
- Depth buffer is not accessed
Fast Clear Use glClear Clear stencil together with depth
Z-Cull Cull blocks of fragments before shading Coarse-grained as opposed to Early-Z
Z-Cull Zmax-Culling - Rasterizer fetches zmax for each tile it processes
- Compute ztrianglemin for a triangle
- Culled if ztrianglemin > zmax
Z-Cull
Z-Cull Automatically enabled on GeForce (6?) cards unless - glClear isn’t used
- Fragment shader writes depth (or discards?)
- Direction of depth test is changed
ATI recommends avoiding = and != depth compares and stencil fail and stencil depth fail operations Less efficient when depth varies a lot within a few pixels
Programmable Culling Unit Cull before fragment shader even if the shader writes depth or discards Run part of shader over an entire tile to determine lower bound z value
Summary What was once “ridiculously expensive” is now the primary visible surface algorithm for rasterization
Resources www.realtimerendering.com
Resources developer.nvidia.com/object/gpu_programming_guide.html
Resources http://www.graphicshardware.org/previous/www_2000/presentations/ATIHot3D.pdf
Resources http://ati.amd.com/developer/dx9/ATI-DX9_Optimization.pdf
Resources developer.nvidia.com/object/gpu_gems_home.html
Resources developer.nvidia.com/object/gpu-gems-3.html
Dostları ilə paylaş: |