<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <!-- saved from url=(0060)http://graphics.stanford.edu/courses/cs448a-01-fall/as1.html --> <meta http-equiv="Content-Type" content="text/html; charset=unicode"> <meta content="Microsoft FrontPage 5.0" name="GENERATOR"> <title>Assignment 2: Benchmarking Graphics Hardware</title> </head> <body style="color: rgb(255, 255, 255); background-color: rgb(0, 0, 0);" alink="#000099" link="#000099" vlink="#990099"> <h1 style="text-align: center; color: rgb(255, 255, 0);">CS 600.460: Interactive Graphics and Games</h1> <h1 style="text-align: center; color: rgb(255, 255, 0);">Assignment 2: Benchmarking Graphics Hardware</h1> <p><b><span style="color: rgb(255, 0, 0);">Due: February 21</span><br> </b></p> <p>The purpose of this assignment is to probe the graphics subsystem to learn as much as you can about its performance, design, and flaws.&nbsp; Your benchmarking software should be sufficiently flexible to automatically report the characteristics of different graphics cards (i.e., you should be able to install and run it on a new machine with a bare minimum of effort).&nbsp; Of course, if you are using vendor-specific extensions, this may not always be possible.</p> <p>Many thanks to Ian Buck for writing the skeleton code for GfxBench.&nbsp; You may, of course, feel free to discard the GfxBench skeleton and write your own.&nbsp; This assignment is stolen from David Luebke's 2004 Real Time Rendering course, who stole it from Greg Humphrey's 2002 course Big Data in Computer Graphics, who in turn stole it from Pat Hanrahan and Kurt Akeley's course in Real Time Graphics Architectures.</p> <p>This assignment is fairly open-ended, and somewhat loosely specified in places.&nbsp; This is deliberate.&nbsp; The techniques necessary to get graphics hardware to run fast are sometimes obscure, and certainly not well documented.&nbsp; You will have to do some digging and experimentation to figure out how to make things run fast.&nbsp; You may want to explore the web for hints about how to do this, particular for part II.&nbsp; You may find that you're losing a factor of two somewhere just because some magic OpenGL state setting is wrong. Put some effort into making things as fast as possible (this knowledge will obviously help you later in the semester!) </p> <p>Feel free to share hints and URL's about making hardware run fast with other students (in fact, you are encouraged to do this on the <a href="http://groups.yahoo.com/group/jhucs-igg05/">class forum</a>), but keep your actual results and code to yourselves.&nbsp; For example, telling your classmates, "I realized that if I disabled the depth test, it made a huge difference," is encouraged.&nbsp; Telling your friends, "The texture download bandwidth of my GeForce 4 seems to be one petabyte per fortnight," is doing their work for them.&nbsp; When in doubt, ask me.</p> <h2>Part I<b>: Explore the crossover point between geometry and rasterization.</b></h2> <h3> </h3> <p>Modern graphics hardware can be generalized into two parallel components: geometry and rasterization. The performance of the graphics system for any given scene is determined by the slower of these two components. For small triangles, the rasterization work per triangle is small, so the system is limited by the rate at which vertices can be processed. For large triangles, or ones with complex shading, the fragment operations may dominate the rendering pipeline. </p> <p>Provided for this assignment is a sample benchmarking application called GfxBench (see below). It measures the fill rate and triangle rate using OpenGL. Modify the GfxBench application to examine the crossover point between geometry limited rendering and rasterization limited rendering. Graph the triangle rate as a function of triangle size for regular smooth shaded triangles. What is the crossover point? </p> <p>Next, modify the program to determine the triangle rate, fill rate, and crossover point for the following: textured triangles, lit triangles, and textured, lit triangles. Graph and explain your results in a <strong>one</strong> page (not including graphs or code) write up. Compare your results for the different triangle types and explain why they may exist. Please include your source code in your write-up. Be sure to discuss any interesting details you might find. </p> <h2>Part II: Moving pixel data</h2> <p>How fast can you move pixel data across the AGP bus?&nbsp; Measure each of the following:</p> <p><b>DrawPixels performance<br> </b>Measure the performance of pixel blitting.&nbsp; What is the impact of the data alignment and/or stride?&nbsp; What is the impact of the source data type (RGB, RGBA, AGBR, etc).&nbsp; Does the size of the image matter?&nbsp; What about blending modes?</p> <p><b>ReadPixels performance<br> </b>How fast can you read color pixels from the framebuffer to main memory?&nbsp; What is the impact of the data type?&nbsp; (BGRA, BGR, RGB, RGBA, etc.)&nbsp; Compare the peak performance of color readback to the peak performance of depth buffer readback.&nbsp; Are you surprised by the results?</p> <p><b>Texture download bandwidth<br> </b>How fast can you move new texture data from host memory to texture memory?&nbsp; Try both <font face="Courier New">glTexImage()</font> and <font face="Courier New">glTexSubImage()</font>.&nbsp; Again, what is the impact of the data type?&nbsp; Do these results surprise you?</p> <p>Create a one-page summary of your findings, including any generalizations you can make about format/alignment/stride dependencies in the graphics pipeline.&nbsp; What group of settings did you use to get the very highest performance from each of these metrics?</p> <h2>Part III: Detailed examination of specific parts of the graphics pipeline</h2> <strong>Choose <i>one</i> of the following aspects of graphics hardware to explore. Present your results in a third one</strong> <strong>page (not including graphs or code)&nbsp;writeup. Please include any source code used to generate your results.</strong> <p><b>Rasterization</b><br> What is the effect of triangle shape on rasterization performance? Is there a difference in long, thin vertical triangles verses long, thin horizontal triangles? Modify the GfxBench application to test a variety of triangle shapes. Also graph fill rate as a function of triangle size.&nbsp; Present your results and discuss what this tells you about how the rasterizer works. </p> <p><b>Texturing</b> <br> Explore the texture cache behavior. Modern graphics hardware maintains an on-chip texture cache as well as using on-board video memory for local texture storage. Modify the GfxBench program to determine the size of the on-chip texture cache and on-board texture memory usage.&nbsp; How does the performance change as a function of texture angle?&nbsp; What (if anything) can you determine about the cache's replacement policy?&nbsp; If you can figure out what it is, does it make sense?&nbsp; Graph your results and explain the texture cache behavior and how you were able to measure it. </p> <p><b>Flexibility/Programmability<br> </b>Explore the performance impact of the programmable features of the newer graphics cards, especially fragment processing.&nbsp; Are certain features slower than others?&nbsp; Why is this? Investigate the overall performance impact of multitexturing, dependent textures, texture locality (think bump-mapped environment mapping with varying levels of bumpiness), etc.&nbsp; See just how slow you can make the card run!&nbsp; Discuss the tradeoff between functionality and performance.</p> <p><b>Vertex Engine</b> <br> Modern graphics hardware includes a vertex cache for triangles that share vertices. This cache maintains transformed vertices that can improve geometry rates. Modify the GfxBench application to examine the vertex cache size and performance. Graph and explain your results. Discuss briefly the benefits and tradeoffs of a vertex cache. </p> <p><b>Graphics Interface</b> <br> Modify the GfxBench application to explore the front end of the graphics pipeline. Modern graphics hardware allows for placing vertex data in AGP memory or on-board video memory for increased geometry performance. Use the NVIDIA "VertexArrayRange" or VAR extension to examine the performance of using AGP and video memory for vertex data compared to regular malloc'ed memory. Is one better than the other?&nbsp; Alternative, try the newer "Vertex Buffer Object" or VBO extension (supported both on ATI and NVIDIA; the vendors are encouraging developers to use VBO instead of earlier versions like VAR).&nbsp; How close can you get to the advertised performance of your graphics card?</p> <p><b>Other<br> </b>If there is some other aspect of graphics hardware performance that really intrigues you and you think you can probe it automatically, send me e-mail.</p> <h2>Logistics</h2> <p style="color: rgb(255, 0, 0);"><strong>This assignment is due Monday, Feb 21, at the start of class (but see the late policy on the course web page).&nbsp;</strong></p> <p>You are allowed to work in groups of two.&nbsp; At least one person should be familiar with OpenGL programming.&nbsp; If you have not programmed in OpenGL, find pair up with someone in the class who has.&nbsp;</p> <p><strong>Submitting Your Results:&nbsp;&nbsp;</strong>Write up a web page which&nbsp;includes a complete report of your results, including source any&nbsp;source code wrote and graphs of your data, and e-mail the URL to the instructor. I want to be able to save your web page using "save web page complete", so it should be a single html page with embedded images (i.e. no links to additional html pages you have generated).<br> </p> <p><strong>Sample Code:&nbsp; </strong>The source code and compiled executable for the GfxBench application can be found here: </p> <p><a href="Tools/gfxbench.zip">GfxBench.zip (Windows)</a></p> <p><a href="Tools/gfxbench.tar.gz">GfxBench.tar.gz (Unix)</a></p> <p>This has been tested under Windows using Microsoft Visual Studio 6.0, and Linux using GNUMake.&nbsp; I make no claims (nor does Ian) that GfxBench is itself optimal (i.e., it may be possible to achieve a slightly higher triangle rate than GfxBench does, even just using immediate mode calls).&nbsp; If you wish to improve GfxBench or start from scratch, feel free.&nbsp; Document what you did differently in your writeup.</p> <p><strong>Grading: </strong>A high grade will be awarded if you demonstrate a good understanding of how graphics hardware <i>could</i> work. Coming up with the correct value for a particular performance metric is less important than how you analyize your results.&nbsp; You are not expected to know all of the details regarding the system you benchmark.&nbsp; Rather your grade will be&nbsp;determined&nbsp;by the tests you&nbsp;design and your analysis of the results.&nbsp; Groups will be assigned the same grade.</p> </body> </html>