Show-off, reference material & tools.
6 posts • Page 1 of 1
Nice. Do you have a video? I'm confused by your timings. When you say 12 FPS for 1080p, do you mean the final image quality is achieved 12 times per second, or the window is progressively updated at 12 FPS? Do you have some ray/sec stats? How many rays or paths per pixel in the 1080p frame? Do you have any full-res screen captures?
There are profiling tools available both for AMD and NVIDIA that can help you. The AMD profiler has a counter that tell you the ALU utilization, it is a good indicator if your application is compute bound or not.UnRAVeL wrote:I still can't figure out why the performance on a 7970 is only marginally better compared to a 680, I expected the 7970 to be much faster. Does this mean my kernel is compute bound? Can anyone shed some light on this?
However, in my experience, ray tracers on GPUs are far from being compute bound and are instead mostly driven by register/cache size, cache/memory bandwidth, etc. The 680 is supposed to have some limit in the cache/memory bandwidth/size so the 7970 is usually faster.