Page 1 of 2

Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracing

Posted: Thu Nov 21, 2013 1:32 pm
by voxelium
Stackless Multi-BVH Traversal for CPU, MIC and GPU Ray Tracing
Attila T. Áfra and László Szirmay-Kalos
Computer Graphics Forum (2013)
http://voxelium.wordpress.com/2013/11/2 ... y-tracing/

Personal copy: http://cg.iit.bme.hu/~afra/publications ... mbvhsl.pdf
Definitive version: http://dx.doi.org/10.1111/cgf.12259 (currently free)

Abstract:
Stackless traversal algorithms for ray tracing acceleration structures require significantly less storage per ray than ordinary stack-based ones. This advantage is important for massively parallel rendering methods, where there are many rays in flight. On SIMD architectures, a commonly used acceleration structure is the multi bounding volume hierarchy (MBVH), which has multiple bounding boxes per node for improved parallelism. It scales to branching factors higher than two, for which, however, only stack-based traversal methods have been proposed so far.
In this paper, we introduce a novel stackless traversal algorithm for MBVHs with up to 4-way branching. Our approach replaces the stack with a small bitmask, supports dynamic ordered traversal, and has a low computation overhead. We also present efficient implementation techniques for recent CPU, MIC (Intel Xeon Phi), and GPU (NVIDIA Kepler) architectures.

Edit: added links to personal copy and definitive version.

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Fri Nov 22, 2013 10:26 am
by toxie
This is definetly some of the better papers on traversal the last years!
Good read and some interesting (maybe not super-ground breaking, but still) ideas..

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Fri Nov 22, 2013 3:51 pm
by voxelium
Thanks, toxie! :)

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Fri Nov 22, 2013 10:55 pm
by Dade
It is a shame AMD has dropped VLIW architecture, QBVH (aka MBVH) was very effective on that kind of GPUs too. Now, I guess, this kind of research is mostly useful for CPUs and Xeon Phi.

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Fri Nov 29, 2013 4:23 pm
by toxie
Still, the small bitmask/stack should come in handy for some applications..

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Fri Nov 29, 2013 5:07 pm
by ingenious
I wonder why no authors preprint of the paper is publicly available. It shouldn't be a problem to publish such a version.

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Mon Jan 20, 2014 8:29 pm
by cessen
Just wanted to note that I implemented the algorithms from this paper in Psychopath, and it's been a huge benefit. Specifically, I was previously limited to a 2-arity BVH before due to Psychopath's use of ray reordering. Using the algorithm in this paper for 4-arity BVH traversal, I was able to improve BVH performance by over 50%. And even using the 2-arity algorithm significantly simplified my code, and provided a nice bump in performance compared to the algorithm I was previous using.

Thanks for the paper!

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Tue Jan 21, 2014 10:03 am
by mpeterson
yes, but keep in mind that this approach is at least a mag of perf. behind compared to state of the art, mp.

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Tue Jan 21, 2014 12:15 pm
by voxelium
Whoa, a magnitude of performance behind? Compared to what exactly?

Re: Stackless MBVH Traversal for CPU, MIC and GPU Ray Tracin

Posted: Tue Jan 21, 2014 3:11 pm
by jbikker
On CPU, the fastest practical approach is straight MBVH traversal. For first bounce diffuse rays, Tsakok's MBVH/RS is optimal. I presented a paper with an approach that outperforms both (by ~20%), but it's a complex algorithm:
http://arauna2.ompf2.com/files/cgf_article.pdf