Hi there,
I've started work on a small 2d physics engine using OpenCL, where the bulk of work is in continuous collision detection between moving line segments. I have implemented this by testing the vertices of line segments for intersection with the line segments of potentially colliding objects. The technique i use is based on representing the points as 3d line segments (movement in time), and the line segments as 3d quadratics (movement in time). The problem is then to find the point of intersection between a ray and a quadratic in the interval [0..1]. To simplify this i split the quadratic into two triangles and test the ray with both triangles.
Since bruteforce doesnt scale, I implemented a broadphase to cull the potential colliding objects (much like a Kd tree, actually im using an uniform grid). This helped a lot, but still after profiling the ray/triangle intersection takes up a significant amount of time.
I googled for some references on how to speed up the raytriangle test and there are plenty of techniques to be found:
MollerTrumbore
Plucker
Sven woop
Projection tests
But im not sure what the most efficient is for a gpu architecture. There are some papers comparing technqiues, but most of them seem to be outdated and are available when you have a subscription to the journal only.
Since this problem is so essential for raytracing, and you guys have done a lot of research & work on this problem, i was wondering if someone can point me to the current state of art for implementing this as efficiently as possible on a (current) gpu architecture.
Any help is greatly appreciated!
Martijn
OpenCL  raytriangle intersection

 Posts: 167
 Joined: Mon Nov 28, 2011 7:28 pm
Re: OpenCL  raytriangle intersection
I found "RayTriangle Intersection Algorithm for Modern CPU Architectures" to provide a really fast, and fairly robust ray triangle intersection. I found that MT is more robust and requires less storage, but Shevtsov's intersection test was faster for me (especially on GPU). Warning: I did this evaluation more than 5 years ago
Re: OpenCL  raytriangle intersection
There are several factors in this problem
1) the accuracy : do you need very accurate test or not ?
2) the memory usage : is it important for you ?
3) the speed : is it so important
In this problem you have to find the exact match for you.
By example you can use the ward method, it seems to me very general one.
You can use the sven woop for fast one.
And the pluecker version for accuracy.
So, if you just start maybe start woop version... And if needed you can switch to another method.
Hope it helps
1) the accuracy : do you need very accurate test or not ?
2) the memory usage : is it important for you ?
3) the speed : is it so important
In this problem you have to find the exact match for you.
By example you can use the ward method, it seems to me very general one.
You can use the sven woop for fast one.
And the pluecker version for accuracy.
So, if you just start maybe start woop version... And if needed you can switch to another method.
Hope it helps
Re: OpenCL  raytriangle intersection
@spectral, you're right spectral about matching the method with the specific goals, for now it remains speed, but the nice thing about this approach is that I can always swap the method and gain accuracy for example.
@graphicsman: I did some more digging and found a more recent analysis for current state gpu's:
http://gggj.ujaen.es/docs/grapp14_jimenez.pdf
Funny thing is that plain MollerTrumbore ray triangle checks seem to perform best on current state gpu's.
Thanks again!
Martijn
@graphicsman: I did some more digging and found a more recent analysis for current state gpu's:
http://gggj.ujaen.es/docs/grapp14_jimenez.pdf
Funny thing is that plain MollerTrumbore ray triangle checks seem to perform best on current state gpu's.
Thanks again!
Martijn
Re: OpenCL  raytriangle intersection
I've also found that the MollerTrumbore test is the fastest on GPUs. The Woop test (or perhaps a slight variation) is the fastest for me on CPUs, but because of the additional storagerelated cost, it is slightly slower than the MollerTrumbore test on GPUs in my experiments. Plucker is definitely slower albeit the most robust one. Projection test is probably not good anymore today, though it could have been the fastest without SIMD. I however think that you should see very minor differences in performance once you improved your acceleration data structure. I might be very wrong, but my two cents.
Re: OpenCL  raytriangle intersection
I use the MöllerTrumbore algorithm too. There are faster algorithms for computing if there is an intersection but most of the time you will want to compute the intersection point. There are other algorithms which cache values and require less mathematical operations to compute an intersection but they use more memory, when GPUs have limited amount of DRAM and no virtual memory, plus you get more cache trashing with those algortihms.
So MöllerTrumbore it is.
So MöllerTrumbore it is.