Memless RT
Memless RT
neat AVX optimizations done to the memless RT stuff and more:
http://voxelium.wordpress.com/2012/04/2 ... tructures/
http://voxelium.wordpress.com/2012/04/2 ... tructures/
-
- Posts: 167
- Joined: Mon Nov 28, 2011 7:28 pm
Re: Memless RT
Neat. What's cool is that this paradigm is pretty new... maybe within a few years it can be made competitive with more traditional approaches. Thanks for sharing.
-
- Posts: 48
- Joined: Fri Dec 02, 2011 12:21 pm
Re: Memless RT
I really like that approach (always like to be 100% dynamic on the fly). Hope they'll soon find a proper solution to scale with multiple cores. that's quite limiting right now. then again, core count seems to stagnate currently. still, they need to be fed 
once that's found, it'll be great.

once that's found, it'll be great.
Re: Memless RT
Don't forget also that this doesn't pay off if you need to trace many rays through the scene. Also, requires to trace rays in batches, so can pose some restrictions on the system, like integrator interruption and resuming, etc.
Re: Memless RT
Is it just me or is this (and Rayes) somehow moving towards stochastic rasterization, except in world space as opposed to camera space? Where REYES is sorting primitives into screen space buckets before doing 2D point-in-triangle tests for each pixel, this, if I understand it correctly, is sorting primitives into 3D bounding boxes before doing 3D ray/triangle tests. Together with micro polygon ray tracing, I can see this all moving towards a generalized method in the middle.
Anyhow, I wonder if this would allow for some nice on-demand tessellations.
Anyhow, I wonder if this would allow for some nice on-demand tessellations.
Re: Memless RT
unfortunately the original thread about the memless RT idea vanished together with the old ompf, but on-demand tesselation is perfectly possible, IMHO even automatic LOD for incoherent rays, if one is willing to make some compromises..
(and in case somebody missed the older poster/publication on the topic: http://ainc.de/Research/MemlessRT.pdf)
(and in case somebody missed the older poster/publication on the topic: http://ainc.de/Research/MemlessRT.pdf)
-
- Posts: 3
- Joined: Thu Apr 26, 2012 2:54 pm
Re: Memless RT
Well, I am not sure that there is a real scaling problem. You may want to have a look at the original paper (http://dl.acm.org/citation.cfm?id=2019636) to get more details on the algorithm, results, and the scaling according to cores multicore (~3.5x with 4 threads ).davepermen wrote:I really like that approach (always like to be 100% dynamic on the fly). Hope they'll soon find a proper solution to scale with multiple cores. that's quite limiting right now. then again, core count seems to stagnate currently. still, they need to be fed
once that's found, it'll be great.
This TOG paper will be presented at SIGGRAPH 2012, so stop by if you are around and want to know more.
The EG paper has some interest, but by swapping rays instead of just swapping indices, the authors end up with poor scaling due to some bandwidth constraints.
Yep, large batches are needed so it depends on your circumstances. I am not sure why "this doesn't pay off if you need to trace many rays through the scene" though.Don't forget also that this doesn't pay off if you need to trace many rays through the scene. Also, requires to trace rays in batches, so can pose some restrictions on the system, like integrator interruption and resuming, etc.

(!advertisement!

http://www.highperformancegraphics.org/ ... stract.pdf
It depends what we mean by competitive. I am pretty sure that obtaining several millions of purely random rays per second (not ambient occlusion rays for instance) is quite competitive, and do not forget that a prior construction step is not needed. Maybe I am wrong, but I would expect that in many cases you can get 75% to 95% of the performances of a state-of-the-art ray-tracer with such an approach (again, I refer you to the TOG paper results).maybe within a few years it can be made competitive with more traditional approaches.
Finally, what Toxie said previously is right, things like tessellation is fun and possibly easier with such a paradigm.
Re: Memless RT
Well, because every batch intersection operation effectively builds a new acceleration structure, which is then thrown away. For multi-bounce global illumination you need to perform many such iterations. Actually, you could cache the built structure and maybe even refine it over iterations. That's an interesting direction to investigate. But with caching the parallelization issue should become worse.DTRendering wrote:Yep, large batches are needed so it depends on your circumstances. I am not sure why "this doesn't pay off if you need to trace many rays through the scene" though.Don't forget also that this doesn't pay off if you need to trace many rays through the scene. Also, requires to trace rays in batches, so can pose some restrictions on the system, like integrator interruption and resuming, etc.
-
- Posts: 3
- Joined: Thu Apr 26, 2012 2:54 pm
Re: Memless RT
I would say that it actually builds parts of a new acceleration structure, with subtle differences as well (I'll emphasize that in the SIGGRAPH talk). This is quite important as the building percentage is not so high. So it may require several batches (2 or 4 or 10 or 100?) before precomputing a spatial subdivision data structure is a real advantage. I would estimate it between 4 and 8 batches, but it really depends on the circumstances.because every batch intersection operation effectively builds a new acceleration structure
The algorithm itself can also solve problems in other areas. For instance, if you are using ray-tracing to do collision detections only, then only one batch is needed.
The EG short paper compares their results with precomputed data-structures, but did not discuss the construction times of their data-structures either. It could be interesting if the authors could post informal results here.
Finally, you do not throw away your results. At the end of your batch processing, you end up with a list of shuffled triangles, and clearly there is some sort of coherence in the order of appearance

Last edited by DTRendering on Fri Apr 27, 2012 2:12 pm, edited 1 time in total.
Re: Memless RT
Unfortunately, there is definitely a scaling problem for incoherent rays. The scaling strongly depends on the scene and the CPU architecture. For example, I've achieved almost the same scaling as you did for the Conference scene on a very similar CPU (Bloomfield), but it can be worse with Sandy Bridge or other scenes. That's why I did tests using both CPUs.DTRendering wrote:Well, I am not sure that there is a real scaling problem. You may want to have a look at the original paper (http://dl.acm.org/citation.cfm?id=2019636) to get more details on the algorithm, results, and the scaling according to cores multicore (~3.5x with 4 threads ).
Swapping indices works well for coherent rays, but performs consistently worse for incoherent rays because of the poor cache utilization.DTRendering wrote:The EG paper has some interest, but by swapping rays instead of just swapping indices, the authors end up with poor scaling due to some bandwidth constraints.