Hi all,

some forum members have shown interest in our paper on irregular grids. Since there are some concerns that we can quickly address, and to keep things organized, I am starting a new thread here.

The paper is available at http://kalojanov.com/data/irregular_grid.pdf . On the same page you can find additional materials like presentation slides and a CUDA implementation. All of this is also available here https://graphics.cg.uni-saarland.de/index.php?id=939

Both Arsene and myself will be happy to answer questions or help anyone who is interested in experimenting in the same or similar direction.

## Irregular grids

### Re: Irregular grids

Hey,

That was a really interesting publication! You have, what seems to me, to be excellent rendering performance. Still it would be interesting to profile your build times to see where are the bottlenecks in the algorithm for each scene. You can probably easily get this data off something like nvprof. San Miguel in particular is an interesting scene to study since it has a lot of dissimilar sized triangles. Sponza, Conference, Crown, and Fairy Forest, should have similar characteristics. If enough time is spent on the initial uniform grid construction, on scenes with dissimilar sized triangles, a different construction algorithm with better workload distribution among threads might be warranted. Looking at Figure 5 in your paper the grid construction time seems to be ~40% of total construction time in your algorithm. In our experience this can be made up to 9x faster with a more scalable algorithm.

We also made the CUDA source code from our paper available, in the MIT license, on Github here in case you're interested: https://github.com/vcosta/cuda-grid . In case you want to understand the algorithm just look at Figures 2,3,4 in our paper: https://www.academia.edu/13928983/Effic ... hitectures . It's totally parallel and uses no atomics whatsoever.

That was a really interesting publication! You have, what seems to me, to be excellent rendering performance. Still it would be interesting to profile your build times to see where are the bottlenecks in the algorithm for each scene. You can probably easily get this data off something like nvprof. San Miguel in particular is an interesting scene to study since it has a lot of dissimilar sized triangles. Sponza, Conference, Crown, and Fairy Forest, should have similar characteristics. If enough time is spent on the initial uniform grid construction, on scenes with dissimilar sized triangles, a different construction algorithm with better workload distribution among threads might be warranted. Looking at Figure 5 in your paper the grid construction time seems to be ~40% of total construction time in your algorithm. In our experience this can be made up to 9x faster with a more scalable algorithm.

We also made the CUDA source code from our paper available, in the MIT license, on Github here in case you're interested: https://github.com/vcosta/cuda-grid . In case you want to understand the algorithm just look at Figures 2,3,4 in our paper: https://www.academia.edu/13928983/Effic ... hitectures . It's totally parallel and uses no atomics whatsoever.

### Re: Irregular grids

We did profile our build times with nvvp, which is why we are confident that any speed-up to the initial creation of the triangle-cell references will not result in noticeable improvement of the build times. The top-level grid we create is much too sparse for that. You can try the code and see for yourself though.

### Re: Irregular grids

Following up the discussion from the "glitch pictures" thread:

As with the Nemoto & Omachi and the Macro-regions paper, the resulting acceleration structures are indeed very similar, but not the same. You certainly have a point criticizing how the paper is positioned, but our view on this is that the novelty is mainly in the combination of construction and traversal algorithms (and the resulting performance), not so much in the the geometric representation of the spatial index structures.

ziu wrote:But it is essential that you guys reference the EXCELL (Tamminen et al.) data structure papers in future publications. It is the exact same acceleration structure even if the construction method might not be the same. ...

As with the Nemoto & Omachi and the Macro-regions paper, the resulting acceleration structures are indeed very similar, but not the same. You certainly have a point criticizing how the paper is positioned, but our view on this is that the novelty is mainly in the combination of construction and traversal algorithms (and the resulting performance), not so much in the the geometric representation of the spatial index structures.

### Re: Irregular grids

javor wrote:Following up the discussion from the "glitch pictures" thread:

As with the Nemoto & Omachi and the Macro-regions paper, the resulting acceleration structures are indeed very similar, but not the same. You certainly have a point criticizing how the paper is positioned, but our view on this is that the novelty is mainly in the combination of construction and traversal algorithms (and the resulting performance), not so much in the the geometric representation of the spatial index structures.

Then why do you say this in Section 1, Paragraph 2, of your paper?

In this paper, we demonstrate that efficient ray traversal on modern GPUs is also possible with a new type of acceleration structure, which combines characteristics of hierarchical grids and bounding volume hierarchies.

To me it looks like you guys clearly thought the acceleration structure itself was novel. The acceleration structure is EXCELL so I think that paper should be referenced or, at minimum, acknowledged in some fashion. The traversal algorithm you guys use IMHO is also similar to prior EXCELL algorithms. What is novel in your work IMHO is the construction method plus the fact it's optimized for a GPU architecture. I quote the "Adaptive cell division for ray tracing" (1991), B. Pradhan, A. Mukhopadhyay paper I referenced before in the other thread because I think it makes this more explicit:

2. PROPOSED DATA STRUCTURE

We assume that the entire object space is uniformly divided into cells by a three-dimensional "virtual" grid of predefined resolution. We call this grid virtual because the cells are not occupied by the objects in object space but by integer entries which guide the search through object space. The virtual grid is represented by a three-dimensional integer array with each cell represented by an element of the array. The space subdivision is done in the following manner: At the beginning the entire object space is enclosed within a box called the root-box on which we superimpose the virtual grid. ... Fig. 1 illustrates this subdivision in two dimensions ... In the case of quadtree division (two-dimensional version of oct-tree division) this box will be divided into four uniform quadrants ... Now notice that each box contains an integral number of cells along both the x- and y-axes. Thus, every cell is contained in exactly one box. We mentioned earlier that the virtual grid is an integer array. This is because the array element corresponding to a cell is filled with identification number of the box which contains the cell. ... The data structures for cell division described in the previous section are shown in Fig. 3. It shows the box-array, an array of pointers to box structures created during subdivision. A box is entered into this array when it can no longer be subdivided. A box is uniquely identified by an index of the array. ... Each box consists of cells which contain the index of this box structure in the box-array.

theirs:

yours:

3. TRAVERSING THE DATA STRUCTURE

Traversing the data structure by a ray is reduced to traversing the virtual grid in object space. When a ray is fired, the first point in which it intersects the root-box is calculated. From this point onward the number of cell units traversed by the ray in both the x-and y-directions are maintained. Since cells are of uniform size, this can be obtained by dividing the actual distance traversed in x- and y-directions by respective cell dimensions in those directions. These cell units need not be whole numbers, since a ray may happen to cross a fraction of a cell. But the integer portions of the cell units yield indices of the cell the ray is currently traversing. Using these to index into the virtual grid we get the number of the box the ray is traversing. ... To find the next box, first we have to obtain the point at which the ray leaves the present box. Next, we have to calculate a point slightly away from the previous point along the direction of the ray. Glassner [2] employs a method in which the ray is intersected with all the six planes of the current box, and the nearest plane or planes are found by determining the smallest t values. This method consumes nine multiplication and nine addition operations, assuming that division and subtraction are equivalent to multiplication and addition, respectively. Lee [5] uses a target plane method in which 5 multiplications and 6 additions take place on average. But, in the worst case, this is slower than the Glassner's method. We employ another different method which is faster than the above two methods. We start by observing that any ray can potentially intersect, at most, three of the six planes of the current box, at the point where it leaves the box. We classify these six planes as lower x- plane, higher x-plane, lower y-plane, higher y-plane, lower z-plane, and higher z-plane. After determining the ray source u and the ray vector v, we find which three of these planes will possibly be intersected by the ray with smallest t by examining its ray vector. For example, if v has a negative x component then the lower x-plane, but not the higher x-plane, is to be tested for intersection. This method consumes five multiplications and six additions in the worst case, which is equal to the average case performance of Lee's method [5].

It looks to me like this describes the method you guys use at:

https://github.com/madmann91/hagrid/blo ... raverse.cu

This is also described in the paper "A new algorithm of space tracing using a CSG model" (1987), Kadi Bouatouch, Mohamed Ouali Madani, Thierry Priol, Bruno Arnaldi.

Still think it's a waste of time to read the prior work? Shit, now you made me do *your* work. But this was fun because I found a lot of other papers about something else that I'm currently working on as a result so it's good.

AFAIK your construction algorithm *is* novel compared to prior published EXCELL work, plus it uses SAH, and it is implemented on a GPU, I have no doubt that you guys ended up reinventing a lot of this independently (it wouldn't be the first time it happened in the field; the grid voxel traversal algorithm was even independently conceived by at least 3 teams in the 1980s at roughly the same time frame in different areas of the globe!). But that's no excuse not to quote the prior art in the future now that you know that it exists. At least that's my POV. Still, it doesn't detract from your major achievement, like I said before.

### Re: Irregular grids

Yes, the EXCELL and the B. Pradhan paper (I only read the first) explore in essence the same idea as we do - a non-uniform space subdivision into AABBs. If we would start the initial subdivision from a single cell instead of a grid, we will end up with the same structure after merging. Because of the "expansion" step we perform afterwards, we end up with cells that overlap each other. This combination of characteristics, together with using SAH and the ability to merge cells across top-level cells of the initial two-level subdivision is unique (to my knowledge).

Whether or not the differences to previous methods (like the macro-regions by Devillers) justify the claim about the novelty of the structure in the way we formulated it, is another question. Seeing the reaction(s) I would have re-phrased the sentence.

Whether or not the differences to previous methods (like the macro-regions by Devillers) justify the claim about the novelty of the structure in the way we formulated it, is another question. Seeing the reaction(s) I would have re-phrased the sentence.

### Re: Irregular grids

I am the first author of the paper.

I have a few remarks to make. First of all, I think there is undoubtedly novelty in the data structure we present. In particular, I see two major differences with previous work and the papers you mention:

Even if you completely disregard the novel construction algorithm, the optimized data layout, and all of the experiments, there is enough novelty in those two points alone to claim the data structure is new. That may not be earthshakingly new, but it is undeniably new. Research rarely starts from scratch, and ideas are always around, even if they are developed independently.

Concerning your related work suggestions, as Javor said, we have to make a choice as to what to mention in the references. The intended audience is people working using ray tracing today. Most of the approaches you mention, while doing something somewhat similar, are more than 20 years old, and re-implementing them for comparison purposes would not be trivial because they were not designed to work on GPUs. We already did that effort for Macro-Regions, and the results speak for themselves: Performance is just not here (on a simple scene, like Sponza, Macro-Regions are around 4 times slower). There are plenty of other approaches that do something somewhat similar and we simply do not have the time/space to put each and every one of them in the paper. The choice of references we made might not be perfect, but we believe it positions our paper as a modern, GPU ray tracing article.

I have a few remarks to make. First of all, I think there is undoubtedly novelty in the data structure we present. In particular, I see two major differences with previous work and the papers you mention:

- We are using a hierarchical grid as an index ("brick map"). Previous approaches all use a uniform grid for that purpose. There however is a cost in using a hierarchical grid: Our virtual grid must have a resolution of the form K*(2^d), and doing a lookup in the virtual map is no longer one memory access, but a sequence of dependent memory accesses. The advantage is that our hierarchical grid allows us to easily reach grid resolutions of over 50000^3, whereas all of the previous approaches, if implemented on a modern GPU, would not be able to get beyond around 2048^3. A large margin of the performance gains we have come from this (see the section that compares against Macro-Regions in the paper).
- We allow cells to overlap each other. The work that is the closest in that respect is the Macro-Regions paper, where only empty cells can overlap. This, in turn, gives more freedom to choose a cell shape than any of the previous approaches. As a consequence, the expansion phase can provide 20% performance improvements over the already merged grid.

Even if you completely disregard the novel construction algorithm, the optimized data layout, and all of the experiments, there is enough novelty in those two points alone to claim the data structure is new. That may not be earthshakingly new, but it is undeniably new. Research rarely starts from scratch, and ideas are always around, even if they are developed independently.

Concerning your related work suggestions, as Javor said, we have to make a choice as to what to mention in the references. The intended audience is people working using ray tracing today. Most of the approaches you mention, while doing something somewhat similar, are more than 20 years old, and re-implementing them for comparison purposes would not be trivial because they were not designed to work on GPUs. We already did that effort for Macro-Regions, and the results speak for themselves: Performance is just not here (on a simple scene, like Sponza, Macro-Regions are around 4 times slower). There are plenty of other approaches that do something somewhat similar and we simply do not have the time/space to put each and every one of them in the paper. The choice of references we made might not be perfect, but we believe it positions our paper as a modern, GPU ray tracing article.

### Who is online

Users browsing this forum: No registered users and 1 guest