### Performance issues with Wald&havran's nlogn kdtree construction

Posted:

**Thu Feb 07, 2019 8:55 pm**I've been attempting to implement Wald and Havran's nlogn kdtree construction for a photon map implementation, and it consistently has vastly worse construction perf, and I'm not sure how much is due to splitting semantics.

If I implement the naive nlog^2n algorithm using a median split as suggested by Jensen then the split logic (assuming sorted list) is (pseudo code)

This fairly obviously results in a guaranteed max time complexity of nlog^2n, and performance (by measurements) seems to grow at approximately that rate.

When I try to use the W&H approach the constant factors seem to be vastly higher, and the tree ends up being much worse. I believe the issue is how I'm splitting.

Here's what I'm doing (pseudo code again)

At this point I have to actually split around , but I can't work out how to do so efficiently.

The reason my current code goes wrong is because I split on.

This trivially ends up breaking the invariant that left and right nodes are split in the middle of the photons along the split axis.

The problem I can't work out is how to do the split given that the events along the left and right of the split may include events belonging to photons that belong in the other half. What we logically want is something like:

then our split logic easily becomes a matter of splitting on
But that introduces a hash table or similar which both uses a large amount of memory and is expensive to build.

The alternative method I can think of it to essentially store left/right in the photon structure itself, and then during the split search update as appropriate, doing that seems kind of ugly, but seems like it would be the computationally fastest way to do it?

If I implement the naive nlog^2n algorithm using a median split as suggested by Jensen then the split logic (assuming sorted list) is (pseudo code)

Code: Select all

```
split_axis = {longest axis}
List<Photons> the_photons = {list of photons sorted along split axis}
split_index = the_photons.len
split_plane = the_photons[split_index].position[split_axis]
left = the_photons[0..split_index)
right = the_photons[split_index..the_photons.len)
```

When I try to use the W&H approach the constant factors seem to be vastly higher, and the tree ends up being much worse. I believe the issue is how I'm splitting.

Here's what I'm doing (pseudo code again)

Code: Select all

```
initial setup:
events = [];
for photon in photons {
for axis in {x,y,z} {
events.push({axis:x, photon:&photon}) // Actual data structure is saner, but functionally equivalent
}
}
sorted_events = sort events by position
splitting:
events = sorted list of events for this node
split_index = the_photons.len
split_axis = {longest axis}
left_count = events.len / 3 / 2 // we have 3 events per photon
// find the actual split point
split_event = {}
max_axis_count = 0 // actual variable name is also terrible :D
for event in events {
if (event.axis != split_axis)
continue
max_axis_count++;
if (max_axis_count != left_count)
continue
split_event = event
break
}
...
```

Code: Select all

`split_event`

The reason my current code goes wrong is because I split on

Code: Select all

`event.position[split_axis] < split_event.position[split_axis]`

This trivially ends up breaking the invariant that left and right nodes are split in the middle of the photons along the split axis.

The problem I can't work out is how to do the split given that the events along the left and right of the split may include events belonging to photons that belong in the other half. What we logically want is something like:

Code: Select all

```
...
left_events = set{}
for event in events {
if (event.axis != split_axis)
continue
left_events.add(event.photon)
max_axis_count++;
if (max_axis_count != left_count)
continue
split_event = event
break
}
```

Code: Select all

`left_events.contains(event.photon)`

The alternative method I can think of it to essentially store left/right in the photon structure itself, and then during the split search update as appropriate, doing that seems kind of ugly, but seems like it would be the computationally fastest way to do it?