OK, did some quick tests:
- Applying the method to direct light sampling yields the results presented in the paper.
- Applying the method to the first diffuse bounce yields no perceivable improvement in quality.
In general, the number of dimensions is a problem: I tried 6 dimensions (sampling direct light on first diffuse surface, then the first diffuse bounce and finally direct light on the second diffuse surface) but this already seems to decrease the quality of the penumbras compared to using just 4 dimensions. This would suggest that just using 2 dimensions could yield the best quality; this way additional dimensions do not affect the quality of the distribution of the first two. It could be that slightly more converged tiles yield better results; I had 128x128 / d=10 tiles of high quality and produced 32x32 / d=6 in just a few minutes (didn't expect to need them).
So that's pretty much what everyone expected.
That being said, the method obviously improves image quality for the first couple of samples, it's straight-forward to implement and should have only a tiny impact on performance.
EDIT: slightly more converged tile. Method applied to 6 dimensions (NEE-1, diff bounce, NEE-2). Obvious tiling pattern due to small tiles; this disappears for larger tiles.