routines for manycore cpus. these implementations outperform latest embree kernels
by 4x for coherent ray transport and > 2x for incoherent transport.
the underlying accl. structure is a mbvh4 built with spatial splits. this kind of
space subdivition typically gives the best traversal performance for 99% of all scenes
but a parallel, vectorized implementation is not that easy to do. some years ago we
already developed a highly parallel morton grid builder that is today as fast as any gpu
grid builder when running on a up2date dual socket system (https://www.youtube.com/watch?v=qtXs4APw1uI).
in production environments low quality builders have no standing and therefore
a fast and scalable high quality bvh builder is needed. below we present some early results from our
latest developments in this direction. on complex scenes we outperform embree builders by up to 8x !
the paper and more informations can be found here:
http://rapt.technology/posts/part-ii-pa ... struction/
