]]>

Getting over 2.5B rays/s on a CPU is exciting though, but I agree we have to see the incoherent numbers.

Statistics: Posted by atlas — Wed Sep 28, 2016 1:10 am

]]>

Statistics: Posted by atlas — Tue Sep 27, 2016 10:32 pm

]]>

]]>

hardware ?

Statistics: Posted by rtpt — Mon Sep 26, 2016 8:29 am

]]>

knl seems to be the first accelerator from intel with some kind of power under the hood

(knf and knc have been simple nonstarters). so i did an optimized implementation

of clpt for avx512 and was surprised about the outcome.

clpt is by far the fastest rt-kernel for cpus today but was never compared to gpus.

so i was looking around for some numbers. not much to find ! so i used the

medium numbers (viewpoint 2) from amd firerays 2 on firepro w9100 and measured the test-scenes

on cuda by using the implementation from nvidia (http://www.nvidia.com/object/nvidia_research_pub_011.html)

optimized for nv titan. to make it short: using coherent ray traversal, knl can render most of the scenes

i have around stable below 1 ms into a 1024x1024 frambuffer.

rem: cuda and knl numbers are avg. values calculated out of a sequence of several thousand frames (scene fly-thru).

amd firerays is single shot.

Statistics: Posted by mpeterson — Fri Sep 23, 2016 1:05 pm

]]>

r is the fresnel reflectance from the light exiting the top layer, bottomColor is the reflected color by the bottom layer. prtselect is a function that select the first argument is the 3th argument is true or the 2th if false for each individual components.

Code:

`inline Color dualLayerRefl(const Color &r, const Color &bottomColor)`

{

const auto t = White - r;

const auto bt = bottomColor * t;

return bt +

pow2(bt) * (

r + r * bottomColor * (prtselect(t.reciprocal(), White, t > 0) - White)

);

}

// Then you use it like this: F is the fresnel reflectance from the light entering the top layer

colorReflected = dualLayerRefl(r, bottomColor) * (White - F);

*EDIT* I adjusted my function which didn't handle bottomColor value higher than 1.

Statistics: Posted by patlefort — Wed Sep 14, 2016 6:02 am

]]>

It's a GPU renderer using OpenCL,

Statistics: Posted by atlas — Mon Sep 12, 2016 3:58 pm

]]>

I'm curious by what do you mean by bi-layer, multi-scattering microfacet model?

Is it GPU or CPU?

Statistics: Posted by patlefort — Mon Sep 12, 2016 8:57 am

]]>

You can keep up with my renderer here: http://www.twitter.com/rove3d

Statistics: Posted by atlas — Sat Sep 10, 2016 7:00 pm

]]>

@atlas:

The method in the paper is especially nice because it is SIMD/SoA vector friendly. With conventional methods that are robust, you would check if the auxiliary vector you chose (e.g. (0,1,0)) is by accident just the vector about which you construct your basis, then find the min_element of that vec3, add 1 to that and renormalize. Finding min_element is however not SIMD friendly because you have to unpack the SoA vec3 for that.

Just wanted to mention for a warning that the robust methods unfortunately don't guarantee that the basis has a certain handedness, which is however necessary for many tangent space operations. I had up until now always used Hughes' and Möller's method and never came across this problem (I e.g. used it for AO rays where the handedness doesn't matter). Will need to find a way to work around this..

Not robust:

Naive:

Hughes and Möller:

Statistics: Posted by szellmann — Wed Aug 31, 2016 10:54 am

]]>

https://www.semanticscholar.org/paper/B ... b4299c/pdf

Keep us updated.

Statistics: Posted by atlas — Fri Aug 19, 2016 5:08 pm

]]>

I can understand why the ideal specular BRDF/BTDF should have the cosine dividing.

BRDF has reciprocity, so it has the same value if it takes swapped directions:

f_r(w_i, w_o) = f_r(w_o, w_i)

It is easy to confirm the ideal specular BRDF obey this relation.

BTDF doesn't have reciprocity, but it has generalized relation (e.g. according to the eq. 5.14 in the Veach's thesis):

f_t(w_i, w_o) / eta_o^2 = f_t(w_o, w_i) / eta_i^2

My question is that the ideal specular BTDF appears NOT to obey the above relation.

f_t(w_i, w_o) = (eta_o / eta_i)^2 * (1 - F(theta_i)) * delta(w_o - T(w_i, n)) / |cos(theta_i)|

f_t(w_o, w_i) = (eta_i / eta_o)^2 * (1 - F(theta_o)) * delta(w_i - T(w_o, n)) / |cos(theta_o)|

In this case, it appears that f_t(w_i, w_o) / eta_o^2 != f_t(w_o, w_i) / eta_i^2

Is there some magic to make the relation true in the Dirac's delta distribution?

Thanks

Statistics: Posted by shocker_0x15 — Fri Aug 12, 2016 7:54 am

]]>

Presenting every bounce yields biased results for most frames; I'm not sure this would look good. Of course a cap on the path length (especially at a low value like 8) also introduces bias, but I suspect it is far less noticeable.

Due to Russian Roulette, the performance impact of longer paths is minimal by the way; only problem is that buffers get very large.

Since the (compacted) buffers are mostly empty for the deeper bounces, it may be possible to allocate for depth = 8 and bounce to depth = 64, with some kind of a safety cap in case the RNG decides that every path should reach 64 for a particular frame. Theoretically, this situation has a non-zero probability; practically this should never happen of course.

Also note that when using CUDA/OpenGL interop the pixel data never leaves the device. Overhead of presenting results is very small that way.

Statistics: Posted by jbikker — Mon Jul 25, 2016 8:03 am

]]>

]]>