]]>

Statistics: Posted by T.C. Chang — Wed Jan 11, 2017 3:57 pm

]]>

https://groups.google.com/forum/#!forum/pbrt

Statistics: Posted by papaboo — Wed Jan 11, 2017 6:39 am

]]>

]]>

This procedure implicitly includes selection between the two terms with probability.

select the case 1 with probability P_s

select the case 2 with probability 1 - P_s

P_s is a probability to perform evaluation on surface.

case 1:

immediately evaluate the attenuated radiance leaving a surface.

estimate: T * L_s / P_s

case 2:

contribution from the second term becomes:

int T * L_a ds / (1 - P_s),

but this still has an integration, so we need to sample distance in the medium interval with PDF p' (which produces a distance only between the interval).

estimate: T * L_a / ((1 - P_s) * p') = T * L_a / p,

where p is the original PDF (which can produce a distance even further than the medium boundary)

Therefore in the 2nd case, consideration for the 1st term is included in the implicit division by (1 - P_s).

I have found that you can also find a good explanation about this at the page 891 of the PBR 3rd book.

Statistics: Posted by shocker_0x15 — Mon Jan 09, 2017 5:15 pm

]]>

Statistics: Posted by T.C. Chang — Thu Jan 05, 2017 5:58 pm

]]>

I haven't done anything special but applying some tiny compile fixes for this architecture, so there's no optimizations yet. I'm just porting the SIMD math lib to ARM NEON. Excited to see which performance difference SoA packets make for coherent workloads on the tiny CPU.

Statistics: Posted by szellmann — Fri Dec 30, 2016 1:05 am

]]>

case 1: distance longer than the medium-boundary

L_i = T * L_s

case 2: distance shorter than the medium-boundary

L_i = int T * L_a ds

Why is in case 2 no 'T * L_s ' Term? Is there no leaving-energy?

Statistics: Posted by XMAMan — Thu Dec 15, 2016 6:01 pm

]]>

As you know, volume light transport equation consists of two terms where the one is attenuated light L_s leaving a surface and the other is "the integral of" light gain L_a due to in-scattering and emission.

L_i = T * L_s + int T * L_a ds

* L_a is volumetric emission and in-scattering

* The unit of L_s is [W/(m^2 sr)] but that of L_a is [W/(m^3 sr)]

In the case where the distance sampling produces a distance longer than the medium-boundary, it corresponds to evaluate the former term (there is no integral), otherwise evaluate the latter term (in integral).

Monte Carlo estimates for these terms are:

T * L_s / Probability

or

T * L_a / PDF

, and the units of these match.

Therefore, mix use of probability and PDF is valid.

Statistics: Posted by shocker_0x15 — Thu Dec 15, 2016 1:55 am

]]>

I have a question to the PDF from the Distance-Sampling-Function in particpating media.

In the SmallUPBP-Example is a function, that sample the ray-distance:

Code:

`virtual float SampleRay(`

const Ray &/*aRay*/,

const float aDistToBoundary,

const float aRandom,

float *oPdf,

const uint aRaySamplingFlags = 0,

float *oRevPdf = NULL) const

{

float s = -std::log(aRandom) / mMinPositiveAttenuationCoefComp();

if (s < aDistToBoundary) // sample is before the boundary intersection

{

float att = EvalAttenuationInOneDim(mMinPositiveAttenuationCoefComp(), s);

if (oPdf) *oPdf = mMinPositiveAttenuationCoefComp() * att;

...

}

else // sample is behind the boundary intersection

{

float att = EvalAttenuationInOneDim(mMinPositiveAttenuationCoefComp(), aDistToBoundary);

if (oPdf) *oPdf = att;

...

}

}

float EvalAttenuationInOneDim(

const float aAttenuationCoefComp,

const float aDistanceAlongRay) const

{

return std::exp(-aAttenuationCoefComp* aDistanceAlongRay);

}

My question is: If the ray is shorter than the distance to the medium-boundary (Sample inside the medium) than he returns a PDF. If the ray is longer than the max-distance, then he returns a Probabilty.

A PDF is not the same as a Probabilty. So why can he mix this two things?

If I use his explanations

http://www.smallupbp.com/thesis.pdf Page 33

If the sampled distance

d is lesser than or equal to dmax, the method returns d along with the desired pdf p¯(d) = σt,mT0

r,m(d). If it is longer, dmax is returned with a probability Pr{d>dmax} = 1 − P(dmax) = T0r,m(dmax).

Or a other text to this topic:

http://www.cs.cornell.edu/courses/cs6630/2012sp/notes/09volpath.pdf Page 3

The probability

that s > smax is 1 – P(smax), which is exp(–σt smax): exactly the attenuation that needs to be applied

to the radiance from behind the medium

He also use this formula but I don't understand why he can mix PDF and Probability.

In the 'normal' pathtracer I get the Path-Pdf by multiplying all the Sampling-Pdfs (Direction-Sampling / Survace-Area-Sampling). This Path-Pdf is needet for the MC-Estimator. But what happens, if one of this Sampling-Pdfs is not a PDF but a Probability?

Can someone explain me why he can mix this two things, if he want to MonteCarlo-Integrate?

Statistics: Posted by XMAMan — Wed Dec 14, 2016 6:51 pm

]]>

]]>

szellmann wrote:

I remember sth. along the lines of the CPU linear filter being 5x slower than nearest filtering. In contrast to that, On the GPU, with HW accelerated filtering, I found there was virtually no difference in performance between nearest and linear filtering.

I remember sth. along the lines of the CPU linear filter being 5x slower than nearest filtering. In contrast to that, On the GPU, with HW accelerated filtering, I found there was virtually no difference in performance between nearest and linear filtering.

How did you lay out your textures in memory? It probably wouldn't make much of a difference if you're performing nearest-neighbor sampling incoherently. But if you're performing linear filtering, then I would expect tiling and/or swizzling textures would make the memory fetches more coherent for a single texture sample (at the cost of more computation).

Statistics: Posted by friedlinguini — Mon Dec 12, 2016 2:58 pm

]]>

In my personal experience the __global char* approach is very fast even if it's not hardware accelerated.

I'm curious, what difference in latency do you actually observe when comparing nearest neighbor vs. linear filtering.

My rt lib has a dedicated CPU API where you can call texture access "intrinsics" like tex2D, etc., which emulate CUDA behavior. So I did the comparison once, both with CUDA, and on the CPU. I compared 3D texture access (so 1x mem access for nearest, 8x mem access for linear!) I remember sth. along the lines of the CPU linear filter being 5x slower than nearest filtering. In contrast to that, On the GPU, with HW accelerated filtering, I found there was virtually no difference in performance between nearest and linear filtering. So I'm curious how emulating textures in GPU DDR memory, which is basically what you propose, compares to that.

Cheers, Stefan

Statistics: Posted by szellmann — Mon Dec 12, 2016 8:17 am

]]>

Statistics: Posted by shocker_0x15 — Mon Dec 12, 2016 2:16 am

]]>

I think the idea is that the __global char* textureData is just one huge array of all texels for all textures.

The idea with __global char* textureData is the following. You would usually have something like a TextureHeader struct "__global TextureHeader* textureHeaders" which holds some information about the texture like: width, height, offset, textureType, filteringType, etc. Then, some material or shader stores the index to the correct TextureHeader.

At runtime in your kernel you would get something like:

TextureHeader texHeader = textureHeaders(shader.texid); // texid is the offset to the correct texture header which provides with all the data we need.

TextureType texType = texHeader.type; // Let's say it was a RGBA float i.e. float4

float2 texDimensions = make_float2(texHeader.width, texHeader.height);

__global float4* myTexture = (__global float4*)(textureData[textureHeader.offset]); // Here's the float4*

With myTexture you can do anything you like. This texture loading code goes wherever you want it do go e.g. functions with switches for filterType, textureType, etc. For a (bi/uni) directional path tracer it is benificial to sort the materials / hitpoints to get optimal performance i.e. utilization and cache coherence with texture lookups. In my personal experience the __global char* approach is very fast even if it's not hardware accelerated.

Cheers

Statistics: Posted by ultimatemau — Sun Dec 11, 2016 11:58 am

]]>