Questions about metroplis sampling.

Practical and theoretical implementation discussion.
friedlinguini
Posts: 89
Joined: Thu Apr 11, 2013 5:15 pm

Re: Questions about metroplis sampling.

Post by friedlinguini » Thu Oct 02, 2014 6:53 pm

Dietger wrote:On another note, whether MLT really produces higher quality samples is debatable. Sure, the sampling density is proportional to the target function, perfect importance sampling and all that, but it comes at a price. When MLT gets stuck in a peak of the target function for too long the correlation artifacts can be massive, destroying any benefits gained from 'perfect' importance sampling. One major issue with the o-so-robust MLT algorithm is that it tends to suffer from aliasing artifacts. Basically it sucks at resolving small features. Imagine a lamp hanging from a pixel-thin wire or a faraway handrail. In PT you don't need that many samples per pixel before you get a nicely anti-aliased wire. However, in MLT your small step mutations better be really small in image space or else this wire will look aliased for a very long time.
I think that's a little unfair. All rendering methods have weaknesses, which is why global illumination is not yet a solved problem. MLT's strength is in finding and reusing difficult-to-find samples, not simply producing "higher quality" samples.

For the thin-wire case it's simple enough to take image resolution into account when choosing a lower bound on mutation size. Orthogonally, it's also a common practice to sample "easy" paths (e.g., L[D]S*E paths) with ordinary path tracing and to leave the rest for MLT. In such cases, the illumination tends to be more even and less prominent, reducing the need for stratification and also generally decreasing the probability of "stuck" samples.

Hm. I wonder if there's some nutty MIS way to automatically combine "regular" path tracing and MLT...

Dietger
Posts: 50
Joined: Tue Nov 29, 2011 10:33 am

Re: Questions about metroplis sampling.

Post by Dietger » Thu Oct 02, 2014 8:09 pm

friedlinguini wrote:Hm. I wonder if there's some nutty MIS way to automatically combine "regular" path tracing and MLT...
Well, in a way that is what Keleman is doing by mixes large step mutations with small step mutations using MIS. Basically it mixes MLT and PT using MIS. MLT is best at sampling high contribution paths, PT is (sometimes) better at sampling some of the less contributing paths. Mixing the two using MIS combines their strengths in terms of their distribution. However, it does not account for MLT's correlation issue. Vanilla MIS does not account for correlation and I don't think it easily can. But if you can come up with a way to compute the correlation of samples for different sampling techniques and somehow relate these to their relative rate of convergence, you may be able to come up with some fancy mixing method (and win a sci-tech award in the process).

shiqiu1105
Posts: 138
Joined: Sun May 27, 2012 4:42 pm

Re: Questions about metroplis sampling.

Post by shiqiu1105 » Thu Oct 02, 2014 8:53 pm

Reposting in case nobody noticed this in the last post of the previous page..

Also, does anyone know how the updated metropolis sampling, which requires no warm-up, works?

http://www.hungrycat.hu/MetropolisSampler.html

shiqiu1105
Posts: 138
Joined: Sun May 27, 2012 4:42 pm

Re: Questions about metroplis sampling.

Post by shiqiu1105 » Thu Oct 02, 2014 9:30 pm

Just realized that the MLT implementation in pbrt is actually primary sample space..
I always thought it's path space.
Will definitely read it!

friedlinguini
Posts: 89
Joined: Thu Apr 11, 2013 5:15 pm

Re: Questions about metroplis sampling.

Post by friedlinguini » Thu Oct 02, 2014 11:06 pm

Dietger wrote:
friedlinguini wrote:Hm. I wonder if there's some nutty MIS way to automatically combine "regular" path tracing and MLT...
Well, in a way that is what Keleman is doing by mixes large step mutations with small step mutations using MIS. Basically it mixes MLT and PT using MIS. MLT is best at sampling high contribution paths, PT is (sometimes) better at sampling some of the less contributing paths. Mixing the two using MIS combines their strengths in terms of their distribution. However, it does not account for MLT's correlation issue. Vanilla MIS does not account for correlation and I don't think it easily can. But if you can come up with a way to compute the correlation of samples for different sampling techniques and somehow relate these to their relative rate of convergence, you may be able to come up with some fancy mixing method (and win a sci-tech award in the process).
That's not quite what I had in mind. My understanding of the large step mutation is that it's simply there to make sure that every state is reachable from every other state. It just seemed to me as I was writing my post that my "algorithm A is good for path X, while algorithm B is good for path Y" comment sounded a lot like the motivation behind MIS. Does a more "difficult" path have a higher probability using MLT, while an "easy, but dark" path has a higher probability in standard path tracing? Is it possible to run both algorithms over the full path space and somehow weight the results in a way has lower error than a hard partitioning?
shiqiu1105 wrote:Also, does anyone know how the updated metropolis sampling, which requires no warm-up, works?
I don't think there is supporting literature. Csaba Kelemen posted that code on his site and also on the original OMPF board, and then seemed to vanish from the rendering world. See also, "Delta Sampling."

shiqiu1105
Posts: 138
Joined: Sun May 27, 2012 4:42 pm

Re: Questions about metroplis sampling.

Post by shiqiu1105 » Thu Oct 02, 2014 11:13 pm

friedlinguini wrote:
I don't think there is supporting literature. Csaba Kelemen posted that code on his site and also on the original OMPF board, and then seemed to vanish from the rendering world. See also, "Delta Sampling."
Yeah I noticed delta sampling on his website too! Seems really cool.. what a shame..

Dietger
Posts: 50
Joined: Tue Nov 29, 2011 10:33 am

Re: Questions about metroplis sampling.

Post by Dietger » Fri Oct 03, 2014 8:46 am

friedlinguini wrote:Does a more "difficult" path have a higher probability using MLT, while an "easy, but dark" path has a higher probability in standard path tracing? Is it possible to run both algorithms over the full path space and somehow weight the results in a way has lower error than a hard partitioning?
Yes, the PT algorithm samples with a different distribution so it has to be better at sampling some part of path space (samples need to go somewhere). In particular, MLT samples proportional to intensity, so darker areas of the image/path space often get less samples then in PT.

Large step mutations are indeed a way to explore the whole path space but, as Kelemen describes in his paper, you can also view the sample set of all large step mutations as the result of a separately run PT algorithm and mix the two using MIS. Works quite well in practice, there is of course some correlation between PT and MLT in this case, but as the PT samples come for free (have to do the large steps anyway) it is worth the effort. More generally you can mix different metropolis and non-metropolis samplers using replica exchange light transport, although unfortunately in that paper they actually don't go beyond what Kelemen already did and only exchange between primary space MLT and vanilla PT.
Last edited by Dietger on Fri Oct 03, 2014 1:38 pm, edited 1 time in total.

Dietger
Posts: 50
Joined: Tue Nov 29, 2011 10:33 am

Re: Questions about metroplis sampling.

Post by Dietger » Fri Oct 03, 2014 1:26 pm

shiqiu1105 wrote:Also, does anyone know how the updated metropolis sampling, which requires no warm-up, works?

http://www.hungrycat.hu/MetropolisSampler.html
I just looked at the code. For the most part it is just an implementation of Kelemen style metropolis with MIS but a few things stood out:

1. Instead of estimating the normalization constant upfront, the method uses a progressive estimate based on large step mutations. This introduces some startup bias because the error in the normalization constant will be correlated to the current state of the Markov chain. Sticking to a precomputed estimate for the normalization constant would eliminate this source of bias, but any error in this estimate would be persistent so using a progressive estimate makes the algorithm consistent. I have seen/used this trick before.

2. The Markov is initialized with a uniform sample. No resampling or weighting a la Veach to eliminate startup bias. No explicit warm-up phase, as advertised.

3. The most interesting: Instead of dividing by the total sample count, the final answer is computed by dividing the integral estimate by the sum of sample weights. Basically, the method computes two integrals in parallel using the same metropolis samples, one for the target function and one for a rect function over the domain of the integral, and divides the two to obtain the final estimate. I can't remember having seen it before, but my take on it is that it corrects for some of the startup bias. Basically, if the distribution of samples contributing to each individual pixel is correct, but some pixels get more/less samples compared to others due to startup bias, this method corrects for that. However I don't think it fixes startup bias in the distribution of samples contributing to the same pixel. Still a neat trick though. I guess its better then nothing.

My final conclusion. This method eliminates some startup bias, but I don't think it to be free from startup bias and probably could still benefit from a warmup phase. But maybe in practice one could get away without, I haven't tried yet.

ingenious
Posts: 282
Joined: Mon Nov 28, 2011 11:11 pm
Location: London, UK
Contact:

Re: Questions about metroplis sampling.

Post by ingenious » Fri Oct 03, 2014 4:33 pm

There are two possible approaches to obtaining a consistent image estimate with MLT. Both have been discussed here before, look at the older threads or in my posts. Here's the gist:

1. In the classic Metropolis algorithm, without Veach's weighting trick, you would run a single Markov chain and accumulate the pixel estimates without any weighting. In addition, on the side you would accumulate an estimate of the normalization, i.e. the integral over the entire image plane, of your (unknown) target distribution. Only when you want to display an image would you actually multiply all pixels by that normalization estimate. Assuming that some of the candidates in the chain are sampled from a known distribution, as done for the large mutations (e.g. simple path tracing), you can use them to improve the accuracy of the normalization as you extend the chain over time. This approach ensures that the more samples you take from the Chain, the more accurate your (image * normalization) estimate becomes. So this is a biased but consistent estimator.

2. With Veach's weighting, the pixel estimates that you write into the image during the chain sampling become unbiased. But here's the caveat! In order to converge to the correct image in the limit, you need to average the results of infinitely many independent chains. Therefore, each of these infinitely many chains has to have the same finite length. This detail is buried in equation (11.7) in Veach's thesis, where N is the chain length. If you only take a single, infinitely long chain, then the image you will get will indeed be an unbiased estimate, but it will be equal to the correct solution only with a certain probability. That is, you get an unbiased but also inconsistent estimator. Which is not what you're looking for! You want the correct image with probability 1 in the limit as you take many samples (i.e. paths), so you must average over infinitely many finite-length chains to obtain a both unbiased and consistent estimator. Again, credit for this insight goes to Toshiya Hachisuka, who pointed it out here a while ago. This in my opinion is the crucial detail in understanding the start-up bias elimination. Unfortunately, Veach didn't explain this in simple terms, and only left equation (11.7) for the smartest of us to parse :) It's also worth pointing out again that the initial resampling step that Veach describes for choosing the initial chain state is only an optimization and is therefore entirely optional. In fact, I'm not sure it's even that important when you know you'll be averaging over a large number of chains.

Disclaimer: I have never implemented any sort of MLT in my life :oops: So I may be missing some important practical details.
Image Click here. You'll thank me later.

MohamedSakr
Posts: 83
Joined: Thu Apr 24, 2014 2:27 am

Re: Questions about metroplis sampling.

Post by MohamedSakr » Fri Oct 03, 2014 5:42 pm

sorry for jumping over the thread, but I wonder how ERPT works?

my previous knowledge so far, how PT works, BDPT, VCM, PPM, PM + final gather, MLT (BDPT), MLT + MIS (MMLT)

how ERPT is different? from the name it seems to apply only to PT, but can it apply to BDPT or VCM like algorithms?
does it converge faster than MLT?

and typical limitations? "stuck on peaks, caustics failure, glossy reflection artifacts, etc..."

feel free to clarify as much as possible :D

Post Reply