Search this blog

Showing posts with label Stupid rendering tricks. Show all posts
Showing posts with label Stupid rendering tricks. Show all posts

24 August, 2019

Misunderstanding Multilayering (Diffuse-Specular Energy Conservation)

- Introduction to the problem

In our last episode, "misunderstanding multiscattering", we saw how to create a multiscattering BRDF mostly by intuition. We used the concept of directional albedo (a.k.a. directional-hemispherical reflectance) to normalize a specular BRDF and from there we derived a very simple closed-form approximation for GGX energy conservation.

We also showed that the directional albedo is the response of our BRDF in a white light furnace at different normal to view angles, and we typically have that information available in tabular form as it's a key component of the "split sum" approximation to the integrals needed for image-based lighting on contemporary BRDFs. Great!

This time, we'll show how the very same data, ideas, and intuitions, can be used to build multilayered BRDFs, and in particular to couple matte-specular models (diffuse lobes to specular lobes). 
Like the previous article, this is not anything particularly novel, this isn't Siggraph worthy material, just some notes on the subject.

Let's start, as always, with the problem, with a very quick recap of things everyone reading this will probably already know (my flight has been delayed a couple of hours, so yes, we have time for recaps).

Under the framework of geometrical optics we look for light interactions with materials at surfaces where the index of refraction changes. We assume that light travels in a medium of uniform IOR (e.g. air), and then eventually hits a surface with a different IOR and either gets reflected out of the surface, or refracted into it.
In fact in real-time rendering we only consider the air->material interface, we don't handle nested objects (e.g. air->glass->water->glass->air) even when we render transparent materials or multilayered materials. Which is wrong, but it's just one of the many ways we are still totally wrong (and yet might or might not be right by not caring). Even for offline path-tracers, this is not entirely trivial, depending on how you do light transport.

We then consider, for a small but not infinitesimal patch of our surface, the statistics of how probable are these reflections and refraction in function of the light incident angle, a given outgoing angle we want to measure, and the surface patch normal, and these statistics create a BRDF "lobe".

But BRDFs though don't typically have a single lobe: we usually have "metals" and "dielectrics" and in the latter case we have a specular and a diffuse lobe. How comes? Well, because we don't really consider a single interaction, that was sort-of a lie. We do create a lobe with the light that is scattered from the surface interaction, but we also have to consider the light that gets refracted into the surface. In the case of metals, the energy that goes into the material somehow disappears (becomes heat or maybe fairy dust who knows - physics) and we effectively have a single lobe. 
For dielectrics, though the refracted light keeps hitting molecules inside what's essentially a participating medium, and eventually taking random paths it comes back out from the surface and we model that as a diffuse lobe. 

Diffuse lobes are reasonable when the "bouncing around" is such that it effectively randomizes the direction at which the light comes out, but still happens in such a small space that we consider the light coming out effectively at the same point it came in. 
If that's not true, and the scattering distances are bigger, we have what we call "subsurface scattering" effects. If the direction is not randomized "enough", then instead we have transparent materials, and somewhere in the middle, we have what we usually address as participating media.

Any time that we consider multi-layered materials we have to model the way the light that is refracted by a layer reaches the next. At the very least, we have to know at least how much energy is "handled" by a given layer/lobe and how much is "passed down" to the subsequent lobes, as if we gave the same input light to all the layers we would sum the interactions up and potentially end up with more energy coming out of a material than it came in!
In theory we should understand much more than that, namely how the light travels in a given layer and reaches another one (its statistical distribution over directions and space), but for diffuse lobes found in dielectrics we can imagine that it doesn't matter much (anyways the light is going to be randomized...), so right now we just want to know how much energy survives the topmost specular lobe of a dielectric (gets refracted as opposed to reflected).

This might seem a lot of handwaving. And it is! This is the point of this post -and- the previous one!

We know that we have certain problems in our math, but should we care? How much should we care and why? Does it matter for the goal of generating photorealistic images easily? Remember, this is our goal, not fixing physics. Then again if we wanted to look at the physics there are a ton of assumptions that we are making anyways.

Left to right: Specular only, Specular+Diffuse, Specular+Diffuse*(1-Fresnel)
It turns out that this problem matters a lot, much more in fact than the energy loss in non-multiscattering specular lobes. The multiscattering issue was small, happened only at high roughnesses and it was entirely fixable by artists tweaking specular albedo (typically controlled by Fresnel f0 parameter) a bit.

Tweaking albedo is fine because it still keeps the materials decoupled from lighting, the main issue we wanted to fix by embracing physically-based rendering models, and it's doable because the tweaks needed are small and entirely in the range of realistic albedos (which don't ever go to f0 = 1 for metals, and of course even less so for dielectrics).

Not having any energy conservation between specular and diffuse instead creates overly bright materials that can look fine in a certain scene but will glow unnaturally under different conditions, and it's quite hard for artists to make sure that the lobes are tuned in a way that doesn't produce more energy than they should.
In fact, most real-time rendering today works without multiscattering BRDFs, but (hopefully) it always considers some way of balancing specular and diffuse.

- Solutions and pretty images

Now we know the problem -and- we know it matters, so we are justified in spending some time to investigate further. How? Well, as we did with the previous post, we bring back our friend the white furnace test. Pretty much any time we want to check for energy conservation, we take our materials and put them in a furnace!


As for the previous time though, just putting a material in a furnace doesn't tell us that much. Yes, it's already apparent how the middle image looks unnaturally bright, but that's not a great test. What we want is to setup our scene so we expect the material to reflect back exactly all the light that comes in, no matter which path the light takes in the material and across the microfacets, and then see if we get more light than we expect or less.
We have to think, what is that is absorbing light in our materials? The specular reflects light out or refracts it in, it doesn't absorb, so all the energy loss is when we go "inside" the material, in the dielectrics case, in the diffuse layer. It stands to reason then that if we set our diffuse albedo to one, regardless of the specular parameters, we should be perfectly white in a white furnace. Let's see:

Comparing no energy conservation to (1-Fresnel) with a fully white diffuse albedo.
Bingo! We see now that just adding diffuse with no attempt at energy conservation does indeed result in energy being created. And surprisingly it looks like that the simple idea of using (1-F) as a multiplication factor to normalize diffuse is indeed doing a very good job, in fact, it looks alright if it wasn't for the grazing angles... Why is that?

(1-Fresnel) at varying roughness (from 0.1 to 0.9)
Same as above, but using a non-multiscattering Specular instead of the simple one presented in the last post.
Well if we think about it, it's fairly obvious. For the terms we commonly use, at the incident angle (N=V=L) the shadowing term has no effect and the NDF takes a constant value regardless of the roughness parameter, Fresnel dominates. At grazing angles, the shadowing term starts to matter, and there is where we see our simple normalization breaking down.

What we can try to fix this? If you read the previous post, it should be obvious by now. We have something that should be white in a furnace, it isn't, let's make it! We know how much energy the specular lobe will scatter back in the furnace, for any roughness, Fresnel f0, and viewing angle, this again is the split-sum table we use for environment map lighting. We know that a Lambert diffuse is correctly normalized, so it will scatter back all incoming energy if the albedo is set to one. So how much do we have to scale the lobe for the sum of the specular plus diffuse to be one, with an arbitrary specular and a unitary albedo diffuse? Obviously just by 1-E, where E is what we get from the split-sum table!

(1-E) energy conservation test, notice the more correct grazing angles.
Note: some artifacts remain due to approximations in the Directional Albedo table I used.
Furnace fixed! And note, this works regardless of if we're using or not the multiscattering specular BRDF, as a non-multiscattering one will just behave as it's refracting more light to the diffuse layer, which in our case will still bounce it all back out. And, if we're using the simple renormalization for multiscattering presented in the previous post, we don't even need to compute a new table for directional albedo, as it's easy to derive how to analytically modify the output of the single scattering table to accommodate for the multiscattering correction (I'll leave this as an exercise for the reader, it's trivial and might motivate you to actually look at the definition of directional albedo...)

As it was for the multiscattering normalization idea, this is wrong and we can see immediately that it's wrong from the equation, even if the error won't show in the furnace, because we don't respect reciprocity (we consider only ndotv and not ndotl). It's even intuitive, as we don't seem to consider at all that the light going from the specular layer to the diffuse, scattering inside the material and eventually coming out, has to cross again the specular interface as it comes out!

Before we delve further into this, I want though to stop for a second and check out what we are doing (again, same as last time). Let's plot the 1-E multiplicative factor we're applying to diffuse and see what happens:
(1-E) normalization compared to (1-Fresnel) at varying roughness,
for a non-multiscattering specular.

Same as above, but using simple multiscattering specular. 
This is interesting! Albeit we can couple diffuse with and without a multiscattering specular, the plots for the multiscattering case are much simpler, so much so that it's easy to derive, by hand, a function that would approximate them. Fun!

Approximation: mix( (vec3(1,1,1) - fresnel(f0,ndotv) ), vec3(1) - f0, roughness )
And now, at last, let's have a look at a more correct solution that respects reciprocity and see if that matters or not for real-time rendering. This is quite hard to know intuitively, so we'll have just to implement something and check.

Luckily this is nothing new, in 2001 Kelemen and László Szirmay-Kalos published "A Microfacet Based Coupled Specular-Matte BRDF Model with Importance Sampling", and we can pretty much copy and paste their equation, which unsurprisingly ends up looking very close to the way we adjust for reciprocity in specular multiscattering (in fact Kulla and Conty cite the afore mentioned KSK paper in theirs).

Without further ado:

Left to right: Simple EC, Fresnel EC, KSK EC.
Simple Multiscattering GGX, Roughness 0.1.
Same as above but for 0.3 roughness GGX.

And some more images. The difference between (1-F) and (1-E) can be seen only at low roughness. Difference is even less pronounced for table-based (1-E), the approximation showed above and the full KSK method.

1-Fresnel; Roughness 0.1 to 0.9, GGX with simple multiscattering.
Approximated 1-E.
Table based 1-E.
KSK.

12 August, 2019

Misunderstanding Multiscattering

Today we will build a state-of-the-art multiscattering GGX BRDF for physically-based rendering. 

Already done you say? Yes, you're right, by people who understand maths and physics, that's cheating. We instead will try to do the same trying to learn as little as possible...

If you want to do this the right way instead, these are some excellent articles:
Ok, ready? 

Let's start from your vanilla contemporary GGX with all the correlated-Smith fixings. The following sequence of images is generated with Disney's BRDF explorer environment lighting mode. They show a metallic GGX withy varying roughness, from 0.3 to 0.9, using the common parametrization alpha = roughness squared.
Note: please always indicate the parametrization you are using in your articles and presentations. There are so many variants out there. Here I use one of the most common ones, even if, for what is worth probably the best idea is alpha = roughess^4, which is more perceptually linear and also manages to be a great approximation for the one-over-square-root Blinn-Phong Gloss to GGX mapping many engines prefer.


We can notice how at high roughness the material looks noticeably darker. This is annoying, and smart people have worked hard to find and fix this issue. 

Let's compare the image above with Kulla/Conty's Multiscattering GGX BRDF.


Fixed! 

Now, we could try to understand how this solution work, the math is already there. And you probably should, it's both interesting and useful to understand other related problems. But, let's try not to...

First of all, we should ask why is this a problem? How do we know it is a problem? And also, if it is a problem, how big of a problem it is?

Let's start at the top. Why do we like "PBR"? It's not because we like physics, at least, I don't... And if we did like physics, this would be a small error to focus on, considering that we are surely committing worse sins in our end-to-end image pipeline...

Computer graphics is not predictive rendering. We don't try to simulate physics to make accurate simulations, that's not a goal (and the people who care about physics are an order of magnitude ahead of us). We make pretty pictures.

At a given point, we noticed that using some physics to make pretty pictures allowed for easier workflows, decoupling materials from lighting, reducing the number of hacks and parameters, allowing to use libraries of real materials and so on. 
We thought artists would be happier and be able to work more efficiently, while still being able to create a lot of different scenes. It took a while to move over all our workflows, but it was the right call.

So, when we look for right and wrong, we have to start with art and artists. If there is a need we can identify there, then we can look at physics to see if the answers are there. In this case, we can say that yes, indeed we have a (small) problem. 
It would be nice if our material parameters were orthogonal, and having things go darker as we change roughness is not ideal.

Let's move to step two then. Can physics help? Is this darkening physically correct, or not? How do we test? Enter the furnace! Let's put our "GGX coated" metallic object in a uniformly lit environment and see what happens:


Light hits our surface, hits our microfacets. The microfacets reflect back some light, and refract some other, according to Fresnel. If we're assuming a metal people told us that the refracted light, that goes "inside" the surface, is quickly absorbed and never comes out (becomes heat). 

It's reasonable that even in a furnace our metallic object has a color, as some of the energy will be absorbed. But what if we take our microfacets and make them always reflect all the light, no absorption? 
This means we need to set our f0 to 1 (remember, Fresnel is what controls how much light the microfacets scatter), let's try that and see what happens:


The object is still not white. Something's wrong! Ok, the inquisitive reader might still say - how do you know it's wrong. Perhaps certain directions scatter more light and others less, so we still see some shading even if all the light eventually comes out... 
If we think of BRDF and lobes it's not easy to get correct intuition. Let's instead think of how a light path, starting from the camera, looks like. It might hit some of the microfacets, one or many times, bounce around then eventually escape and connect to the furnace environment which will always be a emitting a given contant energy.
How much of that energy reaches the camera? All of it! Because regardless of which and how many microfacets we hit, all the energy is reflected and none is absorbed.

Now we have an intuition of why the image above should be white, it isn't, so we have a problem in our math...

If you studied BRDFs you know that in microfacet model we have a masking-shadowing visibility function that models which microfacets are occluded by others. What we don't typically model though is the fact that these occlusions are themselves microfacets, so the light should bounce around and eventually get out, not just be discarded. 
This is what the multiscattering models model and fix, and indeed if we did put in a furnace Kulla and Conty's GGX that we have shown before, it would generate a boring and correct white image for a fully reflective material, regardless of its roughness.

Kulla's model is not trivial though, and in many cases is likely not worth using to fix such a relatively minor problem. So. Can we be more ignorant? What if we knew how much light our BRDF gives out in a furnace, for a given roughness and viewing angle (fixing then again f0 = 1)? Could we just take that value and normalize the BRDF with it? 

Spoiler alert, we can. And not only we can but it's also trivial to do, because we already have this "furnace" value in most modern engines, in the look-up tables used for the popular split-sum image based lighting approximation. 


The split-sum table boils down the "BRDF in a furnace" (also known as directional albedo or directional-hemispherical reflectance) to a scale and bias (add) factor to be applied to the Fresnel f0 value.
In our case, we want to normalize considering f0=1 so all we do is to scale our BRDF lobe by one over bias(roughness,ndotv) + scale(roughness,ndotv). This is the result:


Indeed, we're getting some energy back at high roughness, and if we tested this in a furnace it would come up white, correctly, at f0 = 1. But it's also different from Kulla's, in particular, color is not quite as saturated in the rough materials. How comes? Well again, if you studied this problem already (cheater!) you know the answer.

Proper multiscattering adds saturation because as light hits more and more microfacets before escaping the surface, we pick up more color (raising a color to a power results in a more saturated color). But so how can this be physically wrong, but still correct in our test? 

Well, it's simple, really. By not simulating this extra saturation we are still energy conserving, but we changed the meaning of our BRDF parameters. The f0 "meaning" in our "ignorant" multiscattering BRDF is not the same as Kulla's, it results in a different albedo, but the BRDF itself is still energy conserving, it's just a different parametrization. 
Most importantly, I'd argue it's a better parametrization! Then again, remember our objectives. We don't do physics for physics' sake, we do it to help our production.

We wanted to make our parameters more orthogonal so that artists don't need to artificially "brighten" our BRDF at high roughness. If we went with the "more correct" solution (about that, this recent post by Narkowicz is a great read) we would add a different dependency, now roughness instead of darkening our materials it makes them more saturated, which would kind of defeat the purpose. 
You can imagine some scenarios where this might be desirable, but I'd say it's almost always wrong for our use-cases.

If we wanted to simulate the added saturation, there are a few easy ways. Again, going for the most "ignorant" (simple) we can just scale the BRDF by 1+f0*(1/(bias(roughness,ndotv) + scale(roughness,ndotv)) - 1), resulting in the following:


Now we are close to Kulla's solution. This is still not entirely correct if you wonder why Kulla's approximation is better, as it might not show in images, it's because ours doesn't respect reciprocity.
This might be important for Sony Imageworks, as certain offline path-tracing light transport algorithms do require it, but it's fairly irrelevant for us.

Ok, so now that we have found an approximation we're happy with we can (and should) go a step further and see what we are really doing. Yes, we have a formula, but it depends on some look-up tables. 
This isn't a big issue as we need these tables around for image-based lighting anyways, but it would be an extra texture fetch and in general it's always important to double-check our math, so let's visualize the 1/(bias(roughness,ndotv) + scale(roughness,ndotv)) function we're using:


It looks remarkably simple! Indeed, it's so simple it has a trivial approximation, you don't even need any sophisticated tool to find it so I'll just show it: 1 + 2*alpha*alpha * ndotv. This fits very well:

Approximation compared to the correct normalization factor (gray surface)

You can see that there is a bit of error in the furnace test. We could improve by doing a proper polynomial fit - yes, it turns out "2" and "1" in the formula above are not exactly the best constants. 
Actually defining "best" would be a problem all on its own, because doing a simple mean-square minimization on the normalization function doesn't really make a lot of sense (we should care about end visuals, perceptive measures, which angles matter more and so on), and I think we already spent too much time for such a small fix.
Moreover, the actual rendered images are really hard to distinguish from the table-based solution.

What is interesting is to see what could we achieve if we wanted to be even simpler, dropping the dependency from ndotv. It turns out in this case 1 + alpha*alpha does the trick decently as well.
This means that we're just applying a multiplicative factor to our BRDF to brighten it at high roughness, which just makes a lot of sense.

This of course adds more error and it starts to show as the BRDF shape changes, we get more energy at grazing angles on rough surfaces, but it might just be good enough depending on your needs:

Under-exposed to highlight that with the simpler approximation sometimes we lose light, sometimes we add.

06 May, 2017

Shadow mistery

Can shadows cause the texture UVs to shift?

This was a bug assigned to one of our engineers. Puzzling. Instead of being really useful, I started to investigate with some offline rendering.


Look at the shadows of the disappearing pillar

Wow! Mental Ray is broken :)

So apparently yes, you can easily create this optical illusion. It's pretty easy to understand why, especially on a constant albedo the texture of the surface comes entirely from shading. If the light moves, so will the highlights traverse the surface, and that creates an illusion similar to a texture shift, of just a few pixels.

The effect is much stronger when the ambient light creates an highlight opposite to the main light, so when the main light is shadowed the ambient highlight dominates and the shift becomes apparent.

On a render where the ambient and highlight are on the same side, the effect is much less pronounced.



The nail on the coffin was though when I managed to reproduce the same effect in real-life, so, it's definitely an optical illusion that can happen, but it's probably made worse by realtime rendering unshadowed ambient/GI on normalmaps and by the fact that shadows abruptly cancel the sun/sky light, instead of gradually blocking only some rays and rolling out the highlight direction in the penumbra region.

01 October, 2016

More (silly) tone-mapping ideas

As a follow-up to my previous post I'll show here two more tone-mapping tricks. They are both "silly" in a way, they were done quickly and mostly while waiting for more important things to build, so they are far from "production quality".

They both are somewhat inspired by photography (but make no effort at all to simulate photographic effects, rather, they take inspiration from observing how certain things work), and I believe they could all be part (including the ideas of the previous post) of a single tone-mapping operator. 
All dynamic range compression algorithms have trade-offs and artifacts, so I believe it's not unwise to layer various methods each doing just a bit of compression, to achieve an overall larger reduction.

"Adaptive ND-Filter"

The idea of doing this test came from a discussion with my Italian friend Manuele Bonanno, as we were chatting about Bart's TM post (which I linked in the previous article).

In landscape photography is not uncommon to use graduated neutral density filters. A ND filter reduces all wavelengths of light equally, darkening the image without shifting colors, and can be used for particular effects like very long exposures or to be able to use wide lens apertures in daylight. 
A graduated ND filter does the same but using a gradient, and it's used typically to darken the sky (top half) relative to the other elements of the image.


A GND filter - from Tiffen Filters marketing material

The interesting thing to notice is that even if in photography these filters are made with a fixed gradient, in practice the effect is smooth enough that it can make a big difference on the final image without being noticeable. 
Our vision system is not good at detecting very low-frequency gradients in images, so after a given spatial frequency we can "cheat" liberally without being able to see artifacts.

How to apply this in computer graphics is rather obvious: let's just blur everything a lot and use that as a base for a localized tone-mapping operator (in the test below, I just scale the exposure by the blurred version of the image a bit, then apply global TM as usual).

Doing local tone-mapping with a gaussian blur average usually yields pretty strong haloing artifacts (and a lot of photographic HDR toning software actually is plagued by such halos), and my previous article tried to show an alternative that is both stupidly cheap and that can't result in haloing (as it's not using neighboring operators at all).


Without ND
With ND (save the images and A/B them by alternating the two)

The "insight" here is that you can also prevent haloing if you use gaussian blur, but you blur enough not to be at frequencies where haloing shows. And, of course, if you have a good quality bloom/veil effect, you can re-use the image pyramids from that almost directly. Just remember to apply a -neutral- (grayscale) ND!


The gaussian filter approximated with image pyramids

Used tastefully and in a temporally stable way I think it can work, and indeed my colleague Michal confirmed that he know of titles doing something similar. 
I did a quick test using COD:AW and the effect indeed works really well, actually even better indoors than outdoors often, reducing the amount of "auto-exposure" needed (which we don't push very far typically anyways).

Film Grain

What? Film grain is not about dynamic range, right? It's usually an artistic effect, at best it can be seen as a dithering method that can help avoiding Mach band artifacts. Right?


A detail from a film photo by Trent Parke

Well, not really. If you look at an actual photo you'll notice that grains mostly show in midtones, or rather, that pure whites and pure blacks either saturate grains or have no grains exposed, and thus appear as solid regions.


Thresholded version

This means that if we look at a thresholded version of a good film scan we can see a spatial dithering in the threshold. 
We have some black areas that have black grains but are not fully saturated with black, and then we have "blacker than black" areas where the same black grains appear spatially clustered to create an entirely saturated region (and the same happens in the highlights as well). 

This suggests that we can use dithering patterns not only to get more precision in the midtones, but also to extend a bit the range of highlights and shadows. If a dither pattern is symmetric and equally distributed around zero (as it should be), sometimes it will add luminosity and sometimes it will subtract it. 
So if an area is white, but not whiter-than-white, the times the pattern subtracts it will create darker pixels. In areas that are constantly whiter than the intensity of the dither, even after subtraction, we will still end up with a full white, and we will see saturated areas. 



The image above is a quick test of the theory. On the left, it uses just the simple, old good Reinhard tone-mapping (with black and white levels used to get contrast). The image on the right uses the exact same tone-mapping but also adds som RGB film grain. 
If you look at the shadows and the highlights you can notice that there is a bit more detail visible in the film-grain version (e.g. count the number of visible beams in the glass ceiling).

This is just a quick test, I tweaked the film grain actually to be less intense in the midtones (around 0.5) than near the extremes (0 and 1) to be able to push it a bit further without being too intense over the entire image. It can certainly be done in much better ways, but as far as silly experiments go, I'm rather happy. 
If you really wanted to be cheating, you could even just add grain in the shadows, which would definitely be a hack as you're adding energy but hey... it's not that we don't routinely do that (e.g. bloom/veil usually doesn't redistribute energy, it just adds it...)

P.s. real film grain is actually not easy to emulate (and I'm not trying to, here). This is the only attempt I know of.

06 February, 2016

Low-resolution effects with depth-aware upsampling

I have to confess, till recently I was never fond of doing half or quarter res effects via a bilateral upsampling step. It's a very popular technique, but all the times I tried it I found it causing serious edge artifacts... 
On Fight Night Champion I ended up shipping AO and deferred shadows without any depth aware upsampling (just separating the ring and fighters from the background, and using a bias towards over-shadowing); Space Marines ended up shipping with a bilateral upsampling on AO (but no bilateral blurring or noise) but it still had artifacts. In the end it sort-of worked, via some hacks that were good enough to ship, but that I never really understood.

For Call of Duty Black Ops 3 we needed to compute some effects (volumetric lighting) at quarter-res or less, to respect the performance budgets we had, so depth-aware upsampling was definitely a necessity, so I needed to investigate a bit more into it.
A quite extreme example of "god rays" in COD:BO3
I found a solution that is very simple, that I understand quite well, and that works well in practice. I'm sure it's something many other games are doing and many other people discovered (due to its simplicity), but I'm not aware of it being presented publicly, so here it is, my notes on how not to suck at bilateral upsampling:

1) Bilateral weighting doesn't make a lot of sense for upsampling.

The most commonly used bilateral upsampling scheme works by using the same four texels that would be involved in bilinear filtering, but changing their weights by multiplying them by a function of the depth difference between the true surface (high res z-buffer) and their depths (low-res z-buffer).

This method makes little sense, really, because you can have the extreme case where the bilinear weights select only one sample, but that sample is not similar to the surface depth you need at all! Samples that are not detected to be part of the full-res surface should simply be ignored, regardless of how "strongly" biliear wants to access them...

A better option is to simply -choose- between bilinear filtering or nearest depth point sampling, based on if the low-res samples are part of the high-res surface or not. This can be done in a variety of ways, for example:

- lerp(bilinear_weights, depth_weights, f(depth_discontinuity)) * four_samples
- lerp(biliear_sample, best_depth_sample, f(depth_discontinuity))
- bilinear_fetch(lerp(bilinear_texcoords, best_depth_texcoords, f(depth_discontinuity)))

Where the weighting function f() is quite "sharp" or even just a step function. The latter scheme is similar to nVidia's "nearest depth sampling", it's the fastest alternative but in Black Ops 3 I ended up sharply going from bilateral to "depth only" weights if a too big discontinuity is detected in the four bilinear texels.

2) Choose the low-res samples to maximise the chances of finding a representative.

It's widely known that a depth buffer can't be downsampled averaging values, that would result in depths that do not exist in the original buffer, and that are not representative of any surface, but "floating" in between surfaces at edge discontinuities. So either min or max filtering is used, commonly preferring nearest-to-camera samples, with the reasoning that closest surfaces are more important, and thus should be sampled more (McGuire tested various strategies in the context of SSAO, see Table 1 here).

But if we think in terms of the reconstruction filter and its failure cases, it's clear that preferring a single set of depths doesn't make a lot of sense. We want to maximize the chance of finding, among the texels we consider for upsamping, some that represent well the surfaces in the full resolution scene. Effectively in the downsampling step we're selecting on points we want to compute the low-res effect, clearly we want to do that so we distribute samples evenly across surfaces.

A good way of doing this is to chose per each sample in the downsampled z-buffer, a surface that is different from the ones of its neighbors. There are many ways this could be done, but the simplest is to just alternate min and max downsampling in a checkerboard patter, making sure that for each 2x2 quad, if we are in a region that has multiple surfaces, at least two of them will be represented in the low-res buffer. 

In theory it's possible to push even more surfaces in a quad, for example we could record the second smallest or second biggest, or the median or any other scheme (even a quasi-random choice) to select a depth (we shouldn't use averages though, as these will generate samples that belong to no surface), but in practice this didn't seem to work great with my upsampling, I guess because it reduces spatial resolution in favour of depth resolution, but your mileage may vary depending on the effect, the upsampling filter and the downsampling ratio.

Some residual issues can be seen sometimes (upper right),
when there is no good point sample in the 2x2 neighborhood.

Further notes.

The nearest-depth upsampling with a min/max checkerboard pattern downsampling worked well enough for Black Ops 3 that no further research was done, but there are still things that could be clearly improved:

- Clustering for depth selection.
A compute shader could do actual depth clustering to try to understand how many surfaces there are in an area, and chose what depths to store and the tradeoffs between depth resolution and screenspace resolution.

- Gradients.
Depth discontinuity in the upsampling step is a very simplistic metric, more information can be used to understand if samples belong to the same surface, like normals, g-buffer attributes and so on.

- Wider filters.
Using a 2x2 quad of samples for the upsampling filter is convenient as it allows to naturally fall back to bilinear if we think the samples are representative of the high-res surface, but there is no reason to limit the search to such neighborhood, wider filters could be used, both for higher-order filtering and to have better chances of finding representative samples.

- Better filtering of the representative depth samples.
There is no reason to revert to point-sampling in presence of discontinuities (or purely depth-weighted sampling), it's still possible to reject samples that are not representative of the surface while weighting the useful ones with a filter that depends on the subtexel  position.
Special cases could be considered for horizontal and vertical edges, where we could do 1d linear interpolation on the axis of the surface. Bart Wronski has something along these lines here (and the idea of baking an UV offset to be reused by different effects also allows in general to use more complex logic, and amortize it among effects).

- "Separable" bilateral filters.
Often when depth-aware upsampling is employed we also use depth-aware (bilateral) filters, typically blurs. These are often done in separate horizontal/vertical passes, even if technically such filters are not separable at all. 
This is particularly a problem with depth-aware filters because the second pass will use values that are not anymore relative to the depths in the low-res depth buffer, but result from a combination of samples from the first pass, done at different depths.

The filter can still look right if we can always correctly reject samples not belonging to the surface at center texel of a filter, because anyway the filtered value is from the surface of the center texel, so doing the second pass using a rejection logic that uses attributes (depth...) at the center of the filtered value sort-of works (it's still a depth of the right surface). 
In practice though that's not always the case, especially if the rejection is done with depth distances only, and it causes visible bleeds in the direction of the second filter pass. A better alternative in these cases (if the surface sample rejection can't be fixed...) is to do separate passes not in an horizontal/vertical fashion but in a staggered grid (e.g. first considering a NxN box filter pass then doing a second pass by sampling every N pixels in horizontal and vertical directions).

26 April, 2014

Smoothen your functions

Do you have an "if", "step" or such? Replace with a saturate(multiply-add(x)).
Do you have a mad-saturate? Replace with a smoothstep.
Do you have a smoothstep? Replace with smootherstep...

Ok, kidding, but sort-of, I actually do often ed up replacing ramps (saturate/mad... the fairy dust of shading, I love sprinkling mads in shader code) instead of steps, I remember years ago turning a pretty much by-the-book crysis 1-style SSAO into a much better SSAO by just "feathering" the hard in/out tests (which is kinda what line sampling SSAO does btw).

If you think about it, it's a bit of "code smell". What shading functions should be discontinuous? True, most lighting has a max or saturate right? But why? Really we're considering infinitesimal lights, for physically realistic lights we would have an area of emission, and that area would be fractionally shadowed by a surface, so even there, the shadowing function wouldn't just be a step of the dot product. This might not be evident on diffuse, but already when you're trying to use half-angle based specular attention has to be taken when handling transition to the "nightside".

And of course even when reasonable, any "step" function (well -any- function!) in a shader should be anti-aliased... And of course everybody knows what's the convolution of a step with a box (pixel footprint) is... Texturing and Modeling, a Procedural Approach is the canonical text for this, but it's funny, googling around one of the first hits is this documentation page of Renderman on antialiasing, whose slides are horribly aliased. The OpenGL Orange Book also has examples, and I really want to mention this IQ's article on ray differentials even if it doesn't do analytic convolution...

Many times the continuity of derivatives is not that important (visible), that's why we can use saturated ramps (discontinuous in the first derivative) or saturated smoothsteps (discontinuous in the second), with the big exception of manipulating inputs to specular shading. In that case, even second-derivative discontinuities can very clearly show, thus the need of the famous "smootherstep".

Anyhow. I usually have a bunch of functions around to help with ramps, triangle ramps, smoothsteps and so on, most of them are trivial and can be derived on paper in a second or so. Lately I had to use a few I didn't know before, so I'll be writing them down here.

Yes, all this introduction was useless. :)

- Smooth Min/Max

log(pow(pow(exp(x),s) + pow(exp(y),s),1/s))

This will result is a smooth "min" between x and y for negative values of s (which controls the smoothness of the transition), "max" for positive values.

For s=-1 this results in the "smoothest" min:

log(exp(x+y)/(exp(x)+exp(y))

If you know that x,y are always positive a simpler formulation can be employed, as we don't need to go through the exponential mapping:

pow(pow(x,s) + pow(y,s),1/s)


Note also that if you need a soft minimum of more than two values, your expressions simplify, e.g. pow(pow(pow(pow(x,s) + pow(y,s),1/s),s)  + pow(z,s),1/s) = pow(pow(x,s) + pow(y,s)  + pow(z,s) ... ,1/s).

Note also the link between softmax and norm-infinity.

- A few notes on smoothsteps

Deriving smoothstep and smootherstep is trivial, just create a polynomial of the right degree (cubic or quintic) and impose f(0)=0, f(1)=1 and f'(0)=0, f'(1)=0 (and the same for f'' in case of smootherstep), solve and voila'.


Once you do that, it's equally trivial to start toying around and derive polynomials with other properties. E.g. imposing derivatives only at one extreme:


You can have a "smoothstep" with non-zero derivatives at the extremes:


Or a quartic that shifts the midpoint:


It would seem that the more "properties" you need to have the higher degree polynomial you need to craft. Until you remember that you can do everything piecewise...
Which is basically making small, specialized splines. For example, a quadric smoothstep can look like this:


This is helpful also because there are certain tradeoffs based on applications, especially as having continuous derivatives don't mean automatically that it will be nice looking...
You can make functions that impose more and more derivatives (and do you know that smoothsteps can be chained? smoothstep(smoothstep(x))...) but that doesn't mean the derivatives will "behave", as they can vary wildly in the domain and result in visible "wobbling" in shading.


Another thing that you might not have noticed is how close smoothstep is to a (shifted) cosine, I didn't before a coworker or mine, the all-knowing Paul Edelstein, mentioned it. Probably not too useful, but never know, in certain situations it might be applicable and cheaper.


- Sigmoid functions

Another class of functions that are widely useful are sigmoids, "s shaped functions"

Smooth Sigmoid: x/pow((pow(abs(x),s)+1),1/s)
Logistic: 1/(1+exp(-x))

Sigmoids are similar to smoothsteps, but usually reach zero derivatives at infinity instead at 0,1 endpoints.


They make nice "replacements" for "step" as they approach nicely their limits as they go to infinity:


But also for saturated ramps, especially the smooth sigmoid as it has f'(0)=1 as we have shown before.


Another sigmoid is the Gompertz function, which has nice and clear parameters:

asymptote*exp(-displacement*exp(-rate*x))

Beware though, it's not symmetric around its midpoint:


There are a ton more, but I'd say not as generic. If you look at the various tonemapping curves, most of them are sigmoids, but most of them are in exponential space and not symmetric.
In fact at a given point I made tonemapping curves out of sigmoids, piecewise sigmoids or other weird things glued together :)



- Bias and Gain (thanks to Steve Worley for reminding me of these)

Bias pow(x,(-log2(a))
Gain if x < 0.5 then 0.5*bias(2*x, a) else 1-0.5*bias(2-2*x, a) 

Schlick's Bias x/((1/a-2)*(1-x)+1)
Schlick's Gain if x < 0.5 then SBias(2*x,a)*0.5 else 1-0.5*SBias(2-2*x,a))

Bias is just a power (-log2(a) only maps 0...1 to the power), and Gain maps one power next to a mirrored copy around the midpoint, the easiest way you can construct a piecewise sigmoid (without imposing conditions on the derivatives and so on).

Schlick's versions were published in Graphics Gems IV, and are not only an optimization of the original Bias/Gain formulas (credited to Perlin's Hypertexture paper), but are symmetric over the diagonal, which is a nifty property (it also means that for parameter a the inverse curve is given by the same formula with 1-a)



- Smooth Abs


Obviously if you have any "smoothstep" you can shift it around zero to create a "smoothsign" and multiply by the original value to get a smoothed absolute. The rational polynomial sigmoid works quite well for that:


SmoothAbsZero d*x*x/sqrt(1+d*d*x*x)

If you don't need to reach zero at x=0 then you can simply add an epsilon to the square root of the square of your input, yielding this


SmoothAbs sqrt(x*x+e)


And that's all what I have for now, if you encountered other nifty functions for modelling and tinkering with procedurals and so on, let me know in the comments! 
I'm always looking for nifty functions that can be useful for sculpting mathematical shapes :)

- Bonus example: Soft conditional assignment


Some links: