Search this blog

24 August, 2019

Misunderstanding Multilayering (Diffuse-Specular Energy Conservation)

- Introduction to the problem

In our last episode, "misunderstanding multiscattering", we saw how to create a multiscattering BRDF mostly by intuition. We used the concept of directional albedo (a.k.a. directional-hemispherical reflectance) to normalize a specular BRDF and from there we derived a very simple closed-form approximation for GGX energy conservation.

We also showed that the directional albedo is the response of our BRDF in a white light furnace at different normal to view angles, and we typically have that information available in tabular form as it's a key component of the "split sum" approximation to the integrals needed for image-based lighting on contemporary BRDFs. Great!

This time, we'll show how the very same data, ideas, and intuitions, can be used to build multilayered BRDFs, and in particular to couple matte-specular models (diffuse lobes to specular lobes). 
Like the previous article, this is not anything particularly novel, this isn't Siggraph worthy material, just some notes on the subject.

Let's start, as always, with the problem, with a very quick recap of things everyone reading this will probably already know (my flight has been delayed a couple of hours, so yes, we have time for recaps).

Under the framework of geometrical optics we look for light interactions with materials at surfaces where the index of refraction changes. We assume that light travels in a medium of uniform IOR (e.g. air), and then eventually hits a surface with a different IOR and either gets reflected out of the surface, or refracted into it.
In fact in real-time rendering we only consider the air->material interface, we don't handle nested objects (e.g. air->glass->water->glass->air) even when we render transparent materials or multilayered materials. Which is wrong, but it's just one of the many ways we are still totally wrong (and yet might or might not be right by not caring). Even for offline path-tracers, this is not entirely trivial, depending on how you do light transport.

We then consider, for a small but not infinitesimal patch of our surface, the statistics of how probable are these reflections and refraction in function of the light incident angle, a given outgoing angle we want to measure, and the surface patch normal, and these statistics create a BRDF "lobe".

But BRDFs though don't typically have a single lobe: we usually have "metals" and "dielectrics" and in the latter case we have a specular and a diffuse lobe. How comes? Well, because we don't really consider a single interaction, that was sort-of a lie. We do create a lobe with the light that is scattered from the surface interaction, but we also have to consider the light that gets refracted into the surface. In the case of metals, the energy that goes into the material somehow disappears (becomes heat or maybe fairy dust who knows - physics) and we effectively have a single lobe. 
For dielectrics, though the refracted light keeps hitting molecules inside what's essentially a participating medium, and eventually taking random paths it comes back out from the surface and we model that as a diffuse lobe. 

Diffuse lobes are reasonable when the "bouncing around" is such that it effectively randomizes the direction at which the light comes out, but still happens in such a small space that we consider the light coming out effectively at the same point it came in. 
If that's not true, and the scattering distances are bigger, we have what we call "subsurface scattering" effects. If the direction is not randomized "enough", then instead we have transparent materials, and somewhere in the middle, we have what we usually address as participating media.

Any time that we consider multi-layered materials we have to model the way the light that is refracted by a layer reaches the next. At the very least, we have to know at least how much energy is "handled" by a given layer/lobe and how much is "passed down" to the subsequent lobes, as if we gave the same input light to all the layers we would sum the interactions up and potentially end up with more energy coming out of a material than it came in!
In theory we should understand much more than that, namely how the light travels in a given layer and reaches another one (its statistical distribution over directions and space), but for diffuse lobes found in dielectrics we can imagine that it doesn't matter much (anyways the light is going to be randomized...), so right now we just want to know how much energy survives the topmost specular lobe of a dielectric (gets refracted as opposed to reflected).

This might seem a lot of handwaving. And it is! This is the point of this post -and- the previous one!

We know that we have certain problems in our math, but should we care? How much should we care and why? Does it matter for the goal of generating photorealistic images easily? Remember, this is our goal, not fixing physics. Then again if we wanted to look at the physics there are a ton of assumptions that we are making anyways.

Left to right: Specular only, Specular+Diffuse, Specular+Diffuse*(1-Fresnel)
It turns out that this problem matters a lot, much more in fact than the energy loss in non-multiscattering specular lobes. The multiscattering issue was small, happened only at high roughnesses and it was entirely fixable by artists tweaking specular albedo (typically controlled by Fresnel f0 parameter) a bit.

Tweaking albedo is fine because it still keeps the materials decoupled from lighting, the main issue we wanted to fix by embracing physically-based rendering models, and it's doable because the tweaks needed are small and entirely in the range of realistic albedos (which don't ever go to f0 = 1 for metals, and of course even less so for dielectrics).

Not having any energy conservation between specular and diffuse instead creates overly bright materials that can look fine in a certain scene but will glow unnaturally under different conditions, and it's quite hard for artists to make sure that the lobes are tuned in a way that doesn't produce more energy than they should.
In fact, most real-time rendering today works without multiscattering BRDFs, but (hopefully) it always considers some way of balancing specular and diffuse.

- Solutions and pretty images

Now we know the problem -and- we know it matters, so we are justified in spending some time to investigate further. How? Well, as we did with the previous post, we bring back our friend the white furnace test. Pretty much any time we want to check for energy conservation, we take our materials and put them in a furnace!


As for the previous time though, just putting a material in a furnace doesn't tell us that much. Yes, it's already apparent how the middle image looks unnaturally bright, but that's not a great test. What we want is to setup our scene so we expect the material to reflect back exactly all the light that comes in, no matter which path the light takes in the material and across the microfacets, and then see if we get more light than we expect or less.
We have to think, what is that is absorbing light in our materials? The specular reflects light out or refracts it in, it doesn't absorb, so all the energy loss is when we go "inside" the material, in the dielectrics case, in the diffuse layer. It stands to reason then that if we set our diffuse albedo to one, regardless of the specular parameters, we should be perfectly white in a white furnace. Let's see:

Comparing no energy conservation to (1-Fresnel) with a fully white diffuse albedo.
Bingo! We see now that just adding diffuse with no attempt at energy conservation does indeed result in energy being created. And surprisingly it looks like that the simple idea of using (1-F) as a multiplication factor to normalize diffuse is indeed doing a very good job, in fact, it looks alright if it wasn't for the grazing angles... Why is that?

(1-Fresnel) at varying roughness (from 0.1 to 0.9)
Same as above, but using a non-multiscattering Specular instead of the simple one presented in the last post.
Well if we think about it, it's fairly obvious. For the terms we commonly use, at the incident angle (N=V=L) the shadowing term has no effect and the NDF takes a constant value regardless of the roughness parameter, Fresnel dominates. At grazing angles, the shadowing term starts to matter, and there is where we see our simple normalization breaking down.

What we can try to fix this? If you read the previous post, it should be obvious by now. We have something that should be white in a furnace, it isn't, let's make it! We know how much energy the specular lobe will scatter back in the furnace, for any roughness, Fresnel f0, and viewing angle, this again is the split-sum table we use for environment map lighting. We know that a Lambert diffuse is correctly normalized, so it will scatter back all incoming energy if the albedo is set to one. So how much do we have to scale the lobe for the sum of the specular plus diffuse to be one, with an arbitrary specular and a unitary albedo diffuse? Obviously just by 1-E, where E is what we get from the split-sum table!

(1-E) energy conservation test, notice the more correct grazing angles.
Note: some artifacts remain due to approximations in the Directional Albedo table I used.
Furnace fixed! And note, this works regardless of if we're using or not the multiscattering specular BRDF, as a non-multiscattering one will just behave as it's refracting more light to the diffuse layer, which in our case will still bounce it all back out. And, if we're using the simple renormalization for multiscattering presented in the previous post, we don't even need to compute a new table for directional albedo, as it's easy to derive how to analytically modify the output of the single scattering table to accommodate for the multiscattering correction (I'll leave this as an exercise for the reader, it's trivial and might motivate you to actually look at the definition of directional albedo...)

As it was for the multiscattering normalization idea, this is wrong and we can see immediately that it's wrong from the equation, even if the error won't show in the furnace, because we don't respect reciprocity (we consider only ndotv and not ndotl). It's even intuitive, as we don't seem to consider at all that the light going from the specular layer to the diffuse, scattering inside the material and eventually coming out, has to cross again the specular interface as it comes out!

Before we delve further into this, I want though to stop for a second and check out what we are doing (again, same as last time). Let's plot the 1-E multiplicative factor we're applying to diffuse and see what happens:
(1-E) normalization compared to (1-Fresnel) at varying roughness,
for a non-multiscattering specular.

Same as above, but using simple multiscattering specular. 
This is interesting! Albeit we can couple diffuse with and without a multiscattering specular, the plots for the multiscattering case are much simpler, so much so that it's easy to derive, by hand, a function that would approximate them. Fun!

Approximation: mix( (vec3(1,1,1) - fresnel(f0,ndotv) ), vec3(1) - f0, roughness )
And now, at last, let's have a look at a more correct solution that respects reciprocity and see if that matters or not for real-time rendering. This is quite hard to know intuitively, so we'll have just to implement something and check.

Luckily this is nothing new, in 2001 Kelemen and László Szirmay-Kalos published "A Microfacet Based Coupled Specular-Matte BRDF Model with Importance Sampling", and we can pretty much copy and paste their equation, which unsurprisingly ends up looking very close to the way we adjust for reciprocity in specular multiscattering (in fact Kulla and Conty cite the afore mentioned KSK paper in theirs).

Without further ado:

Left to right: Simple EC, Fresnel EC, KSK EC.
Simple Multiscattering GGX, Roughness 0.1
Same as above but for 0.3 roughness GGX.

12 August, 2019

Misunderstanding Multiscattering

Today we will build a state-of-the-art multiscattering GGX BRDF for physically-based rendering. 

Already done you say? Yes, you're right, by people who understand maths and physics, that's cheating. We instead will try to do the same trying to learn as little as possible...

If you want to do this the right way instead, these are some excellent articles:
Ok, ready? 

Let's start from your vanilla contemporary GGX with all the correlated-Smith fixings. The following sequence of images is generated with Disney's BRDF explorer environment lighting mode. They show a metallic GGX withy varying roughness, from 0.3 to 0.9, using the common parametrization alpha = roughness squared.
Note: please always indicate the parametrization you are using in your articles and presentations. There are so many variants out there. Here I use one of the most common ones, even if, for what is worth probably the best idea is alpha = roughess^4, which is more perceptually linear and also manages to be a great approximation for the one-over-square-root Blinn-Phong Gloss to GGX mapping many engines prefer.


We can notice how at high roughness the material looks noticeably darker. This is annoying, and smart people have worked hard to find and fix this issue. 

Let's compare the image above with Kulla/Conty's Multiscattering GGX BRDF.


Fixed! 

Now, we could try to understand how this solution work, the math is already there. And you probably should, it's both interesting and useful to understand other related problems. But, let's try not to...

First of all, we should ask why is this a problem? How do we know it is a problem? And also, if it is a problem, how big of a problem it is?

Let's start at the top. Why do we like "PBR"? It's not because we like physics, at least, I don't... And if we did like physics, this would be a small error to focus on, considering that we are surely committing worse sins in our end-to-end image pipeline...

Computer graphics is not predictive rendering. We don't try to simulate physics to make accurate simulations, that's not a goal (and the people who care about physics are an order of magnitude ahead of us). We make pretty pictures.

At a given point, we noticed that using some physics to make pretty pictures allowed for easier workflows, decoupling materials from lighting, reducing the number of hacks and parameters, allowing to use libraries of real materials and so on. 
We thought artists would be happier and be able to work more efficiently, while still being able to create a lot of different scenes. It took a while to move over all our workflows, but it was the right call.

So, when we look for right and wrong, we have to start with art and artists. If there is a need we can identify there, then we can look at physics to see if the answers are there. In this case, we can say that yes, indeed we have a (small) problem. 
It would be nice if our material parameters were orthogonal, and having things go darker as we change roughness is not ideal.

Let's move to step two then. Can physics help? Is this darkening physically correct, or not? How do we test? Enter the furnace! Let's put our "GGX coated" metallic object in a uniformly lit environment and see what happens:


Light hits our surface, hits our microfacets. The microfacets reflect back some light, and refract some other, according to Fresnel. If we're assuming a metal people told us that the refracted light, that goes "inside" the surface, is quickly absorbed and never comes out (becomes heat). 

It's reasonable that even in a furnace our metallic object has a color, as some of the energy will be absorbed. But what if we take our microfacets and make them always reflect all the light, no absorption? 
This means we need to set our f0 to 1 (remember, Fresnel is what controls how much light the microfacets scatter), let's try that and see what happens:


The object is still not white. Something's wrong! Ok, the inquisitive reader might still say - how do you know it's wrong. Perhaps certain directions scatter more light and others less, so we still see some shading even if all the light eventually comes out... 
If we think of BRDF and lobes it's not easy to get correct intuition. Let's instead think of how a light path, starting from the camera, looks like. It might hit some of the microfacets, one or many times, bounce around then eventually escape and connect to the furnace environment which will always be a emitting a given contant energy.
How much of that energy reaches the camera? All of it! Because regardless of which and how many microfacets we hit, all the energy is reflected and none is absorbed.

Now we have an intuition of why the image above should be white, it isn't, so we have a problem in our math...

If you studied BRDFs you know that in microfacet model we have a masking-shadowing visibility function that models which microfacets are occluded by others. What we don't typically model though is the fact that these occlusions are themselves microfacets, so the light should bounce around and eventually get out, not just be discarded. 
This is what the multiscattering models model and fix, and indeed if we did put in a furnace Kulla and Conty's GGX that we have shown before, it would generate a boring and correct white image for a fully reflective material, regardless of its roughness.

Kulla's model is not trivial though, and in many cases is likely not worth using to fix such a relatively minor problem. So. Can we be more ignorant? What if we knew how much light our BRDF gives out in a furnace, for a given roughness and viewing angle (fixing then again f0 = 1)? Could we just take that value and normalize the BRDF with it? 

Spoiler alert, we can. And not only we can but it's also trivial to do, because we already have this "furnace" value in most modern engines, in the look-up tables used for the popular split-sum image based lighting approximation. 


The split-sum table boils down the "BRDF in a furnace" (also known as directional albedo or directional-hemispherical reflectance) to a scale and bias (add) factor to be applied to the Fresnel f0 value.
In our case, we want to normalize considering f0=1 so all we do is to scale our BRDF lobe by one over bias(roughness,ndotv) + scale(roughness,ndotv). This is the result:


Indeed, we're getting some energy back at high roughness, and if we tested this in a furnace it would come up white, correctly, at f0 = 1. But it's also different from Kulla's, in particular, color is not quite as saturated in the rough materials. How comes? Well again, if you studied this problem already (cheater!) you know the answer.

Proper multiscattering adds saturation because as light hits more and more microfacets before escaping the surface, we pick up more color (raising a color to a power results in a more saturated color). But so how can this be physically wrong, but still correct in our test? 

Well, it's simple, really. By not simulating this extra saturation we are still energy conserving, but we changed the meaning of our BRDF parameters. The f0 "meaning" in our "ignorant" multiscattering BRDF is not the same as Kulla's, it results in a different albedo, but the BRDF itself is still energy conserving, it's just a different parametrization. 
Most importantly, I'd argue it's a better parametrization! Then again, remember our objectives. We don't do physics for physics' sake, we do it to help our production.

We wanted to make our parameters more orthogonal so that artists don't need to artificially "brighten" our BRDF at high roughness. If we went with the "more correct" solution (about that, this recent post by Narkowicz is a great read) we would add a different dependency, now roughness instead of darkening our materials it makes them more saturated, which would kind of defeat the purpose. 
You can imagine some scenarios where this might be desirable, but I'd say it's almost always wrong for our use-cases.

If we wanted to simulate the added saturation, there are a few easy ways. Again, going for the most "ignorant" (simple) we can just scale the BRDF by 1+f0*(1/(bias(roughness,ndotv) + scale(roughness,ndotv)) - 1), resulting in the following:


Now we are close to Kulla's solution. This is still not entirely correct if you wonder why Kulla's approximation is better, as it might not show in images, it's because ours doesn't respect reciprocity.
This might be important for Sony Imageworks, as certain offline path-tracing light transport algorithms do require it, but it's fairly irrelevant for us.

Ok, so now that we have found an approximation we're happy with we can (and should) go a step further and see what we are really doing. Yes, we have a formula, but it depends on some look-up tables. 
This isn't a big issue as we need these tables around for image-based lighting anyways, but it would be an extra texture fetch and in general it's always important to double-check our math, so let's visualize the 1/(bias(roughness,ndotv) + scale(roughness,ndotv)) function we're using:


It looks remarkably simple! Indeed, it's so simple it has a trivial approximation, you don't even need any sophisticated tool to find it so I'll just show it: 1 + 2*alpha*alpha * ndotv. This fits very well:

Approximation compared to the correct normalization factor (gray surface)

You can see that there is a bit of error in the furnace test. We could improve by doing a proper polynomial fit - yes, it turns out "2" and "1" in the formula above are not exactly the best constants. 
Actually defining "best" would be a problem all on its own, because doing a simple mean-square minimization on the normalization function doesn't really make a lot of sense (we should care about end visuals, perceptive measures, which angles matter more and so on), and I think we already spent too much time for such a small fix.
Moreover, the actual rendered images are really hard to distinguish from the table-based solution.

What is interesting is to see what could we achieve if we wanted to be even simpler, dropping the dependency from ndotv. It turns out in this case 1 + alpha*alpha does the trick decently as well.
This means that we're just applying a multiplicative factor to our BRDF to brighten it at high roughness, which just makes a lot of sense.

This of course adds more error and it starts to show as the BRDF shape changes, we get more energy at grazing angles on rough surfaces, but it might just be good enough depending on your needs:

Under-exposed to highlight that with the simpler approximation sometimes we lose light, sometimes we add.

18 July, 2019

What makes "10x" engineers. A complete hypothesis.

This happened on Twitter recently:


It's a long thread and I censored the author's name because it doesn't matter and I don't want to add to the hate that already naturally happens on any social media these days. 
To be honest, I'm still not entirely sure it's not just a parody, but I think it isn't. Regardless, it seemed a good excuse for a blog post.

Let's for a moment forget about how good or bad the term is. I don't particularly love discussions around words. I've never used "10x" in my career and I don't find it particularly great (also because it's linked to a certain start-up culture that I don't particularly enjoy), but I don't really want to open that can of worms.

What's wrong with a description of a "10x" engineer like the one above? 

I hope it's obvious but I'll spell it out. It tries to describe an engineer not in terms of the work they do, the actual output and results, but in some kind of mystical terms.
Like you can infer people's skills by looking at how they dress, what they eat, what time they come to the office and so on.

It doesn't help that the attributes this person uses as signals for a "10x engineer" are all kinda on the introverted side of the spectrum, to be charitable, as introversion per-se is a test of some kind of smartness. It's close to the old Hollywood trope of the computer guy being a weird, overweight, sexually frustrated white guy.

Ok, now that we closed the chapter on the critique of random people's tweets, let's get to something more interesting... 

Do "10x" engineers even exist? What are they? And what makes them?

Let's start from that last one. 

What's a 10x engineer? Typically we think of "10x" engineers as people that are much more productive than the average when working with code. Code "wizards", "ninjas", "rockstars" or whatever else cringe-worthy moniker teenagers use.

In my experience, 10x engineers do exist. Even controlling for seniority, knowledge, and skills, productivity is not uniform among people. I hope this is not controversial, one can know something, be even experienced in doing a given thing with a good track record, and yet not be as effective at doing it as others.

Should you hire only "10x" people? Definitely! Sort-of. In a way... We all look for excellent people, of course, and being able to distinguish good from great among a given seniority level is certainly important.
That said, there are lots of ways to be excellent beyond coding. In a decently-sized team other aspects might even be more important, mentoring, coordination, project management and so on. As things grow code usually tends towards being an implementation detail, so to speak, secondary to product and people concerns.

Even if we just look at coding, there are lots of kinds of engineers, people who are great at handling huge, foreign code-bases, people who are great at fixing things, people who are great at creating new things, people who are great architects and so on...

Lots of things in which you can be "10x" - but still, the concept of productivity being separate from skill generally holds.

These multipliers are also of the hardest to assess in interviews because again we're saying it doesn't correlate with simply what a person has on the CV or their ability to answer technical questions. Correctly characterizing where this productivity comes from is thus of utmost importance.

Why are some people more effective than others, given the same skillset? Where does that effectiveness come from? Is it their choice of editors? Their typing speed? Some sort of flow-related supernatural focus? I think not.

First Hypothesis: Output = Skill * Effort * Allocation

Skill is knowledge and experience, what we usually mostly correlate with seniority levels. As an analogy, I would say this is the value of the chips you have, at a gambling table.

Effort, given the same workday, is mostly focus. It's a time management skill, the ability to execute your tasks in good-sized chunks. In the gambling analogy, this would be the number of chips you have.

Focus is partially environment-related, but what we don't say often enough is that a lot of it is a skill. A lot of it also relates to how much a person likes doing a given thing, how fit he is for the job at hand. Effective managers that try to hire and allocate people to do the things they are passionate about, can thus help to get the most from the effort multiplier.

The last aspect, what I called allocation. This is how you spend the chips you have. Second hypothesis: correct allocation is what "makes" a 10x engineer.

In other words, we all have a number of chips to spend each day on our tasks. And controlling for seniority levels more or less effectively capture this number, there isn't that much variance in that.
The part that has a lot of variance is the allocation. Not in what to do, as in prioritizing this bugfix to that feature (even if such skills are also fundamental and have very high variance we capture them well in job categories, think technical director versus principal engineer for example) but how we do things.

Do I use a scripting language, should I implement things in C, or maybe I should learn that fancy new language everyone's talking about? Do I rely on a library or write from scratch? Do I need to understand the overall architecture of this software? Do I need to understand the specifics of the functions I'm calling? When it's appropriate to be sloppy? Should I jump into prototyping or I need to learn about the state of the art first? Should I go deep or wide?

We always have a limited amount of resources, ability to keep things in our brain, of doing work. And software design has a lot of different dimensions, abstraction versus specificity, generalization versus integration, high-level versus low-level concerns and so on. 
The ability to navigate this design space and selecting the right tools for the job, both in terms of concrete artifacts (code, libraries, languages, IDEs) and of abstract methodologies, makes a huge difference. One thing is to know about things, the other is to be able to critically evaluate tradeoffs and allocate (your) resources well.

Third Hypothesis. The reason why "10x" skills look mystical is that we don't have a solid theory for allocation choices.

When we have a design space that lacks a solid theoretical framework to navigate it, all success looks random, unteachable, and mystical. This is why "10x" engineers are both rare and sometimes described in horrible ways like it was done in the twitter thread above.

Too much of programming is still an art, eventually some people "get it" after lots of exercise, but we don't really know how to replicate that success.

We accept that somehow of the huge talent pool of software engineers, some will somehow find a way to be productive at some things, and some will be less successful.

---

Update: I was made aware of this, which is a much more concise way of vehiculating the same message:


It's great that I'm not alone in this...

02 June, 2019

The Value of Pixels (presentation slides)


Presented at the bay area game tech meetup, hosted at Roblox offices.
If you want to be notified of future meetups, join here.




15 May, 2019

Seeing the whole Physically-Based picture.

Subtitle: Building our rendering on solidly shaky grounds.

Physically-Based Rendering has won. There is no question about it, after an initial period of reluctance, even artists have been converted and I don't think you can find many rendering systems nowadays, either offline or real-time, that hasn't embraced PBR. And PBR proved itself to even be able to adapt to multiple art styles, outside strict adherence to photorealism.

But, really, how much physics is there in our PBR renderers? Let's have a look.

- Optics and Photometry.


Starting from the top, we have to define our physical framework. Physics are models, made to "fit" reality in order to make predictions. Different models are appropriate for different contexts and problems. For rendering, we work with a framework called "geometrical optics".

In G.O. light is composed of multiple frequencies which are assumed to be independent. Light travels in straight lines (in homogeneous media). It changes direction at changes of media (changes of IOR), where it can be absorbed, reflected or refracted. It travels instantaneously and it follows the path of least time (Fermat's principle). 

Is this a good framework? It's already making a lot of assumptions, and we know it cannot model all light behavior even when it comes to things that are easily visible: diffraction, interference, fluorescence, phosphorescence. But we say that these phenomena are not that common in everyday materials, and we might be right.

That's not all though, even before we start rendering our first triangle, we make more assumptions. First, we define a color space, usually a trichromatic one, because of the visual system metamerism. Fine, but we know that's not correct for rendering. We know spectral rendering has in even sometimes dramatically different results, but we trust our artists can tune lighting, materials, and post-processing in the right way (even if the two things shouldn't be related) to generate nice images even if we restrict ourselves to RGB. Or at least, we hope.

- Scattering


Next, we have to define what happens when the light "hits" something (an IOR discontinuity). Well, who knows, light is really hard! Some electrons... resonate? Get polarized? Please let it not be something to do with quantum stuff... Anyhow, eventually they scatter some energy back... waves? particles? There is some interference at around the atomic level. Who knows, luckily, we have another framework that comes to rescue: microfacet theory.

Surfaces are made of microfacets, like a microscopic landscape, light rays hit, bounce around and eventually come out. If we integrate the behavior of said microfacets over a small area, we can compute a scattering probability (BRDF) from the distribution of the microfacets themselves and a lot of math and voila', rendering happens.

Over a small area? How small, by the way? Well, Naty Hoffman and Eric Heitz say around the order of magnitude of the projected area of a pixel. I say, around the order of magnitude of a light wavelength, and then the projected area thing is antialiasing applied "after". So probably it's the pixel thing that's right.

What are these microfacets made of? Ideal reflectors obeying only the Fresnel law for how much light is reflected and how much refracted. The refracted part gets into the material (for dielectrics, that somehow allow this behavior), scatters some more and eventually comes out. If it comes out still "near enough" we call that "diffuse" reflection.
Otherwise, we call that subsurface scattering. But how does the light scatter inside the material? It hits particles. Microflakes? But microfacet based diffuse models (e.g. Oren-Nayar) simply swap the facets from ideal reflectors to ideal diffusers (Lambert)...

Regardless. We know all these things! We have blog posts, Siggraph talks, and books. Physics... And this still is well in that "geometrical optics" framework. Rays of light hit things. So much so that we can create raytracers to brute-force these microscopic interactions and create our own BRDFs!

But, it is still reasonable to use geometrical optics for these interactions? They seem to be quite... small. Maybe diffraction now matters? It turns out, it does, and it's a well-known thing (if you read the papers from the sixties... Beckmann-Spizzichino), but we sweep it under the rug.

And well, we can't really derive the equations from the microfacets, that integral is itself hard, so the BRDFs that we use introduce, typically, a bunch of other assumptions. For example, they might disregard the fact that light can bounce around multiple times before "coming out".

But who cares, nice pictures can be generated with this theory, and that's what matters. Moreover, someone did try to fit the resulting equations to real-world materials, right? The MERL database? I wonder how much error there is in that. Or how much it samples "well" real-world materials. Or how perceptual is the error metric used in estimating the error... Better to not think too much.

- Fiat Lux!


Are we done now? Far from it! In practice, we cannot just use the BRDF and brute-force light rays, not for real-time rendering, we're not Arnold. We need to compute a few more integrals!

We need to integrate over the light source, and over the surface area that is "seen" by the pixel, we're considering (pixel footprint). And that is incredibly hard, so hard we don't even try before having introduced a bunch more assumptions and approximations.

First of all, when we talk about pixel footprint, we really mean that we consider some statistics of the surface normals. We don't consider the fact that, for example, the "view rays" themselves change (and the light ones too), or that the surface normals don't really exist as an entity separate from actual surface geometry (which would cause shadowing and all other fun things). We assume these effects to be small.

Then, when we talk about light, we mostly mean simple geometric shapes that emit light from their surface. For example, a sphere. At each point, the light is emitted equally in all directions, and most often, it's also emitted with the same intensity over the surface.

And even then it's not enough to compute everything in closed-form. In fact, the more complex the light is, typically, the more approximated the BRDF will become. And then we'll fit these approximated BRDFs to the "real" one, and sum everything up. And sprinkle some of that pixel footprint thing on top somehow as well, but really that's done once on the "real" BRDF, even if we never actually use that!

So we have an approximation for very small lights, and maybe a good one for spheres, one for lines and capsules with some more handwaving, even more for polygonal lights, especially if textured and lastly one for far, "environment" light... We have approximations for "diffuse" and for "specular", for each of these. And maybe for static versus dynamic objects? A lot of math and different approximations.

We compare them and make sure that more-or-less the material looks the same under different kinds of light, and call it a day... The most ambitious of us might even export a scene to a path-tracer and compute some sort of ground-truth, to visually make sure things are at least reasonable...

- We're done, right?


So... we get our final image. In all its Physically Based, 60fps, HDR glory! Spectacular.

Year after year people come up with better equations, tighter approximations, and we make shinier pixels as a result.

Is that all? Of course not! We are just getting started! 

In practice, materials are not just one surface... They can have layers! And they are never optically uniform! They sparkle! They are anisotropic, they have scratches. Really, look around, look at most things. Most things are sparkly and anisotropic, due to the way they are fabricated.

And nothing is a surface, really. It's mostly volume and particles. Even... air! So we need fog and volumetric models. But that's not just about the light that scatters in the air back to our virtual cameras, we should also consider how this scattering affects lighting at surfaces. Our rays of light are not that straight anymore! Participating volumes make our light sources more "diffuse". Bigger. All of them, also things like environment lighting! And... that should affect shadows too right?

And now that we think about shadows... all this complexity and unknowns are still only for what we call "direct" lighting! What about global illumination? What about the million other hacks and assumptions that we rely upon to render each or our frames?

- Conclusions

So. How much physics is there in a frame, really? And more importantly, what's the point of all this? Should we be ashamed of not knowing physics that well? Should we do physics more? Less?



I don't know. I personally do not know physics well and I'm not too ashamed. A lot of what we've been doing is reasonable, really. We went with GGX because its "tail" helps. All the lighting improvements served our products. All the assumptions, individually, looked reasonable.

But, there is a value I think in looking at our math and our approximations holistically, now that we are getting so good at photorealism.

Perhaps there is not too much value for example in going off the deep end of complexity when we think of BRDFs, if we can't then integrate them with complex lighting, or, in order to do so, we have to approximate them again.

Similarly, the features we focus on should be evaluated. Is it more important to have non-uniform emission in our light sources, or a different "tail" in GGX/T-R? Anisotropic surfaces or sparkles? Spectral sampling? Thin-film? Non-lambertian diffuse? Of which kind? Accurate energy conservation or multi-bounce in microfacets?

Is it better to use the best possible approximation for a given integral, even if we end up with many different ones, or should we just use a bunch of spherical gaussians, or LTCs and such, but keep the same representation everywhere? And in general, is most of our error still in the materials, or in the lights? This is very hard to tell from just looking at artist-made pictures, because artists will compensate any way they can!

But even more importantly - How much can we keep relying on simplifying assumptions in order to make our math work?

I suspect to answer these questions we'll need more data. Acquire from the real world. Brute force solutions. Then look at the data and understand what matters, what matters perceptually, what errors are we committing end-to-end, and what we should approximate better and how...

And we should not assume that because somewhere we have a bit of physics, we are doing things correctly. We are, after all, a field that forgot for decades basic things like color spaces and gamma.