Search this blog

01 January, 2020

The grown-up programmer's (meta)manifesto


Another year is almost over and years should bring wisdom. Most people make lists for things they want to do in the new year, here, let's write a manifesto for programmers.

If you've been following me for a bit here or in an even-less-coherent form over twitter, these principles won't come as a surprise, but outside's cold, I'm taking a few days (mostly) away from work in my hometown, perfect time to organize some thoughts on this blog.

Meta-Rule 1: The only dogma is that we shall have none.

I love the mathematical exactness of Computer Science, we sit at the foundation of mathematics (formal systems), reasoning (logic) with fascinating implications on philosophy as well. But that theoretical foundation has nothing to do with the actual job of programming computers.

In practice, ours is a huge industry made of companies, teams, people, products and problems so heterogeneous that I doubt anyone can claim to understand it globally.

It's rare to find truly "wrong" ideas or algorithms and technologies that have been superseded by newer systems that better them in all accounts. Most of our work deals in tradeoffs. Computers are binary, programming is not.

Ideas that are appropriate for a stable team of senior programmers are often not appropriate for a team that sees a rapid turnover or that wants to grow a significant amount of juniors. 
Methodologies that work for innovative, research-driven projects are probably not the same that should be employed for less creative industries that on the other hand need to reliably evolve large, stable code-bases.
And of course, different industries have completely different cultures, requirements, expectations, constraints.

You can use goto(s).

Corollary: Understand your self-defenses.

Cosmic horror. When overwhelmed with complexity our minds tend to raise shields. Most often these come in two varieties, depending on if you're in your fascination phase or in your grumpy one.

Hype: In the former case, we tend to mindlessly follow others, ideas spread through hype and are embraced even when they are not appropriate in a given context. Eventually, we get burned by these, most of us learn our lessons and technology goes from hype to well-understood tool. 

Hate: This is the latter case, where on the other hand we harden around what we already know, the trusty tools and ideas we can rely upon, and see new ideas in an overly negative way.
Most times, this makes us actually effective, because most new ideas have high failure rates (research and innovation is defined by the high possibility of failure). Always predicting that something new will fail is a good bet. But this efficiency and resilience to variance comes with the risk of being blind to the next big thing. 
When the next big thing becomes obvious to all, it's usually too late to embrace it (the innovator's dilemma).

Note how both behaviors have virtuous traits. Hype helps new ideas to spread, and it's probably even necessary as we biased towards not wanting to move from our comfort zone, thus if we relied on perfect rationality for good new ideas to take their place in our collective tool belt, we would probably fail. And hate, as I wrote already, does indeed work most of the time, and makes people more effective at the job they do.

We have to understand, at a meta-level, that collectively these waves of hype and hate are there for a reason. Human behavior exists for a reason, and it is mostly there to serve us.
But if we understand this, then we can also be grounded and present to ourselves and understand when these behaviors do not serve us well. We should try to learn everything and idolize nothing. Know our history without becoming bound to it. Be curious without fully embracing something just because everyone else seems to be doing the same...

Corollary: Be skeptical of theories without context.

Only math is absolute. Any other science deals with primitives that we haven't invented (a.k.a. reality), thus, we can only hope to create models (theories) that are useful (predictive) in specific contexts (under given assumptions). Absolute statements are almost always propaganda, over-simplifications. We tend to like them because they are, well, simpler, and simplicity is attractive in an overly complex world. But unfortunately, they also tend to be wrong.

If a theory doesn't identify the conditions under which it applies, and the conditions under which it does not, then chances are it's not a good one. This applies even more broadly, to innovations, papers, algorithms, ideas in general.

Meta-Rule 2: Our minds are (usually) the scarcest resource, we should optimize for them.

"Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? — Brian Kernighan, The Elements of Programming Style

Again, it's unfortunate that we tend to take all these mantras as absolutes, we like to quote them, smart, famous people can't be wrong, right? 

"Premature optimization is the root of all evil" amIrite?

Stop taking shortcuts to enlightenment. We need to actually think. To me, the famous quotes above is not really a call to not write clever code - we definitely do need to be clever, even often and often early on.
But the perils both Knuth and Kernighan (among many others) were trying to communicate are the ones that come with our tendency as passionate engineers to think too much about the machine.

And yes, I say that as someone who has few skills, but one of these few is a decent knowledge of modern hardware and how to exploit it. See. I didn't say that we should not think about the machine, we should. But the peril is to become persuaded that squeezing cycles or in other ways making the machine happy is the pinnacle of programming. The key part of what I wrote is "caring too much".

People (and products) are harder than hardware. Probably have always been, but it's especially true today with a stable industry and the commoditization of programming, where most software has hundreds of people behind and it's unlikely for new clever algorithms and implementations to be the key of the success of a product.

Corollary: Be as simple as possible.

Or, do not overcomplicate. Do not overengineer. As simple as possible might very well mean not simple at all, but we should fight needless complexity as our worst enemy.

Complexity is also very tricky to define, and that's why someone's overengineering might be someone else beautifully minimal code. Say, for example, that I need to do some numerical optimization in a C++ program. I can probably write my own small optimizer in say a thousand lines of code, from scratch. Maybe Nelder-Mead or Differential Evolution.

Or I could link Eigen to give me basic vector math and reduce my code to a hundred lines. Or, I could link an external optimization library and pick a ready-made algorithm with a single function call. Heck, I could even do a call into a Mathematica kernel that can do some fancy auto-selection of a good algorithm to use.

What is "simpler"? If you are a C++ programmer, you'll probably say one of the first two options. What if I said I'm doing this in Python? In that case, I'm pretty sure we would choose the latter ones! Why? Is it just because the library/module support in Python is better? I doubt it.

Complexity is contextual. It's not about the lines of code we write or the lines of code we depend on. It's about a window of concepts we care about. Our brain is limited and looks smarter than it is mostly because it aggressively focuses on certain things while ignoring others. We're perpetually blind. 

I think that this translates to coding as well. Like it or not, we can work only within a limited window. C/C++ programmers tend to be system-level programmers, and these are tools used when we need to focus our window on performance and memory. In this context, oftentimes writing relatively small amounts of code from scratch, with no dependencies, is easier than depending on lots of unknown "magical" code, because of attention is on a given level of execution and exactly what code we execute usually matters.

On the other hand, if I'm working in Python or Mathematica I actually like to think I am blessed by not having to look at the lower levels of code execution. That is not where the window is, typically. Scripting is great to sketch and explore ideas, the minutiae of execution don't matter in this "prototyping" mode, they would actually be distractions. We don't really want to write Python code thinking how the interpreter is going to execute it.

Usually, every tool is good in a given context (tradeoffs), the context dictates what matters and what we can and should ignore, and when we talk about complexity we should think specifically of how conceptually simple the components we care about are.

This also yields another corollary. If we chose the right tool, a good default is to be idiomatic. It's usually wrong to try to "bend" a tool that is made for a given thing to go way out of its comfort zone. It usually ends up in not very well hidden complexity (because we need to interface two different worlds) and also a loss of familiarity (others can't tell how things work, it's not obvious from the local context).
While we shouldn't fear to go outside the familiar, we shouldn't do it if it's not necessary.

Corollary: Constraints are useful.

If it's true that our brain is limited to a given window, explicitly controlling said window by imposing constraints must be a useful tool. 

This shouldn't be shocking, programming is creativity, and creativity does thrive under constraints. The paradox of choice that Barry Schwartz describes applies very much to programming as well. Writer's block.

I don't want to write much about this because I intend to write much more, and dedicate a future post to constraints in programming. I think that other than taking control, designing explicitly the window at which we operate for a given program, they can be useful when imposed artificially as well.

Over-constraining can help us focus on what matters for a given task, especially for people (like me) who tend to be "completionists" in programming - not happy if we don't know we did the "best" possible implementation. Here, adding constraints is a useful hack as we can re-define "best" to narrow the scope of a task.

01 December, 2019

Is true hacking dead? What we lost.

I don't know how consciously or not, but now that I moved to San Mateo, I found myself listening to many audiobooks about the history of computing, videogames and the Silicon Valley, from the Jobs biography to the "classic" Hackers by Steven Levy, from "Console Wars" to "Bad Blood".
All of these I've been enjoying, even if some need to be taken with more of a grain of salt than others, and from most I've gained one or two interesting perspectives.

Hackers, in particular, struck some chords that are dear to me. Besides the history and the various personalities, some of which I didn't know of, one thing resonated: the hands-on, pragmatic, a-political nature of early hacking.

And no, before we keep going, I don't mean that we should not be political in our actions, today. We are social animals and we should care about society and politics, in fact, it would seem to me that the only reason, at least if one is to take the book at its word, why early hacking was a-political is because hackers were fairly despicable a-social people.

But, it is interesting, because one could make the case that nowadays we live in a world where ideologies trump pragmatic realities, and perhaps we should understand why and take a step back.

What did hackers want? Access to computing. Computers were fascinating, mesmerizing and scarce. It wasn't a matter of software licenses, nobody cared about pieces of paper (or locked doors even), we wanted to be able to touch and tinker with the machine.

And everything was made to be tinker friendly in a golden age of computer hackerism, were kids like me could put sprites on a home television set by reading the c64 manual and playing with basic.
Nobody cared that the machine was not opensource, that the basic interpreter was licensed from Microsoft.

It was truly a huge movement, if we think about it for a second, even its tools were all about immediacy, graphics as a mean of direct feedback, live-coding.
We had one megahertz CPUs (in my times) working with size (not speed!) optimized interpreters.

Even at the ideological level, the goal was for everyone to have access, with systems like Lisp and even more clearly Smalltalk which were designed explicitly with the idea that the user was a tinkerer, always able to stop the world, inspect the inner workings, make some changes, and keep going.
We almost didn't have graphics, but it was in a way the golden age of graphics because people were mesmerized by the possibilities, especially excited about having immediate feedback loops, direct manipulation, fast iteration.


Sketchpad (Sutherland), which is mentioned in Alan Kay's
"The Early History of Smalltalk"

We lost all of this, basically all. We live in a time where it's impossible not to interface with a computer, computing is cheap and immensely powerful, yet it's nearly impossible to understand and contribute to it.

It is particularly interesting how we used to have the holy grail of live-coding on computers that shouldn't have been able to afford it, while today even the newest, fanciest languages focus primarily in being able to gobble up millions of lines of code in various modules while making iteration increasingly inefficient.

Not having direct access, the ability to stop the machine, list the code, modify and resume, was almost unthinkable. Not having an easily accessible programming language on your machine was unthinkable. 
Today what was once a given, sounds in most contexts like science fiction. QBasic is in many ways still an environment that can teach people many lessons...

And again, what I find especially remarkable is that we had so much abstraction and immediacy on machines that shouldn't have been able to afford it. The 80ies were a sort of golden era for interpreters and VMs.

We went the IBM way, and we probably didn't realize it. All that we do today is built for structured teams of thousands of engineers. We prioritize big batch development over individual productivity.
That's probably why we still have textual source (great for git and merging) over more expressive formats or even the old idea of serializing the entire state of a VM (again lisp, smalltalk) which sacrifices merging entirely to make hotpatching (dynamic software updates) trivial.


The sad and inspiring story of TempleOS,
a.k.a. what the Raspberry Pi should have been.

Now, to a degree this is entirely reasonable, when something becomes commoditized it's just another thing to be used, it loses its appeal. 
We buy cars and go to mechanics, right? We don't know how to peek inside the engines anymore...

But what is striking to me is how that ideology is completely lost as well, replaced with one that prioritizes theoretical freedoms over actual ones. 

We replaced the Commodore 64, which was entirely closed, proprietary yet hackable, with a linux-based monstrosity like the Raspberry-Pi, which is mostly opensource from what I understand (on the software side of things), yet might as well just be booting Windows and the vast majority of its uses would remain identical.
It's a cheap and fun toy for programmers, sure, but it mostly (entirely?) fails at making computation more accessible, which was its original goal.

In general, it feels like hacking is today dogmatic instead of pragmatic. Surely if everything was open-source... or distributed... or blockchain-based, immutable and lock-free with a pinch of functional programming... written this or this other way, then we would have a better, enlightened society. 

And it's not a joke, it's not an entirely a fringe phenomenon, there are vast arrays of engineers that are honestly invested in trying to change the world, but honestly think that solutions are to be found in the technical infrastructure of things. (by the way - wanna see something weird?)

Perhaps we didn't truly graduate from our a-social tendencies, perhaps we're true to form in thinking that the machine and technology are more interesting than people, and groups, and culture...

Whatever the causes, we have software and hardware systems that strive to be entirely open, yet time and again are closed ones that are more accessible in practice, that really drive social revolutions.
Linux didn't change the desktop, nor the way software is made. 

Look at my industry. Videogames. What did make games tinkerable? Liberated individual creativity, art, even the ability to make a living?

Steam, the Apple app store, Microsoft XBLIG, Youtube, Twitch, Spotify, Patreon... Unity, Pico8, Dreams passing through Minecraft and Roblox and the game modding community... Not the blockchain, not linux or torrents and so on.

Even the Demoscene, one of the last bastions of true hackerism, is completely uninterested in the ideology of software licenses and contracts.


Joseph White - Pico-8

And ironically, probably by utter coincidence, but ironically indeed, all the new power brokers of this era, the Facebooks and Amazons, the Googles and Twitters and so on, fully embrace opensource stacks, hundreds of millions lines of codes powering the AIs, the networks of today. 
The new IBMs do know very well that lines of code are for the most part worthless, but people and communities aren't, so it's a no brainer to opensource more if in change one gets more people involved in a project, and more engineers hired...

In the end, probably licenses don't mean much. And perhaps technology doesn't either. How we design our human-computer (and human-to-human) interfaces does. And if we don't start thinking about people and think that some lines of code or a contract can change the world, we'll be stuck in not understanding why we keep failing.

See also: This inspiring keynote by Andy Van Dam "Reflections on Unfinished Revolutions in Personal Computing", and the work of Brett Victor

29 August, 2019

Engineering Career Guide [LEAKED]

Q: How do I progress in my career as a (rendering) engineer?
A: Start from here. <== DOWNLOAD LINK

I hope this helps. It's not comprehensive and I did remove some bits that are specific to us (they shouldn't matter anyways), it's meant to be a starting point for discussions.

I could have written a separate blog post but in the end it would have been a rehash of the same ideas so I just decided to spill some beans...


Not the download link! Just a Roblox monster.

24 August, 2019

Misunderstanding Multilayering (Diffuse-Specular Energy Conservation)

- Introduction to the problem

In our last episode, "misunderstanding multiscattering", we saw how to create a multiscattering BRDF mostly by intuition. We used the concept of directional albedo (a.k.a. directional-hemispherical reflectance) to normalize a specular BRDF and from there we derived a very simple closed-form approximation for GGX energy conservation.

We also showed that the directional albedo is the response of our BRDF in a white light furnace at different normal to view angles, and we typically have that information available in tabular form as it's a key component of the "split sum" approximation to the integrals needed for image-based lighting on contemporary BRDFs. Great!

This time, we'll show how the very same data, ideas, and intuitions, can be used to build multilayered BRDFs, and in particular to couple matte-specular models (diffuse lobes to specular lobes). 
Like the previous article, this is not anything particularly novel, this isn't Siggraph worthy material, just some notes on the subject.

Let's start, as always, with the problem, with a very quick recap of things everyone reading this will probably already know (my flight has been delayed a couple of hours, so yes, we have time for recaps).

Under the framework of geometrical optics we look for light interactions with materials at surfaces where the index of refraction changes. We assume that light travels in a medium of uniform IOR (e.g. air), and then eventually hits a surface with a different IOR and either gets reflected out of the surface, or refracted into it.
In fact in real-time rendering we only consider the air->material interface, we don't handle nested objects (e.g. air->glass->water->glass->air) even when we render transparent materials or multilayered materials. Which is wrong, but it's just one of the many ways we are still totally wrong (and yet might or might not be right by not caring). Even for offline path-tracers, this is not entirely trivial, depending on how you do light transport.

We then consider, for a small but not infinitesimal patch of our surface, the statistics of how probable are these reflections and refraction in function of the light incident angle, a given outgoing angle we want to measure, and the surface patch normal, and these statistics create a BRDF "lobe".

But BRDFs though don't typically have a single lobe: we usually have "metals" and "dielectrics" and in the latter case we have a specular and a diffuse lobe. How comes? Well, because we don't really consider a single interaction, that was sort-of a lie. We do create a lobe with the light that is scattered from the surface interaction, but we also have to consider the light that gets refracted into the surface. In the case of metals, the energy that goes into the material somehow disappears (becomes heat or maybe fairy dust who knows - physics) and we effectively have a single lobe. 
For dielectrics, though the refracted light keeps hitting molecules inside what's essentially a participating medium, and eventually taking random paths it comes back out from the surface and we model that as a diffuse lobe. 

Diffuse lobes are reasonable when the "bouncing around" is such that it effectively randomizes the direction at which the light comes out, but still happens in such a small space that we consider the light coming out effectively at the same point it came in. 
If that's not true, and the scattering distances are bigger, we have what we call "subsurface scattering" effects. If the direction is not randomized "enough", then instead we have transparent materials, and somewhere in the middle, we have what we usually address as participating media.

Any time that we consider multi-layered materials we have to model the way the light that is refracted by a layer reaches the next. At the very least, we have to know at least how much energy is "handled" by a given layer/lobe and how much is "passed down" to the subsequent lobes, as if we gave the same input light to all the layers we would sum the interactions up and potentially end up with more energy coming out of a material than it came in!
In theory we should understand much more than that, namely how the light travels in a given layer and reaches another one (its statistical distribution over directions and space), but for diffuse lobes found in dielectrics we can imagine that it doesn't matter much (anyways the light is going to be randomized...), so right now we just want to know how much energy survives the topmost specular lobe of a dielectric (gets refracted as opposed to reflected).

This might seem a lot of handwaving. And it is! This is the point of this post -and- the previous one!

We know that we have certain problems in our math, but should we care? How much should we care and why? Does it matter for the goal of generating photorealistic images easily? Remember, this is our goal, not fixing physics. Then again if we wanted to look at the physics there are a ton of assumptions that we are making anyways.

Left to right: Specular only, Specular+Diffuse, Specular+Diffuse*(1-Fresnel)
It turns out that this problem matters a lot, much more in fact than the energy loss in non-multiscattering specular lobes. The multiscattering issue was small, happened only at high roughnesses and it was entirely fixable by artists tweaking specular albedo (typically controlled by Fresnel f0 parameter) a bit.

Tweaking albedo is fine because it still keeps the materials decoupled from lighting, the main issue we wanted to fix by embracing physically-based rendering models, and it's doable because the tweaks needed are small and entirely in the range of realistic albedos (which don't ever go to f0 = 1 for metals, and of course even less so for dielectrics).

Not having any energy conservation between specular and diffuse instead creates overly bright materials that can look fine in a certain scene but will glow unnaturally under different conditions, and it's quite hard for artists to make sure that the lobes are tuned in a way that doesn't produce more energy than they should.
In fact, most real-time rendering today works without multiscattering BRDFs, but (hopefully) it always considers some way of balancing specular and diffuse.

- Solutions and pretty images

Now we know the problem -and- we know it matters, so we are justified in spending some time to investigate further. How? Well, as we did with the previous post, we bring back our friend the white furnace test. Pretty much any time we want to check for energy conservation, we take our materials and put them in a furnace!


As for the previous time though, just putting a material in a furnace doesn't tell us that much. Yes, it's already apparent how the middle image looks unnaturally bright, but that's not a great test. What we want is to setup our scene so we expect the material to reflect back exactly all the light that comes in, no matter which path the light takes in the material and across the microfacets, and then see if we get more light than we expect or less.
We have to think, what is that is absorbing light in our materials? The specular reflects light out or refracts it in, it doesn't absorb, so all the energy loss is when we go "inside" the material, in the dielectrics case, in the diffuse layer. It stands to reason then that if we set our diffuse albedo to one, regardless of the specular parameters, we should be perfectly white in a white furnace. Let's see:

Comparing no energy conservation to (1-Fresnel) with a fully white diffuse albedo.
Bingo! We see now that just adding diffuse with no attempt at energy conservation does indeed result in energy being created. And surprisingly it looks like that the simple idea of using (1-F) as a multiplication factor to normalize diffuse is indeed doing a very good job, in fact, it looks alright if it wasn't for the grazing angles... Why is that?

(1-Fresnel) at varying roughness (from 0.1 to 0.9)
Same as above, but using a non-multiscattering Specular instead of the simple one presented in the last post.
Well if we think about it, it's fairly obvious. For the terms we commonly use, at the incident angle (N=V=L) the shadowing term has no effect and the NDF takes a constant value regardless of the roughness parameter, Fresnel dominates. At grazing angles, the shadowing term starts to matter, and there is where we see our simple normalization breaking down.

What we can try to fix this? If you read the previous post, it should be obvious by now. We have something that should be white in a furnace, it isn't, let's make it! We know how much energy the specular lobe will scatter back in the furnace, for any roughness, Fresnel f0, and viewing angle, this again is the split-sum table we use for environment map lighting. We know that a Lambert diffuse is correctly normalized, so it will scatter back all incoming energy if the albedo is set to one. So how much do we have to scale the lobe for the sum of the specular plus diffuse to be one, with an arbitrary specular and a unitary albedo diffuse? Obviously just by 1-E, where E is what we get from the split-sum table!

(1-E) energy conservation test, notice the more correct grazing angles.
Note: some artifacts remain due to approximations in the Directional Albedo table I used.
Furnace fixed! And note, this works regardless of if we're using or not the multiscattering specular BRDF, as a non-multiscattering one will just behave as it's refracting more light to the diffuse layer, which in our case will still bounce it all back out. And, if we're using the simple renormalization for multiscattering presented in the previous post, we don't even need to compute a new table for directional albedo, as it's easy to derive how to analytically modify the output of the single scattering table to accommodate for the multiscattering correction (I'll leave this as an exercise for the reader, it's trivial and might motivate you to actually look at the definition of directional albedo...)

As it was for the multiscattering normalization idea, this is wrong and we can see immediately that it's wrong from the equation, even if the error won't show in the furnace, because we don't respect reciprocity (we consider only ndotv and not ndotl). It's even intuitive, as we don't seem to consider at all that the light going from the specular layer to the diffuse, scattering inside the material and eventually coming out, has to cross again the specular interface as it comes out!

Before we delve further into this, I want though to stop for a second and check out what we are doing (again, same as last time). Let's plot the 1-E multiplicative factor we're applying to diffuse and see what happens:
(1-E) normalization compared to (1-Fresnel) at varying roughness,
for a non-multiscattering specular.

Same as above, but using simple multiscattering specular. 
This is interesting! Albeit we can couple diffuse with and without a multiscattering specular, the plots for the multiscattering case are much simpler, so much so that it's easy to derive, by hand, a function that would approximate them. Fun!

Approximation: mix( (vec3(1,1,1) - fresnel(f0,ndotv) ), vec3(1) - f0, roughness )
And now, at last, let's have a look at a more correct solution that respects reciprocity and see if that matters or not for real-time rendering. This is quite hard to know intuitively, so we'll have just to implement something and check.

Luckily this is nothing new, in 2001 Kelemen and László Szirmay-Kalos published "A Microfacet Based Coupled Specular-Matte BRDF Model with Importance Sampling", and we can pretty much copy and paste their equation, which unsurprisingly ends up looking very close to the way we adjust for reciprocity in specular multiscattering (in fact Kulla and Conty cite the afore mentioned KSK paper in theirs).

Without further ado:

Left to right: Simple EC, Fresnel EC, KSK EC.
Simple Multiscattering GGX, Roughness 0.1.
Same as above but for 0.3 roughness GGX.

And some more images. The difference between (1-F) and (1-E) can be seen only at low roughness. Difference is even less pronounced for table-based (1-E), the approximation showed above and the full KSK method.

1-Fresnel; Roughness 0.1 to 0.9, GGX with simple multiscattering.
Approximated 1-E.
Table based 1-E.
KSK.

12 August, 2019

Misunderstanding Multiscattering

Today we will build a state-of-the-art multiscattering GGX BRDF for physically-based rendering. 

Already done you say? Yes, you're right, by people who understand maths and physics, that's cheating. We instead will try to do the same trying to learn as little as possible...

If you want to do this the right way instead, these are some excellent articles:
Ok, ready? 

Let's start from your vanilla contemporary GGX with all the correlated-Smith fixings. The following sequence of images is generated with Disney's BRDF explorer environment lighting mode. They show a metallic GGX withy varying roughness, from 0.3 to 0.9, using the common parametrization alpha = roughness squared.
Note: please always indicate the parametrization you are using in your articles and presentations. There are so many variants out there. Here I use one of the most common ones, even if, for what is worth probably the best idea is alpha = roughess^4, which is more perceptually linear and also manages to be a great approximation for the one-over-square-root Blinn-Phong Gloss to GGX mapping many engines prefer.


We can notice how at high roughness the material looks noticeably darker. This is annoying, and smart people have worked hard to find and fix this issue. 

Let's compare the image above with Kulla/Conty's Multiscattering GGX BRDF.


Fixed! 

Now, we could try to understand how this solution work, the math is already there. And you probably should, it's both interesting and useful to understand other related problems. But, let's try not to...

First of all, we should ask why is this a problem? How do we know it is a problem? And also, if it is a problem, how big of a problem it is?

Let's start at the top. Why do we like "PBR"? It's not because we like physics, at least, I don't... And if we did like physics, this would be a small error to focus on, considering that we are surely committing worse sins in our end-to-end image pipeline...

Computer graphics is not predictive rendering. We don't try to simulate physics to make accurate simulations, that's not a goal (and the people who care about physics are an order of magnitude ahead of us). We make pretty pictures.

At a given point, we noticed that using some physics to make pretty pictures allowed for easier workflows, decoupling materials from lighting, reducing the number of hacks and parameters, allowing to use libraries of real materials and so on. 
We thought artists would be happier and be able to work more efficiently, while still being able to create a lot of different scenes. It took a while to move over all our workflows, but it was the right call.

So, when we look for right and wrong, we have to start with art and artists. If there is a need we can identify there, then we can look at physics to see if the answers are there. In this case, we can say that yes, indeed we have a (small) problem. 
It would be nice if our material parameters were orthogonal, and having things go darker as we change roughness is not ideal.

Let's move to step two then. Can physics help? Is this darkening physically correct, or not? How do we test? Enter the furnace! Let's put our "GGX coated" metallic object in a uniformly lit environment and see what happens:


Light hits our surface, hits our microfacets. The microfacets reflect back some light, and refract some other, according to Fresnel. If we're assuming a metal people told us that the refracted light, that goes "inside" the surface, is quickly absorbed and never comes out (becomes heat). 

It's reasonable that even in a furnace our metallic object has a color, as some of the energy will be absorbed. But what if we take our microfacets and make them always reflect all the light, no absorption? 
This means we need to set our f0 to 1 (remember, Fresnel is what controls how much light the microfacets scatter), let's try that and see what happens:


The object is still not white. Something's wrong! Ok, the inquisitive reader might still say - how do you know it's wrong. Perhaps certain directions scatter more light and others less, so we still see some shading even if all the light eventually comes out... 
If we think of BRDF and lobes it's not easy to get correct intuition. Let's instead think of how a light path, starting from the camera, looks like. It might hit some of the microfacets, one or many times, bounce around then eventually escape and connect to the furnace environment which will always be a emitting a given contant energy.
How much of that energy reaches the camera? All of it! Because regardless of which and how many microfacets we hit, all the energy is reflected and none is absorbed.

Now we have an intuition of why the image above should be white, it isn't, so we have a problem in our math...

If you studied BRDFs you know that in microfacet model we have a masking-shadowing visibility function that models which microfacets are occluded by others. What we don't typically model though is the fact that these occlusions are themselves microfacets, so the light should bounce around and eventually get out, not just be discarded. 
This is what the multiscattering models model and fix, and indeed if we did put in a furnace Kulla and Conty's GGX that we have shown before, it would generate a boring and correct white image for a fully reflective material, regardless of its roughness.

Kulla's model is not trivial though, and in many cases is likely not worth using to fix such a relatively minor problem. So. Can we be more ignorant? What if we knew how much light our BRDF gives out in a furnace, for a given roughness and viewing angle (fixing then again f0 = 1)? Could we just take that value and normalize the BRDF with it? 

Spoiler alert, we can. And not only we can but it's also trivial to do, because we already have this "furnace" value in most modern engines, in the look-up tables used for the popular split-sum image based lighting approximation. 


The split-sum table boils down the "BRDF in a furnace" (also known as directional albedo or directional-hemispherical reflectance) to a scale and bias (add) factor to be applied to the Fresnel f0 value.
In our case, we want to normalize considering f0=1 so all we do is to scale our BRDF lobe by one over bias(roughness,ndotv) + scale(roughness,ndotv). This is the result:


Indeed, we're getting some energy back at high roughness, and if we tested this in a furnace it would come up white, correctly, at f0 = 1. But it's also different from Kulla's, in particular, color is not quite as saturated in the rough materials. How comes? Well again, if you studied this problem already (cheater!) you know the answer.

Proper multiscattering adds saturation because as light hits more and more microfacets before escaping the surface, we pick up more color (raising a color to a power results in a more saturated color). But so how can this be physically wrong, but still correct in our test? 

Well, it's simple, really. By not simulating this extra saturation we are still energy conserving, but we changed the meaning of our BRDF parameters. The f0 "meaning" in our "ignorant" multiscattering BRDF is not the same as Kulla's, it results in a different albedo, but the BRDF itself is still energy conserving, it's just a different parametrization. 
Most importantly, I'd argue it's a better parametrization! Then again, remember our objectives. We don't do physics for physics' sake, we do it to help our production.

We wanted to make our parameters more orthogonal so that artists don't need to artificially "brighten" our BRDF at high roughness. If we went with the "more correct" solution (about that, this recent post by Narkowicz is a great read) we would add a different dependency, now roughness instead of darkening our materials it makes them more saturated, which would kind of defeat the purpose. 
You can imagine some scenarios where this might be desirable, but I'd say it's almost always wrong for our use-cases.

If we wanted to simulate the added saturation, there are a few easy ways. Again, going for the most "ignorant" (simple) we can just scale the BRDF by 1+f0*(1/(bias(roughness,ndotv) + scale(roughness,ndotv)) - 1), resulting in the following:


Now we are close to Kulla's solution. This is still not entirely correct if you wonder why Kulla's approximation is better, as it might not show in images, it's because ours doesn't respect reciprocity.
This might be important for Sony Imageworks, as certain offline path-tracing light transport algorithms do require it, but it's fairly irrelevant for us.

Ok, so now that we have found an approximation we're happy with we can (and should) go a step further and see what we are really doing. Yes, we have a formula, but it depends on some look-up tables. 
This isn't a big issue as we need these tables around for image-based lighting anyways, but it would be an extra texture fetch and in general it's always important to double-check our math, so let's visualize the 1/(bias(roughness,ndotv) + scale(roughness,ndotv)) function we're using:


It looks remarkably simple! Indeed, it's so simple it has a trivial approximation, you don't even need any sophisticated tool to find it so I'll just show it: 1 + 2*alpha*alpha * ndotv. This fits very well:

Approximation compared to the correct normalization factor (gray surface)

You can see that there is a bit of error in the furnace test. We could improve by doing a proper polynomial fit - yes, it turns out "2" and "1" in the formula above are not exactly the best constants. 
Actually defining "best" would be a problem all on its own, because doing a simple mean-square minimization on the normalization function doesn't really make a lot of sense (we should care about end visuals, perceptive measures, which angles matter more and so on), and I think we already spent too much time for such a small fix.
Moreover, the actual rendered images are really hard to distinguish from the table-based solution.

What is interesting is to see what could we achieve if we wanted to be even simpler, dropping the dependency from ndotv. It turns out in this case 1 + alpha*alpha does the trick decently as well.
This means that we're just applying a multiplicative factor to our BRDF to brighten it at high roughness, which just makes a lot of sense.

This of course adds more error and it starts to show as the BRDF shape changes, we get more energy at grazing angles on rough surfaces, but it might just be good enough depending on your needs:

Under-exposed to highlight that with the simpler approximation sometimes we lose light, sometimes we add.