Search this blog

06 February, 2016

Low-resolution effects with depth-aware upsampling

I have to confess, till recently I was never fond of doing half or quarter res effects via a bilateral upsampling step. It's a very popular technique, but all the times I tried it I found it causing serious edge artifacts... 
On Fight Night Champion I ended up shipping AO and deferred shadows without any depth aware upsampling (just separating the ring and fighters from the background, and using a bias towards over-shadowing); Space Marines ended up shipping with a bilateral upsampling on AO (but no bilateral blurring or noise) but it still had artifacts. In the end it sort-of worked, via some hacks that were good enough to ship, but that I never really understood.

For Call of Duty Black Ops 3 we needed to compute some effects (volumetric lighting) at quarter-res or less, to respect the performance budgets we had, so depth-aware upsampling was definitely a necessity, so I needed to investigate a bit more into it.
A quite extreme example of "god rays" in COD:BO3
I found a solution that is very simple, that I understand quite well, and that works well in practice. I'm sure it's something many other games are doing and many other people discovered (due to its simplicity), but I'm not aware of it being presented publicly, so here it is, my notes on how not to suck at bilateral upsampling:

1) Bilateral weighting doesn't make a lot of sense for upsampling.

The most commonly used bilateral upsampling scheme works by using the same four texels that would be involved in bilinear filtering, but changing their weights by multiplying them by a function of the depth difference between the true surface (high res z-buffer) and their depths (low-res z-buffer).

This method makes little sense, really, because you can have the extreme case where the bilinear weights select only one sample, but that sample is not similar to the surface depth you need at all! Samples that are not detected to be part of the full-res surface should simply be ignored, regardless of how "strongly" biliear wants to access them...

A better option is to simply -choose- between bilinear filtering or nearest depth point sampling, based on if the low-res samples are part of the high-res surface or not. This can be done in a variety of ways, for example:

- lerp(bilinear_weights, depth_weights, f(depth_discontinuity)) * four_samples
- lerp(biliear_sample, best_depth_sample, f(depth_discontinuity))
- bilinear_fetch(lerp(bilinear_texcoords, best_depth_texcoords, f(depth_discontinuity)))

Where the weighting function f() is quite "sharp" or even just a step function. The latter scheme is similar to nVidia's "nearest depth sampling", it's the fastest alternative but in Black Ops 3 I ended up sharply going from bilateral to "depth only" weights if a too big discontinuity is detected in the four bilinear texels.

2) Choose the low-res samples to maximise the chances of finding a representative.

It's widely known that a depth buffer can't be downsampled averaging values, that would result in depths that do not exist in the original buffer, and that are not representative of any surface, but "floating" in between surfaces at edge discontinuities. So either min or max filtering is used, commonly preferring nearest-to-camera samples, with the reasoning that closest surfaces are more important, and thus should be sampled more (McGuire tested various strategies in the context of SSAO, see Table 1 here).

But if we think in terms of the reconstruction filter and its failure cases, it's clear that preferring a single set of depths doesn't make a lot of sense. We want to maximize the chance of finding, among the texels we consider for upsamping, some that represent well the surfaces in the full resolution scene. Effectively in the downsampling step we're selecting on points we want to compute the low-res effect, clearly we want to do that so we distribute samples evenly across surfaces.

A good way of doing this is to chose per each sample in the downsampled z-buffer, a surface that is different from the ones of its neighbors. There are many ways this could be done, but the simplest is to just alternate min and max downsampling in a checkerboard patter, making sure that for each 2x2 quad, if we are in a region that has multiple surfaces, at least two of them will be represented in the low-res buffer. 

In theory it's possible to push even more surfaces in a quad, for example we could record the second smallest or second biggest, or the median or any other scheme (even a quasi-random choice) to select a depth (we shouldn't use averages though, as these will generate samples that belong to no surface), but in practice this didn't seem to work great with my upsampling, I guess because it reduces spatial resolution in favour of depth resolution, but your mileage may vary depending on the effect, the upsampling filter and the downsampling ratio.

Some residual issues can be seen sometimes (upper right),
when there is no good point sample in the 2x2 neighborhood.

Further notes.

The nearest-depth upsampling with a min/max checkerboard pattern downsampling worked well enough for Black Ops 3 that no further research was done, but there are still things that could be clearly improved:

- Clustering for depth selection.
A compute shader could do actual depth clustering to try to understand how many surfaces there are in an area, and chose what depths to store and the tradeoffs between depth resolution and screenspace resolution.

- Gradients.
Depth discontinuity in the upsampling step is a very simplistic metric, more information can be used to understand if samples belong to the same surface, like normals, g-buffer attributes and so on.

- Wider filters.
Using a 2x2 quad of samples for the upsampling filter is convenient as it allows to naturally fall back to bilinear if we think the samples are representative of the high-res surface, but there is no reason to limit the search to such neighborhood, wider filters could be used, both for higher-order filtering and to have better chances of finding representative samples.

- Better filtering of the representative depth samples.
There is no reason to revert to point-sampling in presence of discontinuities (or purely depth-weighted sampling), it's still possible to reject samples that are not representative of the surface while weighting the useful ones with a filter that depends on the subtexel  position.
Special cases could be considered for horizontal and vertical edges, where we could do 1d linear interpolation on the axis of the surface. Bart Wronski has something along these lines here (and the idea of baking an UV offset to be reused by different effects also allows in general to use more complex logic, and amortize it among effects).

- "Separable" bilateral filters.
Often when depth-aware upsampling is employed we also use depth-aware (bilateral) filters, typically blurs. These are often done in separate horizontal/vertical passes, even if technically such filters are not separable at all. 
This is particularly a problem with depth-aware filters because the second pass will use values that are not anymore relative to the depths in the low-res depth buffer, but result from a combination of samples from the first pass, done at different depths.

The filter can still look right if we can always correctly reject samples not belonging to the surface at center texel of a filter, because anyway the filtered value is from the surface of the center texel, so doing the second pass using a rejection logic that uses attributes (depth...) at the center of the filtered value sort-of works (it's still a depth of the right surface). 
In practice though that's not always the case, especially if the rejection is done with depth distances only, and it causes visible bleeds in the direction of the second filter pass. A better alternative in these cases (if the surface sample rejection can't be fixed...) is to do separate passes not in an horizontal/vertical fashion but in a staggered grid (e.g. first considering a NxN box filter pass then doing a second pass by sampling every N pixels in horizontal and vertical directions).

24 January, 2016

Color grading and excuses

I started jotting down some notes for this post a month ago maybe, after watching bridge of spies on a plane to New York. An ok movie if you ask me, with very noticeable, heavy-handed color chocies and for some reasons a heavy barrel distortion in certain scenes. 

Heavy barrel distortion, from the Bridge of Spies trailer. Anamorphic lenses?
I'm quite curious to understand the reasoning behind said distortion, what it's meant to convey, but this is not going to be a post criticizing the overuse of grading, I think that's already something many people are beginning to notice and hopefully avoid. Also I'm not even entirely sure it's really a "problem", it might be even just fatigue

For decades we didn't have the technology to reproduce colors accurately, so realistic color depiction was the goal to achieve. With digital technology perfect colors are "easy", so we started experimenting with ways to do more, to tweak them and push them to express certain atmospheres/emotions/intentions, but nowadays we get certain schemes that are repeated over and over so mechanically it becomes stale (at least in my opinion). We'll need something different, break the rules, find another evolutionary step to keep pushing the envelope.

Next-NEXT gen? Kinemacolor
What's more interesting to me is of course the perspective of videogame rendering. 

We've been shaping our grading pretty much after the movie pipelines, we like the word "filmic", we strive to reproduce the characteristics and defects of real cameras, lenses, film stocks and so on. 
A surprising large number of games, of the most different genres, all run practically identical post-effect pipelines (at least in the broad sense, good implementations are still rare). You'll have your bloom, a "filmic" tone mapping, your color-cube grading, depth of field and motion blur, and maybe vignette and color aberration. Oh, and lens flares, of course... THE question is: why? Why we do what we do? 

Dying light shows one of the heavier-handed CA in games
One argument that I hear sometimes is that we adopt these devices because they are good, they have so much history and research behind them that we can't ignore. I'm not... too happy with this line of reasoning. 
Sure, I don't doubt that the characteristic curves of modern film emulsions were painstakingly engineered, but still we should't just copy and paste, right? We should know the reasoning that led to these choices, the assumptions made, check if these apply to us. 
And I can't believe that these chemical processes fully achieved even the ideal goals their engineers had, real-world cameras have to operate under constraints we don't have.
In fact digital cameras are already quite different than film, and yet if you look at the work of great contemporary photographers, not everybody is rushing to apply film simulation on top of them...

Furthermore, did photography try to emulate paintings? Cross-pollination is -great-, but every media has its own language, its own technical discoveries. We're really the only ones trying so hard to be emulators; Even if you look at CGI animated movies, they seldom employ many effects borrowed from real-world cameras, it's mostly videogames that are obsessed with such techniques.

Notice how little "in your face" post-fx are in a typical Pixar movie...
A better reason someone gave me was the following: games are hard enough, artists are comfortable with a given set of tools, the audience is used to a given visual language, so by not reinventing it we get good enough results, good productivity and scenes that are "readable" from the user perspective.

There is some truth behind this, and lots of honesty, it's a reasoning can lead to good results if followed carefully. But it turns out that in a lot of cases, in our industry, we don't even apply this line of thinking. And the truth is that more often than not we just copy "ourselves", we copy what someone else did in the industry without too much regard with the details, ending up in a bastard pipeline that doesn't really resemble film or cameras.

When was the last time you saw a movie and you noticed chromatic aberrations? Heavy handed flares and "bloom" (ok, other than in the infamous J.J.Abrams  Star Trek, but hey, he apologized...)? Is the motion blur noticeable? Even film grain is hardly so "in your face", in fact I bet after watching a movie, most of the times, you can't discern if it was shot on film or digitally.
Lots of the defects we simulate are not considered pleasing or artistic, they are aberrations that camera manufacturers try to get rid of, and they became quite versed at it! Hexagonal-shaped bokeh? Maybe on very cheap lenses...


http://www.cs.ubc.ca/labs/imager/tr/2012/PolynomialOptics/
On the other hand lots of other features that -do- matter are completely ignored. Lots of a lens "character" comes from its point spread function, a lens can have a lower contrast but high resolution or the opposite, field curvature can be interesting, out of focus areas don't have a fixed, uniform shape across the image plane (in general all lens aberrations change across it) and so on. We often even leave the choice of antialiasing filters to the user...

Even on the grading side we are sloppy. Are we really sure that our artists would love to work with a movie grading workflow? And how are movies graded anyways? With a constant, uniform color correction applied over the entire image? Or with the same correction applied per environment? Of course not! The grading is done shot by shot, second by second. It's done with masks and rotoscoping, gradients, non-global filters...

A colorist applying a mask
Lots of these tools are not even hard to replicate, if we wanted to; We could for example use stencils to replicate masks, to grade differently skin from sky from other parts of the scene. 
Other things are harder because we don't have shots (well, other than in cinematic sequences), but we could understand how a colorist would work, what an artist could want to express, and try to invent tools that allow a better range of adjustment. Working in worldspace or clipspace maybe, or looking at material attributes, at lighting, and so on.

Ironically people (including myself sometimes) are sometimes instinctively "against" more creative techniques that would be simple in games on the grounds that they are too "unnatural", too different from what we think it's justified by the real camera argument, that we pass on opportunities to recreate certain effects that would be quite normal in movies instead, just because they are not exactly in the same workflow.

Katana, a look development tool.

Scene color control vs post-effect grading.

I think the endgame though is to find our own ways. Why do we grade and push so much on post effects to begin with? I believe the main reason is because it's so easy, it empowers artists with global control over a scene, and allows to do large changes with minimal effort. 

If that's the case though, could we think of different ways to make the life of our artists easier? Why can't we allow the same workflows, the same speed, to operations on source assets? With the added benefit of not needlessly breaking physical laws, thus achieving control in a a more believable way....


Neutral image in the middle. On the right: warm/cold via grading, on the left a similar effect done editing lights. 
Unlike in movies and photography for us it's trivial to change the colors of all the lights (or even of all the materials). We can manipulate subsets of these, hierarchically, by semantic, locally in specific areas, by painting over the world, interpolating between different variants and so on...
Why did we push everything to the last stage of our graphics pipeline? I believe if in photography or movies there was the possibility of changing the world so cheaply, if they had the opportunities we do have, they would exploit them immediately.

Gregory Crewdson

Many of these changes are "easy" as they won't impact the runtime code, just smarter ways to organize properties. Many pipelines are even pushing towards parametric material libraries and composting for texture authoring, which would make even bulk material edits possible without breaking physical models.

We need to think and experiment more. 



P.S. 
A possible concern when thinking of light manipulation is that as the results are more realistic, it might be less appropriate for dynamic changes in game (e.g. transitions between areas). Where grading changes are not perceived as changes in the scene, lighting changes might be, thus potentially creating a more jarring effect.

It might seem I'm very critical of our industry, but there are reasons why we are "behind" other medias, I think. Our surface area is huge, engineers and artists have to care about developing their own tools -while using them-, making sure everything works for the player, make sure everything fits in a console... We're great at these things, there's no surprise then that we don't have the same amount of time to spend thinking about game photography. Our core skills are different, the game comes first.

15 November, 2015

Lecture slides for University of Victoria


An introduction to what is like to be a rendering engineer or researcher in a AAA gamedev production, and why you might like it. Written for a guest lecture to computer graphics undergrads at University of Victoria.

As most people don't love when I put things on scribd, for now this is hosted from my dropbox.

15 August, 2015

Siggraph 2015 course

As some will have noticed, Michal Iwanicki and I were speakers (well I was much more of a listener, actually) in the physically based shading course this year (thanks again to both Stephens for organizing and invitingus) presenting a talk on approximate models for rendering, while our Activision colleague and rendering lead of Sledgehammer showed some of his studio's work on real world measurements used in Advanced Warfare.

Before and after :)
If you weren't at Siggraph this year or you missed the course, fear not, we'll publish the course notes soon (I need to do some proof reading and adding bibliography) and the course notes are "the real deal", as in twenty minutes we couldn't do much more than a teaser trailer on stage.

I wanted to write though about the reasons that motivated me to present in the course that material, give some background. Creating approximations might be time consuming sometimes, but it's often not that tricky, per-se I don't think it's the most exciting topic to talk about. 
But it is important, and it is important because we are still too often too wrong. Too many times we use models that we don't completely understand, that are exact under assumptions we didn't investigate and for which we don't know what perceptual error they cause.

You can nowadays point your finger at any random real time rendering technique, really look at it closely, compare with ground truth, and you're more likely than not to find fundamental flaws and simple improvements through approximation.

This is a very painful process, but necessary. PBR is like VR, it's an all or nothing technique. You can't just use GGX and call it a day. Your art has to be (perceptually) precise, your shadows, your GI, your post effect, there is a point where everything "snaps" and things just look real, but it's very easy to be just a bit off and ruin the illusion. 
Worse, errors propagate non-locally as artists try to compensate for our mistakes in the rendering pipeline by skewing the assets to try to reach as best as they can a local minimum.

Moreover, we are also... not helped I fear by the fact that some of these errors are only ours, we commit them in application, but the theory in many cases is clear. We often got research from the seventies and the eighties that we should just read more carefully. 
For decades in computer graphics we rendered images in gamma space, but there isn't anything for a researcher to publish about linear spaces, and even today we largely ignore what colorspaces really are and what we should use, for example.

We don't challenge the assumptions we work with.

A second issue I think is sometimes it's just neater to work with assumptions that it is to work on approximations. And it is preferable to derive our math exactly via algebraic simplifications, the problem is that when we simplify by imposing an assumption, its effects should be measured precisely.

If we consider constant illumination, and no bounces, we can define ambient occlusion, and it might be an interesting tool. But in reality it doesn't exist, so when is it a reasonable approximation? Then things don't exactly work great, so we tweak the concept and create ambient obscurance, which is better, but to a degree even more arbitrary. Of course this is just an example, but note: we always knew that AO is something odd and arbitrary, it's not a secret, but even in this simple case we don't really know how wrong it is, when it's more wrong, and what could be done to make it measurably better.

You might say that even just finding the errors we are making today and what is needed to bridge the gap, make that final step that separates nice images from actual photorealism, is a non-trivial open problem (*).
It's actually much easier to implement a many exciting new rendering features in an engine than to make sure that we got even a very simple and basic renderer is (again perceptually) right. And on the other hand if your goal is photorealism it's surely better to have a very constrained renderer in very constrained environments that is more accurate than a much fancier one used with less care.

I was particularly happy at this Siggraph to see that more and more we are aware of the importance of acquired data and ground truth simulations, the importance of being "correct", and there are many researchers working to tackle these problems that might seem even to a degree less sexy than others, but are really important.

In particular right after our presentations Brent Burley showed, yet again, a perfect mix of empirical observations, data modelling and analytic approximations in his new version of Disney's BRDF, and Luca Fascione did a better job I could ever do explaining the importance of knowing your domain, knowing your errors, and the continuum of PBR evolution in the industry.

P.S. If you want to start your dive into Siggraph content right, start with Alex "Statix" Evans amazingly presentation in the Advances course: cutting edge technology presented through a journey of different prototypes and ideas. 
Incredibly inspiring, I think also because the technical details were sometimes blurred just enough to let your own imagination run wild (read: I'm not smart enough to understand all the stuff... -sadface-). 
Really happy also to see many teams share insights "early" this year, before their games ship, we really are a great community.

P.S. after the course notes you might get a better sense of why I wrote some of my past posts like:

(*) I loved the open problems course, I think we need it each year and we need to think more at what we really need to solve. This is can be a great communication channel between the industry, the hardware manufactures and the academia. Chose wisely...

04 July, 2015

The following provides no answers, just doubts.

Technical debt, software rot, programming practices, sharing and reuse etcetera. Many words have been written on software engineering, just today I was reading a blog post which triggered this one.

Software does tend to become impervious to change and harder to understand as it ages and increases in complexity, that much is universally agreed on, and in general it's understood that malleability is one key measure of code and practices that improve or retain it are often sensible.

But when does code stop to be an asset and starts being a liability? For example when should we invest on a rewrite? 
Most people seem to be divided in camps on these topics, at least in my experience I’ve often seen arguments and entire teams even run on one conviction or another, either aggressively throwing away code to maintain quality or never allowing rewrites to capitalize on investments made in debugging, optimization and so on.

Smarter people might tell you that different decisions are adeguate for different situations. Not all code needs to be malleable, as we stratify certain layers become more complex but also require less change, new layers are the ones that we actively iterate upon and need more speed. 
Certainly this position has lots of merit, and it can be extended to the full production stack I’d say, including the tools we use, the operating systems we use and such things.

Such position just makes our evaluation more nuanced and reasonable, but it doesn’t really answer many questions. What is the acceptable level of stiffness in a given codebase? Is there a measure, who do we ask? It might be tempting just to look at the rate of change, where do we usually put more effort, but most of these things are exposed to a number of biases.

For example I usually tend to use certain systems and avoid others based on what makes my life easier when solving a problem. That doesn’t mean that I use the best systems for a given problem, that I wouldn’t like to try different solutions and that these wouldn’t be better for the end product. 
Simply though, as I know they would take more effort I might think they are not worth pursuing. An observer, looking at this workflow would infer that the systems I don’t use don’t need much flexibility, but on the contrary I might not be using them exactly because they are too inflexible.

In time, with experience, I’ve started to believe that all these questions are hard for a reason, they fundamentally involve people. 
As an engineer, or rather a scientist, one grows with the ideal of simple formula to explain complex phenomena, but people behaviour still seems to elude such simplifications.

Like cheap management books (are there any other?) you might get certain simple list of rules that do make a lot of sense, but are really just arbitrary rules that happened to work for someone (in the best case, very specific tools, worst just crap that seems reasonable enough but has no basis), they gain momentum until people realize they don’t really work that well and someone else comes up with a different, but equally arbitrary set of new rules and best practices.
Never they are backed by real, scientific data.

In reality your people matters more than any rule, the practices of a given successful team don’t transfer to other teams, often I’ve seen different teams making even similar products successfully, using radically different methodologies, and viceversa teams using the same methodologies in the same company managing to achieve radically different results.

Catering to a given team culture is fundamental, what works for a relatively small team of seniors won’t apply to a team for example with much higher turnover of junior engineers. 
Failure often comes from people who grew in given environments with given methodologies adapted to the culture of a certain team, and as that was successful once try to apply the same to other contexts where they are not appropriate.

In many ways it’s interesting, working with people encourages real immersion into an environment and reasoning, observing and experimenting what specific problems and specific solutions one can find, rather than trying to apply a rulebook. 
In some others I still believe it’s impossibile to shut that nagging feeling that we should be more scientific, that if medicine manages to work with best practices based on statistics so can any other field. I've never seen so far big attempts at making software development a science, deployed in a production environment. 

Maybe I'm wrong and there is an universal best way of working, for everyone. Maybe certain things that are considered universal today, really aren't. It wouldn't be surprising as these kinds of paradigm seem to happen in the history of other scientific fields.

Interestingly we often fill questionaries to gather subjective opinions about many things, from meeting to overall job satisfaction, but never (in my experience) on code we write or the way we make it, time spent where, bugs found where and so on...
I find amusing to observe how code and computer science is used to create marvels of technological progress, incredible products and tools that improve people’s lives, and that are scientifically designed to do so, yet often the way these are made is quite arbitrary, messy and unproductive.
And that also means that more often than not we use and appreciate certain tools we use to make our products but we can’t dare to think how they really work internally, or how they were made, because if we knew or focused on that, we would be quite horrified.

P.S.
Software science does exist, in many forms, and is almost as old as software development itself, we do have publications, studies, metrics and even certain tools. But still, in production, software development seems more art than science.