Search this blog

Loading...

10 July, 2016

SIGGRAPH 2015: Notes for Approximate Models For Physically Based Rendering

This is a (hopefully temporary) hosting location for the course notes Michal's Iwanicki and I drafted for our presentation at the Physically Based Shading Course last year.

I'm publishing them here because they were mentioned a lot in our on-stage presentation, in fact, we meant that mostly as a "teaser" of the notes but we are still not able to bring them out of the "draft" stage, despite the effort of everyone involved (us and the course organizers, to whom goes our gratitude for the hard work involved in making such a great event happen).

It also doesn't help that in an effort to show an overall methodology, we decided to collate more than a year of various research efforts (which happened independently, for different purposes) into this big document. I still have to work more on my summarization skills.

06 July, 2016

How to spot potentially risky Kickstarters. Mighty No9 & PGS Lab

This is really off-topic for the blog, but I've had so many discussions about different gaming related Kickstarters that I feel the need to write a small guide. Even if this is probably the wrong place with the wrong audience...

Let's be clear, this is NOT going to be about how to make a successful Kickstarter campaign, actually, I'm going to use two examples (one of a past KS, and one of a campaign that is still open as I write) that are VERY successful. It's NOT even going to be about how to spot scams, and I can't say that either example is

But I want to show how to evaluate risks, and when it's best to use a good dose of skepticism because it seems that there is a lot of people that get caught in the "hype" for given products and end up regretting their choices.

The two example I'm mostly going to use are the following campaigns:
I could have picked others, but these came to mind. It's not a specific critique to these two though, and I know there are lots of people enjoying Mighty No.9, and I wish the best to PGS Labs, I hope they'll start by addressing the points below and proving my doubts unfounded.

The Team

This is absolutely the most important aspect, and it's clear why. On Kickstarter you are asked to give money to strangers, to believe in them, their skills and their product. 
Would you, in real life, give away a substantial amount of money to people, for an investment, without knowing anything about them? I doubt it.

So when you see a project this successful...


...first thought muse be, these guys must be AMAZING, right?


I kid you not, that's the ONLY information on the PGS Lab team. They have a website, but there is ZERO information on them there as well.


From their (over-filtered and out-of-sync) promo video, we learn one name of a guy...


"We have brought together incredible Japanese engineers and wonderful industrial designers". A straight quote from the video, the only other mention of the team. No names, no past projects, no CVs. But they are "wonderful", "incredible" and "Japanese", right?

This might be the team. Might be buddies of the guy in the middle...
For me, this is already a non-starter. But it seems mine is not a popular point of view...

The team?

So what about Mighty No.9 then? Certainly, Inafune has enough of a CV... And he even had a real team, right? He even did the bare minimum and put the key people on the Kickstarter page...



Or did he? Not so quickly...


This is the first thing I noticed in the original campaign. Inafune has a development team (Concept) but it seems that for this game, he did intend to outsource the work.

Unfortunately, not an unusual practice, it seems that certain big names in the industry are using their celebrity to easily raise money for projects they then outsource to third party developers.



Igarashi for Bloodstained did even "worse". Not only the game itself is outsourced, but the campaign, including the rewards and merchandise, are. In fact, if you look at the KS page, you'll notice some quite clashing art styles...


...I suspect this was due to the fact that different outsourcers worked on different parts of the campaign (concept art vs rewards/tiers).

Let's be clear, per se this is not a terrible thing, both Igarashi and Inafune used Inti Creates as the outsourcing partner that has plenty of experience with 2d scrollers, which means the end product might turn out great (in fact, the E3 demo of Bloodstained looks at least competent if not exceptional)... But it shows, to me, a certain lack of commitment.

People are thinking that these "celebrity" designers will put their careers on the line, against the "evil" publishers that are not funding their daring titles (facepalm), while they are just running a marketing campaign.

This became extremely evident for Inafune in particular, as he rushed launching a (luckily disastrous... apparently you can't fool people twice) second campaign in the middle of Mighty No.9 production, revealing his hand and how little commitment he had to the title.

The demo: demonstrating skills and commitment

Now, when you got the team down, you want to evaluate their skills. Past projects surely help, but what helps, even more, is showing a demo, a work-in-progress version of the product.

It's hard enough to deliver a new product even when you are perfectly competent, I've worked in games done by experienced professionals that just didn't end up making it, and I've backed Kickstarters that failed to deliver even if they were just "sequels" of products a given company was already selling... So you really shouldn't settle for anything less than concrete proof.

How does our Kickstarters fare in terms of demos?


PGS Labs show a prototype. GREAT! But wait...


Oh. So, the prototype is nothing ore than existing hardware, disassembled and reassembled in a marginally different shape. In fact you can see the PCBs of the controller they used, a joypad for tablets which they just opened, desoldered some buttons and moved them into a 3d printed shell.

Well, this would be great if we are talking about modding, but proves exactly NOTHING about their abilities to actually -make- the hardware (my guess - but it's just a guess, is that in the best scenario they are raising money to look for a Chinese ODM that already has similar products in their catalog, and they won't really do any engineering).

Of course, when it comes to the marketing campaigns of "celebrity designers" all you get is what is cheaper to make, they know they'll get millions anyways, so, just get some outsourcers to paint some concept art


It's really depressing to me how, by just creating a video with their faces, certain people can raise enormous amounts of money. And I know that there are lots of success stories, from acclaimed developers as well, but if you look at them, the pattern is clear: success comes from real teams of people deeply involved with the products, and with actual, proven, up-to-date skills in the craft.

While so far I'd say all the projects of older, lone "celebrities" have -all- resulted in games that are -at best- ok. Have we ever seen a masterpiece coming out from any of these? Dino Dini? Lord British

Personally, as a rule of thumb I'd rather give money to a "real" indie developer, who really can't just go to a publisher in lots of cases or even self-fund borrowing from a bank, and that often do MUCH, MUCH better games by real passion, sacrifice, and eating lots of instant noodles I assume...

The "gaming press"

What irks me a lot is that these campaigns are very successful because they feed on the laziness of news sites where hype spreads due to the underpaid human copy and paste bots who just repeat the same stuff over and over again. It's really a depressing job.

And even good websites, websites where I often go for game critique and intelligent insights, seem to be woefully unequipped to discuss anything about production, money, how the industry works. 

I'm not sure if it's because gaming journalists are less knowledgeable about production (but I really doubt it) or if it's because they prefer to keep a low profile (but... these topics do bring "clicks", right?).

Anyhow. I hope at least this can help a tiny bit :)

02 July, 2016

Unity 101 - From zero to shaders in no time

Disclaimer:

I'm actually no Unity expert, as I started to look into it more seriously for a course I taught, but I have to say, it looks like it could be right now one of the best solutions for a prototyping testbed. 

This post is not meant as a rendering (engineering) tutorial, it's written for people who know rendering, and want to play with Unity, e.g. to prototype effects and techniques.

Introduction:

I really liked Nvidia's FXComposer for testing out ideas, and I still do, but unfortunately that product has been deprecated for years. 

Since then I started playing with MJP's framework by adding functionality that I needed (and later he added himself), and there are a couple of other really good frameworks out there by skilled programmers, but among the full-fledged engines, Unity seems to be the best choice right now for quick prototyping.

The main reason I like Unity is its simplicity. I can't even begin to get around Unreal or CryEngine, and I don't really care about spending time learning them. Unity on the other hand is simple enough that you can just open it and start poking around, which is really its strength. People often obsess too much on details of technology. Optimization and refinement are relatively easy, it's the experimental phase which we need to do quickly!

Unity basics:

There are really only three main concepts you need to know:

1) A project is made of assets (textures, meshes...). You just drag and drop files into the project window, and they get copied to a folder with a bit of metadata describing how to process them. All assets are hot-reloaded. Scripts (C# or JavaScript code) are assets as well!

2) Unity employs a scene-graph system (you can also directly emit draws, but for now we'll ignore that). You can drag meshes into the scene hierarchy and they will appear in the game and the editor view, and create lights, cameras and various primitives.



The difference between the two is that the game is seen through a game camera, the editor can freely roam, and when you are in game view you can change object properties (if you're paused), but these changes don't persist (aren't serialized in the scene), while when you are in the editor changes are persistent.



3) Unity uses a component system for everything. A C# script just defines a class (with the same name as the script file) which inherits from "MonoBehaviour" and can implement certain callbacks.
All the public class members are automatically exposed in the component UI as editable properties (and C# annotations can be used to customize that) and serialized/deserialized with the scene.



A component can be attached to any scene object (a camera, a mesh, a light...) and can access/modify its properties, and perform actions at given times (before rendering, after, before update, on scene load, on component enable, when drawing debug objects and so on and so forth...)



Components can really freely change anything in the scene, as there is a way of finding objects by name, type and so on, and can also create new objects and so on. The performance characteristics of doing some operations is sometimes... surprising, and in real games you might need to cache/pool certain things, but for prototyping it's irrelevant.

Shaders & Rendering:

From the rendering side things are similarly simple. Perhaps the most complex aspect for someone unfamiliar with it, is the shader system. 

As most engines, Unity has a shader system that allows for automatic generation of shader permutations (e.g. the forward renderer needs as permutation per light type and shadow) and it also needs to handle different platforms (it can cross-compile HLSL to GLSL).
It achieves that with a small DSL for shader description "ShaderLab", and the shader code is actually embedded into it. 
Unity has also other ways of making shaders without touching HLSL, and a "surface shader" system that allows to avoid writing VS ans PS, but these are not really that interesting for a rendering engineer, so I won't cover them :)

ShaderLab has functionality to set render state and declare shader parameters, with the latter automatically reflected in the Material UI, when a material binds to a given shader. I won't go into a detailed description of this system, because once you see a ShaderLab shader things should be pretty obvious, but I'll provide some examples at the end.

For geometry materials, the procedure is quite simple: you'll need ShaderLab shader (.shader) asset, a material asset that is bound to it, and then you can just assign it to a mesh (drag and drop) and everything should work.

Unity supports three rendering systems (as of now): VertexLit (which is really a forward renderer without multipass and up to eight lights per object), Forward (multipass, one light at a time) and Deferred (shading, there is also a legacy system that does "deferred lighting"). 
The shader has to declare for which system it's coded and the way Unity passes the lighting information to the shader changes based on that.

For post-effects materials you'll need both a shader and a component. The component will be a C# script that gets attached to the camera and triggers rendering in the OnRenderImage callback. In the script one can create programmatically a material, binding it to the shader and settings its parameters, so there's no need to have a separate material asset. 
The rendering API exposed by Unity is really minimal, but it's super easy to create rendertargets and draw fullscreen quads. Unity automatically chains post-effects if there are multiple components overriding OnRenderImage, and callback provides a source and destination rendertarget, so the chain is completely transparent to the scripts.

Fore more advanced effects, there is support for drawing and creating meshes (including their vertex attributes), drawing immediate geometry (lines and so on, usually for debugging) and even doing "procedural draws" (draws with no mesh data attached, vertices are assumed to be pulled from a buffer) and dispatching compute shaders.

It's also possible to access the g-buffer when using the deferred renderer and sample shadowmaps manually, but there is no provision for changing the way either are created, and no real access to any of the underlying graphics API (unless you write C++ plugins).

Last but not least, on PC Unity integrates with RenderDoc and Visual Studio for easy debugging, which is really a nice perk.

All this is best explained with code, so, if you care to try, here is a bunch of fairly well commented (albeit very basic / mostly wrong in terms of rendering techniques) sample shaders I hastily made to learn Unity myself before I started teaching the course.

18 June, 2016

Learning to teach

This past couple of months I've been teaching once a week a rendering course at private art school here in Vancouver. It took quite a bit of my time for course preparation (which in turn left this blog dry), but it's a quite nice experience and a small way to contribute to our rendering community, if you can, I wouldn't discourage people into getting involved with teaching.

This is not the first time I held a class, I actually worked for a small professional school in Italy teaching the very basics of computers while I was going to university. But it's the first time I teach rendering in a systematic way, outside sporadic presentations at work and conferences. 

So, still lots to learn...

What to teach?

The first obstacle actually is to decide what to teach. I had a curriculum for this course as it was taught in previous years, it went through the motions of building an engine in OpenGL, and from what I can tell was fairly extensive, albeit a bit oldschool.
I decided pretty early on to scrap it as I didn't think I could provide enough value that way.

I have to premise that this course takes place over 7x3=21 hours, and it's part of what I imagine is a very intense, packed year of courses in game programming. 
It's one of the last courses that the students take, so there is an expectation of a certain level of proficiency with programming and math (I don't cover linear algebra, for example). So you really have to pick your battles and fight for student's attention.

It might be my own bias (as I started programming extremely young and went on towards more theoretical studies in my higher education), but I still think that APIs and frameworks can (and probably should) be self-learned. These are not complicated topics, and while you might pick some bad advice if you mindlessly accept everything the web throws at you, I didn't think going through such things would be the best use of (rather limited) time.

Ok, so I started with a clean slate. What now? This is truly challenging. You can easily talk to people at work, on in a conference, when you have a quite precise idea of what can be assumed to be "common knowledge". But here? Not so much.
So my first decision was to keep it agile... I had from the get go a rough conceptual plan of how to structure the seven lessons and how they could relate to each other, and I discussed it with the head of the faculty, but I did everything one lesson at a time, with a good dose of improvisation.

This is my strategy. I would think and jot down notes (pretty much as I do when I have some thoughts that I think might one day go on the blog...) on where I wanted to go with a lesson, starting with the overall objective (what I wanted the students to achieve). 
Then, the day before I would scramble and start laying down some slides, trying to keep them minimal (which is against my instincts), I just make sure I have enough supporting material, diagrams, illustrations, to cover what I wanted. 
I post the lesson materials online after the class, by actually pruning out things I didn't end up covering, writing a summary to cover the main points of the lesson and giving links to further resources.

The way I tried to adapt is to slightly over-prepare. Not as in to be very prepared and rehearsed (I don't rehearse), but to have a good amount of options for the class, as it's easier to cut corners in class than to have to scramble to find materials that you didn't expect to need. I never walk in the class thinking that I -have- to go through everything, and I don't have the lesson after the one I'm teaching prepared at all, so I can always adjust.

How to assess?

Improvising (to an extent) the lessons gives the freedom to adapt, but you can't adapt if you don't have the pulse of how things are going. And this is probably the hardest thing, ever. It's especially hard when you are very versed in a field, because you totally lose sight of what means not to understand certain concepts. 

The more basic the course the harder the task. If you ever teach an introductory class to programming, and have your students problem-solve, you'll see what I mean. It's actually hard to articulate why certain things are solved in a certain way, when for your brain it's just obviously what you need to do...

I'm not great at this yet actually. I waited with anticipation the student's solutions for my first assignment, because I had not a solid idea of it was going to be trivial or daunting, and if really things where coming through (they ended up doing great). Assignments are for teachers! I really don't care about assigning grades, I care about getting feedback!

That said, a few things I can share.

First. Generic "checkpoints" are worse than useless. Asking to an audience how things are going, whenever things are understood, or to ask questions freely, is at best a waste of time, and at worst a way to get a false sense of confidence. Questions have to be real. 

Problem solving in class is great. I tried to mix problem solving with in-class assignments, and the reason I think you need both is that if you do only questions you risk engaging always the same people (and pointed questions are a bit more aggressive), while assignments let you walk through desks and see how people are doing individually. 

Moreover, as this was a very intensive course, I could not expect really a lot of at-home study from the students (my class ends at 21:30!), and I think working in class goes well with such programs because you need to have the time to come up with solutions by your own, and practice, so if you can't do it at home, at least some degree of that was done during the class itself. 
Arguably all the real learning happens not because one memorizes a lesson as a teacher taught it, but by practicing and thinking hard enough that the underlying concepts start to become apparent.

Finally, just asking for guesses and speculations before revealing the next bit of information also just helps keeping people engaged and in general, awake!

The course.

A few people after learning that I was trying this asked for the course program. I can't share the course materials themselves, both due to contractual restrictions and because they are not polished enough to be comprehensible without me explaining them, but I will go through what ended up being my ideas for the lessons (keeping in mind that if I am to do this again, I will probably change a lot).

Lesson 1: Introduction. My background, how I did things and why that doesn't matter. The course objectives: giving a basic understanding of the field to facilitate further study and practical, hand-on tinkering with GPUs and shaders. Introduction to the "rendering problem". Rendering in a videogame company, what it means to be a rendering engineer, what are the problems we face (spoiler alert: tech is the easier part).

Lesson 2: What's a GPU? Starting from a CPU (serial execution) going towards a GPU (latency, hiding it with data-parallel work...). Going from the rendering equation to rasterization: visibility and shading. Some hand-on time with executing kernels on the GPU: ShaderToy.

Lesson 3: Going from data-parallel processing to 3d scenes. What goes on when we draw an object? Vertex shaders, pixel shaders. Tinkering in Unity.

Lesson 4: "Advanced" shading. Texturing, projections, UV mapping. Some fun examples (reflection mapping with pre-integrated cubemaps in Unity). "Advanced" lighting and back to the rendering equation. Fundamentals of PBR.

Lesson 5: Going past a single draw. Engine and (generic) rendering API concepts. API: resource creation (memory), binding of state (registers), draws -> command buffer. Engine: visibility and sorting. Rendering algorithms: shadow mapping, post-effects, forward and deferred rendering.

Lesson 6: DirectX 11 API in-class "live coding". I downloaded SharpDX (an DX9-10-11-12 C# wrapper) and we went from drawing a single 2D triangle (the "hello world" of rendering) to drawing a 3D quad.

Lesson 7. I haven't actually taught this yet, but I settled now with the idea of going through an actual contemporary game, showing art and rendering choices, workflows, and more advanced rendering algorithms.

Closing remarks.

Teaching gives us an unique perspective we don't have writing articles or speaking at conferences. We can see (if we pay attention) what people understand from our words, and we can see if we are boring or engaging. I still have a lot to improve. Especially narrowing down topics is hard! Creating something truly beautiful and to the point.

To a degree I think trying to squeeze too much into a lesson (or as I do in this blog, an article), responds to the same urges that make programmer over-generalize: being concerned about covering all the possible bases even if it comes at a detriment to the overall quality. I'm aggressively against these practices in code, but still not great at exercise the same restraint in lectures.

So, I'd better stop here!

12 March, 2016

Beyond photographic realism

Service Note: if you're using motion-blur, please decouple rotational, translational and moving object amounts. And test the effect on a variety of screen sizes (fields of view). In general, motion blur shouldn't be something you notice, that registers consciously.
I've been recently playing Firewatch (really good!), which is annoyingly blurred when panning the view on my projector, and that's not an uncommon "mistake". The Order suffers from it as well, at least, to my eyes/my setup. When in doubt, provide a slider?

Ok, with that out of the way, I wanted to try to expand on the ideas of artistic control and potential of our medium I presented last time, beyond strict physical simulation with some color-grading slapped on top. But first, allow me another small digression...

This winter I was back in my hometown, visiting Naples with my girlfriend and her parents. It was their first time in Europe, so of course we went to explore some of the history and art of the surroundings (a task for which a lifetime won't be enough, but still). 

One of the landmarks we went to visit is the Capodimonte art museum, which hosts a quite vast collection of western paintings, from the middle ages up to the 18th century. 

Giotto. Nativity scene. The beginnings of the use of perspective (see also Cimabue).
I've toured this museum a few times with my father in the past, and we always follow a path which illustrates the progress of western art from sacred, symbolic illustrations, where everything is placed according to an hierarchy of importance, to the beginnings of study of the natural world, of perspective, sceneries, all the way to commissioned portraits, mythological figures and representation of common objects and people.

Renaissance painting by Raphael. Fully developed perspective.
What is incredibly interesting to me in this journey is to note how long does it take to develop techniques that nowadays we take for granted (the use of perspective, of lighting...), and how a single artist can influence generations of painters after.

Caravaggio. The calling of Saint Matthew.
When is the next wave coming, for realistic realtime rendering? When are we going to discover methods beyond the current strict adhesion to somewhat misunderstood, bastardized ideas borrowed from photography and cinematography?

Well, first of all, we ought to discuss why this question even matters, why there should be an expectation; couldn't photography be all there is to be in terms of realistic depiction? Maybe that's the best that can be done, and artistic expression should be limited to the kind of scene setups that are possible in such medium. 

In my view, there are two very important reasons to consider a language of realistic depiction than transcends physical simulation and physical acquisition devices:

1) Perception - Physical simulation is not enough to create perceived realism when we are constrained to sending a very limited stimuli (typically, LDR monitor output, without stereopsis or tracking). Studying physiology is important, but does not help much. A simple replica of what our vision system would do (or be able to perceive) when exposed to a real-world input is not necessarily perceived as real when such visual artefacts happen on a screen, instead of as part of the visual system. We are able to detect such difference quite easily, we need to trick the brain instead!

A tone-mapped image that aims to reproduce the detail perceived
by the human visual system does not look very realistic.
2) Psychology - Studying perception in conjunction with the limits of our output media could solve the issue of realism, but why perceptual realism matters in the first place? I'd say it's such a prominent style in games because it's a powerful tool for engagement, immersion. The actual goal is emotional, games are art. We could trick the brain with more powerful tools than what we can achieve by limiting ourselves to strict (perceptual) realism.

In other words the impression of seeing a realistic scene is in your brain, reproducing only at the physics of light transport on a monitor is not enough to make your brain fire in the same way as when it's looking at the real world...

So it's this all about art then? Why is this on a rendering technology blog? The truth is, artists are often scientists "in disguise", they discover powerful tools to exploit the human brain, but don't codify them in the language of science... Art and science are not disjointed, we have should understand art, serve it.

I've been very lucky to attend a lecture by professor
Margaret Livingstone recently, it's a must-see.

Classical artists understood the brain, if not in a scientific way, in an intuitive one. Painters don't reproduce a scene measuring it, they paint what it -feels- like to see a given scene.

Perceptual tricks are used in all artistic expressions, not only in painting but in architecture or in sculpture. Michlangelo's David has its proportions altered to look pleasing from a top-down viewing angle, enlarging the head and the right hand. And there is an interesting theory according to which Mona Lisa's "enigmatic" smile is a product of different frequencies and eye motions (remember that the retina can detect high-frequency details only in a small area near the center...)

Analyzing art through the lens of science enquiry can reveal some of these tricks, which is interesting to researchers as they can tell something about how our brain and visual system work, but should be even more interesting for us, if we can codify these artistic shortcuts in the language of science, turning talent into tools.

Edward Hopper understood light. And tone mapping!
(painting has a much more limited dynamic range than LCD monitors)
Cinematography has its own tricks. Photography, Set design, all arts develop their own specific languages. And real-time rendering has much more potential, because we control the entire pipeline, from scene to physics simulation to image transfer, and we can alter these dynamically, in reaction to player's inputs.
Moreover, we are an unique blend of science and art, we have talents with very different backgrounds working together on the same product. We should be able to create an incredibly rich language!
The most wonderful thing about working in lighting is the people that you encounter. Scientists and artists; engineers and designers; architects and psychologists; optometrists and ergonomists; are all concerned about how people interact with light. It is a topic that is virtually without boundaries, and it has brought me into contact with an extraordinary variety of people from whom I have gathered so much that I know that I cannot properly acknowledge all of them. - From "Lighting by Design", Christopher Cuttle.
We have full control over the worlds we display, and yet so far we author and control content with tools that simulate what can be done with real cameras, in real sets. And we have full control over the -physics- of the simulations we do, and yet we are very wary of allowing tweaks that break physical laws. Why?


James Turrell's installations play with real-world lights and spaces
I think that a lot of it is a reaction, a very understandable and very reasonable reaction, against the chaos that we had before physically-based rendering techniques. 

We are just now trying to figure everything we need to know about PBR, and trying to educate our artists to this methodology, and that has been without doubt the single most important visual revolution of the last few years in realtime (and even offline) graphics. 
PBR gives us certainty, gives us ways to understand visual problems in quantitative terms, and gives us a language with less degrees of freedom for artists, with parameters that are clearly separated, orthogonal (lights, materials, surfaces...).

This is great, when everything has a given degree of realism by construction, artists don't have to spend time trying just to achieve realism through textures and models, they can actually focus on higher level goal of focusing on -what- to show, of actual design.

But now, as we learn more of the craft of physically based models, now is the time to learn again how and why to programmatically break it! We have to understand that breaking physics is not a problem per se, the problem is breaking physics for no good reason. 
For example, let's consider adjusting the "intensity" of global illumination, which is something that is not uncommon in rendering engines, it's often a control that artists ask for. The problem is entirely of math, or correctness, but of intent. 


Lighting (softness/bounces) can be a hint of scale
Why are we breaking energy conservation? Is it because we made some mistakes in the math, and artists are correcting? Is it because artists did create worlds with incorrect parameters, for what they are trying to achieve? Or is it because we consciously want to communicate something with that choice, for example, distorting the sense of scale of the scene? The last, is a visual language choice, the former are just errors which should be corrected, finding the root cause instead of adding a needless degree of freedom to our system.

Nothing should be off the table, if we understand why we want certain hacks, the opposite, we should start with physical simulations and find what we can play with, and why. 
Non-linear geometric distortions? Funky perspective projections (iirc there were racing games in the far past that did play with that)? Color shifts? Bending light rays

And even more possibilities are open with better displays, with stereopsis, HDR, VR... What if we did sometimes kill, or even invert the stereo projection, for some objects? Or change their color, or shading, across eyes? 


3D printed dress by Iris Van Herpen
All other artistic disciplines, from fashion to architecture, rush at new technologies, new tools, trying to understand how they can be employed, hacked, bent, for new effects. 

We are still in our infancy, it's understandable, realistic realtime rendering is still young (even if we look at -gameplay- in games, that arguably is much more studied and refined than visuals as an art), but it's time to start being more aware, I'd say, of our limits, start experimenting more.

06 February, 2016

Low-resolution effects with depth-aware upsampling

I have to confess, till recently I was never fond of doing half or quarter res effects via a bilateral upsampling step. It's a very popular technique, but all the times I tried it I found it causing serious edge artifacts... 
On Fight Night Champion I ended up shipping AO and deferred shadows without any depth aware upsampling (just separating the ring and fighters from the background, and using a bias towards over-shadowing); Space Marines ended up shipping with a bilateral upsampling on AO (but no bilateral blurring or noise) but it still had artifacts. In the end it sort-of worked, via some hacks that were good enough to ship, but that I never really understood.

For Call of Duty Black Ops 3 we needed to compute some effects (volumetric lighting) at quarter-res or less, to respect the performance budgets we had, so depth-aware upsampling was definitely a necessity, so I needed to investigate a bit more into it.
A quite extreme example of "god rays" in COD:BO3
I found a solution that is very simple, that I understand quite well, and that works well in practice. I'm sure it's something many other games are doing and many other people discovered (due to its simplicity), but I'm not aware of it being presented publicly, so here it is, my notes on how not to suck at bilateral upsampling:

1) Bilateral weighting doesn't make a lot of sense for upsampling.

The most commonly used bilateral upsampling scheme works by using the same four texels that would be involved in bilinear filtering, but changing their weights by multiplying them by a function of the depth difference between the true surface (high res z-buffer) and their depths (low-res z-buffer).

This method makes little sense, really, because you can have the extreme case where the bilinear weights select only one sample, but that sample is not similar to the surface depth you need at all! Samples that are not detected to be part of the full-res surface should simply be ignored, regardless of how "strongly" biliear wants to access them...

A better option is to simply -choose- between bilinear filtering or nearest depth point sampling, based on if the low-res samples are part of the high-res surface or not. This can be done in a variety of ways, for example:

- lerp(bilinear_weights, depth_weights, f(depth_discontinuity)) * four_samples
- lerp(biliear_sample, best_depth_sample, f(depth_discontinuity))
- bilinear_fetch(lerp(bilinear_texcoords, best_depth_texcoords, f(depth_discontinuity)))

Where the weighting function f() is quite "sharp" or even just a step function. The latter scheme is similar to nVidia's "nearest depth sampling", it's the fastest alternative but in Black Ops 3 I ended up sharply going from bilateral to "depth only" weights if a too big discontinuity is detected in the four bilinear texels.

2) Choose the low-res samples to maximise the chances of finding a representative.

It's widely known that a depth buffer can't be downsampled averaging values, that would result in depths that do not exist in the original buffer, and that are not representative of any surface, but "floating" in between surfaces at edge discontinuities. So either min or max filtering is used, commonly preferring nearest-to-camera samples, with the reasoning that closest surfaces are more important, and thus should be sampled more (McGuire tested various strategies in the context of SSAO, see Table 1 here).

But if we think in terms of the reconstruction filter and its failure cases, it's clear that preferring a single set of depths doesn't make a lot of sense. We want to maximize the chance of finding, among the texels we consider for upsamping, some that represent well the surfaces in the full resolution scene. Effectively in the downsampling step we're selecting on points we want to compute the low-res effect, clearly we want to do that so we distribute samples evenly across surfaces.

A good way of doing this is to chose per each sample in the downsampled z-buffer, a surface that is different from the ones of its neighbors. There are many ways this could be done, but the simplest is to just alternate min and max downsampling in a checkerboard patter, making sure that for each 2x2 quad, if we are in a region that has multiple surfaces, at least two of them will be represented in the low-res buffer. 

In theory it's possible to push even more surfaces in a quad, for example we could record the second smallest or second biggest, or the median or any other scheme (even a quasi-random choice) to select a depth (we shouldn't use averages though, as these will generate samples that belong to no surface), but in practice this didn't seem to work great with my upsampling, I guess because it reduces spatial resolution in favour of depth resolution, but your mileage may vary depending on the effect, the upsampling filter and the downsampling ratio.

Some residual issues can be seen sometimes (upper right),
when there is no good point sample in the 2x2 neighborhood.

Further notes.

The nearest-depth upsampling with a min/max checkerboard pattern downsampling worked well enough for Black Ops 3 that no further research was done, but there are still things that could be clearly improved:

- Clustering for depth selection.
A compute shader could do actual depth clustering to try to understand how many surfaces there are in an area, and chose what depths to store and the tradeoffs between depth resolution and screenspace resolution.

- Gradients.
Depth discontinuity in the upsampling step is a very simplistic metric, more information can be used to understand if samples belong to the same surface, like normals, g-buffer attributes and so on.

- Wider filters.
Using a 2x2 quad of samples for the upsampling filter is convenient as it allows to naturally fall back to bilinear if we think the samples are representative of the high-res surface, but there is no reason to limit the search to such neighborhood, wider filters could be used, both for higher-order filtering and to have better chances of finding representative samples.

- Better filtering of the representative depth samples.
There is no reason to revert to point-sampling in presence of discontinuities (or purely depth-weighted sampling), it's still possible to reject samples that are not representative of the surface while weighting the useful ones with a filter that depends on the subtexel  position.
Special cases could be considered for horizontal and vertical edges, where we could do 1d linear interpolation on the axis of the surface. Bart Wronski has something along these lines here (and the idea of baking an UV offset to be reused by different effects also allows in general to use more complex logic, and amortize it among effects).

- "Separable" bilateral filters.
Often when depth-aware upsampling is employed we also use depth-aware (bilateral) filters, typically blurs. These are often done in separate horizontal/vertical passes, even if technically such filters are not separable at all. 
This is particularly a problem with depth-aware filters because the second pass will use values that are not anymore relative to the depths in the low-res depth buffer, but result from a combination of samples from the first pass, done at different depths.

The filter can still look right if we can always correctly reject samples not belonging to the surface at center texel of a filter, because anyway the filtered value is from the surface of the center texel, so doing the second pass using a rejection logic that uses attributes (depth...) at the center of the filtered value sort-of works (it's still a depth of the right surface). 
In practice though that's not always the case, especially if the rejection is done with depth distances only, and it causes visible bleeds in the direction of the second filter pass. A better alternative in these cases (if the surface sample rejection can't be fixed...) is to do separate passes not in an horizontal/vertical fashion but in a staggered grid (e.g. first considering a NxN box filter pass then doing a second pass by sampling every N pixels in horizontal and vertical directions).