Preface
For the past decade or two, improvements in videogame real-time rendering fidelity have always been going in locksteps with console releases, as these provide a large install base of relatively powerful (and power-efficient) GPUs that provide the anchoring for rendering technology and asset authoring.
Thus, now that the next next-generation is knocking at the door, we should start seeing more and more interesting, applied R&D coming out, after the relative slump of the last year or two. Exciting times!
Of course, though I'm not writing this because we're eager in general about tinkering with new next-gen toys, but because in the last two days we've been going through lots of excitement as we've seen a few great demos, starting a week ago with Sebastien Hillaire's new atmospheric sky simulation, then the Unreal 5 announcement showcasing Nanite's geometry and Lumen's GI, and finally NVidia releasing a new demo of their vision for RTX raytracing and cloud (as in datacenters) rendering.
Of all these, it's undeniable that Nanite is the one that got the most attention, at least from rendering professionals. Why is that? I don't think this is just because of visual appeal, albeit Brian is undeniably in the exclusive pantheon of most handsome rendering engineers (in my heart, only challenged by Drobot), we always expect beauty from Unreal demos.
I think the real difference here between this and many other incredibly smart developments, is that rendering development has been incredibly unbalanced. While we know that we don't know anything about almost anything, when we look at the big picture, it's also undeniable that we spent a lot more effort in certain areas than others.
Shading, since we started this idea of PBR, has seen a ton of attention, ostensibly way past the point where innovation in that realm actually translates to benefits for artists and visual quality for end-users (remember, PBR started as an idea to reduce the amount of weird, ad-hoc controls and shaders we were asking artists to tune!). I would even venture that we know exact solutions for certain specific problems in the shading equation, while never having proved that these were the most significant source of error when looking at the end-to-end pipeline...
Light transport is arguably a much bigger problem, both algorithmically and in its influence in the end quality, and it's far from being solved... But we were not slacking. Most rendering innovation is light transport in real-time rendering: shadows, various ideas on occlusion and indirect illumination, area lighting and environment lighting, and so on and so forth. Important, yes. Definitely not "solved" yet. But an area where we applied the brunt force of hundreds of genius minds for the last decade, arriving now at solutions that are undeniably quite amazing.
Art production on the other hand... We started with objects made of triangles and uv-mapped textures, and we largely are at objects made of triangles and uv-mapped textures. There has been some effort here and there to break free, from Lionhead's "mega meshes" to Carmack's "megatextures", but nothing that ultimately took hold in the industry.
Sure, we did, on the art side, started working in material layers (Substance et al), and with acquired data (photogrammetry, BRDF scanning), but that has been going mostly at a tooling level, with relatively "low tech" ideas (relatively!).
Yet, every time that we see something new (hello Dreams!) in the asset authoring domain, we get excited. We know that this matters, we agree. But we were perhaps still riding on the growth of big games being able to pay for sheer art grunt - even if we should know that saving on the human capital does not mean scaling down our production (necessarily), it means that people can focus their creativity on things that matter more.
Arguably it "had" to come from a third party engine. Games sell to users. Middleware sells to developers. Developer productivity is an end-product feature for an engine. Almost none of the successful gaming middleware was so due to being the most efficient implementation of a given concept, but because they understood (or outright created) the needs of their audience. Engineering excellence is not measured in instructions per cycle but in the ability to deliver amazing value to your customers. Sometimes this implies optimizing for IPC, sometimes it doesn't.
It's painfully obvious, I know, but I feel like it's worth reiterating when many professionals in our line of business used to deride the inefficiencies of Unity, notoriously, and even of course of Unreal itself which was never (and I think could never be) the absolute champion of rendering efficiency. Now both, of course, are hubs of excellence in rendering research.
So. Is this going to "be it"? Way too early to say. But any new piece of R&D in this area is amazing to see.
How does this thing work?
Who knows!
Who cares!
Ok, seriously. I know that we stand on the shoulders of the giants, and this is how progress is made, you don't need to tell me. And we're all eager to lean on Brian's and Graham's and all the rest of Epic's gang, (handsome) shoulders. But let's allow ourselves to be silly and wrong and imagine how this could be made. Puzzles are fun, and having everybody explore a single direction in R&D is a very dangerous thing.
So, let's start.
I think pixel-peeping the UE5 video is not particularly interesting. For what it's worth, I am not entirely sure that such a demo / vertical slice would not just run on a powerful enough GPU with semi-conventional techniques and just eating the memory footprint and killing the rasterizer. Desktop GPUs are so powerful that you can be very dumb and use a fraction of their power, still creating great results (and that's part of why consoles can "bridge the gap" or rather, most PC GPUs end up being not utilized as efficiently, in relative terms).
So, let's start from the basic principles. I'd imagine that this technology is made with the explicit intent of being able to ingest any content (photogrammetry, movie assets, etc), that's to say, arbitrary triangle meshes, of whatever complexity and topology. I also would imagine that it would be ok to have some error when rendering them, as long as it's kept around the pixel/sub-pixel level.
A few things seem to be a hard problem in crafting such technology:
1) How to compress the data, so that we can hopefully still fit in the constraints of physical media (bluray) and we can push it through our vertex pipelines.
2) How to LOD.
2b) How to antialias, as we are effectively side-stepping normalmaps under our assumptions (baking detail from highpoly to intermediate cages).
3) How to render pixel-sized geometric details efficiently.
3b) How to do culling.
Of these problems, I would say the one it's hardest to talk about is the compression aspects. This is both a very well researched topic and something that interacts really with the specifics of the whole system in ways that, at least for me, make it impossible to use it as a starting point.
In fact, I'd say that the best approach for this is to start from rendering and work our way up.
Why can't we just use the hardware as is?
Now, why is it hard to render pixel-sized or subpixel triangles?
Well, it isn't of course, a GPU just does it. But for the longest times GPUs with an almost universal assumption that there is a given pixel-to-triangle ratio (around an average of ten or so pixels per triangle), triangles do work-expansion. And this is really one of these cardinal ideas around which GPUs are balanced, we "expect" a given ratio between vertices and triangles, triangles to non-culled triangles, and finally triangles to pixels.
Many people, including myself, have been suggesting (to various vendors) the idea that such ratios should change (quite like other did, for example, ALU to TEX), but so far, unsuccessfully, and it's not hard to imagine why. We all know that gradients, and thus mipmaps et al, require 2x2 quads of course, so small triangles need to generate lots of partial quads wasting efficiency. Any GPU rendering pixel-size triangles would literally run at 1/4th of the compute efficiency.
Well, what if we don't need the gradients you ask? And I asked! But hardware people can be lazy. Once you know that you need quads then you can build your texture units to compute addressing in groups of 2x2, all the way down the memory lane. And you can build your work setup to create waves that in the very worst case will have wave_size/4 different triangles, and so on and so forth going "up" the pipeline.
Killing quads seems easy enough, making all the units up and downstream work with 4x more work isn't. In fact one could fundamentally argue that it would never be a good idea, because the raster is there (among other things) to -create- pixel work.
If you have to have one-to-one raster to compute thread ratio, then one can argue that you wouldn't have a fixed function raster but rather some more instructions that do raster-like things for the compute pipeline... Perhaps near the texture units? I wonder how hardware raytracing works... :)
Ok, so the hardware rasterizer is out of the window. What next? Here is where we start Dreaming...
Look, ma, no (hardware) raster!
Especially after Dreams, I think we all want non-orthodox rendering methods to be the answer. Volumes? SDF? Raymarching? Points? Perhaps we could even create crude shells around the geometry, raster them with the hardware, and leverage hardware raytracing (which ought to be "wider" and leveraging compute more than the fixed-function raster) for visibility inside the shell? I love it!
But as I start thinking of this a bit more, I don't think I would be able to make it work.
Volumes and SDFs are immensely powerful, but for this specific use-case, they would not guarantee that the original geometry is always preserved accurately, which I think is an important constraint. In general, I am always scared of leveraging data structures that require lots of iterations per pixel, but I'm probably wrong.
I really would like people to think more in that direction, but I failed. And the few "debug draw" frames from the Unreal presentation clearly show "arbitrary" triangles.
It is true though that, at least without pixel peeping every single frame, I could not notice any LOD switch / topology change, which with just triangle data sounds somewhat tricky even if triangles are that small.
The thing that "killed" the shell idea for me really though is that I don't think the hardware raster will help, because you'd need to output the depth of the fine geometry from a pixel shader, slowing down, if not killing outright, early depth rejection, which would be the only reason to leverage the hardware path anyways.
And we already know we can do all the culling work in software! We have games already shipped that move object culling, draw culling, triangle cluster, and even individual triangle culling to compute, pre-raster. We know we can already do better than the conventional hardware path when we dealing with complex geometry, and of course, all the CPU-side culling techniques still apply as well.
What's next? Another "obvious" source of inspiration is REYES, and Karis himself referenced to geometry images, that, if seen through the right distortion lens could remind of micropolygon grids.
Also, there are in the literature REYES implementations on the GPU, and other experiments with software rasterization, including ideas of "quad-merging" and so on.
So, what does REYES do? Well, the idea there is to create regular grids of vertices, with the squares of the grid roughly pixel-sized. This allowed to then compute all shading on the grid in world space, which literally means shading and geometry detail are one-to-one (sort-of, actual implementations are more complex), while not needing all the topology information of arbitrary meshes (the connectivity is implicit, it's a grid).
It also means we compute shading pre-visibility (works even with path tracing! see Weta's Manuka), and the grid itself will provide a way to compute gradients and it serves as the unit of SIMD work.
Essentially, REYES maps the concepts that we know and love from HW raster (SIMD, gradients) to a different space (world/object instead of screen).
The question is now, do we need all that? I don't think we'd want to do shading this way, in fact, I doubt we want to do shading -at all- in our software solution, as it would be a million times harder than just resolving visibility.
And at the end we still have to rasterize the grid's triangle, the only thing that this provides in our case would be a compression format, essentially, mini-geometry images, and we don't care about that right now.
From Karis on DigitalFoundry already spelled it out I think clearly. "The vast majority of triangles are software rasterized using hyper-optimized compute shaders specifically designed for the advantages we can exploit [...]".
Would it be hard to imagine that we can extend Graham's ideas of GPU culling pipelines to distribute chunks of triangles to a software raster? Say that we removed the hard cases, we don't need to do shading, we don't even want to write gbuffers, just visibility (and some triangle IDs).
Chunks of triangles are small enough that one can send them to specialized shaders even bucketed for size or the need for clipping etc.
In the "simplest" case of unclipped, pixel-ish sizes triangles, it's not hard to beat the hardware raster with a "simple" shader that computes one triangle per thread.
In fact, this is certainly true for depth-only rendering, and I know that different people experimented with the idea for shadowmaps at the very least, not sure if it ever shipped in an era where these are often cached or the wasted raster work can be overlapped with other compute via async, but it's definitely possible.
So, moving culling to compute and doing software rasterization in compute is definitely possible, and for small triangles, ought to be fast. One could even dream that some GPUs might have instructions to help visibility computation nowadays... But to be fair, I don't think that's the way here, with what we have right now.
Now comes the real tricky things though. The software raster is easy to imagine, but how to do shading is not. I'm rather sure you do not want to write gbuffers or worse, do shading directly from the SW raster. You need to write a "visibility buffer".
As far as I know, here you have two options, you either write draw-IDs, UVs, gradients, and tangent-spaces, or you just write triangle/draw-IDs and barycentrics, then go back to the triangle data to compute UVs.
The first approach limits you to only one set of UVs (realistically) - that might not be a problem in many cases or if say, we had virtual texturing as well. It requires more bandwidth, but you do not need to access the geometric data "twice".
The second decouples visibility from attribute fetching, and thus the vertex attributes can be arbitrary, but in general is more complex and I'd imagine you really don't want to have "fat" attributes when the geometry is this dense anyways, you really have to be careful with your inputs, so I guess it's overall less attractive.
Note that they both impose some limit on the number of draws you can do before you run out of IDs, but that's not a huge issue as one could find ways to "flush" the visibility buffer by resolving the shading (kind of what a mobile GPU does).
With deferred rendering, we already moved roughly half of the shading out of the raster and into compute tiles. It was never a stretch of the imagination to think that we could move pieces out, and keep just visibility. I guess it's the natural next step to try to entirely get rid of the hardware raster (at least, when dealing with pixel-sized triangles)...
Compute-based culling, plus visibility buffers, plus compute-based raster. I guess what I'm saying is that the future is to implement mobile GPUs in software, huh?
Seriously though, this has been in my mind for a while, and I know too little about GPU hardware to have a real opinion.
I wonder if the fact that we seem to be moving more and more towards concepts that exist in mobile GPU a proof that their design is smart, or a proof that such systems should not fundamentally be part of the hardware, because they can be efficiently implemented in software (and perhaps GPUs could help by providing some specialized instructions and data paths).
Also, would this imply not leveraging two of new cool GPU features we have now? Mesh shaders, and raytracing? Who knows.
All of this, if it's even in the ballpark of something that can be realistically imagined, is still incredibly hard to actually make it work (I mean, even just on paper or in a demo, much less in a production system, complete with streaming and so on).
On mobile hardware, there is a lot of (secret, magical) compression going on behind the scenes on the geometrical data, and this is without pushing pixel-sized triangles...
Of course, all this also implies throwing away quad-based gradients, but I guess that's not a huge stretch if the rest of the shading is mostly about fetching a single texture from an UV, directly. And with raytracing (and honestly even some things in conventional deferred) we threw away quads already.
Or maybe one can take a page from REYES and bake everything into vertex data. One can even imagine that with so much geometrical detail, prefiltering becomes a lost war anyways and one really has to double-down on temporal antialiasing.
Compression and LODs are the hard part in this scheme. Is it possible to extract chunks of regular topology, REYES-style, from arbitrary meshes? Sort of triangle-strips on steroids. Perhaps. Or maybe that doesn't matter as much, as long as you can stream really fast a lot of geometry.
If one managed to compress really well, one could argue that LODs are actually not that hard, especially if it's "just" about changing topology and reusing the vertex data, which for such dense meshes, and with triangles always being almost pixel-sized, it should be possible...
I guess that would be an decent use-case for cloud streaming, where a game could have terabytes of read-only data because they would not need to be delivered to the user's devices, and shared among many instances on the same hardware (bandwidth notwithstanding...).
I hope none of (the rather boring) things I wrote makes sense anyways, and they really likelt do not, as it's hard to even know what would really be hard, without starting to implement. And I like to dream of what the next Dreams will look like... Maybe we don't need triangles at all! Render it on!
Search this blog
Showing posts with label Graphic rants. Show all posts
Showing posts with label Graphic rants. Show all posts
15 May, 2020
15 May, 2019
Seeing the whole Physically-Based picture.
Subtitle: Building our rendering on solidly shaky grounds.
Physically-Based Rendering has won. There is no question about it, after an initial period of reluctance, even artists have been converted and I don't think you can find many rendering systems nowadays, either offline or real-time, that hasn't embraced PBR. And PBR proved itself to even be able to adapt to multiple art styles, outside strict adherence to photorealism.
But, really, how much physics is there in our PBR renderers? Let's have a look.
- Optics and Photometry.
Starting from the top, we have to define our physical framework. Physics are models, made to "fit" reality in order to make predictions. Different models are appropriate for different contexts and problems. For rendering, we work with a framework called "geometrical optics".
In G.O. light is composed of multiple frequencies which are assumed to be independent. Light travels in straight lines (in homogeneous media). It changes direction at changes of media (changes of IOR), where it can be absorbed, reflected or refracted. It travels instantaneously and it follows the path of least time (Fermat's principle).
Is this a good framework? It's already making a lot of assumptions, and we know it cannot model all light behavior even when it comes to things that are easily visible: diffraction, interference, fluorescence, phosphorescence. But we say that these phenomena are not that common in everyday materials, and we might be right.
That's not all though, even before we start rendering our first triangle, we make more assumptions. First, we define a color space, usually a trichromatic one, because of the visual system metamerism. Fine, but we know that's not correct for rendering. We know spectral rendering has in even sometimes dramatically different results, but we trust our artists can tune lighting, materials, and post-processing in the right way (even if the two things shouldn't be related) to generate nice images even if we restrict ourselves to RGB. Or at least, we hope.
- Scattering
Next, we have to define what happens when the light "hits" something (an IOR discontinuity). Well, who knows, light is really hard! Some electrons... resonate? Get polarized? Please let it not be something to do with quantum stuff... Anyhow, eventually they scatter some energy back... waves? particles? There is some interference at around the atomic level. Who knows, luckily, we have another framework that comes to rescue: microfacet theory.
Surfaces are made of microfacets, like a microscopic landscape, light rays hit, bounce around and eventually come out. If we integrate the behavior of said microfacets over a small area, we can compute a scattering probability (BRDF) from the distribution of the microfacets themselves and a lot of math and voila', rendering happens.
Over a small area? How small, by the way? Well, Naty Hoffman and Eric Heitz say around the order of magnitude of the projected area of a pixel. I say, around the order of magnitude of a light wavelength, and then the projected area thing is antialiasing applied "after". So probably it's the pixel thing that's right.
What are these microfacets made of? Ideal reflectors obeying only the Fresnel law for how much light is reflected and how much refracted. The refracted part gets into the material (for dielectrics, that somehow allow this behavior), scatters some more and eventually comes out. If it comes out still "near enough" we call that "diffuse" reflection.
Otherwise, we call that subsurface scattering. But how does the light scatter inside the material? It hits particles. Microflakes? But microfacet based diffuse models (e.g. Oren-Nayar) simply swap the facets from ideal reflectors to ideal diffusers (Lambert)...
Regardless. We know all these things! We have blog posts, Siggraph talks, and books. Physics... And this still is well in that "geometrical optics" framework. Rays of light hit things. So much so that we can create raytracers to brute-force these microscopic interactions and create our own BRDFs!
But, it is still reasonable to use geometrical optics for these interactions? They seem to be quite... small. Maybe diffraction now matters? It turns out, it does, and it's a well-known thing (if you read the papers from the sixties... Beckmann-Spizzichino), but we sweep it under the rug.
And well, we can't really derive the equations from the microfacets, that integral is itself hard, so the BRDFs that we use introduce, typically, a bunch of other assumptions. For example, they might disregard the fact that light can bounce around multiple times before "coming out".
But who cares, nice pictures can be generated with this theory, and that's what matters. Moreover, someone did try to fit the resulting equations to real-world materials, right? The MERL database? I wonder how much error there is in that. Or how much it samples "well" real-world materials. Or how perceptual is the error metric used in estimating the error... Better to not think too much.
- Fiat Lux!
Are we done now? Far from it! In practice, we cannot just use the BRDF and brute-force light rays, not for real-time rendering, we're not Arnold. We need to compute a few more integrals!
We need to integrate over the light source, and over the surface area that is "seen" by the pixel, we're considering (pixel footprint). And that is incredibly hard, so hard we don't even try before having introduced a bunch more assumptions and approximations.
First of all, when we talk about pixel footprint, we really mean that we consider some statistics of the surface normals. We don't consider the fact that, for example, the "view rays" themselves change (and the light ones too), or that the surface normals don't really exist as an entity separate from actual surface geometry (which would cause shadowing and all other fun things). We assume these effects to be small.
Then, when we talk about light, we mostly mean simple geometric shapes that emit light from their surface. For example, a sphere. At each point, the light is emitted equally in all directions, and most often, it's also emitted with the same intensity over the surface.
And even then it's not enough to compute everything in closed-form. In fact, the more complex the light is, typically, the more approximated the BRDF will become. And then we'll fit these approximated BRDFs to the "real" one, and sum everything up. And sprinkle some of that pixel footprint thing on top somehow as well, but really that's done once on the "real" BRDF, even if we never actually use that!
So we have an approximation for very small lights, and maybe a good one for spheres, one for lines and capsules with some more handwaving, even more for polygonal lights, especially if textured and lastly one for far, "environment" light... We have approximations for "diffuse" and for "specular", for each of these. And maybe for static versus dynamic objects? A lot of math and different approximations.
We compare them and make sure that more-or-less the material looks the same under different kinds of light, and call it a day... The most ambitious of us might even export a scene to a path-tracer and compute some sort of ground-truth, to visually make sure things are at least reasonable...
We compare them and make sure that more-or-less the material looks the same under different kinds of light, and call it a day... The most ambitious of us might even export a scene to a path-tracer and compute some sort of ground-truth, to visually make sure things are at least reasonable...
- We're done, right?
So... we get our final image. In all its Physically Based, 60fps, HDR glory! Spectacular.
Year after year people come up with better equations, tighter approximations, and we make shinier pixels as a result.
Year after year people come up with better equations, tighter approximations, and we make shinier pixels as a result.
Is that all? Of course not! We are just getting started!
In practice, materials are not just one surface... They can have layers! And they are never optically uniform! They sparkle! They are anisotropic, they have scratches. Really, look around, look at most things. Most things are sparkly and anisotropic, due to the way they are fabricated.
And nothing is a surface, really. It's mostly volume and particles. Even... air! So we need fog and volumetric models. But that's not just about the light that scatters in the air back to our virtual cameras, we should also consider how this scattering affects lighting at surfaces. Our rays of light are not that straight anymore! Participating volumes make our light sources more "diffuse". Bigger. All of them, also things like environment lighting! And... that should affect shadows too right?
And now that we think about shadows... all this complexity and unknowns are still only for what we call "direct" lighting! What about global illumination? What about the million other hacks and assumptions that we rely upon to render each or our frames?
- Conclusions
So. How much physics is there in a frame, really? And more importantly, what's the point of all this? Should we be ashamed of not knowing physics that well? Should we do physics more? Less?
I don't know. I personally do not know physics well and I'm not too ashamed. A lot of what we've been doing is reasonable, really. We went with GGX because its "tail" helps. All the lighting improvements served our products. All the assumptions, individually, looked reasonable.
But, there is a value I think in looking at our math and our approximations holistically, now that we are getting so good at photorealism.
Perhaps there is not too much value for example in going off the deep end of complexity when we think of BRDFs, if we can't then integrate them with complex lighting, or, in order to do so, we have to approximate them again.
Similarly, the features we focus on should be evaluated. Is it more important to have non-uniform emission in our light sources, or a different "tail" in GGX/T-R? Anisotropic surfaces or sparkles? Spectral sampling? Thin-film? Non-lambertian diffuse? Of which kind? Accurate energy conservation or multi-bounce in microfacets?
Is it better to use the best possible approximation for a given integral, even if we end up with many different ones, or should we just use a bunch of spherical gaussians, or LTCs and such, but keep the same representation everywhere? And in general, is most of our error still in the materials, or in the lights? This is very hard to tell from just looking at artist-made pictures, because artists will compensate any way they can!
But even more importantly - How much can we keep relying on simplifying assumptions in order to make our math work?
Is it better to use the best possible approximation for a given integral, even if we end up with many different ones, or should we just use a bunch of spherical gaussians, or LTCs and such, but keep the same representation everywhere? And in general, is most of our error still in the materials, or in the lights? This is very hard to tell from just looking at artist-made pictures, because artists will compensate any way they can!
But even more importantly - How much can we keep relying on simplifying assumptions in order to make our math work?
I suspect to answer these questions we'll need more data. Acquire from the real world. Brute force solutions. Then look at the data and understand what matters, what matters perceptually, what errors are we committing end-to-end, and what we should approximate better and how...
And we should not assume that because somewhere we have a bit of physics, we are doing things correctly. We are, after all, a field that forgot for decades basic things like color spaces and gamma.
And we should not assume that because somewhere we have a bit of physics, we are doing things correctly. We are, after all, a field that forgot for decades basic things like color spaces and gamma.
30 March, 2019
An unbiased look at real-time raytracing
Evaluating technology without hype or hate...
That would have been the title of my blog post if I published the version I had prepared after the DXR/RTX technology finally became public last year, at GDC 2018.
But alas I didn't. It remained in my drafts folder. Siggraph came and went. Now another GDC, and I finally decided to can that and rewrite it.
Why? Not because I thought the topic wasn't interesting. Hype is easy to give in to. Fear of missing out, excitement about new toys to play with, tech for tech's sake... Hate is equally devious. Fear of change. Comfort zones, familiarity.
These are all very interesting things to think about. And you should. But can I claim I am an expert on this? I don't know, I am not a venture capitalist, and I could say I've been right a number of times, but I doubt that reaches the threshold of statistical significance.
Moreover, being old and grumpy and betting against new technologies is often an easy win. Innovation is hard!
And really, it doesn't matter much. This technology is already in the hardware, and it will stay for the future. It is backed by large companies, and more will come on board for sure. And yes, it could go the way of geometry shaders and other things that tried to work "against" the established GPU architectures, but even for these, we did spend some time to understand how they could help us...
So, let's just assume we want to do some R&D in this RTRT thing and let's ask a different question. What should we be looking for?
The do and do not list of RTRT research.
DO NOT - Think that RTRT will make things simpler, or that (technical) simplicity is an objective. In real-time rendering, the pain comes from within. There's nothing that will stop people spending one month to save 0.1ms in any renderer.
Until power is out of the equation, we will always build complex systems to achieve the ultimate quality vs performance tradeoffs. When people say that shadow maps are hard for example, they mostly mean that fast shadow maps are hard. Nobody prevents us from rendering huge, high precision maps with high-quality filtering. Even rendering from multiple light samples and doing proper area lights. We don't do it, because of performance optimization.
And that's true for all complexity in a real-time renderer. When we add raytracing to the mix we only sign for more pain, hybrid algorithms, code paths, caching schemes and so on. And that's ok. Programmer's pain doesn't really matter much in the logistics of the production of today's games.
![]() |
| How many rendering techniques can you see? How much pain was spent to save fractions of ms on each? |
DO - Think about ray/memory/shading coherency and the GPU implications of raytracing. In other words, optimization. Right now, on high-end hardware, we can probably throw a few relatively naive raytracing effects and they will work because these GPUs are much more powerful than the consoles that constrain the scene and rendering complexity of most AAA games. They can render these scenes at obscene framerates and resolutions. So it might seem not a huge price to pay to drop back to 1080p and 60hz in order to have nicer effects. But this doesn't mean it's an efficient use of GPU power, and that won't stand long term.
Performance/quality considerations are a great culler of rendering techniques. We need to think about efficient raytracing.
DO NOT - Focus on the "wrong" things. Specular reflections don't matter much. Perceptually they don't! Specular highlights, in general, are a strong indicator of shape and material in objects, but we are not good at spotting errors in the lighting environment that generates them. That's why cubemaps work so well. In fact, even for shiny floors and walls (planar mirrors) with objects near or in contact with them, we are fooled most of the times by relatively simple cheats. We see errors in screen-space reflections only because some times they fail catastrophically, and we're talking there about techniques that take fractions of milliseconds to compute. And reflections with raytracing are both too simple and too complex. Too simple, because they are an easy case of raytracing as rays tend to be very coherent. And too complex, because they require evaluating surface shading, which is hard to do in most engines outside screen-space and is slow as triggering different shaders with real-time raytracing is really not hardware friendly.
![]() |
| Intel's demo: raytraced Wolfenstein (http://www.wolfrt.de/). Circa 2010. |
DO - Think about occlusion on the other hand. It's much more interesting, can be more hardware friendly, definitely is more engine friendly and most importantly it's likely to have a bigger visual impact. Correct shadows from area lights, but also correctly occluding indirect lighting, both specular and diffuse.
DO NOT - Think that denoising will save the day. In the near future, for real-time rendering, it most likely will not. In fact in general denoising (even simple blurring that we sometimes already employ) can lift noise from high frequencies to lower ones, which under animation makes for worse artifacts.
DO - Invest in caching and temporal accumulation ideas. Beyond screen-space. These will likely be more effective, and useful for a wide variety of effects. Also, do think about finer-grained solutions to launch work / update caches / update on demand. For this, real-time raytracing might help indirectly, because it needs in order to be performant the ability to launch shader work from other shaders. That general ability, if implemented in hardware, and exposed to programmers, could be useful in general, and it's one of the most interesting things to think about when we think of hardware raytracing.
DO NOT - Make the wrong comparisons! RTX on / RTX off tells a lie, because what we can't see with "RTX off" is what the game could look like if we allocated all the power that RTX needs to pushing conventional techniques or even simply more assets. There are a lot of techniques we don't use today because we don't think they are on the right side of the quality/performance equation. We could use them, but we prefer to push more assets instead.
If you want to be persuasive about raytracing, proper comparisons should be made. And proper comparisons should also take into account that rasterization without shading (visibility only) leaves compute units available for other work to be done in parallel.
RTX hardware isn't free either! It costs chip area, even if you don't use it, but there's nothing we can do about that...
DO NOT - Assume that scene complexity is fixed. This is a corollary of the previous point, but we should always think at the very least, for overall visual impact, if simply pushing more stuff is better than pushing a given particular idea for "shinier" stuff, because scene complexity is far from having "peaked".
![]() |
| Offline rendering might (might!) be essentially complexity-agnostic today. Real-time, not quite. (frame from Avengers Infinity War) |
DO - Think about cases where raytracing could outperform rasterization at its own game. This is hard, because raytracing likely will always have a quite high cost, both because of the memory traffic that is required to traverse the spatial subdivision structures, and because it uses the compute units, while the rasterizer is a small piece of hardware that can operate in parallel. But, that said, raytracing could win in a couple of ways.
First, because it's much more fine-grained. For example, refreshing very small areas in a shadow map could perhaps be faster with a raytracer. Another way to say this is that there are certain cases where the number of pixels we need visibility for is much smaller than the number of primitives and vertices we'd need to traverse in a rasterizer.
The second thing to think about is how raytraced visibility goes wide, using the compute units and thus, the entire GPU. The rasterizer, on the other hand, can often be the bottleneck. And even if in many cases we can overlap other work to keep the GPU busy, that is not true in all cases!
DO - Think about engineering costs if you want the technology to be used now. It's true that programmer's pain doesn't matter. But at the moment RTX covers a tiny slice of the market. Programmers could find their pain in completing more important tasks... Corollary: think about fallback techniques. If we're moving an effect to RTX, how will we render it on GPUs that don't support it? Will it look very different? Will it make authoring more painful? That is something we generally can't afford.
In general, be brutally honest about costs and feasibility of solutions. This is a good rule in general, but it is especially true for an emerging technology. You don't want to burn developers with techniques that look good on paper, but fail to ship.
DO - Establish collaborations. Real-time raytracing is probably not going to sell more copies of a game. And if it's not going to save costs and make authoring more effective, if we're talking about uses in the runtime (an exception could be for uses in the artist tools themselves, e.g. to aid lightmap baking and/or previewing). It currently targets only a small audience, and you'll gain nothing by jumping on this too early.
So, you probably should not pull your smartest R&D engineers from whatever they're doing to jump on this unless you have some very persuasive outside incentives... If not, you'll likely won't have many people to do raytracing related things.
Thus, you should probably see if you can leverage collaborations with external research groups...
11 March, 2019
Rendering doesn’t matter anymore?
Apologies. I wanted to resist the clickbait title, but I couldn’t find anything much better...
And no, I’m not renouncing my ways as a rendering engineer, I’m not going to work on build systems or anything like that. Nor do I believe that real-time rendering has “peaked” or that our pace and progress in image quality has seen slowdowns. There is still a ton of work to do, and the difference between good and bad graphics can be dramatic...
But what I want to talk about a bit more (I mentioned this in my previous post) is what matters, and how do we decide that. ROI, perhaps an ugly term, but it gets the job done.
From product.
I’ve spent most of my now thirteen-old professional career in videogames working on production teams. A.k.a. making games. And lots of games I’ve helped making, I actually average a game per year, even when I was in production, which is quite unusual I guess.
Now, when you are in production, things are relatively simple. Ok, no, they are everything but. What I mean is that is straightforward... Ok, maybe still not the best description.
You start with some sort of rough plan. Hopefully, the creative persons have ideas, they present them to you, and you start making a sketch. What are the risks, things to experiment first, what are tasks that are more well known.
Unless you are bootstrapping an engine from scratch or doing major tech changes, mostly you’ll be asked for a ton of features, things people want. An unreasonable amount of them. Ludicrous.
So you go on and prioritize, estimate, shuffle things until you have some plan that makes sense. It won’t, but we know that, we start working and as things change, we re-adjust that plan, kicking features off the list and moving thinks up the priority...
So you get a gigantic amount of work to do, you get on the ride and off you go, fighting fires as they happen, course-adjusting and bracing yourself for the landing. For the most part. There are some other skills involved here and there, but mostly it’s about steering this huge ship that has both a ton of momentum and the worst controls ever.
Naturally, there isn’t much time to think about philosophical questions and other bullshit like that. In fact, plenty of times the truth is that you start losing control over the priorities, even.
That neat idea of reshuffling your list becomes more like a rough sort, and you don’t even necessarily have time or energy to understand why people who are asking for things need these things...
![]() |
| Production, on a good day. |
If you go around and look at big enough productions, one pattern you will notice is that people start working without knowing the “why” of things. Which leads, no need to say, to quite sub-optimal solutions. But the production beast is an organic one, it’s unclean, it’s made by people and opinions and blood and sweat. Engineering is the art of handling all that and still shipping a great game, and it looks nothing like any idealized version of beauty some programmers might hold dear.
To technology.
Then you move to some cushy job in some central technology department, right? And now you have a problem. You have time, at least, sometimes.
You might want to work on things that help, or have a chance to help, more than a single product. If you do R&D, you will be doing things that have more risks and unknowns. In general, you aren’t so strongly tied to that list of features people are shuffling around day after day. Even when you are doing the only reasonable thing, which is to be attached to a product, you are not that close, you can’t be as you’re not part of the core team.
This is an opportunity because you can have some time and freedom, but also a huge risk because, in the end, the product is all that matters. Being singularly focused on production is not necessarily the best strategy for great products, because that monster swallows and consumes everything, focused on getting “more”, but straying away too much is the road to masturbatory efforts that can be irrelevant at best, dangerous most often.
So, you start thinking of ROI. What should I do? What’s best? You probably have things from multiple teams that could be done, and you have other things that you can persuade teams they should want...
In my case, being a rendering person, the question boils down to, what matters in rendering? How do I estimate how much a thing weights? When you move from “vfx artists want this particle trail thing and you have to do it tomorrow” to look at things with an iota of horizon, how do you decide?
Rendering doesn’t matter...
...like it used to. Once upon a time, rendering made the games. Even more than that, entire genres. Doom, of course, is the obvious example, but there are many. The CD-ROM FMV game era. The hardware sprite and scrolling background fuelled platformers, shooters, and so on.
![]() |
| Chances are your engine won't create the next big videogame genre. |
Then that ended, we arrived at a point where we had enough computing hardware that videogame genres are not defined by technology anymore. Perhaps this will change with VR/AR but for now let’s ignore them (they’re not hard to ignore either, these days).
But we still had a period where technology could be product defining. Call of Duty running at 60fps on ps3 and 360, for example, was quite unique, and that technical characteristic was instrumental to the product. Today doing a 60fps title is the norm, to ship at 30 is almost a gutsy move...
Rendering is thus restricted in the narrower field of aesthetics. It’s just... graphics. Sad if you think of that, right?
Well of course not! We have an ace up our sleeves, see. It’s true that technology is not genre-defining anymore, but AAA productions are insanely graphic-intensive. We love our computer graphics, and the amount of people dedicated to their care and feed is enormous. Everything is good again in the universe, rendering engineering reigns supreme.
So this is the first order of attack of the ROI problem. There are lots of things that are measurable in people and hours and dollars. These, pretty much, will automatically win over anything else. Let’s put them in the bucket of “really important stuff”.
By the way, when I say “measurable”, I don’t mean you can measure them or that you will. You most definitely will not! What I mean is that you could think of them and have a strong feeling they relate to said measurable quantities...
Chasing shiny things.
So I said you can bucket things. Things that are required to ship the game first. Things that help people second. Third, you get all your shiny things, which are, incidentally today what you could call graphics R&D. A good part of the stuff I do!
Should we stop doing that? No, of course I will never admit to that, c'mon.
But more seriously, it obviously can’t be that simple. There will never be an end to thing that “help people”, even if the best possible scenario you can still make progress, nothing is ever perfect. So obviously you will reach a point where some rendering effect trumps a tiny pipeline improvement, at least that is a given!
Moreover, though it is not that computer-graphic techniques, even when they are purely visual, do not help content production. We could point at the obvious trend of physically-based rendering, and how that helped (after a lot of growing pains everyone had to go through) to curb the explosion of hacks and ad-hoc controls that we had to create assets before.
But even smaller things can help artists to get more freedom, say even things like antialiasing, for example, might mean that geometry and other sources of discontinuity can be use more leniently, without transforming the frame in an undecipherable mess.
Not only there are diminishing returns for productivity improvements as for any other things, but the split point between features and productivity is often tricky. We definitely do not wait till everything is perfect before pushing more features out, the production monster wants to be fed.
And we shall never, ever discount the gigantic effects of familiarity, the other big scary monster. It is not worth sacrificing everything to it, but we should respect it. To use a technique well, to master it takes a long time. Changing things, even if entirely for the better, with no drawbacks whatsoever, still implies that we need to pay the (often huge) costs of loss of familiarity.
So? How do you decide? How do you measure? Then again. You do not.
I hope he won’t mind me saying, this is one the paths to enlightenment forced on my by Christer, my former boss. How to put this. He has his tricks, not quite koans... So I learned that when he wasn’t persuaded about the opportunity of something, he would go and ask me to put things in more systematic ways, to try to narrow down that ever elusive “ROI”.
Then one time I think we were even arguing about how he could decide that a given initiative he was supporting would, in the end, be beneficial or the better course compared to another alternative. And he slipped and say that we don’t necessarily have to quantify this ROI thing! Of course, be both immediately caught that, even if we were over the phone he could almost sense my smile, but being the clever man he is, he managed to still be right despite the apparent idiosyncrasy...
The lesson is that we want to keep in mind that ROI thing. Not that we need to necessarily optimize for it and spend too much time chasing it. But we definitely need to keep it in mind, be always scared of the risk of doing irrelevant, or worse, damaging things. Keep ourselves accountable.
It’s the question, not the answer.
You might be excused to have thought that I put the question mark in the title, even if it isn’t in the form of a question, because of my poor English. But no, it was a clever thing you see, I actually went back halfway into writing this, and thought about it, and finally changed the punctuation. Only after deciding I would also write this, and feel so meta-clever. And again and ok, let’s stop this recursive loop...
And if I was really good at this, I could have jumped directly to the point and spared you all the blabbing, but I have time on my hands these days so. You’re welcome.
In the end, it is true that certain games should even chase diminishing returns because that’s what you do when you’re up enough. And it’s totally true that you can’t really quantify ROI anyway, so often times you should just do what you want. If someone really thinks something is important, and it’s not offensively bad, there should be space for that. In other words, because we know we are bad at ROI, we should realize that to chase it we should not chase it all the time (surprisingly, this is even a concept in optimization algorithms, by the way).
But! The questions are interesting.
How important shiny things are? Is there a point when state of the art techniques become so complex that they are unfriendly both to either content or programmers integrating/iterating, so much so that they will be used sub-optimally? And simpler solutions would have been actually better instead?
Think for example of something perfectly physically accurate, that can produce perfect images, but that behaves poorly when the inputs are not exact. This is not even such a wild scenario, you can see plenty of PBR games that would have been most likely best off without copy-and-pasting the GGX formulas, just because they now go nuclear with specular and aliasing...
![]() |
| Bloodborne might not be the pinnacle of RTR, but it is imaginative... |
Even more interesting. Is there a point where the attention to graphical perfection actually produces worse graphics? Could it be, for example, that the efforts required to create worlds that are perfect, truly great quality-wise, comes in the way of creating worlds that have the variety, the artistry, the iteration and look that in the end are most often correlated to what people think of great graphics?
Again. In the end, we should remember that we serve the product. Not photorealism, per se, but the product. We do believe that photorealism is a great tool to create games, and I won’t question that. But still we have to remember that photorealism is not the goal, technology per se is useless. It’s the product, that we work for.
And if I had to guess, I'd say in most products today both end-user image quality and in most cases, performance, are bottlenecked by asset production, not the lack of whatever latest cool rendering trick. In particular by:
- The sheer ability of authoring assets. Quantity / Variety.
- The ability of iterating on assets. Quality.
- The complexity of technical issues linked to art assets. Which in practice yield sub-optimal decision. Performance & Quality.
- And the very fixed granularity of assets and their editing tools, the overall inability of performing large, sweeping art changes. The more an environment is "dressed" (authored) the more it hardens and resist change. Art direction. (and perhaps this also causes an over-reliance on some of the few tools that can do said sweeping changes, namely, post-effects)
N.B. All these are rendering problems! Implementation, research, even hardware innovation. Despite the title, the argument here is not that rendering research in videogames is a waste of time, or beyond diminishing returns. Au contraire! It's more vital than ever, in our times of enormous asset pressure. But we have to think hard about what is useful to the end product.
To make a stupid example. A very smart system to automatically generate rendering meshes from artist data (LODs, materials, instances etc) is probably orders of magnitude more important than say, a post-effect...
Subscribe to:
Posts (Atom)
















