Search this blog


Tuesday, April 8, 2014

How to make a rendering engine

Today I was chatting on twitter about an engine and some smart guys posted a few links, that I want to record here for posterity. Or really to have a page I can point to every time someone uses a scene graph.

Remember kids, adding pointer indirections in your rendering loops makes kitten sad. More seriously, if in DirectX 11 and lower your rendering code performance is not bound by the GPU driver then probably your code sucks. On a related note, you should find that multithreaded command buffers in DX11 make your code slower, not faster (as they can be used to improve the engine parallelism but they are currently only slower for the driver to use, and your bottleneck should be the driver).

Links below are all about the idea of ditching state machines for rendering, encoding all state for each draw and having fixed strings of bits as the encoding. I won't describe the concept here, just check out these references:

Some notes / FAQs answers. Because every time I write something about these system people start asking the same things... I think because so many books still talk about "immediate" versus "retained" 3d graphic APIs and the "retained" is usually some kind of scenegraph... Also scenegraphs are soooo OOP and books love OOP.
  • Bits in the keys are usually either indices in arrays of grouped state (e.g. camera/viewport/rendertarget state, texture set, etc...) or direct pointers to the underlying 3d API data structures
    • So we are following pointers anyways, aren't we? Yes of course, but the magic is in the sort, it not only will help minimize state changes but also guarantees that all accesses in the arrays are (as-)linear(-as possible)!
    • Of course if you for example sort strictly over depth (not in depth chunks), then you have to accept to jump between materials at each draw, and the accesses over these might very well be random.
      • If that's the case try to avoid indirections for these and store the relevant data bits directly in the draw structure.
      • Another solution for this example case is to sort the material data in a way that is roughy depth coherent, i.e. all materials in a room are stored near each other. In theory you could also dynamically sort and back-patch the pointers to the material data in the game code, but we're getting too complex now...
    • The same can't be guaranteed for resource pointers (GL, DX...), even if the pointers will be linearly ordered they might be far away in memory, that's unavoidable. On consoles you have control on where resources are allocated even for GPU stuff so you can pack them together, but even more importantly you can directly store the pointers that the GPU needs w/o intermediate CPU data structures
  • You don't need to have a single array of keys and sort it!
    • Use buckets, i.e. some bits of the key index which bucket to use. Bucketing per rendertarget/pass is wise
    • "Buckets", a.k.a. separate lists. In other words don't be shy to have a list per subsystem, nobody says there should be one solution for all the draws in your engine.
      • This is usually a good idea also because draw-emitting jobs can and should be sequenced by pass, e.g. in a deferred renderer maybe we want first a rough depth-prepass, then g-buffer, then shadows... These can be pulled in pass-order from the visibility system
      • Doing the emission per pass means we can kick the GPU as soon as the first pass is done. Actually if we don't care for perfect sorting, and we really care about kicking draws as soon as possible, we can even divide each pass in segments and kick draws as soon as the first segment is done.
      • I shouldn't say it but just in case... These systems allow to generate draws in parallel, obviously, and also to sort in parallel and to generate GPU commands in parallel, quite easily. Just keep lists per thread, sort per thread, then merge them all (the only sync point) then split in chunks and per thread create GPU command lists (if you have an API where these are fast...)
  • You don't need to use the same encoding for all keys!
    • Some bits can decide what the other bits mean. Typically per rendertarget/pass you need to do very different things, e.g. a shadowmap render pass doesn't need to care about materials but might want to use some more bits as a z-key for depth sorting
    • Similarly, you can and should have specialized decoding loops
  • Not all the bits in the key need to be used for sorting
    • Bits of state that directly map to the GPU and don't incur in overheads from setting them, should not be part of the sorting, they will just slow it down.
  • Culling: make the keys be part of the visibility system
    • When a bounding primitive is finally deemed to be visible, it should add all the keys related to drawing its contents
    • At that point you want also to patch in the bits related to the projected depth, for depth sorting
  • Hierarchical transforms
    • Many scenegraphs are used as a transformation hierarchy. It's silly, on most engines a tiny fraction of objects need that, mostly the animation/skinning system for its bones. Bones do express a graph, but it's not enough of a reason to base your -entire- rendering system on it.
  • Group state that is (almost) always set together in the same bits
    • E.G. instead of having separate bits (referring to separate state structures) for viewport, rendertarget, viewworldprojection constant data and so on, merge all that in a single state structure.
  • Won't I need other rendering "commands" in my list? Clears? Buffer copies? Async CPU jobs waits? Compute shaders? Async compute shaders...
    • All of these can be part of "on bind" properties of certain parts of the state. E.G. when the bits pointing to the rendertarget/pass change we look up in that state structure to see if the newly set rendertarget have to be cleared
    • In practice as you should "bucket" your keys into different arrays processed by different decode loops, these decode loops will know what to do (e.g. the shadowmap decode will make sure the CPU skinning jobs are finished before trying to draw and so on)
  • Are there other ways?
    • Yes but this is a very good starting point...
    • Depends on the game. A system like this is good when you don't know what draws you'll have, typically because they come from a visibility system which can't spit them in the right order and/or because of parallel processing.
    • Games/systems where you can easily generate GPU commands in the right order and you exactly know which state changes are needed, obviously can sidestep all this architecture. E.G. Fifa, being a soccer game, doesn't need to do much visibility and knows exactly how each player is made in terms of materials, thus the code can be written to exactly process things in the right order... Something like this would be reasonable for Frostbite, but you won't use Frostbite for Fifa...

Wednesday, February 26, 2014

Valve VR. I want to believe.

Premise: Spoiler alert, I guess. Albeit it might not matter, it's worth noting that if you're going to experience either Valve's or Rift's Crystal Cove demo anytime soon (GDC perhaps) you could want to consider approaching it without any preconceived notion that this or other articles might give. Also, lengthy as usual.

Valve fanboy, I am not. In fact I might say I still hold a grudge against it, I want Valve to make games, not stickers to slap on PCs in a marketing stunt with the hope of moving some units of a platform in general decline. I understand that Steam makes money, but on a personal level I can't really care...

So this morning, after waking up way too early for my habits I arrive at Valve's building I can't help muttering "so cheap" as I look for the street number and make sure that I'm in the right place. Of course it's silly of me as money is not the reason why they didn't bother to put a logo on the facade nor the name on the directory, but anyhow I digress... The "I" in the titular "I want to believe" here refers not to my faith, but to how my brain reacted to the system.

Now, it's hard to put in words an experience, especially something novel as this is, as there aren't easy analogies to make and frames of reference in most people's experiences to anchor to. I'll try my best to explain how it works, or how I think it works.

Visuals are a tricky beast. Rendering engineers are often deeply embedded in very technical concerns, but at the end of the day, what really matters is psychology. What happens, when stars align and things go right (because, mind you, we're really far from making this a science and the little science there is most of the times is still very far from mingling with entertainment) is that we create a visual experience that somehow tickles the right neurons in our brain to "evoke" a given atmosphere, sensations, feelings that we learned in real life and are "replayed" by these stimuli. 
Mostly for me that happened with environments, I guess because we're really discriminating when it comes to characters: Call of Duty, Red Dead Redemption, Kentucky Route Zero are all great examples.

When we try to achieve this via photorealism we hope that by simulating what's perceptually important, what we can notice in a scene, the light, the materials, the shapes and so on, we reach a point where our brain accepts the image we generated as a reproduction of reality, like a photo. And we hope that our artists can use this tool that gets us close to reality to more easily be able to fool our brains into producing emotions, because we're nearer to the paths that normally fire when we experience things in the real world.

Inside Virtual Reality
A good VR experience completely sidesteps all this, Abrash says it right, when things align in VR you get presence and it's an infinitely more powerful tool. Presence is the sensation that you are there; That virtual wall is at ten meters from you. And it is really unquestionable. Realism doesn't matter anymore.
The VR prototype suffered from all kinds of defects: it's clearly "low resolution", a lot of demos didn't have lighting, most did do only lightmaps or diffuse, not specular, most had no textures, I could see banding and even some aliasing, I could spot errors in the lightmaps, even quite clearly the ghosts from the OLEDs and so on and on. A nightmare for a rendering engineer, on a 2d screen you would have said the worst visuals ever.
Yet you were fooled into thinking you were there, not through realism but through the immersion that is possible with the low-latency, full head tracking (rotation AND position) stereo sauce Valve has implemented. I suspect there are a million things to get just right, we discussed how OLEDs provided enough dynamic range that you didn't question the scene much for example and they have a catalog of things that are crucial and things you can ignore.

The most succinct way to describe this is that in all medias before this, at best you had the impression of looking at a greatly detailed, technologically advanced reproduction of reality (think of the best immersive stereo projection movie you've ever seen for example). Here even when you're looking at really basic renderings you think you are in some sort of weird, wrong alternate world, similar if you wish to certain installations, rooms that play with light and shapes to create some very unusual experiences.

The demo environments Valve created (or actually I should say, I witnessed) were quite tame. Clearly they didn't want to push it to avoid certain people reacting negatively to the experience. Most of the time the scene was static, not in the sense of devoid of animation but as in a fixed room you could move in but that wasn't moving relative to you. Things never got scary and I didn't interact in any way with the simulation (even if I, in numerous times, instinctively went reaching with my hands to objects and avoided objects getting to near me), yet there were some intense moments.

I guess a few people now described a scene where you're in the simplest room possible, no shading, no lighting, yet you start on a small ledge and you can't avoid feeling vertigo and have to actively force yourself to step into the "void". You know it's not real, at an intellectual level. You have all the possible visual hints saying it's not real, yet your brain tells you otherwise.
Another, switching for the first time to a scene with some animated robots, at the moment of the switch I had a second of high alertness, as your primitive brain steps in and rushes to prepare for a suspicious activity.
The weirdest sensation was at the end, flying (very slowly, as apparently motion creates quite easily discomfort) through CDAK. There the visuals were distorted enough (see the youtube video linked) that I didn't feel as much presence (so there are some extreme cases where visuals can break it), yet when some of the weird blue objects passed through me I had, again for a split second, a sensation that I could only later rationalize as reminiscent somehow of being in a sea, I guess because that's the only thing I have in my experience of going through something like that.

When does presence break? Visuals can be pushed quite far before they break presence. Mind you, rendering will be a huge issue there and good rendering I am sure does make a difference to remove the idea that you are immersed in something quite odd, but again it simple visuals don't break immersion. I would also have loved to see a scene with and without various rendering effects (visual hints) but alas, no such luck.
Even the low resolution and blur and ghosts are to me defects of my vision, like seeing through glasses or through a dirty motorbike helmet, not a problem of the "reality" of the scene. Impossible behaviours do. In one of the scenes for example there were some industrial machines at work behind a glass wall. Poking my head through the wall into the room breaks it. 
It's not like suddenly losing stereo, as in one of the cross-eyed stereograms where you focus on the page and lose the effect. The closest analogy I can think of is Indiana Jones' leap of faith, and you might have experienced something similar with some visual tricks in theme parks: you realize it's an illusion.

There are a myriad of ways of doing something that breaks the presence, many things that are totally acceptable on a normal display are intolerable in VR. You might know that you can't really use normal-maps or any other non-stereoscopic hints of detail for example, but you are also much more aware of errors in animation, physics, not to mention characters (which weren't demoed at all).
And it's good if the consequence of an error is only bringing back to the idea you're in a VR helmet, the bad cases are when certain hints are very strong, but certain others are completely wrong or missing, like falling without acceleration, wind and so on, as these can cause discomfort.

In conclusion, it was better than I expected, as I expected all the visual issues to have a bigger impact. I think it can be used to create incredible, amazing experiences, that will feel like nothing ever felt before. And it obviously has a lot of applications outside entertainment as well.
I think and hope all this research will also be useful for traditional image synthesis, as for the first time we really have to systematically study perception and how our brain works, and not just be lucky with it. Also certain technological advances, for example in low-latency rendering system, will directly apply to traditional games as well.

I also think that it will be still for a long while a very niche product, or if it will succeed it will be due to a killer app that doesn't look in any shape or form like a traditional game, as if for certain technological issues we can clearly see a roadmap (weight, tracking, resolution, lag and so on), for certain others we don't have any idea yet, mostly controls but also how to deal with all the situations where we our brain is accustomed to have more sensorial hints than just what the eyes tell.
Even tiny things, like just the fact that with position tracking we can compenetrate with everything is quite an issue to solve. Fast movement is hard as well, it exacerbates the technical issues (lag, refresh rates and so on) to a degree that even "cockpit" games are hard (not to mention the lack of acceleration), even worse if you have to move your body in any athletic way as it's easy to get the VR system out of the optimal alignment it needs for crisp vision.

I don't think games can be "ported", a FPS in VR will be much more of a gimmick than e.g. FPS with virtual joysticks on an iPhone. We will need radically new stuff, low movement (for now at least, later on maybe some cockpit games can work well enough for the masses), novel ways of interaction (gaze for example can work decently, wands do work great... kinect-like stuff is very laggy and thus limited to only gesture recognition, not direct manipulation right now), new experiences...

It will probably be for a few early adopters, but I'm quite persuaded to be among them, just to be able to create weird environments that feel real.

P.S. I saw the number "3" multiple times in Valve offices. You certainly know what that means...

Monday, February 10, 2014

Design Pattern: The Push Updater

Just kidding.

Let me tell you a story instead. Near the end of Space Marines we were, as it often happens at these stages, scrambling to find the extra few frames per second needed for shipping (I maintain that a project should be shippable from start to the end, or close, but I digress).

Now, as I did such a good job (...) at optimizing the GPU side of things it came to a point where the CPU was our bottleneck, and on rendering we mostly were bound by the numbers of material changes we could do per frame (quite a common scenario). Turns out that we couldn't afford our texture indirections, that is, to traverse per each material the pointers that led to our texture classes which in turn contained pointers to the actual hardware textures (or well, graphics API textures). Bad.

Most of the trashing happened in distant objects, so one of the solutions we tried was to collapse the number of materials needed in LODs. We thought of a couple of ways in which artists could specify replacement materials for given ones in an object, collapsing the number of unique materials needed to draw an LOD (thus making the process easier than manual editing of the LODs). Unfortunately it would have required some changes in the editor which were too late to do. We tried something alongside of the idea of baking textures to vertex colors in the distance (a very wise thing actually) but again time was too short for that. Multithreading the draw thread was too risky as well (and on consoles we were already saturating the memory BW so we wouldn't get any better by doing that).

In the end we managed to find a decent balance sorting stuff smartly, asking for a lot of elbow grease from the artists (sorry) and doing lots of other rearrangements of the data structures (splitting by access, removing some handles and so on), we ended up shipping a solid 30fps and were quite happy.

A decent trick is to cache the frequently-accessed parts of an indirection, together with a global counter that signals when any of the objects of that kind changed and a local copy of the counter. If you saw no changes (local copy of the counter equals global) you can avoid following the indirection and use the local cache instead... This can get more complex by having a small number of counters instead of just one, hashing the objects into buckets somehow, or keeping a full record of the changes that happened and have a way for an object to see if its indirection was in the list of changed things... We did some smart stuff, but that was still a sore point. Indirections. These bastards.

So, wrapping up the project we went to the drawing board and started tinkering and writing down plans for the next revision of the engine. We estimated that without these indirections we could push 40% more objects each frame. But why do you have these to begin with? Well, the bottom line is that you often want to update some data that is shared among objects. In case of the textures the indirection served our reference counting system which was used to load and unload them, hot-swapping during development and streaming in-game.

Here comes the "push pattern" to the rescue. The idea is simple, instead of going through an indirection to fetch the updated data, create an object (you can call it UpdateManager and create it with a Factory and maybe template it with some policies, if that's what turns you on) that will store the locations of all the copies of a piece of data (sort of like a database version of a garbage collector), so every time you need to make a copy or destroy a copy you register this fact. Now if create/destroy/updates are infrequent compared to accesses, having copies all around instead of indirections will significantly speed up the runtime, while you can still do global updates via the manager by poking the new data in all the registered locations.

A nifty thing is that the manager could even sort the updates to be propagated by memory location, thus pushing many updates at once with potentially less misses. This is basically what we do in some subsystems in an implicit way. Think about culling for example, if you have some bounding volumes which contain an array of pointers to objects, and as these bounding volumes are found visible you append the pointers to a visible object list, you're "pushing" an (implicit) message that said objects were found visibile...

Thursday, February 6, 2014

OT: The Sony RX1 (versus Fuji X100s)

I recently bought an used (new is way too expensive) Sony RX1, time for another photographic off-topic. I'll keep it short, follow the links if you want to know more.

Throughout this I'll share some meaningless images of my first week with the RX1
- Mhmmm, GOOD

First and foremost, the lens. Or if you prefer, lens/sensor combination, because these days lenses mean nothing if not seen on a specific sensor. Case in point, all modern mirrorless cameras from the Sony NEX to Micro 4/3 to the recent Sony A7 fullframe in theory can mount a lot of extemely expensive glass from Leica and Zeiss, in practice nothing by the Leica M seems to perform well with these, especially if you go 35mm or wider. 
Yes you can correct some of the defects (and really the Leica M does apply some correction in-camera, that's also why it works better) but honestly, if you're attaching a thousands-of-dollars lens to a body, you don't want anything less than perfection.

So, how does the Sonnar do? One word: wonderful. It's The reason to buy this camera. Easily the best 35mm I've personally used, film or digital, and among the best lenses I've ever seen. 
It's a "people" lens, optimized near and wide open, doesn't care about avoiding distortion (and it shouldn't, these things are easy to correct digitally and lens optimization should focus on other issues nowadays) from f2 to f4 you basically gain only less vignetting. Some people say it's somewhat worse at infinity, I didn't pixel-peep that but I can believe it, still it's an excellent performer, with a dreamy, smooth quality.

Loves people
Second, the handling and build quality. It's very, very good. Still not perfect because you know, no way a modern camera ships with all the controls done right, so you still need two presses for manual focus assist/zoom and you don't have a direct ISO wheel nor an absolute shutter speed wheel (i.e. with markings), but the aperture ring is excellent (fly-by-wire but it feels mechanical, third-stop clicks) and the manual focus one is very good as well. Also, plenty, plenty of customizable buttons. Good rear LCD quality. Well done.

Last but not least, the sensor. Nothing really to say, it's a 24 megapixels full-frame and it delivers exactly what you expect. Very low noise, lots of details, it's quite normal for such a sensor but in a "compact" camera it's not something you see often (in fact, this is the smallest full-frame camera you can buy, period).

No light? Not a problem
- Eeeew, BAD

It's a toy. A wonderful, mesmerizing toy, but not a "professional" camera. At best a second body to have around just for its lens/sensor combo. That doesn't mean you can't take amazing photos with it in any possible condition, it's just not done in a way where I would depend on it if it was critical, if a missed shot meant failure. Why?

I want a "Monochrom" version of this camera!
First, it's lacks a viewfinder! Oh buy you can buy an external one... Let's see, you have two bad options there. 
The first is to go with an EVF, which not only is an EVF (duh) but also attaches to the flash hot-shoe, leaving you without the option of attaching an external flash.
The second is to buy an OVF, but even Sony's own is just a pure optical VF, with no indicators, not even focus confirmation. Which means it's utterly useless, because there are only two scenarios, really. Either you're in a sunny day, and then you can/have to use fast shutter and the three-contact-points given by a viewfinder are not important (certainly not enough to lose all indicators!) or you're at night, then you would want to grip the camera better, but focus is critical, you're probably wide-open and can't even trust the autofocus... So yes, no indicators means no photo.

All in all the EVF is still an important accessory to buy, even with its defects, it's your best bet, and it helps if you want to be able to review images in bright light. I really hope Sony will make an OVF with some electronics in it... but I know it won't happen, maybe when they do an RX2...

Second, it's slow. Especially the autofocus. Not the slowest ever but among the slowest, the good is that even if it's slow it seems to truly not lie. If it's green, it's in focus. Autofocus systems that lie, saying it's in focus while it's not, are worse than anything else, but this especially in lowlight is problematic, it requires area with lots of contrast, more than what you are used to use, and if you're photographing a moving subject it can be frustrating. Also, very bright regions seems to be able to confuse it (I guess because the out of focus highlight still register as contrast, but it's only a guess)
Luckily it's quite easy to toggle AF/MF by assigning that to one of the buttons which makes the camera much more usable: just AF then lock and do MF adjustments as needed.


The way I see it, there are only two alternatives in its class, or maybe even just one if we wanted to be totally fair.

The X100s is prime competition. A bit unfair because the X100s will fit in a coat pocket while the RX1 doesn't, so it's in a different size range (useful ranges are: fits in pants, fits in coat, everything else) but still it holds its own and in practice many people will wonder what to do between the X100s and the RX1.

Having used both of them I can say, the X100s hybrid viewfinder is the reason to buy that camera, the RX1 lens is the reason to buy this. Interestingly if you look on dpreview (my first stop for camera reviews) you could think that browsing the studio shots there's not much difference between the X100s lens/sensor and the RX1, or anyways it's really hard to tell while pixel-peeping. Well, it's wrong, in real-life the Sony camera is on a entirely different planet.

The RX1 also feels better built and has a less erratic AF, but the X100s AF is quite a bit faster... The RX1 has probably better peaking (due to a better LCD really) but the X100s has the excellent virtual split image aid... All in all? I prefer the RX1, because even if the X100s is almost in every way better to shoot with (and mostly because of the incredible viewfinder, really almost everything else is worse, handling-wise) when you get the image the Sony image quality shines quite a bit ahead of the Fuji.

The second, more fair competition is Sony's own A7 and A7r... but I didn't try these so I can't say much. For now, they have no lenses really, just a zoom that seems quite mediocre and a 35mm f2.8 which seems good but not as excellent as the RX1 f2, and really these differences are "worth" (or maybe I should say cost...) thousands. Yes you can buy adaptors, which are worthless, as they make the camera/lens combination bigger (and you don't want that in a camera that's supposed to be compact!) and often perform not-so-great.

Also the A7 focal plane shutter won't flash-sync as fast as the RX1 and won't be as quiet and vibration-less. In fact the A7r shutter seems to be suffering severe vibration issues and it's quite on the loud side (even without the mirror!) while the A7 seems better in that regard.
For now the A7/A7r are toys with no real advantage over the RX1, but if (and it's a big if, as that was one of the really sore points of the Nex line) Sony makes some truly great compact lenses for them, then the situation will change significantly.

P.S. It's actually good that the X100, X100s, RX1 and A7/r are "flawed gems". They come used for much cheaper than new even if they are new-ish cameras, as some people can't deal with the flaws and some others will enjoy the discount... Compare this with well-established DSLRs, I can't think of any reason for example to change my 5d-II, in fact even if it's a much older camera, and new went for around the same price, today you can find the RX1 selling for around the same price as the 5d!

Saturday, January 18, 2014

In the next-generation everything will be data (maybe)

I've just finished sketching a slide deck on data... stuff. And I remembered I had a half-finished post on my blog tangentially related to that, so I guess it's time to finish it. Oh. Oh. I didn't remember it was rambling so much. Brace yourself...

Rant on technology and games.
Computers, computing is truly revolutionary. Every technological advance has been incredible, enabling us to deal with problems, to work, to express ourselves in ways we could have never imagined. It's fascinating, and it's one of the things that drew me to computer science to begin with.

Why am I writing this? Games are high-tech, we know that, is this really the audience for such a talk? Well. Truth is, really, we aren't that much. Now I know, the grass is always greener and everything, but really in the past decade or so technology surprised me yet again and turned things over their heels. Let's face it, the web won. Languages come and go, code is edited live, methodologies evolve, psychology, biometrics, a lot of cool happens there, innovation. It's a thriving science. Well, web and trading (but let's not talk of evil stuff here for now) and maybe some other fields, you get the point.

Now, I think I even know why: algorithms make money in these fields. Shaving milliseconds can mark the success or death of a service. I am, supposedly, in one of the most technical subfields of videogame programming: rendering. Yet it's truly hard to say whether an innovation I might produce does make more money on a shipped title. It's even debatable what kind of dent in sales better visuals as a whole do make. We're quite far removed, maybe a necessary condition, at best, but almost always not sufficient.

Now, actually I don't want to put our field down that much. We're still cool. Technology still matters and I'm not going to quit my job anytime soon and I enjoy the technological part of it as well as the other parts. But, there's space to learn, and I think it's time to start looking at things with a different perspective...

An odd computing trick that rendering engineers don't want you to know.
Sometimes, working on games, engineers compete on resources. Rendering takes most, and the odd thing is we can still complain about how much animation, UI, AI, and audio take. All your CPU are belong to us

To a degree we are right, see for example what happens when a new console comes out. Rendering takes it all (even struggling), gameplay usually fits, happy to have more memory sitting around unused. We are good at using the hardware, the more hardware, the more rendering will do. And then everybody complains that rendering was already "good enough" and that games don't change and animation is the issue and so on.

Rendering in other words, scales. SIMD? More threads? GPUs? We eat them all... Why? Well, because we know about data! We're all about data. 

Don't tell people around, but really, at its best rendering is a few simple kernels that go through data wrapped hopefully in an interface that doesn't upset artists too much. We take a scene of several thousands of objects and we find the visible ones from a few different points of view. Then we sort and them and send everything to the GPU. 

Often the most complex of all this is loading and managing the data and everything that happens around the per-frame code. The GPU? Oh, there things get even more about the data! It goes through millions of triangles, transforms them to place them on screen and then yet again finds the visible ones. These generate pixels that are even more data, for which we need to determine a color. Or roughly something like that.

The amount of data we filter through our few code "kernels" is staggering, so it's we devote a lot of care to them. 

Arguably many "unsuccessful" visuals are due to trying to do more than it's worth doing or it's possible to do well. Caring too much for the number of features instead of specializing on a few very well executed data paths. You could even say that Carmack has been very good at this kind of specialization and that made his technology have quite the successful legacy it has.

Complexity and cost.
Ok all fine, but why should we (and by we I'm imagining "non-rendering" engineers) care? Yes, "gameplay" code is more "logic" than "data", that's maybe the nature of it and there's nothing wrong with it. Also wasn't code a compressed form of data anyhow?

True, but does it have to be this way? Let's start talking about why it maybe shouldn't. Complexity. The least code, the best. And we're really at a point where everybody is scared about complexity, our current answer is tools, as in, doing the same thing, with a different interface. 

Visual programming? Now we're about data right? Because it's not code in a text editor, it's something else... Sprinkle some XML scripting language and you're data-oriented.
So animation becomes state machines and blend trees. AI becomes scripts, behaviour trees and boxes you connect together. Shaders and materials? More boxes!

An odd middle ground, really we didn't fundamentally change the ways things are computed, just wrapped them changing the syntax a bit, not the semantic. Sometimes you can win something from a better syntax, most of these visual tools don't as now we have to maintain a larger codebase (a runtime, a custom script interpreter, some graphical interfaces over them...) that expresses at best the same capabilities as pure code. 
We gain a bit when we have to iterate over the same kind of logic (because C++ is hard, slow, and so on) but we lose when we have to add completely new functionalities (that require modifications to the "native" runtime and to be propagated through tools).

This is not the kind of "data-driven" computation I'll be talking about and it is an enormous source of complexity.

Data that drives.
Data comes in two main flavours, sort of orthogonal to each other: acquisition and simulation. Acquired data is often to expensive to store, and needs to be compressed in some ways. Simulated (generated) data is often expensive to compute, and we can offset that with storage (precomputation). 
Things get even more interesting when you chain both i.e. you precompute simulated data and then learn/compress models out of it, or you use acquired data to instruct simulated models, and so on.

Let's take animation. We have data, lots of it, motion capture is the de-facto standard for videogame animation. Yet, all we do it to clean it up, manually keyframe a bit, then manually chop, split, devise a logic, connect pieces together, build huge graphs dictating when a given clip can transition into another, how two clips can blend together and so on. For hundreds of such clips, states and so forth. 
Acquisition gets manually ground into the runtime, and simulation is mostly relegated to minor aesthetic details. Cloth, hair, ragdolls. When you're lucky collisions and reactions to them.

Can we use the original data more? Filter, learn models. If we know what a character should do, then can we search for the most "fitting" data we have automatically, an animation that has a pose that conserves what matters (position, momentum) and goes where we want to go... Yes, it turns out, we can. 
Now, this is just an example, and I can't even begin to scratch the surface of the actual techniques, so I won't. If you do animation and this is new to you, start from Popovic (continuos character control with low dimensional embeddings is to the date the most advanced of his "motion learned from data" approaches, even if kNN based solutions or synthesis of motion trees might be most practical today) and explore from there.

All of this is not even completely unexplored, AAA titles are shipping with methods that replace hardcoding with data and simulation. An example is the learning-based method employed for the animation of crowds in Hitman:Absolution
I had the pleasure of working from many years with the sports group at EA, which surely knows animation and AI very well, shipping what was at the date I think one of the very few AAA titles with a completely learning-based AI, Fight Night Round 4
The work of Simon Clavet (responsible for the animation of Assissin's Creed 3) is another great example, this time towards the simulation end of the spectrum.

What I'd really wish is to see if we can actually use all the computing power we have to make better games, via a technological revolution. We're going to really enter a "next generation" of gaming if we learn more on what we can do with data. In the end it's computer science, actually all there is to it. Which is both thrilling and scary, it means we have to be better at it, and how much there is to learn.
  • Data acquisition:  filtering, signal processing, but also understanding what matters which means metrics.
    • Animation works with a lot of acquisition. Gameplay acquires data too, telemetry but also some studios experiment with biometrics and other forms of user testing. Rendering is just barely starting with data (e.g. HDR images, probes, BRDF measurements).
    • Measures and errors. Still have lots to understand about Perception and Psychology (what matters! artists right now are our main guidance, which is not bad, listen to them). Often we don't really know what errors we have in the data, quantitatively.
    • Simulation, Visualization, Exploration.
  • Representation, which is huge, everything really is compression, quite literally as code is compressed data, we know, but the field is huge. Learning really is compression too.
  • Runtime, parallel algorithms and GPUs.
    • This is what rendering gets done well today, even if mostly on artist-made data.
    • Gather (Reduce) / Scatter / Transform (Map)
    • For state machines (Animation, AI) a good framework is to think about search and classification. What is the best behaviour in my database for this situation? Given a stage, can I create a classification function that maps to outcomes? And so on.
In the end it's all a play of shifting complexity from authoring to number crunching. We'll see.

Tuesday, December 17, 2013

Mental note: shadowmap-space filters

A thought I often had (and chances are many people did and will) about shadows is that some processing in shadowmap space could help for a variety of effects. This goes from blurring ideas (variance shadow maps and variants) to the idea of augmenting shadowmaps (e.g. with distance to-nearest-occluder information).

I've always discarded these ideas though (in the back of my mind) because my experience with current-gen told me that often (cascaded) shadowmaps are bandwidth-bound. To a degree that even some caching schemes (rendering every other frame, or tiling a huge virtual shadowmap) fail because the cost of re-blitting the cache in the shadowmap can exceed the cost of re-rendering.
So you really don't want to do per-texel processing on them, and it's better instead to work in screenspace, either by splatting shadows in a deferred buffer and blurring, or by doing expensive PCF only in penumbra areas and so on (i.e. with min/max shadowmap mipmaps to compute trivial-in shadow and trivial-out shadow cases and branch).

It seems though that lately caching schemes are becoming practical (probably they are already for some games on current-gen, by no mean my experience on the matter in Space Marines can be representative of all graphic loads).
In these cases it seems logical to evaluate the possibility of moving more and more processing in shadowmap space. 

Then again, a reminder that a great metaheuristic for graphics is to try to reframe the same problem in a different space (screen, light, UV, local, world... pixel/vertex/object...)

Just a thought.

Friday, December 13, 2013

Never Again in Graphics: Unforgivable graphic curses.

Well known, zero cost things that still are ignored too often.

Do them. On -any- platform, even mobile.

  • Lack of self-occlusion. Pre-compute aperture cones on every mesh and bend the normalmap normals, change specular occlusion maps and roughness to fit the aperture cone. The only case where this doesn't apply is for animated models (i.e. characters), but even there baking in "t-pose" isn't silly (makes total sense for faces for example), maybe with some hand-authored adjustments.
  • Non-premultiplied alpha.
  • Wrong Alpha-key mipmaps computed via box (or regular image) filters.
  • Specular aliasing (i.e. not using Toksvig or similar methods).
  • Analytic, constant emission point/spot lights.
  • Halos around DOF filters. Weight your samples! Maybe only on low-end mobile, if you just do a blur and blend, it might be understandable that you can't access the depth buffer to compute the weights during the blur...
  • Cartoon-shading-like SSAO edges. Weight your samples! Even if for some reason you have to do SSAO over the final image (baaaad), at least color it, use some non-linear blending! Ah, and skew that f*cking SSAO "up", most light comes from sky or ceiling, skewing the filter upwards (shadows downwards) is more realistic than having them around objects. AND don't multiply it on top of the final shading! If you have to do so (because you don't have a full depth prepass) at least do some better blending than straight multiply!
  • 2D Water ripples on meshes. This is the poster child of all the effects that can be done, but not quite right. Either you can do something -well enough- or -do not do it-. Tone it down! Find alternatives. Look at reference footage!
  • Color channel clamping (after lighting), i.e. lack of tonemapping. Basic Reinhard is cheap, even on shaders on "current-gen" (if you're forced to output to a 8bit buffer... and don't care that alpha won't blend "right").
  • Simple depth-based fog. At least have a ground! And change the fog based on sun dot view. Even if it's constant per frame, computed on the CPU.
If you can think of more that should go in the list, use the comments section!

Thursday, December 12, 2013

Shit people say: graphics have "peaked"

If you think that rendering has peaked, it's probably good. Probably it means you're not too old and haven't lived through the history of 3d graphics, where at every step people thought that it couldn't get better. Or you're too old and don't remember anymore...

Really, if I think of myself on my 486sx playing Tie Fighter back then. Shit couldn't get any better. And I remember Rebel Assault, the first game I bought when I had my first CD-rom reader. And so on and on (and no, I didn't play only Star Wars games, but at the time LucasArts was among the companies made all must-buy titles... until the 360 I've always been a "computer" gamer, nowadays I play only on consoles).

But but but, these new consoles launched and people aren't that "wowed" right? That surely means something. We peaked, it happened.

I mean, surely it is not that when the 360 and later PS3 came out games weren't looking incredibly much better than what we had on ps2, right? (if you don't follow the links, you won't get the sarcasm...). And certainly, certainly the PS2 launch titles (was touted as more powerful than a SGI... remember?) it blew late PS1 titles right out of the water. I mean, it wasn't just more resolution.

Maybe it's lack of imagination. As I wrote, I was the same, many times as I player I failed to imagine how it could get better. To a degree I think it's because videogame graphics, like all forms of art, "speak" to the people of their time, first and foremost. Even if some art might be "timeless" that doesn't imply that its meaning remains constant over time, it's really a cultural, aesthetic matter which evolves over time.
Now I take a different route, which I encourage to try. Just go out, walk. See the world, the days, the nights. Maybe pick up a camera... How does it feel? To me, working to improve rendering, it's amazing. Amazing. I could spend hours walking around and looking in awe and envy at the world we can't yet quite capture in games.
Now think if you could -play- reality, tell stories in it. Wouldn't it be a quite powerful device? Won't it be the foundation for a great game?

Stephen Shore, one of the masters of American color photography

Let me be explicit though, I'm not saying that realism is the only way, in the end we want to evoke emotions, and that can be done in a variety of ways, I'm well aware. Sometimes it's better to illustrate and let the brain fill in the blanks, emotions are tricky. Take that incredible masterpiece that is Kentucky Route Zero which manages to use flat-shaded vector graphics and still feel more real than many "photorealistic" games. 
It's truly a game that every rendering engineer (and every other person too) should play, to be reminded of what are the goals we are working for: pushing the right buttons in the brain and trick it to remember or replay emotions it experienced in the real world. 
Other examples you might be more accustomed to are Call of Duty (most of them) and Red Dead Redemption, two games that are (even if it's very questionable actually) not as technically accomplished as some of the competition but manage to evoke and atmosphere that most other titles don't even come close to.

At the end of the day, photorealism is just a "shortcut", as if we have something that spits realistic images for every angle and every lighting, it's easier to focus on the art, the same way that it's cheaper to film a movie rather than hand paint every frame. It's a set of constraints, a way of reducing the parameters space from the extreme of painting literally every pixel every frame to more and more procedural models where we "automate" a lot of the visual output and allow creativity to operate on the variables left free to tuning (i.e. lighting, cinematography and so on). 
It is -a- set of constraints, not the -only- one. It's just a matter of familiarity, as we're trying to fool our brains into firing the right combinations of neurons, it makes some sense to start with something that is recognizable as real, as our lives and experiences are drawn from real world. But different arguments could be made (i.e. that abstraction helps this process of recollection), this would be the topic of a different discussion.If your artists are more comfortable working in different frameworks there is a case to be made for alternatives, but when even Pixar agrees that physics are a good infrastructure for productive creativity then you have a quite strong "proof" that it's indeed a good starting point.

Diminishing returns... It's nonsense. As I said, everyday I come back home from the office, and every day (or so) I'm amazed at the world (I'm in Vancouver, it's pretty here) and how far we still have to go to simulate all this... No, it's not going to be VR the next step (Oculus is amazing, truly, even if I'm still skeptical about a thing you have to wear and for which we have no good controls), there is still a lot to do on a 2d screen. Both in rendering algorithms and in pure processing power. Yes we need more polygons please. Yes we need more resolution. And then more power on top of that to be able to simulate physics, and free our artists from the shackles of needing to eyeball parameters and hand-painted maps and so on...

And I don't even buy the fact that rendering is "ahead" and other things "lag" behind. How do you even make the comparison?
AI is "behind" because people in games are not as smart as humans? Well, quite unfair to the field, I mean, trying to make something look like a photo, versus something behave like a human, seems to be a bit easier to me.
Maybe you could say that animation is behind because well, things look much worse in motion than they do when they are static. But, not only part of that is a rendering problem, but it just says exactly that, things in motion are "harder" than static things, it doesn't mean that "motion" lags behind as a field...
Maybe you can say we implemented more novel techniques in rendering than we did in other fields, animation didn't change that much over they years, rendering changed more. I'm not entirely sure it's true, and I'm not entirely sure it means that much anyways, but yes, maybe we had more investment or some games did, to be more precise.

Anyhow, we still suck. We are just now beginning to understand the basics of what colours are, of what materials are, how light works. Measure, capture, model. We're so ignorant still. Not to mention on the technical side. Pathetic. We don't even know what to do with most of the hardware yet (compute shaders? for what?).

There could be an argument that spending more money on rendering is not worth it - because spending them on something else now gets us more bang for the buck, which is a variation of the "rendering is ahead" reasoning that doesn't hinge on actually measuring what is ahead of what. I could consider that, but really the reason for it is just that it's harder to disprove. But on the other hand, it's also completely random! 
Did we measure this? That would be actually fascinating! Can we devise an experiment where we can turn a "rendering" know and an "animation" or "gameplay" know and see what are people most sensitive to? I doubt it, seriously, but it would be awesome.
Maybe we could do some market research and come up with metrics that say that people buy more games if they have better animation over rendering, but... I think rendering actually markets better (that's why companies name and promote their rendering engines, but not their animation ones).

Lastly, you could say, it's better to spend money somewhere else just because it seems that rendering is expensive and maybe the same money just pays so much more innovation somewhere else. Maybe. This still needs ways of measuring things that can't be measured, but really the thing is some people are scared that asset costs will still go up and up. Not really "rendering" costs, but "art" costs. Well -rendering- actually is the way to -lower- art costs. 
No rendering technique is good if it doesn't serve art better, and unfortunately even there we still suck... We are mostly making art the same way we always did, triangles, UVs, manually splitting objects, creating LODs, grouping objects and so on. It's really sad, and really another reason to be optimistic about how much still we have to do in the future.

Now, I don't want to sound like I'm saying, I'm a rendering guy, my field is more relevant and all the money should go to it. Not at all. And actually I'm passionate of a lot of things, animation for example is fascinating as well... and who knows, maybe down the line I'll do stuff that it's completely different than what I'm doing today... I'm just annoyed that people say thing that are not really based in facts (and as we're at it, let's also dispel the myth that hardware progress is slowing down...).


Tuesday, December 10, 2013

Never again: point lights

Distant, point, spotlight, am I right? Or maybe you can merge point and spot into an uberlight. No.
Have you ever actually seen a point-light in the real world? It's very rare, isn't it? Even bare-bulbs don't exactly project uniformly in the hemisphere...
If you're working with a baked-GI solution that might not affect you much, in the end you can start with a point, construct a light fixture around it and have GI take care of that. But even in the baked world you'll have analytic lights most often. In deferred, it's even worse. How many games show "blobs" of light due to points being placed around? Too many!
With directional and spots we can circumvent the issue somehow by adding "cookies", 2d projected textures. With points we could use cube textures, but in practice I've seen too many games not doing it (authoring also could be simpler than hand-painting cubes...)
During Fight Night (boxing game) one little feature we had was light from camera flashes, which was interesting as you could clearly see (for a fraction of a second) the pattern they made on the canvas (journalists are all around the ring) and there it was the first time I noticed how much point lights suck.
The solution was easy, really, I created a mix of a point and distant light, which gave a nice directional gradient to the flash without a cone shape of spots. You could think of the light as being a point and the "directional" part being a function that made the emission non constant on the hemisphere. 

It's a multiply-add. Do it. Now!

Minimum-effort "directional" point

Another little trick that I employed (which is quite different) is to "mix" point and directional in terms of the incoming light normal on the shaded point (biasing point normals towards a direction), at the time an attempt to create lights that were "area" somehow, softer than pure points. But that was really a hack...
Nowadays you might have heard of IES lights (see this and this for example), which are light emission profiles often measured by light manufacturers (which can be converted to cubemaps, by the way). 
I would argue -against- them. I mean sure, if you're going towards a cubemap based solution sure, have that as an option, but IES are really meaningful if you have to go out in the real world and buy a light to match a rendering you did of an architectural piece, if you are modeling fantasy worlds there is no reason to make your artists go through a catalog of light fixtures just to find something that looks interesting. What is the right IES for a pointlight inside a barrel set on fire?

A more complicated function

A good authoring tool would be imho just a freehand curve, that gets baked into a simple 1d texture (in realtime, please, let your artists experiment interactively), mapped with light direction dot (light position-shaded point).
If you want to be adventurous, you can take a tangent vector for the light and add a second dot product and lookup. And add the ability of coloring the light as well, a lot of lights have non-constant colors as well, go around and have a look (i.e. direct light vs light reflected out of the fixture or passing through semi-transparent material...).

1d lookups are actually -better- than a cubemap cookies, because if you see in real world example many fixtures generate very sharp discontinuities in the light output, which are harder (require much more resolution) to capture in a cubemap...
Exercise left for the reader: bake the light profile approximating a GI solution, automatically adapting it to the enviroment the light was "dropped" in...

Sunday, December 8, 2013

Enhance this!

Don't you hate when people have strong critiques towards a thing, but it happens that it's just that they don't know enough about it? Well, I don't, because then I think of how many times in my youth (and let's say only then) I did the same...

Regardless, today I happen to have a bit of time and I saw yet another post laughing at how stupid the "image enhance" trick used in movies and TV series is, and so you get this nerdrage against nerdrage...

Now think a second about this. Who do you think it's right? C.S.I., which is a huge TV series using arguably some of the best writers and consultants, or the random dude on the net? Do you think they don't know how realistic any of the techniques they use is? Do you think they don't actually and very carefully thread between real science and fiction to deliver a mix that is comprehensible and entertains their audience, telling a story while keeping it grounded in actual techniques used in the field? Don't you think -they- know better, and the result was very consciously constructed?

The same goes of course for anything, really, especially when something is successful, makes a lot of money, has a lot of money behind, you should always bias yourself towards being humble and assuming the professionals making said thing -know better-.

Now, back to the "image enhance" trick. It turns out it is real science. It's called "super-resolution" and it's a deep field with a lot (really, a lot!) of research and techniques behind it.
It's actually common nowadays as well, chances are that if your TV has some sort of SD2HD conversion, well that is super-resolution in action (and even more surprising are all the techniques that can reconstruct depth from a single image, which also ship in many TVs, the kind of models they came up with for that are crazy!).

The scenarios presented in movies are actually -quite- realistic even if the details are fictionalized. True, the interface to these programs won't look like that, maybe they won't be real-time and surely they won't be able to "zoom" in "hundreds" of times, but they surely can help and surely are used. 

That is to me a reasonable compromise between fiction and reality, as certainly you can and will use computers to get a legible nameplate for a video that is too low-resolution for the naked eye, or match an otherwise unreadable face against a database of suspects and so on, probably not in quite as glamorous and simple way as the movies show, but fundamentally the idea is sound (and I'm quite sure, used in the real world).
It is a non-realistic representation of a very realistic scenario, which is the best that good fiction should try to achieve, going further is silly. Or are you going to argue that a movie is crap because at night for example you can't really see as clearly as they show, or because they don't let a DNA test take weeks and an investigation several years?

When it comes to videos we can use techniques known as "multiple image" super-resolution, registering (aligning) multiple images (frames in this case, i.e. optical flow), and merging the results, which do work quite well. Also, most fictionalized super-resolution enhances focus on faces or nameplates, which are both much easier to super-resolve because we can "hint" the algorithm with a statistical model (a-priori) which helps tremendously to guide the "hallucination".
And even if hallucinating detail might not hold in a court (the stronger the a-priori model, the more it will generate plausible results but by no means always reliable), it might be very well be used as a hint to direct the investigations (I've never seen a case where it was used in courts, always to try to identify a potential suspect or a nameplate, both cases where having a strong probability, even if it's far from certainty, are realistic).

So, bottom line is, if you think these guys are "stuuuuupid", well then you might want to think twice. Here are some random-ish links (starting points... google scholar for references and so on if you're interested... I couldn't even find many of my favorite ones right now) to the science of super-resolution:
It would take many pages only to survey the general ideas in the field. Don't limit your imagination... Computer science is more amazing than you might think... We reconstruct environments from multiple cameras, or even sweeping video... can capture light in flight, we can read somebody's heartbeat from video, fucking use lasers to see around corners and yes, even take some hints about an environment from corneas...

And by the way, don't bitch about Gravity, try enjoy the narrative instead. You might live a happier life :)