Search this blog

24 January, 2009

Too young

No, I've not been (at least, yet) laid off. Quite the contrary, our game is reaching alpha, a state where in theory we should work only on bugfixing, optimization and tuning. In practice, this status is reached for most games only when they ship, some are not so lucky and features are still added (not considering expansions, that would be separate products) with patches.
But it was fun to notice that I started receiving more job offers in this period, of companies intrested into getting good engineers, freshly cut... Oh well.

I also had a couple of more interesting things to post, but just today my macbook died. Ok, I know that sounds like "aliens ate my demomaker" but yeah, it's real and I'm really pissed off by that.

An overused analogy
I've always been fascinated by how much good coding looked like good paiting. Some start with a preliminary sketch, some do not, I don't think it's so important. The iterative process is. You first start to draw general light volumes and set the main colours, the mood of the picture. Then you go down and start to nail down shapes, focusing on the main parts first, bodies, faces. Eventually you'll change some eary decisions, because they don't look quite right as the picture progresses. Then you go into the details more and more until you're happy, trying not to overdo it.

Painting has a process, and you'll find the same in most of the crafts, from sculpture to music. The media does not matter much, it's just like programming languages are to us. Some ideas are better expressed in one media, some in another, but in the end it doesn't matter too much...
If we have to look at the media, then sculpting materials can be used for some quite nice analogies, some are harder to shape, but they allow for finer detail (think about marble), some others are fast and can be used to make casts for more solid works, some are obsolete, with fewer and fewer masters being able to work with them.
C is marble, Michelangelo can create the most incredible works out of it, but it requires an incredible mind, a lot of muscles, lots of preliminary drawings and many years. We, normal people, should focus on something that's more easy to change, but that can still be hardened enough to suit our needs... Anyway...

So coding and art, both deal with processes, while still allowing for a lot of freedom and creativity. Why we don't see quite as much structure in our rendering work then? Are we too young? Even the word "creativity", it scares us, but should it?
Creativity is such a taboo word in programming because it evokes things you can’t control, but it’s quite the opposite, you can, artists do.
I really think that a weekend spent learning painting or watching how artist paint would be really helpful to learn about a good workflow. Artists are not messy at all, they have a very organized and evolved, refined way to creativity.

In my experience, the rendering work of a game starts, if you're lucky, by gathering references, compiling a feature list, and making mockups of what the end result should look like. From there, engineers make estimates, a plan is made, something get cut early, and then the project starts.

Agile or not often does not make a fundamental difference in the way rendering evolves. It should do, and it changes the way your daily work progresses, but often it does not change our perspective at a more fundamental level. Features are engineered, changes happen, people get more and more busy until everything is done. Now take screenshots of the game as it goes through all those steps. For sure you'll notice (well, hopefully) that it's improving over time. But can you notice a pattern? A process? In my experience most of the times, no, you can't. Is rendering too young to have one?

Things get done in a technical order, based on their risk. If it's new and it's hard, it will be done first. But in this way, how can you know what you are doing? How do you know that what you did is correct, it fits into the picture quite right... It's like drawing some detailed eyes in a black canvas, without knowing where they will be, on which face, with which light, or mood...

Short story
Recently I was involved in some very intresting discussions on normals. Without going into details that do not matter nor am I allowed to write, I can say that we have bad normals. Everyone does, really! Simply, no matter how you transform your meshes, if it's not rigid, and if you're using the same transform on vertices and on their normals, you'll be wrong. Smooth normals are a big pain in the ass, they're totally fake and they're computed by averaging face normals, that depend on many vertices, often in a non trivial way (as usually, the mean in weighted considering some heuristics).
I never realized that before.

Actually I was thinking that our fault was somewhere else in the math we were using, but I was proven wrong, so well, that's it. And as all the solutions seem to be too expensive or too risky to try, I suspect we'll live with it.

What's more interesting is that some time later, another defect was found in another area. As it was discovered by lighting artists, they tried their best to fix it by changing our many lighting variables, but they failed. Eventually someone thought that it could be due to that normal problem and I was involved into the discussion. After a while, it was found that no, it's not lighting nor normals, it's just the animation that's wrong, the shape itself gets wrong. In a subtle way, but enough to cause a big loss of realism.

Where do I want to go with that? Our problem is that we don't build things on a stable, solid base. Not from an engineering perspective (in that case, the solid base should be provided by correct maths, and understood approximations on physical phenomena), nor from a rendering one. We build things on hacks, find visual defects and add more hacks to fix them, in a random order. It's impossible to predict what's affecting what, shaders have hundred of parameters, nothing is correct, so you can't tell when you're making an error.

We should learn, again, from art. Start with shapes, and basic lights (i.e. no textures, only gouraud shading, shadows and alpha maps). Make sure that shapes do work, that work even under animation, and provide means to always check that.

Then focus on the main shaders, on the material they should model. Find out what lights do we need, photography helps, key, fill, background... And when something does not look right, do not fix it, but find out first why it does not.

If it's too saturated, the solution is not to add a desaturation constant, but first to check out why. Are we taking a color to an exponent? What sense that operation has? Why we do that? Is the specular too bright? Why? Is our material normalized, or does it reflect more light than the one it receives?

Addendum - update:
In the non realtime rendering world, one of the advantages of unbiased estimators of the rendering equation is to be able to compute the error, and the convergence, of the image. In our world, there is no notion of error. Because nothing is correct, nor consistent. Many times there aren't even lights nor materials, but pieces of code that emit colors.

In the end we have lots of parameters and techniques, and an image that does not feel right. If our artists are very skilled, by looking at the reference pictures, they might identify the visual defect that makes the image not look right. Specular is wrong. Hue changes in the skin are incorrect. If we are able to find those problems (not trivial) finding the cause is always impossible. Wrong normals? The subsurface scattering is badly tuned, or it's the diffuse map? Or it's the specular of the rimlights? The only option is to add another parameter of another hack that makes the situation less worse... In that case, but probably add other visual defects somewhere else, and for sure it adds complexity...

It's the same thing that happens with bad, rotten code, the only difference is that this time is our math and our processes that are bad, not the resulting code, so we're less trained to recognize it...

06 December, 2008

Which future? (plus, a shader optimization tip)

Update: poll closed. Results:

Cpu and Gpu convergence - 36
One to one vertex-pixel-texel - 7
Realtime raytracing - 14
More of the same - 9
We don't need any of that - 14

We have a clear winner. To my surprise, the realtime raytracing hype did not have a big influence on this, nor Carmak with its ID Tech 5 and "megatextures" (clipmaps) or John Olick with his voxel raycaster... IBM Cell CPU and Intel's Larabee seem to lead the way... We'll see.
---
Which future? Well, in the long term, I don't know. I don't know anything about it at all. In general our world is way too chaotic to be predictable. Luckily.


I'm happy to leave all the guesswork to hardware companies (this article by AnardTech is very intresting).

But we can see clearly the near future, there are a lot of technologies and ideas, trends, promising research... Of course we don't know which ones will be really succesful, but we can see some options, and even make some predictions, having enough experience. Educated guesses.

But what is even more intresting to me, is not to know which ones will succeed, that's not too exciting, I'll know anyway, it's a matter of time. I'd like to know which ones should succeed, which ones we are really waiting for. That's why I'm going to add a poll to this blog, and ask you.

I'm not writing a lot, the game I'm working on is taking a lot of my time now, so this blog is not as followed as it used to be, so running a poll right now could be not the most brightest idea ever. But I've never claimed to be bright...

Before enumerating the "possible futures" I've choosen for the poll, a quick shader tip, so I don't feel too guilty of asking without giving anything back. Ready? Here it comes:

Shader tip: struggling to optimize your shader? Just shuffle your code around! Most shader compilers can't find the optimal register allocation. That's why for complicated shaders, some times even removing code leads to worse performance. You should always try to pack your data together, to give the compiler all the possible hints, not to use more than you need (i.e. don't push vector3 of grayscale data) and follow all those kind of optimization best practices. But after you did everything right, try also just to shuffle the order of computations. It makes so much difference that you'll better structure your code so you have many separate blocks with clear dependencies between them...

Ok, let's see the nominations now:

Realtime raytracing.
Raytracing is cool. So realtime raytracing is cool squared! It looks good on a ph.d. thesis, it's an easy way to use multithreading and an intresting way to use SIMD and Cuda. You can implement it in a few lines of code, it looks good in any programming language, lets you render all kinds of primitives, is capable of incredibly realistic imagery.
It's so good that some people started to think that they need it, even if they don't know really why... There's a lot of hype around that, and a lot of ignorance (i.e. the logarithmic-complexity myth, raytracing is not faster than rasterization not even in the asymptotic limit, there are some data structures, applicable to both, that can make visibility queries faster under some conditions, even if building such structures is usually complicated).

CPU and GPU convergence, no fixed pipeline. Larrabee, the DirectX11 compute shaders, or even the Playstation 3 Cell processor... GPUs are CPUs wannabe and viceversa. Stream programming, massive multithreading, functional programming and side effects. Are we going towards longer pipelines, programmable but fixed in their role, or are we going towards a new, unified computing platform that could be programmed to do graphics, among other things? Software rendering again? Exciting, but I'm also scared by its complexity, hardly I can imagine people optimizing their trifillers again.

One to one vertex-pixel ratio (to texel).
Infinite detail, John Olick (ID software, Siggraph 2008) and Jules Urbach
(ATI Cinema 2.0 and OTOY engine) generated a lot of hype as they used raytracing to achieve that goal. But the same can be done right now with realtime tassellation, this is also the way that DirectX 11 took. REYES could be intresting too, for sure the current quad based rasterization pipelines are a bottleneck with high geometrical densities.
Another problem is how the generate and to load the data for all that detail. Various options are available, acquisition from real world data and streaming, procedural generation, compression, each with its own set of problems and tradeoffs.

Just more of the same. More processing power, but nothing revolutionary. We don't need such a revolution, there's still a lot of things that we can't do due to lack of processing power and the rising amounts of pixels to be pushed (HD, 60hz is not easy at all). Also, we still have to solve the uncanny valley problem, and this should shift our attention, even as a rendering enginners, outside the graphics realm (see this neat presentation of NVision08).

And the last one, that I had to add to have something to vote by myself:

Don't care, we don't need more hardware.
The next revolution won't be about algorithms or hardware, but about tools and methodologies. We are just now beginning to discover what are colors (linear color rendering, gamma correction etc), we still don't know what normals are (filtering, normalmapping), and we're far from having a good understanding of BRDF, of its parameters, of physical based rendering. We make too many errors, and add hacks and complexity in order to correct them, visually. Complexity is going out of control, we have too much of everything and too slow iteration times. There's much to improve to really enter a "next" generation.


26 October, 2008

off topic 2

Today I was walking on West Broadway to buy some camera goodies (this is a view from Cambie bridge, my appt is in the tallest building on the right)... It's winter but over the weekend it was sunny and nice. Turns out that I'm happy. Not that's strange, I'm usually happy, even if our minds are ruled by the Y combinator and our life's meaningless, we're human after all, and my meaningless life is quite good. It's just that not many times we stop to think about that, go for a walk, sing along, like a fool in a swimming pool.


20 October, 2008

Just blur

Note: this is an addendum to the previous post, even if it should be self-contained, I felt that the post was already too long to add this, and that the topic was too important to be written as an appendix...

How big a point is? Infinitesimal? Well, for sure you can pack two of them as close as you want, up to your floating point precision... But where does the dimension come to play a role in our CG model?

Let's take a simplified version of the scenario of my last post:
  • We want to simulate a rough, planar mirror.
  • We render-to-texture a mirrored scene, as usual.
  • We take a normalmap for the roughness.
  • We fetch texels from our mirrored scene texture using a screen-space UV but...
  • ...we distort that UV by an amount proportional to the tangent-space projection of the normalmap.
Simple... and easily too noisy, the surface roughness was too high frequency in my scenario, as it's easy when your mirror is nearly perpendicular to the image plane... So we blur... In my post I suggested to use a mipmap chain for various reasons, but anyway we blur, and everything looks better.

But let's look at that "blurring" operation a little bit closer... What are we doing? We blur, or pre-filter, because it's cheaper than supersampling, and post-filter... So, is it anti-aliasing? Yes, but not really... What we are doing is integrating a BRDF, the blur we apply is similar (even if way more incorrect) with the convolution we do on a cubemap (or equivalent) encoding the lights surrounding an object to have a lookup table for diffuse or specular illumination.

It's the same operation! In my previous post I said that I did consider the surface as a perfect mirror, with a Dirac delta, perfectly specular, BRDF. Now the reflected texture is exactly representing the lightsources in our scene, or some light-sources, the first bounce indirect ones (all the objects in the scene that directly reflect light from energy emitting surfaces). If we convolve it with a specular BRDF we get again the same image, indexed with the surface normals of the surface we're computing the shading on. But if we blur it, it's a way of convolving the same scene with a non perfectly diffuse BRDF!

In my implementation I used for various reasons only mipmaps, that are not really a blur... The nice thing would be, for higher quality, to use a real blur with a BRDF-shaped kernel that sits on the reflection plane (so it will end up in an ellipsoid when projected in image space)...

In that context, we need all those complications because we don't know another way of convolving the first bounce indirect lighting with our surfaces, we don't have a closed form solution of the rendering equation with that BRDF, that means, we can't express that shading as a simple lighting model (as is for example a Phong BRDF with a point light source).

What does that show? It shows us a dimension that we have in our computer graphic framework, that's implyied by the statistical model of the BRDF. We take our real world, physical surfaces, that are rough and imperfect if we look them close enough (but not too close, otherwise the whole geometrical optics theory does not apply), and we choose a minimum dimension and see how that roughness is oriented, we take a mean over that dimension, and capture that in a BRDF.
Note how this low-pass filter, or blurring, over the world dimensions is very common, in fact, is the base of every mathematical model in physics (and remember that calculus does not like disconinuties...). Models always have an implied dimension, if you look at your phenonema so that one becomes relevant, the model "breaks".
The problem is that in our case, we push that dimension to be quite large, not because we want to avoid to enter the quantum physics region, but because we don't want to deal with the problem of explicitly integrating high frequencies, and so we assume our surfaces to be flat, and capture the high frequencies only as an illumination issue, in the BRDF, in that way, the dimension that we choose depends on the distance we're looking at a surface, and it can become easily in the order of millimeters.

We always blur... We prefer to pre-blur instead of post-blurring, as the latter is way more expensive, but in the end what we want to do is to reduce all our frequencies (geometry, illumination etc) to the sampling one.

What does that imply, in practice? How is that relevant to our day to day work?

What if our surfaces have details of that dimension? Well things generally don't go well... That's why our artists in Milestone, when we were doing road shading, found impossible to create a roughness normal map for the tracks, it looked bad...
We ended up using normalmaps only for big cracks and for the wet track effect, as I explained.
It also means that even for the wet track, it's wise to use the normalmap only for the reflection, and not for the local lighting model... the water surface and the underlying asphalt layer have a much better chance to look good using geometric normals, maybe modulating the specular highlights using a noisy specular map, i.e. using the lenght of the Z component (the axis aligned to the geometric normal) of the (tangent space) roughness normalmap...

Note: If I remember correctly, in my early tests I was using the normalmap only for the water reflection, not using a separate specular for the water (so that layer was made only by the indirect reflection), using a specular without the normalmap for the asphalt layer), all those are details, but the interesting thing that I hope I showed here is why this kind of combination for shading that surface did work better than others...

18 October, 2008

Impossible is approximatively possible

I'm forcing myself to drive this blog more towards practical matters and less towards anti-C++ rants and how cool other languages are (hint, hint)... The problem with that is about my to do list, that nowdays, after work, is full of non programming tasks... But anyway, let's move on with today's topic: simuating realistic reflections for wet asphalt.

MotoGP'08, wet track, (C) Capcom

I was really happy to see the preview of MotoGP'08. It's in some way the sequel of the last game I did in Italy, SuperBike'07, it's based on the same base technology that my fellow collegues of Milestone's R&D group and I developed. It was a huge work, five people working on five platform, writing everything almost from scratch (while the game itself was still based on the solid grounds of our oldgen ones, the 3d engine and the tools started from zero).

One of the effects I took care of was the wet road shading. I don't know about the technology of the actual shipped games, I can guess it's an improved version of my original work, that's not really important for this post, what I want to describe is the creative process of approximating a physical effect...

Everything starts from the requirements. Unfortunately at that time we didn't have any formal process for that, we were not "agile", we were just putting our best effort without much strategy. So all I got was a bunch of reference pictures, the games in our library to look for other implementations of the same idea, and a lot of talking with our art director. Searching on the web I found a couple of papers, one a little bit old but geared specifically towards driving simulations.

The basics of the shader were easy:
  • A wet road is a two layer material, the dry asphalt with a layer of water on top. We will simply alpha blend (lerp) between the two.
  • We want to have a variable water level, or wetness, on the track surface.
  • The water layer is mostly about specular reflection.
  • As we don't have ponds on race tracks, we could ignore the bending of light caused by the refraction (so we consider the IOR of the water to be the same as the air's one).
  • Water will reflect the scene lights using a Blinn BRDF.
  • Water will have the same normals as the underlying asphalt if the water layer is thin, but it will "fill" asphalt discontinuities if it thick enough. That's easy if the asphalt has a normalmap, we simply interpolate that with the original geometry normal proportionally with the water level.
  • We need the reflection of the scene objects into the water.
I ended up using the "skids" texturemap (and uv layout) to encode in one of its channels (skids are monochrome, they require only one channel) the wetness amount. Actually our shader system was based on a "shader generator" tool where artists could flip on and off various layers and options in 3dsMax and generate a shader out of it, so the wetness map could be linked to any channel of any texture that we were using...

Everything here is seems straightfoward and can be done with various levels of sophistication, for example an idea that we discarded, as was complicated to handle by the gameplay, was to have the bikes dynamically interact with the water, drying the areas they passed over.

The problem comes when you try to implement the last point, the water reflections. Reflections from planar mirrors are very easy, you have only to render the scene transformed by the mirror's plane in a separate pass and you're done. A race track itself is not flat but this is not a huge problem, it's almost impossible to notice the error if you handle the bikes correctly (mirroring them with a "local" plane located just under them, if you use the same plane for all of them some reflections will appear to be detached from the contact point between the tires and the ground).

Easy, you can code that in no time, and it will look like a marble plane... The problem is that the asphalt, even when wet, has still a pretty rought surface, and thus it won't behave as a perfect mirror, it will more be like a broken one. Art direction asked for realistic reflections, so... for sure not like that.

Let's stop thinking about the hacks and let's think about what happens in the real world... Let's follow a ray of light that went from a light to an object, then to the asphalt and then to the eye/camera... backwards (under the framework of geometrical optics, that's what we use for compute graphics, you can always go backward, for more details see the famous ph.D. thesis by Eric Veach)!

So we start at the camera, we go towards the track point we're considering, from there it went towards a point on a bike. In which direction? In general, we can't know, any possible directon could make the connection if it does not have a BRDF value of zero, otherwise that connection will have no effect on the shading of the track and thus we won't be able to see it. After bouncing in that direction, it travels for an unknown distance, reaches the bike, and from there it goes towards a light, for which we know the location.

Now simulating all this is impossible, we have two things that we don't know, the reflection direction and the travelled light ray distance between the track and the bike, and those are possible to compute only using raytracing...
Let's try now to fill the holes using some approximations that we can easily compute on a GPU.

First of all we need the direction, that's easy, if we consider our reflections to be perfectly specular, the BRDF will be a dirac impulse, it will have only one direction for which it's non zero, and that is the reflected direction of the view ray (camera to track) around the (track) normal.

The second thing that we don't know is the distance it travelled, we can't compute that, it would require raytracing. In general reflections would require that, why are the planar mirror ones an exception? Because in that case the reflection rays are coherent, visibility can be computed per each point on the mirror using a projection matrix, but that's what rasterization is able to do!
If we can render planar mirrors, we can also compute the distance of each reflected object to the reflection plane. In fact it's really easy! So we do have a measure of the distance, but not the one that we want, the distance our reflected rays travels according to the rough asphalt normals, but the one it travels according to a smooth, marble-like surface. It's still something!

How to go from smooth, flat, to rough? Well the reflected vectors are not so distant, if we have the reflected point on a smooth mirror, we can reasonably think that the point the rough mirror will hit is more or less around the point the smooth mirror reflected. The idea is simple so, we just take the perfect reflection we have in the render-to-texture image, and instead of reading the "right" pixel we read a pixel around it, in a direction that will be the same as the difference vector between the smooth reflection vector and the rough one. But that's difference is the same that we have between the geometric normal and the normalmap one! Everything is going smooth... We only need to know how far to go in that direction, but that's not a huge problem too, we can approximate that with the distance between the point we would have hit with a perfectly smooth mirror and the mirror itself, that distance is straightforward to compute when rendering the perfect reflection texture or in a second pass, by resolving the zbuffer of the reflection render.

Let's code this:

// Store a copy of the POSITION register in another register (POSITION is not
// readable in pixel shader S.M. <>
float2 perfectReflUV = (IN.CopyPos.xy / IN.CopyPos.w)*float2(0.5f,-0.5f) + 0.5f;

// Fetch from the screenspace reflection map, the approximation of the track to
// reflected object distance... It has to be normalized between zero and one.
float reflectionDistance = tex2D(REFLECTIONMAP, perfectReflUV).a;

// Compute a distortion approximaton by scaling by a constant factor the normalmap
// normal (expressed in tangent space)
float2 distortionApprox = normalMapNormalTGS.xy * DISTORTIONFACTOR;

// Fetch the final reflected object color...
float2 reflUV = perfectReflUV + distortionApprox * reflectionDistance;
float3 reflection = tex2D(REFLECTIONMAP, reflUV).rgb;

That actually works, but it will be very noisy, especially when animated. Why? Because the frequency of our UV distortion can be very high, as it depends on the track normalmap, and the track is nearly parallel to the view direction, so its texture mapping frequencies are easily very high (that's why for racing games anisotropic filtering is a must). That's very unpleasing especially when animated.

How do we fight high frequencies? Well, with supersampling! But that's expensive... Other ideas? Who said prefiltering? We could blur our distorted image... well, but that's quite like blurring the reflection image... well, but that's quite possible by generating some mipmaps for it! We know how much we are distorting the reads from that image, so we could choose our mipmap level based on that...
Ok, we're ready for the final version of our code now... I've also implemented another slight improvement, I read the distance from a pre-distorted UV... That will cause some reflections of the near objects to leak into the far ones (i.e. the sky) but the previous version had the opposite problem, that was in my opinion more noticeable... Enjoy!

// Store a copy of the POSITION register in another register (POSITION is not
// readable in pixel shader S.M. <>
float2 perfectReflUV = (IN.CopyPos.xy / IN.CopyPos.w)
*float2(0.5f,-0.5f) + 0.5f;

// Compute a distortion approximaton by scaling by a constant factor the normalmap
// normal (expressed in tangent space)... 0.5f is an estimate of the "right"
// reflectionDistance that we don't know (we should raymarch to find it...)
float2 distortionApprox = normalMapNormalTGS.xy * DISTORTIONFACTOR * 0.5f;

// Fetch from the screenspace reflection map, the approximation of the track to
// reflected object distance... It has to be normalized between zero and one.
float reflectionDistance = tex2D(REFLECTIONMAP, perfectReflUV + distortionApprox).a;
distortionApprox = normalMapNormalTGS.xy * DISTORTIONFACTOR * reflectionDistance;
// we could continue iterating to find an intersection, but we don't...

// Fetch the final reflected object color:

float2 reflUV = perfectReflUV + distortionApprox;
float4 relfUV_LOD = float4(
float4(reflUV,0,REFLECTIONMAP_MIPMAP_LEVELS * reflectionDistance));
float3 reflection = tex2Dlod(REFLECTIONMAP, relfUV_LOD);

Last but not least, you'll notice that I haven't talked much about programmer-artist iteration, even if I'm kinda an "evangelist" of that. Why? It's simple, if you're asked to reproduce the reality, then you know what you want, if you do that by approximating the real thing you know which errors you're doing, hardly there's much to iterate. Of course the final validation has to be given by the art direction, of course they can say it looks like crap and they prefer a hack over your nicely crafted, physical inspired code... But that did not happen, and in that case, a physically based effect requires usually way less parameters, and thus tuning and iteration, than a hack-based one...

Update: continues here...
Update: some slight changes to the "final code"
Update: I didn't provide many details about my use of texture mipmaps as an approximation of various blur levels... That's of course wrong, it may be very wrong if you have small emitting objects (i.e. headlights or traffic lights) in your reflection map. In that case you might want to cheat and render those object with a halo (particles...) around them, to "blur" more without extra rendering costs, or do the right thing, use a 3d texture map instead of mipmap levels, blur each z-slice with different kernel widths, maybe consider some way of HDR color encoding...