Search this blog

25 February, 2009

Game multithreading laws

Personal note: My eeepc still has no keyboard (now it's totally dead), so I'm writing from my girlfriend's laptop... I should learn not to use netbooks in the tub.
I'll cut it short this time. Explicit multithreading is too hard. Actually I think it's the hardest thing in computer science.
Parallel programming articles are always a fascinating read (I strongly advise, strongly, to read Joe Duffy's blog, and of course Herb Sutter's one), but the truth is, when it comes to real work, you want to minimize the exposure you have to it.
And yet, doing games, you want to make your CPU go over its peak gflops rating. How to? Those are my personal laws (not that there's anything revolutionary really, so don't be surprised if you're already following them):
  • Data parallellism is your only God. It will feed your long, starving pipelines, hide your latencies and vectorize your computation.
  • Embrace Stream computing, love Map/Reduce, study ParallelFX, OpenMp and CUDA and finally implement and use a ParallelFor primitive with a Thread Pool.
  • That shall be the only primitive you routinely use for multithreading.
  • Avoid explicit threads.
  • Avoid explicit locks/syncronization primitives.
  • Avoid all forms of data sharing.
  • Enforce the non sharing rule. Use smart pointers and reference counting base classes. Assert that shared smart pointers are acquired/released always on the same thread.
  • You don't need locks in your libraries, because you don't want to share. Re-entrancy is the key, if you can't achieve that, just say it, don't lock. Locks do NOT compose.
  • You don't need exotic parallel data structures or syncronization primitives, because you're not sharing.
  • The only time you have to think about syncronization shall be when communicating with the GPU.
  • You might not have more than explicit threads than the fingers of one hand.
  • Any explicit thread will only depend on one other (i.e. ai->game->rendering).
  • In runtime there will be only one syncronization point, when communication happens by passing an updated buffer of data to the thread that it needs it. That should be done with a queque. Locks contention is practically absent.
  • Communication is always mono-directional, one thread always writes stuff for the dependant thread that always reads it (or, one thread will own only one side of the communication queque).

Follow those rules, and you'll be happy looking at the guy who has implemented that super cool lockless actor based future system working all the weekends...

I will be writing more on this, focusing on how to make the render engine parallell, but it will take a while because my plan is to describe and then publish the code of an old (but not too old) engine I wrote for a series of articles that had to appear on a magazine, but never did...

03 February, 2009

Offline

Finally I've found some time to read a few publications from recent and not-so-recent conferences...

Cuda this, GPU that, it seems that most of the effort is spent in finding ways of adapting old algorithms to the GPU, even in fields were the GPU computation model (at least, of this generation GPUs, who knows about the future) does not apply very well.

Dunno, maybe since I've left university and started to work in the gaming industry, I've got too pragmatic... Still there are things worth reading, Approximating Dynamic Global Illumination in Image Space was really expected, SSAO ported to diffuse global illumination. Point-based approximate color bleeding by Pixar is more exciting, a realtime technique that gets implemented/mutated into the most highend, and thus stable, offline rendering engine.

If you planned to impress your friends with some realtime fluids computed on the GPU,
Real-Time Fluid Simulation using Discrete Sine/Cosine Transforms is a realtime, frequency space approach (like the uber famous Stable Fluids, by Jos Stam), with boundary conditions.

If you use spherical harmonics, and your game has day/night cycles,
Efficient Spherical Harmonics Lighting with the Preetham Skylight Model might be nice, even if you'd probably have to update the skydome map anyway, and you still should have plenty of time to do it in a slow way over a number of frames...

I've already encountered the work of
Ladislav Kavan (Dual quaternion skinning) and even exchanged a couple of mails with him while I was researching on quaternion skinning for my crowd renderer. Nice guy, and very intresting reseach. Animation is a field that I don't really know in depth, but for sure it's showing some good progress, and it's still something where a lot of improvement is possible, even right now, in practical applications. Physics based is the future!

Ok, so, here is where I wanted to talk about how I was surprised to find this after I knew this, and how I was happy to see that people are actually investing in single-ray SIMD raytracing structures, instead of fast but useless ray packets ones. Then I planned to crosslink two interesting posts from the level of detail blog, to show the work of Tobias Ritscher, to then conclude with Progressive Photon Mapping, explaining a lil bit the global illumination problem
(the title of the post was offline as in offline rendering), the magnificent work of Eric Veach versus the practical intuition of Jensen, and how he looks to me more like a coder than a reseacher, and explain the simplicity and beauty of a couple of his ideas (even if in general, I don't like photon mapping). As a postscript, I wanted to remark how bad are papers that don't provide any insight on the downsides of the presented algorithm. In PPM there are a few that appear to be pretty obvious (memory impact of having all that information on the hit points, that scale with the number of pixels in the image, dealing with aliasing and how it does compare to path tracing under less extreme conditions). But my EEEPC keyboard is failing, it started with the CTRL key a long ago, and I've replaced it with the right windows key... then a couple of function keys, and now the return and the arrows trigger the delete button... So I have to stop, anyway, it was really a boring post, I'll probably edit it or delete it later, on a decent computer...

02 February, 2009

DirectX texel offset reminder

DirectX evolved from a very bad library into a decently usable one. But under the hood, some bad habits still exist. Do you do post-processing effects? Or dynamically generate textures? Remember to align your vertices (subtracting half a pixel from them)!

This article is very helpful. Especially if you're trying to implement this...

Xbox 360, having a really cool DirectX 9.5 API, provides means of correcting that with a renderstate, and DirectX 10 solved it too, so that's only applicable to DirectX 9.

24 January, 2009

Too young

No, I've not been (at least, yet) laid off. Quite the contrary, our game is reaching alpha, a state where in theory we should work only on bugfixing, optimization and tuning. In practice, this status is reached for most games only when they ship, some are not so lucky and features are still added (not considering expansions, that would be separate products) with patches.
But it was fun to notice that I started receiving more job offers in this period, of companies intrested into getting good engineers, freshly cut... Oh well.

I also had a couple of more interesting things to post, but just today my macbook died. Ok, I know that sounds like "aliens ate my demomaker" but yeah, it's real and I'm really pissed off by that.

An overused analogy
I've always been fascinated by how much good coding looked like good paiting. Some start with a preliminary sketch, some do not, I don't think it's so important. The iterative process is. You first start to draw general light volumes and set the main colours, the mood of the picture. Then you go down and start to nail down shapes, focusing on the main parts first, bodies, faces. Eventually you'll change some eary decisions, because they don't look quite right as the picture progresses. Then you go into the details more and more until you're happy, trying not to overdo it.

Painting has a process, and you'll find the same in most of the crafts, from sculpture to music. The media does not matter much, it's just like programming languages are to us. Some ideas are better expressed in one media, some in another, but in the end it doesn't matter too much...
If we have to look at the media, then sculpting materials can be used for some quite nice analogies, some are harder to shape, but they allow for finer detail (think about marble), some others are fast and can be used to make casts for more solid works, some are obsolete, with fewer and fewer masters being able to work with them.
C is marble, Michelangelo can create the most incredible works out of it, but it requires an incredible mind, a lot of muscles, lots of preliminary drawings and many years. We, normal people, should focus on something that's more easy to change, but that can still be hardened enough to suit our needs... Anyway...

So coding and art, both deal with processes, while still allowing for a lot of freedom and creativity. Why we don't see quite as much structure in our rendering work then? Are we too young? Even the word "creativity", it scares us, but should it?
Creativity is such a taboo word in programming because it evokes things you can’t control, but it’s quite the opposite, you can, artists do.
I really think that a weekend spent learning painting or watching how artist paint would be really helpful to learn about a good workflow. Artists are not messy at all, they have a very organized and evolved, refined way to creativity.

In my experience, the rendering work of a game starts, if you're lucky, by gathering references, compiling a feature list, and making mockups of what the end result should look like. From there, engineers make estimates, a plan is made, something get cut early, and then the project starts.

Agile or not often does not make a fundamental difference in the way rendering evolves. It should do, and it changes the way your daily work progresses, but often it does not change our perspective at a more fundamental level. Features are engineered, changes happen, people get more and more busy until everything is done. Now take screenshots of the game as it goes through all those steps. For sure you'll notice (well, hopefully) that it's improving over time. But can you notice a pattern? A process? In my experience most of the times, no, you can't. Is rendering too young to have one?

Things get done in a technical order, based on their risk. If it's new and it's hard, it will be done first. But in this way, how can you know what you are doing? How do you know that what you did is correct, it fits into the picture quite right... It's like drawing some detailed eyes in a black canvas, without knowing where they will be, on which face, with which light, or mood...

Short story
Recently I was involved in some very intresting discussions on normals. Without going into details that do not matter nor am I allowed to write, I can say that we have bad normals. Everyone does, really! Simply, no matter how you transform your meshes, if it's not rigid, and if you're using the same transform on vertices and on their normals, you'll be wrong. Smooth normals are a big pain in the ass, they're totally fake and they're computed by averaging face normals, that depend on many vertices, often in a non trivial way (as usually, the mean in weighted considering some heuristics).
I never realized that before.

Actually I was thinking that our fault was somewhere else in the math we were using, but I was proven wrong, so well, that's it. And as all the solutions seem to be too expensive or too risky to try, I suspect we'll live with it.

What's more interesting is that some time later, another defect was found in another area. As it was discovered by lighting artists, they tried their best to fix it by changing our many lighting variables, but they failed. Eventually someone thought that it could be due to that normal problem and I was involved into the discussion. After a while, it was found that no, it's not lighting nor normals, it's just the animation that's wrong, the shape itself gets wrong. In a subtle way, but enough to cause a big loss of realism.

Where do I want to go with that? Our problem is that we don't build things on a stable, solid base. Not from an engineering perspective (in that case, the solid base should be provided by correct maths, and understood approximations on physical phenomena), nor from a rendering one. We build things on hacks, find visual defects and add more hacks to fix them, in a random order. It's impossible to predict what's affecting what, shaders have hundred of parameters, nothing is correct, so you can't tell when you're making an error.

We should learn, again, from art. Start with shapes, and basic lights (i.e. no textures, only gouraud shading, shadows and alpha maps). Make sure that shapes do work, that work even under animation, and provide means to always check that.

Then focus on the main shaders, on the material they should model. Find out what lights do we need, photography helps, key, fill, background... And when something does not look right, do not fix it, but find out first why it does not.

If it's too saturated, the solution is not to add a desaturation constant, but first to check out why. Are we taking a color to an exponent? What sense that operation has? Why we do that? Is the specular too bright? Why? Is our material normalized, or does it reflect more light than the one it receives?

Addendum - update:
In the non realtime rendering world, one of the advantages of unbiased estimators of the rendering equation is to be able to compute the error, and the convergence, of the image. In our world, there is no notion of error. Because nothing is correct, nor consistent. Many times there aren't even lights nor materials, but pieces of code that emit colors.

In the end we have lots of parameters and techniques, and an image that does not feel right. If our artists are very skilled, by looking at the reference pictures, they might identify the visual defect that makes the image not look right. Specular is wrong. Hue changes in the skin are incorrect. If we are able to find those problems (not trivial) finding the cause is always impossible. Wrong normals? The subsurface scattering is badly tuned, or it's the diffuse map? Or it's the specular of the rimlights? The only option is to add another parameter of another hack that makes the situation less worse... In that case, but probably add other visual defects somewhere else, and for sure it adds complexity...

It's the same thing that happens with bad, rotten code, the only difference is that this time is our math and our processes that are bad, not the resulting code, so we're less trained to recognize it...

06 December, 2008

Which future? (plus, a shader optimization tip)

Update: poll closed. Results:

Cpu and Gpu convergence - 36
One to one vertex-pixel-texel - 7
Realtime raytracing - 14
More of the same - 9
We don't need any of that - 14

We have a clear winner. To my surprise, the realtime raytracing hype did not have a big influence on this, nor Carmak with its ID Tech 5 and "megatextures" (clipmaps) or John Olick with his voxel raycaster... IBM Cell CPU and Intel's Larabee seem to lead the way... We'll see.
---
Which future? Well, in the long term, I don't know. I don't know anything about it at all. In general our world is way too chaotic to be predictable. Luckily.


I'm happy to leave all the guesswork to hardware companies (this article by AnardTech is very intresting).

But we can see clearly the near future, there are a lot of technologies and ideas, trends, promising research... Of course we don't know which ones will be really succesful, but we can see some options, and even make some predictions, having enough experience. Educated guesses.

But what is even more intresting to me, is not to know which ones will succeed, that's not too exciting, I'll know anyway, it's a matter of time. I'd like to know which ones should succeed, which ones we are really waiting for. That's why I'm going to add a poll to this blog, and ask you.

I'm not writing a lot, the game I'm working on is taking a lot of my time now, so this blog is not as followed as it used to be, so running a poll right now could be not the most brightest idea ever. But I've never claimed to be bright...

Before enumerating the "possible futures" I've choosen for the poll, a quick shader tip, so I don't feel too guilty of asking without giving anything back. Ready? Here it comes:

Shader tip: struggling to optimize your shader? Just shuffle your code around! Most shader compilers can't find the optimal register allocation. That's why for complicated shaders, some times even removing code leads to worse performance. You should always try to pack your data together, to give the compiler all the possible hints, not to use more than you need (i.e. don't push vector3 of grayscale data) and follow all those kind of optimization best practices. But after you did everything right, try also just to shuffle the order of computations. It makes so much difference that you'll better structure your code so you have many separate blocks with clear dependencies between them...

Ok, let's see the nominations now:

Realtime raytracing.
Raytracing is cool. So realtime raytracing is cool squared! It looks good on a ph.d. thesis, it's an easy way to use multithreading and an intresting way to use SIMD and Cuda. You can implement it in a few lines of code, it looks good in any programming language, lets you render all kinds of primitives, is capable of incredibly realistic imagery.
It's so good that some people started to think that they need it, even if they don't know really why... There's a lot of hype around that, and a lot of ignorance (i.e. the logarithmic-complexity myth, raytracing is not faster than rasterization not even in the asymptotic limit, there are some data structures, applicable to both, that can make visibility queries faster under some conditions, even if building such structures is usually complicated).

CPU and GPU convergence, no fixed pipeline. Larrabee, the DirectX11 compute shaders, or even the Playstation 3 Cell processor... GPUs are CPUs wannabe and viceversa. Stream programming, massive multithreading, functional programming and side effects. Are we going towards longer pipelines, programmable but fixed in their role, or are we going towards a new, unified computing platform that could be programmed to do graphics, among other things? Software rendering again? Exciting, but I'm also scared by its complexity, hardly I can imagine people optimizing their trifillers again.

One to one vertex-pixel ratio (to texel).
Infinite detail, John Olick (ID software, Siggraph 2008) and Jules Urbach
(ATI Cinema 2.0 and OTOY engine) generated a lot of hype as they used raytracing to achieve that goal. But the same can be done right now with realtime tassellation, this is also the way that DirectX 11 took. REYES could be intresting too, for sure the current quad based rasterization pipelines are a bottleneck with high geometrical densities.
Another problem is how the generate and to load the data for all that detail. Various options are available, acquisition from real world data and streaming, procedural generation, compression, each with its own set of problems and tradeoffs.

Just more of the same. More processing power, but nothing revolutionary. We don't need such a revolution, there's still a lot of things that we can't do due to lack of processing power and the rising amounts of pixels to be pushed (HD, 60hz is not easy at all). Also, we still have to solve the uncanny valley problem, and this should shift our attention, even as a rendering enginners, outside the graphics realm (see this neat presentation of NVision08).

And the last one, that I had to add to have something to vote by myself:

Don't care, we don't need more hardware.
The next revolution won't be about algorithms or hardware, but about tools and methodologies. We are just now beginning to discover what are colors (linear color rendering, gamma correction etc), we still don't know what normals are (filtering, normalmapping), and we're far from having a good understanding of BRDF, of its parameters, of physical based rendering. We make too many errors, and add hacks and complexity in order to correct them, visually. Complexity is going out of control, we have too much of everything and too slow iteration times. There's much to improve to really enter a "next" generation.