Search this blog
06 May, 2009
04 May, 2009
Quick journey into physics
I'm quite good at maths but I always sucked at physics. I can simulate simple particle systems, and have a decent knowledge into fluid dynamics, even if not enough to truly understand it. Electromagnetism is magic to me and I didn't know how to simulate constrained rigid bodies. As the game I've almost finished relies heavily on physics, I decided to go and find some tutorials on rigid body simulation for games. The following is a list of what I've found useful, in the order it should be read:
- http://chrishecker.com/How_to_Simulate_a_Ponytail
- http://chrishecker.com/images/e/e8/Gdmag200003-ponytail-1.pdf
- http://chrishecker.com/images/a/a5/Gdmag200004-ponytail-2.pdf
- http://chrishecker.com/Five_Physics_Simulators_for_Articulated_Bodies
- http://www.cs.cmu.edu/afs/cs/user/baraff/www/papers/sig96.pdf
- http://www.slimy.com/~steuard/teaching/tutorials/Lagrange.html
- http://www.cs.cmu.edu/afs/cs/user/baraff/www/pbm/pbm.html
- http://www.gphysics.com/downloads
- http://www.bulletphysics.com/Bullet/phpBB3/
- http://i31www.ira.uka.de/docs/DynamicSimulation.pdf (http://www.impulse-based.de/)
- http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.6574
- more to come... suggestions appreciated (especially good tutorials on solvers of the Lagrange multiplier constraits, PGS and sequential impulses, and simple tutorials of the reduced coordinates approach)
28 April, 2009
Does it scale? Game threading laws part two
Someone is selling you a solution to a problem. A problem that is general, it's not something you want to solve just now, but now and tomorrow and the day after.
In that scenario, we should always ask this question first: does it scale? Note that that is not a question we are in general too concerned with.
Usually we want to know something else, we want to know if it is fast. And we care about scalability only marginally, we know that it's a good property but only if it does not have any overhead, if it does not make what we are doing right now slower (or more complicated, if you see it in a more general way).
But let's talk about the situations where we do care about that. Tools for example, scalability of tools is fundamental... But tools are not sexy, so let's say, threads...
I recently blogged about that, in a nutshell suggesting that the "right" solution is the parallel-for in a thread pool, or in general to look at data parallelism and stream computing. And I said those were "laws" to empathize that you shouldn't be too much concerned about fancier tools. Why?
Because every time I see those fancy tools, as we're talking about perspectives about the future, frameworks to enable our computation to scale... I ask that question! Now beware, is not that people that work on parallel data structures, lock-free or wait-free algorithms, immutability, software transactional memory, actors, futures, uniqueness types... It's not that they're not concerned with scalability! They are, and a lot, parallel computing is all about that.
But here comes the tricky part... scalability is limited by the bottleneck that comes to you first. So you have to identify a bottleneck... in the future! That can be incredibly challenging, your best bet is to just simulate your future workloads... but simulating them on a future hardware that does not exist is complicated! Scalability is a big problem to tame.
Regarding to threading tho, we have a lot of evidence that the main problem will be (well really, already is) memory... And we have plenty of examples... GPUs are a great picture of what the future can be... REYES rendering was invented in the eighties, and already embraced parallelism via coherency...
Now sometimes, as a rendering engineer, you hear stuff like "raytracing is easy to parallelize because each ray is independent from the others". So does that scale? It would seem so, we have independent rays, so we don't need locks! Independent rays... true, but you have to be very worried when you hear that. Independent? Do you mean that I don't have any coherency? And what about the memory? I don't care too much about locks, I care about bandwidth and latency!
Of course that's again something well known, in fact, coherent raytracing is not an new idea at all, and now most raytracers work on ray packets (that hopefully will be replaced with something better, allowing coherent traversal with independent rays, i.e. n-ary trees) but that's not the point...
So back to threading... those "laws" might seem restrictive, but they are not., when you factor in scalability, and when you realize that at least in our context, is all about memory. Really, everything else does not matter much, so it boils down to one thing: data parallelism.
Everything else is not only complicated, but it does not work at all! Now I don't mean that STM does not work, of course you can map data parallelism to that, or implement actors for example in a data parallel way (that is a very smart idea if you're parallelizing your game... if your actors are simple and you have much more instances than types then you don't even need fibers or other kinds of lightweight threading for them, no generators or coroutines, just a parallel-for over a list of message queques)...
But the underlying idea, and the one your framework should embrace, is still modeled with data-parallel threading...
In that scenario, we should always ask this question first: does it scale? Note that that is not a question we are in general too concerned with.
Usually we want to know something else, we want to know if it is fast. And we care about scalability only marginally, we know that it's a good property but only if it does not have any overhead, if it does not make what we are doing right now slower (or more complicated, if you see it in a more general way).
But let's talk about the situations where we do care about that. Tools for example, scalability of tools is fundamental... But tools are not sexy, so let's say, threads...
I recently blogged about that, in a nutshell suggesting that the "right" solution is the parallel-for in a thread pool, or in general to look at data parallelism and stream computing. And I said those were "laws" to empathize that you shouldn't be too much concerned about fancier tools. Why?
Because every time I see those fancy tools, as we're talking about perspectives about the future, frameworks to enable our computation to scale... I ask that question! Now beware, is not that people that work on parallel data structures, lock-free or wait-free algorithms, immutability, software transactional memory, actors, futures, uniqueness types... It's not that they're not concerned with scalability! They are, and a lot, parallel computing is all about that.
But here comes the tricky part... scalability is limited by the bottleneck that comes to you first. So you have to identify a bottleneck... in the future! That can be incredibly challenging, your best bet is to just simulate your future workloads... but simulating them on a future hardware that does not exist is complicated! Scalability is a big problem to tame.
Regarding to threading tho, we have a lot of evidence that the main problem will be (well really, already is) memory... And we have plenty of examples... GPUs are a great picture of what the future can be... REYES rendering was invented in the eighties, and already embraced parallelism via coherency...
Now sometimes, as a rendering engineer, you hear stuff like "raytracing is easy to parallelize because each ray is independent from the others". So does that scale? It would seem so, we have independent rays, so we don't need locks! Independent rays... true, but you have to be very worried when you hear that. Independent? Do you mean that I don't have any coherency? And what about the memory? I don't care too much about locks, I care about bandwidth and latency!
Of course that's again something well known, in fact, coherent raytracing is not an new idea at all, and now most raytracers work on ray packets (that hopefully will be replaced with something better, allowing coherent traversal with independent rays, i.e. n-ary trees) but that's not the point...
So back to threading... those "laws" might seem restrictive, but they are not., when you factor in scalability, and when you realize that at least in our context, is all about memory. Really, everything else does not matter much, so it boils down to one thing: data parallelism.
Everything else is not only complicated, but it does not work at all! Now I don't mean that STM does not work, of course you can map data parallelism to that, or implement actors for example in a data parallel way (that is a very smart idea if you're parallelizing your game... if your actors are simple and you have much more instances than types then you don't even need fibers or other kinds of lightweight threading for them, no generators or coroutines, just a parallel-for over a list of message queques)...
But the underlying idea, and the one your framework should embrace, is still modeled with data-parallel threading...
20 April, 2009
Economy is not that bad...
...if we still have money to waste on crappy tech: one, and two. My prediction: in a couple of months they'll be both pretty much dead.
Now some more interesting links, a lil bit of old and less-old school hacking, enjoy.
http://aggregate.org/MAGIC/
http://graphics.stanford.edu/~seander/bithacks.html
http://www.inwap.com/pdp10/hbaker/hakmem/hakmem.html
http://home.hejl.com/HD/
Now some more interesting links, a lil bit of old and less-old school hacking, enjoy.
http://aggregate.org/MAGIC/
http://graphics.stanford.edu/~seander/bithacks.html
http://www.inwap.com/pdp10/hbaker/hakmem/hakmem.html
http://home.hejl.com/HD/
26 March, 2009
Garbage collection, again
Recently I've discovered this forum, Molly Rocket. A friend of mine told me to have a look, people were trashing one of my articles! Kinda cool I thought, but unfortunately it turned out that it was mostly about misunderstandings due to my bad writing.
Well, that's not the point, the point is that in that forum, there are plenty of smart people, helping less experienced ones. I stumble in a post about garbage collection, and write the usual stuff about it and its merits, compared to explicit allocations and reference counting. And... of course, I get told that those are just the usual arguments, not enough to persuade the big guys.
Ok so, let's write again... about garbage collection!
So let's picture the usual scenario. It's C++ and so, you start writing code to manage memory (at this point, you're dealing with memory, so hopefully you already have a coding standard, maybe some tools to enforce it, and you know that by default, C++ is wrong).
You probably start writing your custom allocators, to help debugging, usual stuff. You wrap the default allocator with some padding before and after to detect stomps, tracing functionality to detect leaks and fragmentation, handling alignment and so on.
Explicit allocation do no compose, they don't really work with OO, so you implement reference counting, maybe deriving your classes from a reference counted base, adding smart pointer classes (and hoping that you don't have to interface it with another similar system in a third part library).
Ok you're set, you start writing your game. And fragmentation is a problem! Ok, no fear, everyone faced that, you know what to do. You start to make separate memory pools, luckily you already knew about that, and tagged all your allocations with a category: rendering, animation, ai, physics, particles. It was so useful to enforce memory budgets during the project!
So now it's only a matter of redirecting some of this categories to different pools, possibly different allocation strategies.
Off we go... and it works! That's how everyone is solving the problem...
But! It's painful!
And this is the best scenario were you have total control and you did all the right choices, and you don't have to link external libraries that use other strategies.
You have to size all your pools for the worst case scenario. And then streaming comes in the equation. Streaming rocks right?
You need to have more and more fine control over allocations, splitting heaps, creating class pools.
You realize that what it really counts is objects lifetime. The most useful thing is to classify allocations as per frame (usually a linear allocator that automatically frees everything at the end of the frame, double buffered if the memory has to be consumed by the GPU...), short lived (i.e. temporary objecs), medium lived (i.e. level resources) and permanent.
You realize that if you go on and on, and split allocations so every class has its own pool, and you size the pools for the worst case you're wasting a lot of memory, and in the end you don't need to manage allocations anymore. You can simply use a circular pool, and overwrite old instances with new ones, if the pool is correctly sized, living instances won't get overwritten ever!
Something wrong. And what's with the idea of object lifetime anyway? Is there a better solution? A more correct, generic answer? Something that should be used as a better default? Well well...
Well, that's not the point, the point is that in that forum, there are plenty of smart people, helping less experienced ones. I stumble in a post about garbage collection, and write the usual stuff about it and its merits, compared to explicit allocations and reference counting. And... of course, I get told that those are just the usual arguments, not enough to persuade the big guys.
Ok so, let's write again... about garbage collection!
So let's picture the usual scenario. It's C++ and so, you start writing code to manage memory (at this point, you're dealing with memory, so hopefully you already have a coding standard, maybe some tools to enforce it, and you know that by default, C++ is wrong).
You probably start writing your custom allocators, to help debugging, usual stuff. You wrap the default allocator with some padding before and after to detect stomps, tracing functionality to detect leaks and fragmentation, handling alignment and so on.
Explicit allocation do no compose, they don't really work with OO, so you implement reference counting, maybe deriving your classes from a reference counted base, adding smart pointer classes (and hoping that you don't have to interface it with another similar system in a third part library).
Ok you're set, you start writing your game. And fragmentation is a problem! Ok, no fear, everyone faced that, you know what to do. You start to make separate memory pools, luckily you already knew about that, and tagged all your allocations with a category: rendering, animation, ai, physics, particles. It was so useful to enforce memory budgets during the project!
So now it's only a matter of redirecting some of this categories to different pools, possibly different allocation strategies.
Off we go... and it works! That's how everyone is solving the problem...
But! It's painful!
And this is the best scenario were you have total control and you did all the right choices, and you don't have to link external libraries that use other strategies.
You have to size all your pools for the worst case scenario. And then streaming comes in the equation. Streaming rocks right?
You need to have more and more fine control over allocations, splitting heaps, creating class pools.
You realize that what it really counts is objects lifetime. The most useful thing is to classify allocations as per frame (usually a linear allocator that automatically frees everything at the end of the frame, double buffered if the memory has to be consumed by the GPU...), short lived (i.e. temporary objecs), medium lived (i.e. level resources) and permanent.
You realize that if you go on and on, and split allocations so every class has its own pool, and you size the pools for the worst case you're wasting a lot of memory, and in the end you don't need to manage allocations anymore. You can simply use a circular pool, and overwrite old instances with new ones, if the pool is correctly sized, living instances won't get overwritten ever!
Something wrong. And what's with the idea of object lifetime anyway? Is there a better solution? A more correct, generic answer? Something that should be used as a better default? Well well...
Subscribe to:
Posts (Atom)


