Search this blog


Sunday, June 29, 2008

Fermat's principle

Searching for topics
I've noticed lately that four major topics have found their way through my blog posts. That wasn't something that I did plan, it just happened, like it happens to a person of going through different periods in his music listening habits. Those are:
  • C++ is premature optimization
  • Shader programming is physical modelling (and artists are fitting algorithms)
  • Iteration time is the single most important coding metric
  • Huge, data-driven frameworks, are bad
We explore things moving our interests more or less randomly, until we find something that requires a spot, a local mininum in the space of things, were we spend some time and then eventually evade, starting another random walk searching for another minima.

And while that it's a good, smart, simple way to search (see metropolis random walks, that lead naturally to simulated annealing, that in turn is a very simple implementation of the tabu search ideas... a nice application of metropolis-hastings montecarlo methods is this one...), it's probably not a good way to write articles, as the result is non uniform, and I've found that the information that I think it's important is scattered among different posts.

As I'm not happy with the way I explained some concepts, I tend to increase the redundancy of the posts, they become longer, some ideas are repeated more times in slightly different perspectives, hoping that the real point I wanted to make eventually is perceived.
That eventually helps myself to have a clearer view of those ideas, I'm not pretending to write this blog as a collection of articles for others to read, I write on the things that I'm interested as writing helps me first and foremost, and then if someone else finds that interesting, it's just an added bonus.

Be water my friend
One of the things that I still don't feel to have clearly espressed is my view of data-driven designs.
Let's look at the last two items of my recurring-topics list: "Iteration time is the single most important coding metric", "Huge, parametric frameworks, are bad". The problem here is that most of the times, huge, parametric frameworks are made exactly to cut iteration times. You have probably seen them. Big code bases, with complex GUI tools that let you create your game AI / Rendering / Animation / Shading / Sounds / whatever by connecting components in a graph or tree, mhm usually involving quite a bit of XML, usually with some finite state machine and/or some badly written, minimal scripting language too (because no matter what, connecting components turns out not to be enough)

How can they be bad, if they are fulfilling my most important coding metric? There is a contraddiction, isn't it?

Yes, and no. The key point lies in observing how those frameworks are made. They usually don't grow up from generalizations made on an existing codebase. They are not providing common services that your code will use. They are driving the way you code instead. They fix a data format, and force your code to be built around it. To fix a data format, you have to make assumptions on what you will need. Assumptions of the future. Those, always, fail, so sooner or later, someone will need to do something that is not easily represented by the model you imposed.
And it's at that point that things go insane. Coders do their work, no matter what, using something close to the Fermat's principle (the basic principle our-rendering-engineers interpretation of light is built on). They try to solve problems following paths of minimal design change (pain minimization as Yegge would call it). And not because they are lazy (or because the deadlines are too tight, or not only anyways), but most of the times because we prefer (questionabily) uniformity to optimal solutions (that's also why a given programming language usually leads to a given programming style...).
So things evolve in the shape that the system imposes them, requirements change, we change solutions to still be fitting in that shape, until our solutions are so different from the initial design that the code looks like a twisted, bloated, slow pile of crap. At that point, a new framework is built, in the best case. In the worst one, more engineers are employed to manage all that crap, usually producing more crap (because they can't do any better, is the design that is rotten, not only the code!).

A common way to inject flexibilty in a rotten overdesigned framework is employing callbacks (i.e. prerendercallback, postrendercallback, preupdate, postendframe etc, etc etc...) to let users add their own code in the inner workings of the system itself. That creates monsters that have subshapes, built on and hanging from a main shape, something that even Spore is not able to manage.

What is the bottom line? That the more a library is general, the more shapeless it has to be. It should provide common services not shape future development. That's why for example when I talk about data-driven and fast iteration, most of the times I also talk about the virtues of reflection and serialization. Those are common services that can be abstracted and that should find a place in our framework, as it's a very safe assumption to say that our solutions will always have parameters to be managed...
Rigid shapes should be given as late as possible to our code.

Simple example
I still see that many rendering engines built on scenegraphs, and worse, using the same graph for rendering and for coordinate frame updates (hierarchical animations). Why? Probably because a lot of books show such ways of building a 3d engine, or because it maps so easily to a (bloated) OOP design that could be easily be an excercise in a C++ textbook.
Hierarchical animations are not so common anyway, they should not be a first-class item in our framework, that is an unreasonable assumption. They should be one of the possible coordinate frame updating subsystems, they should live in the code space that is as near as possible to user rendering code, not rooted in the structure of the system. Heck, who says that we need a single coordinate frame per object anyway? Instacing in such designs is made with custom objects in the graph that hide their instancing coordinates, making everything a big black box, that becomes even worse if you made the assumption that each object has to have a bounding volume. Then instancing objects will have a single volume encompassing all the instances, but that's suboptimal for the culler so you have to customize it to handle that case, or write a second culler in the instaced object, or split the instances in groups... And you can easily start to see how easily things are starting to go wrong...


Nick said...

I agree with you on all points.

Fermat's principle is so cool in physics though, I hate to see it being used to explain programmer laziness and dogma :)

You've got multiple things here, all worth their own posts IMO. It's probably possible to write a book about all the compromises, inefficiencies, architectural monsters, and hacks that result from using an Inventor style scenegraph!

DEADC0DE said...

Yeah, probably extracting the eigenvectors from all my posts, we could get a few really well written single-topic articles, but that's not my objective right now, so my posts will always be a little messy.
Hope they could still be enjoyable.

Hugo Gomes said...

I like OpenSceneGraph :P

Gregory said...

What would be your approach to scenegraphs ?

DEADC0DE said...

My approach to scenegraph is not to use them. It's a simple approach!