Search this blog

03 April, 2011

Hacking bools idea

Little idea. After reading this post about bools, chars and bitfields, I started thinking of an ugly c++ hack. 
What if we defined our own custom bool (chances are that you already have one around in your engine, if not it's probably not a good idea to add one) that it behaves like a bool (char sized or whatever) always, but it strictly stores only 0 or 1?

With such a thing we could make sure when returning a native bool out of it, that what we store is still either 0 or 1 and thus have a mean to identify some memory stomps in many classes that have boolean members, for free.

Of course this "idea" could be extended, via templates, to have ranged integers and so on, but that would start being really ugly...

2011 Current and Future programming languages for Videogames

There are so many interesting langauges that are gaining popularity these days, I thought it could be interesting to write about them and how they apply to videogames. I plan to do this every year, and probably to create a poll as well.

If I missed your favorite language, comment on this article, so at least I can include that in the poll!

Now, before we start, we have to define what "videogames" we are talking about. Game programming has always been an interdisciplinary art: AI, graphics, systems, geometry, tools, physics, database, telemetry, networking, parallel computing and more.
This is even more true nowadays that we everything turned into a gaming platforms: browsers, mobile devices, websites and so on. So truly, if we talk about videogames at large there are very few programming languages that are intresting but not relevant to our business.

So I'll be narrowing this down to languages that could or should be considered by an AAA game studio, working on consoles (Xbox 360 and PS3) as its primary focus, while maybe still keeping an eye on PC and Wii. Why? Well mostly because that's the world I know best. Let's go.

Scripting Languages

Where and why we need them: Code as Data, trivial to hotswap and live-edit, easier for non-programmers. Usually found in loading/initialization systems, AI and gameplay conditions or to define the order of operation of complex sub-systems (i.e. what gets rendered in a frame when). Scripting languages are usually interpreted (some come with optional JITs) so porting to a new platform is usually not a big deal. In many fields scripting is used to glue together different libraries, so you want a language with a lot of bindings  (Perl, Python). For games though, we are more interested in embedding the language and extending it, so small, easy to modify interpreters are a good thing.

Current champion and its strenghts:
Lua (a collection of nice presentations about it can be found on the Havok website). It's the de-facto standard. Born as a data definition language it can very well replace XML (bleargh) as an easier-to-parse, more powerful way of initializing systems.
It's also blessed with one of the fastest interpreters out there (even if it's not so cool on the simpler in-order cores that power current consoles and low-power devices) and with a decent incremental garbage collector. Easy to integrate, easy to customize, and most people are already familiar with its syntax (not only because it's so popular, but also because it's very similar to other popular scripting languages inspired by ECMAScript). Havok sells an optimized VM (Havok Script, previously called Kore).

Why we should seek an alternative: 
Lua is nice but it's not born for videogames, and sometimes it shows. It's fast but not fast enough for many tasks, especially on consoles (even if some projects , like Lua-LLVM, lua2c and MetaLua could make the situation better). It has a decent garbage collector but still it generates too much garbage (it allocates for pretty much everything, it's possible to use it in a way that minimizes the dynamic allocations but then you'll end up throwing away much of the language) and the incremental collector pauses can still be painful. It's extensible but you can only easily hook up C functions to it (and the calling mechanism is not too fast) while you need to patch its internals if you want to add a new type. Types can be defined in Lua itself, but that's seldom useful to games. There is a very cool JIT for it, but it runs on very few platforms.
Personally I'd trade many language features (OOP, Coroutines, Lambdas, Metamethods, probably even script-defined structures or types) for more performance on consoles and easier extensibility with custom types, native function calls (invoking function pointers directly from the script without the need of wrappers) etc...

Present and future alternatives: 
IO. It's a very nice language, in some way similar to Lua (it has a small VM, an incremental collector, coroutines, it's easy to embed...) but with a different (more minimal) syntax. Can do some cool things like binding C++ types to the language, it supports Actors and Futures, it has Exceptions and native Vector (SIMD) support.

Stackless Python. Python, with coroutines (fibers, microthreads, call them as you wish) and task serialization. It's Python. Many people love python, it's a well known language (also used in many tools and commercial applications, either via CPython or IronPython) and it's not one of the slowest scripting languages around (let's say, it's faster than Ruby, but slower than Lua). It's a bigger language, more complex to embed and extend. But if you really need to run many scripted tasks (see Grim Fandango and Eve Online presentations for examples of games using coroutines), it might be a good idea.

JavaScript. ActionScript, another ECMAScript implementation, is already very commonly found in games due to the reliance on Flash for menus. HTML5 looks like a possible contender, and modern browsers are always pushing to have faster and faster JavaScript JIT engines. Some JITs are only available for IA-32 platforms (like Google V8) but some others (like Mozilla TraceMonkey/JaegerMonkey) support PPC as well. The language itself is not really neat and it's full of pitfalls (Lua is much cleaner) but it's usable (also, there are languages that compile to JavaScript and that "clean it up" like coffeeScript and haXe). VMs tend to be big and not easy to understand or extend.

Scheme. Scheme is one the the two major lisp dialects (the other one being Common Lisp). It's very small and "clean". Easy to parse, not to hard to write an interpreter, and easy to extend. It's Lisp, so some people will love it, some will totally hate it. There are quite a few intepreters (Chicken, Bigloo, Gambit) that also come with a scheme-to-C compiler and some (like YScheme) that have short-pause GC for realtime applications, and that's quite lovely when you have to support many different platforms. Guile is a scheme interpreter explicitly written to be embedded.

TCL. Probably a bit "underpowered" compared to the other languages, it's born to be embeddable and extensible, almost to the point that it can more be seen as a platform for writing DSL than as a language. Similar to Forth, but without the annoying RPN syntax. Not very fast, but very easy.

Gaming-specific scripting languages. There are quite a few languages that were made specifically for videogames, most of them in reaction to Lua, most of them similar to Lua (even if they mostly go for a more C-like syntax). None of them is as popular as Lua, and I'd say, none of them emerges as a clear winner over Lua in terms of features, extensibility or speed. But many are worth considering: Squirrel, AngelScript, GameMonkey, ChaiScript, MiniD.

Other honorable mentions. Pawn is probably the closer to what I'd like to have, super small (it has no types,  variables are a 32bit "cell" that can hold an integer or can be cast to a float, a character or a boolean) and probably the fastest of the bunch, but it seems to be discontinued as its last release is from 2009. Falcon is pretty cool too, but it seems to be geared more towards being a "ruby" than a "lua" (that's to say, a complete, powerful multi-paradigm language to join libraries instead of an extension, embedded language) even if they claim to be fast and to be easy to embed. Last, I didn't investigate it much, but knowing Wouter's experience in crafting scripting languages, I won't be surprised if CubeScript was a hidden gem.

Roll your own. Yes, really, I'm serious. It's not that hard, especially using a parser-generator tool. AntLR is one of the best and easiest (see this video tutorial). Interpreters are "easy" and LLVM can be used to write a native compiler that targets consoles (Wii, Ps3, 360 can all be targeted with it) for optimized builds.

High-level languages

Where and why we need them: For "high level" here we mean languages that "higher than C".  We seek for features like type-safety, serialization, reflection, annotations and introspection, runtime code generation, dynamic loading, native threads, better tools integration (refactoring, automated coverage and testing), object lifetime management and so on and on. Usually, they are based on a VM and a JIT, and approach (in theory could even surpass, due to runtime optimizations) the speed of native-compiled systems.
Such languages can be used for most of the game code, even without loss of speed as anyways most games end up implementing such features in their own, often slow, cumbersome, error-prone ways (i.e. reference counting for lifetime management, macros and templates for reflection and so on).

Current champion and its strenghts:  
By far nowadays is CLI (plus CLS) and C#. For a while there was some fascination with Java, and some PC titles shipped using it, but today C# wins hands down. It's the de-facto standard for tools, and it's getting into the shipped titles as well. Unity is one of the most used game engines out there, and it's based on C#. Microsoft XNA puts C# on the 360. Some games on consoles already shipped using C# in their runtime.
Mono is a strong opensource implementation, it has a JIT, an ahead-of-time compiler and an interpreter. It can even use LLVM as a backend (LLVM targets . Also, CLI supports many many other languages (notably, F#), including dynamic ones via the DLR, so it can serve for scripting as well (see IronPython, IronRuby, IronScheme, Boo and so on...)

Why we should seek an alternative: 
We shouldn't. Maybe one day we will, but right now it would be lovely to try to have more and more game code written in a higher level language, instead we still rely mostly on the systems language plus some limited scripting. CLI is our best bet so far.

Present and future alternatives:
JVM languages. The Java Virtual Machine was cool, and nowadays still hosts many interesting languages (notably, Clojure and Scala). But the truth is, most of them are also available for the CLI, the JVM does not have any technological advantage over the former (actually it can be a bit harder to compile) and we don't have an equivalent of the excellent Mono project for it. Surely there are some awesome JIT compilers, LLVM support via the VMKit Project, even code hotswapping frameworks, but it seems that it's loosing momentum quickly for gaming and it's being pushed more and more into the server realm.

Erlang is another language based on a VM that is having quite some buzz. It was meant for servers and it shows, but more and more games are looking into coroutines and actors for parallelism (Mono added coroutines mostly for games), and Erlang was made specifically for that. It's a functional language that revolves around actors for concurrency. It supports code hot swapping natively and that alone is enough to put it in this list.

Mono is a CLI/CLS implementation, but it's worth noticing that it added enough extensions to be considered a platform on its own. Some are compiler extensions that do not change the language (i.e. programs using Mono.Simd will run on any CLI, even if they won't probably use Simd instructions to speed up the computations), but some are not, notably Continuations support (and for example Unity heavily relies on that) and the uber-cool assembly injection support.

OCaml. ML-family languages in a perfect world would rule Algol-like ones. ML is neat, strongly statically typed but with powerful type inferencing, functional but impure and strict. All with a syntax that really makes sense, not so minimal as the Lisp family is and way easier (and way less research-oriented) than Haskell. OCaml is one of the leading ML dialects, probably the most used one (together with the already mentioned F#). It has a good optimizing compiler that works on a number of platforms. Realistically? It won't happen yet.
ML is a great language to know and use but it's too far removed from Algol (see LangPop and Tiobe, for which programming langauges are well known...) to be successful in my opinion (maybe F# could change that...), also there are many nice dialects but no big standards with multiple compiler implementations, that is what you really need to be "safe" when choosing a systems language.

Haskell. It's perhaps surprising that this one made into the list. At least, it surprises me, I was "forced" to add it due to the demands that I had here and on the survey. I guess a lot of its "popularity" among video game programmers is due to this paper by Tim Sweeney, where he uses Haskell to demonstrate how types can help our job.
Haskell was born as a research tool, a "definitive" functional language capable of supporting many different models of computation. It's a purely functional, strongly typed language, so you can "reason" about it formally. It's lazy by default. Now while being lazy, functional and strongly typed is undoubtedly nice, I don't being pure or lazy by default is.
Surely, in theory having no side-effects makes automatic parallelization possible, and Parallel Haskell does that neatly but in practice the results are not great. And it's true that monads are not that hard (this tutorial is neat) and there are plenty of nice resources on the language, but I still don't think purely functional data structures are something that most people will easily understand or be able to write (in fact most basic data structures that we take for grated in the mutable programming world, are still considered great research topics when turned into persistent versions).
Reasoning about space and time performance of an Haskell program is also very hard, and while I do think that a good language should decouple the logical representation of data to their physical layout, I also think we need still to be able to strictly control both. In some domains, more declarative languages are surely welcome.
I do think that Haskell is a language that needs to be learned, and it's great to experiment with. But I personally don't see it as something we will use in the foreseeable future.

System programming

Where and why we need them: High performance, core code. Languages in this tier need to be able to directly manipulate memory, need to support all platforms functionality, even better, support inline assembly. Statically compiled, strict languages. Used for computational kernels, core data structures and direct interface with the hardware.

Current champion and its strengths:  
C/C++, by far. It's the only language that you will find on all the platforms. It's compilers are usually the fastest, and they get augmented with all the features needed to fully use the target hardware (i.e. SIMD extensions and so on).
Native platform libraries interface with them and only them out of the box. In other words, right now for any practical purpose, that means that ANY other language in this category has to have an easy, fast interface with C/C++ or compile to (generate) C/C++.
All the programmers in your company know at least some C and C++.


Why we should seek an alternative: 
C is too low-level for the needs of huge projects as modern games are, it tends to become cumbersome if used for large parts of your source.
C++ is made of evil. Probably not many in your company really know all of C, surely no one fully understands C++.
Also, both C and C++ are showing their age in some respect. Important concepts like SIMD operations are supported only through compiler extensions. Other language features made sense ten years ago but not too much now (for example, short-circuiting boolean operations generates branches that often cost more than what they save).

Present and future alternatives: 
D. D is C++ done right. There is a proprietary compiler from Digital Mars and frontends for GCC and LLVM. It's an interesting, well made language, supported even by Alexandrescu, an hardcore C++ guru. 
It looks like C++ but it's much simpler (it drops all C compatibility, the preprocessor, multiple inheritance, forward declarations and include files, non-virtual member functions and so on) and much more powerful (sane memory management, sane templates, first-class functions and closures, immutable structures, modules, threads, contracts, dynamically compiled code). It still misses some needed features (like runtime reflection, even if I hear it's not hard to add as it supports compile-time reflection), but most of it is there.
It's major obstacle to success is the fact that is similar to C++. So similar that is questionable if it's superior design is worth the trouble of migrating to a language that still does not guarantee support on each platform out of the box.

Objective-C. I don't have enough experience to write about this, but I had a few comments by programmers advocating it, so I'll have to include it here. It's already used in games as it's the language of choice for Apple devices, but I don't know if it could be suitable for AAA games.

Go. Go is a very recent language created by Griesemer, Pike and Thompson at Google. That's enough to be worth considering. Currently there are a few compilers, notably one in the GCC stack. It's much simpler than C++, it only supports interfaces but it does not need to declare inheritance, it's garbage collected, it does not permit pointer arithmetic and it supports threads and tasks (via "goroutines"). One of its goals was to make software development faster, and dependency management easier. That's exactly what we need. Go is still too young, but it surely needs to be followed.

Rust is an experimental language by Mozilla Labs. It's still in development and its syntax is not finalized yet but it looks incredibly promising. It currently has a compiler that uses LLVM as its backend. It's memory safe, immutable by default, concurrent (with coroutine and actor support), it has first class functions and a neat way of expressing and enforcing invariants at runtime and compile-time. It also supports "localized rule-breaking" meaning that safety rules can be broken "if explicit about where and how".

C. If we manage to port more and more code to a language in the previous category (higher level) then we might not need any neat feature in our systems programming language, just sheer speed and compatibility with the hardware.
C would be a great choice, and it often manages to implement features that are important for performance before C++. It's also way easier to parse and reason about than C++ so it's better for tools, it's easier for compilers to work with it (good luck finding a 100% compliant C++ one) and so on.
Many people in the industry are looking back at C or going for a more C-like programming style, I even heard talks of projects to extend C with novel object models (via code generation) to avoid C++.

OpenCL. Yes, it's a language for GPUs. That's to say very powerful processors made of many low-power, in-order, shared-cache parallel processing units. In other words, the future. We need a language for data-parallel computations, and OpenCL seems to be a reasonable choice. I would personally prefer something even more restrictive and stream-oriented, with well-defined inputs and outputs (easy to check for correctness!) and means of connection and buffering between the kernels, but OpenCL is the standard and it's supported by every GPU vendor. Moreover, we already have several CPU implementations, like FOXC that spits out C code from OpenCL, Intel OpenCL SDK for x86 CPUs, IBM OpenCL for Cell. There are also similar initiatives targeting NVidia's proprietary Cuda language, like GPUOcelot. Of course LLVM plays a major role again, even AMD's GPU compiler seems to be based on it, and there are both backends and frontends in the works for it.

Your own C/C++. To a degree, everyone is already using their own version of C++ incompatible with everyone else's. That's because to use C++ in production you have to both restrict and extend it. That's usually done with coding guidelines and reviews plus custom libraries with a sprinkle of preprocessor macros (and setting your compilers to the maximum warning level plus enabling warnings as errors). In other words, horribly.
It is possible, with tools, to do better. Parsers and rules can be used to restrict the language in a well-enforced way. There are a few of such tools and they are mostly commercial (cppcheck is a notable exception, parsing and understanding C++ is a nightmare) like PC-Lint, Coverity, Lattix (for dependency analysis and design rules enforcement), PVS-Studio and so on.
Parsers can also be used to enhance the language, by gathering and exposing information, either to be directly used (i.e. reflection) or to be fed into code-generation tools (to generate bindings or for serialization etc). It's a very interesting possibility, and we have a number of different ways to extract information.
Compilers can help, it's possible to parse their parse symbol and debugging files, some can even directly output useful data (most notably, GCC and CLang with libCLang, even Visual C++ considered this but I don't see it happening yet).
Doxygen also has a good parser that can extract quite some interesting information, SWIG can parse headers and there are also a few parsers dedicated to language extension and analysis, like PyCParser, CTool and CIL for C, TransformersOpenC++ and VivaCore (that is used by the PVS-Studio linter), ELSAHarmonia, that tries to build an incremental tool for refactoring and program transformation, and commercial ones like Understand.
Last but not least, code generation: after parsing either vanilla C++ or an annotated version with custom extensions, for most practical uses we'll need to emit some code.
This is for example, the approach followed by QT's MOC compiler to implement their object model. Some tools support full source-to-source translation (parsing and emitting code), like the DMS toolkit, Stratego/XT or the ROSE compiler.
But code generation is not only useful for source-to-source transforms and language enhancement, it's also vital to bridge data and DSLs to C++ and in general to create interfaces between different systems. That's the approach used by object models like COM, or data formats like Google protobufs and many other domains. Many frameworks approach this as a text generation problem (like Cog and Cheetah) similar to what JSP does for web pages, but there are also generators capable of working directly from a syntax tree or something in the middle between the two, like cppgenmodel and rGen.

Conclusions

Ok so, let's say I have a few months and I have to write a new game engine. What would be my best bet? Well I'd have to toss a coin between CLI+C and Lua+My own C++.

The CLI is a very good platform, and it's surprisingly well supported in gaming. For current-gen it would be a no brainer, Mono already supports all that you need and even rolling your own CLI compiler with LLVM is not an impossible task at all (and you can generate symbols to be able to use existing debuggers/profilers and so on). Microsoft will surely continue to push it even on future platforms, and other vendors and many studios are interested in the technology as well. The only problem I can see is that you won't get a 100% guarantee that you can support any future platform day-zero, as we are not seeing vendors shipping their own CLI for general game development yet.

The other option is to invest in a good, proprietary data/object/module/service/task model for C++, banishing its own OO system, restricting object usage mostly to services to be contained in dynamic linkable, hotswappable modules and to data structures. A generic data model with reflection and serialization support could be used for communication between modules, RPC, persistence and data parallel computations. All this wrapped in some DSLs to code-generate all the fluff. It won't be any easier (personally I think it would be harder, for me) than writing a CLI compiler with LLVM from scratch, and it will be much messier, but 100% future proof.

Addendum

AI-specific languages. I believe in a world where no game will need to implement a state machine. There are quite a few "AI" languages, from the obvious Prolog family to constrain programming languages to more general frameworks like Soar. I'm not an AI expert, and I don't believe that we'll ever see one of these in a game, but they are great inspirations for DSLs, often embedded into other languages (internal DSLs).

Languages for number crunching. As I wrote for the scripting and AI-specific language, there is a lot of inspiration to be found looking at "obscure" task specific languages. Parallelism, concurrency and memory bandwidth are not only problems nowadays, they were problems even  for the early supercomputers. Most of these languages are not going to be ever seen in a game (and many of them are hard to be seen in applications anywhere!) but understanding them is great, especially if you're going to design a DSL.
Nowadays "functional" is considered cool because it should make concurrency easier (immutability, no shared data). It's patently false, as demonstrated by the practice: there is no purely functional language that scales well for number crunching (and yes, I know how Haskell can parallelize things using sparks). While there are purely imperative languages that easily scale to thousand of processors (CG, HLSL, on GPUs). Mutability is not a problem (especially if you don't share...), controlling data is! Making a list of "interesting" languages is pointless, but I'd suggest here to keep an eye on dataflow languages, like Sisal and Sa-C.

01 April, 2011

Version Control for the next-gen?

Source code - a DVCS, like Mercurial
Source art - A distributed, versioning (copy on write) filesystem. Lustre and ZFS (combined). I also hear nice things about using a VCS plus a dependency tracking system to sync only what's needed for a given task (see Shotgun and Tactic)...
Build art/codecontinuous build machines publishing tested builds to a distribution system (i.e. bittorrent)


Thoughts?

27 March, 2011

Stable Cascaded Shadow Maps - Ideas

Stable CSM intro

A "stable" cascade is nothing else than a fixed projection of your entire world on a giant texture, of which we render a fixed window that fits around the projection of view frustum each frame, making sure that we always slide this fixed window by an integral number of texels each frame. 
As we have to be sure that the "window" will fit the frustum in all cases, to determine its size a way is to fit the frustum in a sphere and then size the window using the radius of such sphere.

Implementing CSM, especially on consoles is not that easy. For an open world game you'll notice that you need quite a lot of resolution to get decent results, and cascade rendering can become quickly a problem. On 360 from what I've seen, resolving big shadowmaps from EDRAM to the shared memory is very expensive too, so it becomes important to pack the shadowmaps aggressively. Some random "good" ideas are:
  • Render shadows to a deferred shadow buffer, enabling the possibility of rendering one cascade at a time. It also makes way easier to cross-fade cascades and possible rendering shadows at half-res (that is a good idea... upsampling with bilateral filtering or similar). It's possible to use hi-stencil and hi-z (on ps3, also depth range) in various ways to accelerate this.
  • Tune cascade shadow filtering to try to match filter size across different resolutions (that's to say, filter less far cascades).
  • Shadow a pixel using the best cascade that contains that pixel, instead of relying of the frustum split planes (this makes a bit harder to fade between cascades, but not too much). Use scissors or clipping planes to avoid rendering stuff that is already rendered in previous cascades into more coarse ones. Microsoft has a pair of nice articles.
  • Compute light near-far planes to be tight around the frustum but avoid culling objects before the near plane (a.k.a. "pancake": clamp depth in the vertex shader, it's not a big deal as the projection is orthographic but it can screw self shadowing of such clipped objects, you need to give a bit of "buffer" space to the near plane). The downside is that you get more raster pressure as the hi-z will not reject the objects that are compressed on the near plane... you can solve that either by giving a small linear range for the pancaked objects or marking and using stencil/hi-stencil where they get drawn.
  • Cull small objects aggressively from distant cascades. Avoid rendering objects in far cascades if they were rendered completely in the previous ones.
  • Pack shadowmaps! Do not render things behind the frustum and maximize the area in front of it! This and this articles have some good ideas. You can also pack two shadowmaps into a two-channel 16-bit target if double-depth fill is not giving you a big speedup.
Still after doing all this, you might end needing more performance...

Crysis

I'm playing Crysis 2. Nice game, starts a bit weak with a too forced story but it improves A LOT later on. Graphically is great as I'm sure you've all noticed, ok long story short, I still probably love Modern Warfare and Red Dead a bit more but it does not disappoint. Somewhat the art direction on Crysis 2 looks a bit "hyperrealistic" to me most of the times with very soft and exaggerated ambient fill, even more accentuated by the huge bloom. But well, technically is impressive and it is surely a good game.

Now of course if you're a rendering engineer, first thing you do with such a game is to walk slowly everywhere and check out the rendering techniques. And so did I. Some notes:
  • Lods pop noticeably, small objects are faded out pretty aggressively. Still during "normal" gameplay it's not too evident.
  • DOF is pretty smart. It seems to filter with a "ring" pattern that I guess is both an optimization and a way to simulate bokeh. It looks like what you get from a catadioptric mirror lens, but it's reasonable also because most lenses will have a sharp out of focus either before of after the focal plane, as the bokeh shape of one is the inverse of the other (so if a lens has a nice gaussian-like out of focus after the focal plane, it will get an harsh negative-gaussian one before). It also manages to blur correctly objects before the focal plane, kudos for that.
  • Huge screenspace bloom/lens flares.
  • Motion blur (camera only?)
  • Decent post-filtering AA, even if with some defects (ghosting of objects in motion), not the best I've seen but good.
  • Shadows. Stable CSM. A weird circular filter is applied to them. No fading between cascades. A dithering pattern that seems to be linked to the light space. Far cascades are updated every other frame.
Ok. So the last item caught my attention. How to do that? Well, it's not that hard if you think about it. If you observe the update of the CSM, you'll notice that even when you rotate the view your far cascades move only by a few texels, so we could just add a bit of space there and assume that updating these cascaded every other frame won't create problems.

Caching

But what if we want to be accurate? Well it turns out it's not really hard at all! We know what is the window we rendered last frame, and where we should render this frame. Most of the new frame is already rendered in the last one, we could just shift the data in the right place. 

It turns out, we don't even need that, if we want to apply this incremental update only once and then re-render, we can just shift our "zero" of the shadowmap uv and wrap. We still need to render the new data and resolve it, but that is only a few texels wide border! Even culling the objects to render only the ones that fall in that border is really trivial.

Really, we could do an incremental update for every cascade... forever! If it wasn't for two things: moving objects and the fact that we can't fix our cascade (light) near/far z, but we usually to maximize the resolution need to fit it each frame (or so).

We could alleviate the latter problem by having the "shifting" shader also re-range the "cached" last frame data into the new near/far range. The moving objects one can be solved by having them rendered into a separate buffer or a copy of the buffer. Both solutions though need more memory and bandwidth (resolve time on 360) so they can be good only if that is not already a major bottleneck (that's to say, if you packed your cascaded well).

26 March, 2011

Debugging DirectX9 is so stressful!

I've been working on console games for the past five years, so I don't know much about the PC tech these days. Now I'm on a console/pc title and I just had to debug the PC build.

Oh. My. God.

I got so stressed I actually almost got sick that night. And then yes, it turned out to be a really small bug that I could totally have debugged easier without hooking these tools anyway.

Pix for Windows is a joke, but still it's the best tool I tried... It's a bit better on DX10/11 (faster refresh)

NVidia perfHud is rather useless (even if I hear it's better than Pix for profiling, which I believe as Pix currently is unable to do any profiling at all) and the Intel GPA thing did not seem to really work at all (it took 20 minutes just to load the capture and it gave me some weird results, even if it looks better than Pix, it's promising I guess) - Update: newer versions of GPA seem to work fine, and actually it's now my preferred DX9 tool!

ApiTrace is a new tool which might be good... I had a look at one of the early versions which did not work for DX9, now it seems to have added support for all APIs...

ATI has a Gpu PerfStudio thing which is decent, but it deprecated DX9, the current version is for DX10/11 only.

For some things I would even say the old 3d reaper (or ripper) and DXexplorer are better tools!

I really fucking hope that the new Nvidia Parallel NSight (a.k.a. Nexus) and ATI Gpu PerfStudio 2 are great, I could not try them as they're dx10/11 only and I'm currently on dx9. Overall it really shows how much the industry is committed to PC these days...