Stable CSM intro
A "stable" cascade is nothing else than a fixed projection of your entire world on a giant texture, of which we render a fixed window that fits around the projection of view frustum each frame, making sure that we always slide this fixed window by an integral number of texels each frame.
As we have to be sure that the "window" will fit the frustum in all cases, to determine its size a way is to fit the frustum in a sphere and then size the window using the radius of such sphere.
Implementing CSM, especially on consoles is not that easy. For an open world game you'll notice that you need quite a lot of resolution to get decent results, and cascade rendering can become quickly a problem. On 360 from what I've seen, resolving big shadowmaps from EDRAM to the shared memory is very expensive too, so it becomes important to pack the shadowmaps aggressively. Some random "good" ideas are:
- Render shadows to a deferred shadow buffer, enabling the possibility of rendering one cascade at a time. It also makes way easier to cross-fade cascades and possible rendering shadows at half-res (that is a good idea... upsampling with bilateral filtering or similar). It's possible to use hi-stencil and hi-z (on ps3, also depth range) in various ways to accelerate this.
- Tune cascade shadow filtering to try to match filter size across different resolutions (that's to say, filter less far cascades).
- Shadow a pixel using the best cascade that contains that pixel, instead of relying of the frustum split planes (this makes a bit harder to fade between cascades, but not too much). Use scissors or clipping planes to avoid rendering stuff that is already rendered in previous cascades into more coarse ones. Microsoft has a pair of nice articles.
- Compute light near-far planes to be tight around the frustum but avoid culling objects before the near plane (a.k.a. "pancake": clamp depth in the vertex shader, it's not a big deal as the projection is orthographic but it can screw self shadowing of such clipped objects, you need to give a bit of "buffer" space to the near plane). The downside is that you get more raster pressure as the hi-z will not reject the objects that are compressed on the near plane... you can solve that either by giving a small linear range for the pancaked objects or marking and using stencil/hi-stencil where they get drawn.
- Cull small objects aggressively from distant cascades. Avoid rendering objects in far cascades if they were rendered completely in the previous ones.
- Pack shadowmaps! Do not render things behind the frustum and maximize the area in front of it! This and this articles have some good ideas. You can also pack two shadowmaps into a two-channel 16-bit target if double-depth fill is not giving you a big speedup.
Crysis
I'm playing Crysis 2. Nice game, starts a bit weak with a too forced story but it improves A LOT later on. Graphically is great as I'm sure you've all noticed, ok long story short, I still probably love Modern Warfare and Red Dead a bit more but it does not disappoint. Somewhat the art direction on Crysis 2 looks a bit "hyperrealistic" to me most of the times with very soft and exaggerated ambient fill, even more accentuated by the huge bloom. But well, technically is impressive and it is surely a good game.
Now of course if you're a rendering engineer, first thing you do with such a game is to walk slowly everywhere and check out the rendering techniques. And so did I. Some notes:
- Lods pop noticeably, small objects are faded out pretty aggressively. Still during "normal" gameplay it's not too evident.
- DOF is pretty smart. It seems to filter with a "ring" pattern that I guess is both an optimization and a way to simulate bokeh. It looks like what you get from a catadioptric mirror lens, but it's reasonable also because most lenses will have a sharp out of focus either before of after the focal plane, as the bokeh shape of one is the inverse of the other (so if a lens has a nice gaussian-like out of focus after the focal plane, it will get an harsh negative-gaussian one before). It also manages to blur correctly objects before the focal plane, kudos for that.
- Huge screenspace bloom/lens flares.
- Motion blur (camera only?)
- Decent post-filtering AA, even if with some defects (ghosting of objects in motion), not the best I've seen but good.
- Shadows. Stable CSM. A weird circular filter is applied to them. No fading between cascades. A dithering pattern that seems to be linked to the light space. Far cascades are updated every other frame.
Ok. So the last item caught my attention. How to do that? Well, it's not that hard if you think about it. If you observe the update of the CSM, you'll notice that even when you rotate the view your far cascades move only by a few texels, so we could just add a bit of space there and assume that updating these cascaded every other frame won't create problems.
Caching
But what if we want to be accurate? Well it turns out it's not really hard at all! We know what is the window we rendered last frame, and where we should render this frame. Most of the new frame is already rendered in the last one, we could just shift the data in the right place.
It turns out, we don't even need that, if we want to apply this incremental update only once and then re-render, we can just shift our "zero" of the shadowmap uv and wrap. We still need to render the new data and resolve it, but that is only a few texels wide border! Even culling the objects to render only the ones that fall in that border is really trivial.
Really, we could do an incremental update for every cascade... forever! If it wasn't for two things: moving objects and the fact that we can't fix our cascade (light) near/far z, but we usually to maximize the resolution need to fit it each frame (or so).
We could alleviate the latter problem by having the "shifting" shader also re-range the "cached" last frame data into the new near/far range. The moving objects one can be solved by having them rendered into a separate buffer or a copy of the buffer. Both solutions though need more memory and bandwidth (resolve time on 360) so they can be good only if that is not already a major bottleneck (that's to say, if you packed your cascaded well).