Search this blog

25 August, 2009

Experiment: DOF with Pyramidal Filters

I'm back from a short trip to Montreal, a lot of people have exhaustively blogged about Siggraph and such things so I won't.

Instead as promised, here is the source code from my DOF experiment. Works in FX Composer 2.5, I couldn't use 1.x this time because it crashed on my macbook... 2.5 has other bugs I had to work around, but I managed to solve those (see the code).

As with some other previous snippets I published, those are small tests I did at home, I hope they can be inspiring for someone, maybe even in other domains, more than something I'll use right now on a game (the catch is, if they were, I couldn't publish them anyways ;)

A little background:

Pyramid filtering is a very useful technique for implementing a wide class of continuous convolution kernels. They can be used for a wide range of applications, from image upsampling to inpaiting (see: Strengert , 2007. Pyramid Methods in GPU-Based Image Processing. Conference on Computer Graphics Theory and Applications (GRAPP'07), volume GM-R, pp.21-28).

As with separable filtering, they can be computed in linear time, but unlike them, it's easy to simultaneously compute the convolution for different kernel sizes simultaneously.

A pyramid filter works by applying a convolution, with a small kernel, on the source image, and downsampling the result into a smaller texture. This step is called analysis, and it's repeated multiple times. Each new analysis level allows us to compute our kernel over a wider area. After a given number of analysis steps, we perform the same number of synthesis steps, where we start on the smallest level of our image pyramid, and go up by convolving and upsampling the image.

This pyramid lets us vary the size of our filtering kernel in the synthesis step. As the kernel size depends on the depth of the pyramid, deciding on which level a given pixel starts its synthesis process affects the size of the filter applied to that region of the image.

There are a few problems to be solved if you want to implement DOF with this.

The first one is that you need to mask some samples during your blurring pass, and it's not so obvious to choose how to do that, as the filtering is a two-pass process now. Ideally you'd want to mask different thinks during the first and the second pass, but it's not really possible.

The second problem is how to deal with foreground blur, that is, bleeding out the blur outside foreground object borders (see Starcraft II Effects & Techniques. Advances in Real-Time Rendering in 3D Graphics and Games Course - SIGGRAPH 2008).

I found a solution that does not look too bad, it's still improvable (a lot) and tweakable, but I didn't work further on it because it's stil unable to achieve a good bokeh, and I think that's really something we have to improve in our DOF effects now. Probably it's better suited for other effects, where you don't have so many discontinuities in your blur radius. Smoke and fog scattering for example could work very well.

float Script : STANDARDSGLOBAL <
string UIWidget = "none"; // suppress UI for this variable
string ScriptClass = "scene"; // this fx will render then scene AND the postprocessing
string ScriptOrder = "standard";
string ScriptOutput = "color";
string Script = "Technique=Main;";
> = 0.8; // FX Composer supports SAS .86 (directX 1.0 version does not support scripting)

#define COLORFORMAT "A16B16G16R16F"
//#define COLORFORMAT "A8B8G8R8"

// FX Composer 2.5, on Vista, on my MacBook, does ignore "Clear=Color;" and similar... So I have to FAKE IT!
#define USEFAKECLEAR

// -- Untweakables, without UI

float4x4 WorldViewProj : WorldViewProjection < uiwidget="None">;

float2 ViewportPixelSize : VIEWPORTPIXELSIZE
<
string UIName="Screen Size";
string UIWidget="None";
>;

// -- Tweakables

float ZSpillingTolerance
<
string UIWidget = "slider";
float UIMin = 0;
float UIMax = 2;
float UIStep = 0.01;
string UIName = "Blending tollerance for background to foreground";
> = 0.1;

float DOF_ParamA
<
string UIWidget = "slider";
float UIMin = 0;
float UIMax = 20;
float UIStep = 0.01;
string UIName = "Depth of field";
> = 1;

float DOF_ParamB
<
string UIWidget = "slider";
float UIMin = -10;
float UIMax = 10;
float UIStep = 0.1;
string UIName = "Depth distance";
> = -5;

// -- Buffers and samplers

texture DepthStencilBuffer : RENDERDEPTHSTENCILTARGET
<
float2 ViewportRatio = {1,1};
string Format = "D24X8";
string UIWidget = "None";
>;

// No version of FX Composer let me bind a mipmap surface to as a rendertarget, that's why I need all this:
#define DECLAREBUFFER(n) \
texture ColorBuffer##n : RENDERCOLORTARGET \
< \
float2 ViewportRatio = {1./n,1./n}; \
string Format = COLORFORMAT; \
string UIWidget = "None"; \
int MipLevels = 1; \
>; \
sampler2D ColorBuffer##n##Sampler = sampler_state \
{ \
texture = ; \
MagFilter = Linear; \
MinFilter = Linear; \
AddressU = Clamp; \
AddressV = Clamp; \
}; \

DECLAREBUFFER(1)
DECLAREBUFFER(2)
DECLAREBUFFER(3)
DECLAREBUFFER(4)
DECLAREBUFFER(5)
DECLAREBUFFER(6)

// -- Data structures

struct GeomVS_In
{
float4 Pos : POSITION;
float2 UV : TEXCOORD0;
};

struct GeomVS_Out
{
float4 Pos : POSITION;
float4 PosCopy : TEXCOORD1;
float2 UV : TEXCOORD0;
};

struct FSQuadVS_InOut // fullscreen quad
{
float4 Pos : POSITION;
float2 UV : TEXCOORD0;
};

#ifdef USEFAKECLEAR
struct FakeClear_Out
{
float4 Buffer0 : COLOR0;
float Depth : DEPTH;
};
#endif

// -- Vertex shaders

GeomVS_Out GeomVS(GeomVS_In In)
{
GeomVS_Out Out;

Out.Pos = mul( In.Pos, WorldViewProj );
Out.PosCopy = Out.Pos;
Out.UV = In.UV;

return Out;
}

FSQuadVS_InOut FSQuadVS(FSQuadVS_InOut In)
{
return In;
}

float4 tex2DOffset(sampler2D tex, float2 UV, float2 texTexelSize, float2 pixelOffsets)
{
// DirectX requires a half pixel shift to fetch the center of a texel! (fx composer 2.5)
return tex2D(tex, UV + (texTexelSize * (0.5f + pixelOffsets)));
}

float4 SceneBakePS(GeomVS_Out In) : COLOR
{
float linZ = 1/In.PosCopy.z;

float3 color = frac(In.UV.xyx * 3); // just a test color...

return saturate(float4(color, linZ));
}

#ifdef USEFAKECLEAR
FakeClear_Out FakeClearPS(FSQuadVS_InOut In)
{
FakeClear_Out Out = (FakeClear_Out)0;
Out.Depth = 1.f;

return Out;
}
#endif

float ComputeDofCoc(float linZ)
{
return abs(DOF_ParamA * ((1/linZ) + DOF_ParamB));
}

float ComputeAntiZSpillWeight(float linZcenter, float linZ)
{
return max(ZSpillingTolerance - abs(linZ - linZcenter),0) / ZSpillingTolerance;
}

float4 AnalysisPS(
FSQuadVS_InOut In,
uniform float level,
uniform sampler2D InColor,
uniform float2 SrcUVTexelSize
) : COLOR
{
const float AnalysisRadius = 1;

// Reduction step / gathering
float4 col = tex2DOffset( InColor, In.UV, SrcUVTexelSize, 0.f.xx );
float centerLinZ = col.w;
float weight = 1;

float4 res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(-1,-1) * AnalysisRadius);
float resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(1,-1) * AnalysisRadius);
resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(-1,1) * AnalysisRadius);
resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(1,1) * AnalysisRadius);
resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

col /= weight; // normalize
//col.w = centerLinZ; // leave Z as untouched as possible

return col;
}

float4 SynthesisPS(
FSQuadVS_InOut In,
uniform float level,
uniform sampler2D InColor,
uniform float2 SrcUVTexelSize,
uniform sampler2D PrevInColor,
uniform float2 PrevSrcUVTexelSize
) : COLOR
{
// Always sampling the first level to obtain linZ
float linZ = tex2DOffset(ColorBuffer1Sampler, In.UV, 1.f/ViewportPixelSize, 0.f.xx).w;
float dof_coc = ComputeDofCoc(linZ);

// Expansion step / scattering
float4 col = tex2DOffset(InColor, In.UV, SrcUVTexelSize, 0.f.xx);
float4 prevcol = tex2DOffset(PrevInColor, In.UV, PrevSrcUVTexelSize, 0.f.xx);

// Blur out
if(ComputeDofCoc(col.w) < xyz =" prevcol.xyz;
//dof_coc = max(dof_coc, max(ComputeDofCoc(col.w), ComputeDofCoc(prevcol.w)));

float useThisLevel = saturate(dof_coc - (level /* *2 */));

//return float4(useThisLevel.xxx, 1.f); // DEBUG
return float4(col.xyz, useThisLevel);
}

float4 FSQuadBlitPS(FSQuadVS_InOut In, uniform sampler2D InColor, uniform float2 InTexelSize) : COLOR
{
return tex2DOffset(InColor, In.UV, InTexelSize, 0.f.xx).xyzw;
}

// -- Technique

#define DECLAREANALYSIS(s,d) \
pass Analysis##s \
< \
string Script = \
"RenderColorTarget0=ColorBuffer"#d";" \
"Draw=Buffer;"; \
> \
{ \
ZEnable = false; \
ZWriteEnable = false; \
AlphaBlendEnable = false; \
VertexShader = compile vs_3_0 FSQuadVS(); \
PixelShader = compile ps_3_0 AnalysisPS( s, ColorBuffer##s##Sampler, 1.f/(ViewportPixelSize/s) ); \
}

#define DECLARESYNTHESIS(prevs,s,d) \
pass Synthesis##d \
< \
string Script = \
"RenderColorTarget0=ColorBuffer"#d";" \
"Draw=Buffer;"; \
> \
{ \
ZEnable = false; \
ZWriteEnable = false; \
AlphaBlendEnable = true; \
SrcBlend = SrcAlpha; \
DestBlend = InvSrcAlpha; \
ColorWriteEnable = 7; \
VertexShader = compile vs_3_0 FSQuadVS(); \
PixelShader = compile ps_3_0 SynthesisPS( d, ColorBuffer##s##Sampler, 1.f/(ViewportPixelSize/s), ColorBuffer##prevs##Sampler, 1.f/(ViewportPixelSize/prevs) ); \
}

// debug-test stuff:
#define ENABLE_EFFECT
#define ENABLE_SYNTH

technique Main
<
string Script =
#ifdef USEFAKECLEAR
"Pass=FakeClear;"
#endif
"Pass=Bake;"
#ifdef ENABLE_EFFECT
"Pass=Analysis1;"
"Pass=Analysis2;"
"Pass=Analysis3;"
"Pass=Analysis4;"
"Pass=Analysis5;"
#ifdef ENABLE_SYNTH
"Pass=Synthesis5;"
"Pass=Synthesis4;"
"Pass=Synthesis3;"
"Pass=Synthesis2;"
"Pass=Synthesis1;"
#endif
#endif
"Pass=Blit;"
;
>
{
#ifdef USEFAKECLEAR
pass FakeClear
<
string Script =
"RenderColorTarget0=ColorBuffer1;"
"RenderDepthStencilTarget=DepthStencilBuffer;"
"Draw=Buffer;";
>
{
ZEnable = true;
ZWriteEnable = true;
ZFunc = Always;

VertexShader = compile vs_3_0 FSQuadVS();
PixelShader = compile ps_3_0 FakeClearPS();
}
#endif

pass Bake
<
string Script =
"RenderColorTarget0=ColorBuffer1;"
"RenderDepthStencilTarget=DepthStencilBuffer;"
"Clear=Color;"
"Clear=Depth;"
"Draw=Geometry;";
>
{
ZEnable = true;
ZWriteEnable = true;

VertexShader = compile vs_3_0 GeomVS();
PixelShader = compile ps_3_0 SceneBakePS();
}

#ifdef ENABLE_EFFECT
DECLAREANALYSIS(1,2)
DECLAREANALYSIS(2,3)
DECLAREANALYSIS(3,4)
DECLAREANALYSIS(4,5)
DECLAREANALYSIS(5,6)
DECLARESYNTHESIS(6,6,5)
DECLARESYNTHESIS(6,5,4)
DECLARESYNTHESIS(5,4,3)
DECLARESYNTHESIS(4,3,2)
DECLARESYNTHESIS(3,2,1)
#endif

pass Blit
<
string Script =
"RenderColorTarget0=;"
"RenderDepthStencilTarget=;"
"Clear=Color;"
"Draw=Buffer;";
>
{
ZEnable = false;
ZWriteEnable = false;
VertexShader = compile vs_3_0 FSQuadVS();
PixelShader = compile ps_3_0 FSQuadBlitPS( ColorBuffer1Sampler, 1.f/(ViewportPixelSize/1) );
}
};

2 comments:

Anonymous said...

Thanks - I am all for code-sharing, but am also a lazy bugger. If you could do an overview description of your method, it would be much appreciated! :)

Thanks again!

DEADC0DE said...

Well long story short it's similar to the idea of using a mipmap pyramid to be able to have your image blurred at different radii, but instead of creating the mipmap only going down, it adds a phase that goes the other way as well, from coarse to the top level. Those are called pyramid filters, and well, there's a reference to a paper in the introduction of the post, that explains them pretty well. Adapting that to DOF is kindof natural, but not so easy. The author of the first paper I reference also made a DOF system out of pyramid filtering, but it's more complicated than this one, works better but it's not for realtime