C0DE517E: 08.09

It shouldn't be news that the quality of publications done at Siggraph has been going progressively worse, in the last years, and other conferences are becoming more and more important, Eurographics, but also Siggraph Asia (see for example, those really neat publications).

A lot of politics, sponsors, pressure from universities, younger reviewers... Nowadays Siggraph is more an occasion to meet people and see what's going on in the industry, than a showcase of the best graphic research on the planet.

I didn't see anything groundbreaking, and a lot of publications were addressing problems that are not, in my view at least, so crucial. Still I don't think this year Siggraph's was bad, and you'll find plenty of coverage of the event online, so I won't write about that.

I have the impression that the non-realtime rendering, and GI in particular, has seen a slowdown recently, but it may also be that my interest shifted away from those subjects, so I don't have a good picture.

To me what's more interesting, at least now, is realtime graphics, and I'm probably more sensitive to publications in that field. At the main conference unsurprisingly, the most exciting realtime 3d presentation was done by Crytek (see this) but I was really looking forward for the papers of the HPG, one of the Siggraph's side-conferences.

Generally, there is some pretty good stuff there, like the Morphological Antialiasing paper, and many others... But you have to filter out the buzz and I was really bothered by some papers that, in my opinion, simply should not have been there. I don't know really why they bother me, probably it's also because I've seen published some ideas in the past, that I didn't bother to publish thinking they would have been rejected anyway, or maybe it's just that I have too many friends in the research community with good ideas and little luck.

Hardware-accellerated Global Illumination by Image Space Photon Mapping. Wow! Let's read...
And what's that? Well, if you've followed any GPU GI research in the last 1-2 years, it's really easy. They're using a RSM for the first-hit of the lights, they read it back in CPU and use that data for normal photon tracing (claiming that the slowest part is the first-hit, so they care about doing only that in GPU), then they splat the photons using... photon splatting.

It's mostly a tech-demo, it would be cool if they published it as such, maybe with better assets it could be a worthy addition to the other demos NVidia has. Maybe they could have published this applied research in NVidia's GPU Gems. But Siggraph?

Why "image space" anyway? And the worst part, why they don't say "RSM" or "splatting"? They cite those works as "related research", and that's it. They don't use any of those terms, they replaced everything with something else that makes it sound better and new... Photon splatting sounds slow, let's use "Image Space", is way more cool (doesn't matter if there's nothing happening in image space there). RSM are well-known... let's call them... bounce maps (genius)!

Image Space Gathering. Even worse! And it came just after the previous one in the HPG conference! It's something really minor, the only application seems to be blurry reflections, and from the images it doesn't look so nice for that either.

The algorithm? Render your image, and then blur it. But hey, preserve the edges using the Z-buffer, and make your kernel proportional to that too. Wow! Don't blow my mind with such advanced shit man!

They say "image space" and "gathering" and in the abstract,they also use "cross bilateral filtering". The idea is simple and little more than a not-so-neat trick, a curiosity with limited applications. But there's the buzz!

I think it would be easy to write a buzz-meter, check for the frequency of some keywords in the abstracts and build a filtering system that intelligently filters all the noise...

I'm back from a short trip to Montreal, a lot of people have exhaustively blogged about Siggraph and such things so I won't.

Instead as promised, here is the source code from my DOF experiment. Works in FX Composer 2.5, I couldn't use 1.x this time because it crashed on my macbook... 2.5 has other bugs I had to work around, but I managed to solve those (see the code).

As with some other previous snippets I published, those are small tests I did at home, I hope they can be inspiring for someone, maybe even in other domains, more than something I'll use right now on a game (the catch is, if they were, I couldn't publish them anyways ;)

A little background:

Pyramid filtering is a very useful technique for implementing a wide class of continuous convolution kernels. They can be used for a wide range of applications, from image upsampling to inpaiting (see: Strengert , 2007. Pyramid Methods in GPU-Based Image Processing. Conference on Computer Graphics Theory and Applications (GRAPP'07), volume GM-R, pp.21-28).

As with separable filtering, they can be computed in linear time, but unlike them, it's easy to simultaneously compute the convolution for different kernel sizes simultaneously.

A pyramid filter works by applying a convolution, with a small kernel, on the source image, and downsampling the result into a smaller texture. This step is called analysis, and it's repeated multiple times. Each new analysis level allows us to compute our kernel over a wider area. After a given number of analysis steps, we perform the same number of synthesis steps, where we start on the smallest level of our image pyramid, and go up by convolving and upsampling the image.

This pyramid lets us vary the size of our filtering kernel in the synthesis step. As the kernel size depends on the depth of the pyramid, deciding on which level a given pixel starts its synthesis process affects the size of the filter applied to that region of the image.

There are a few problems to be solved if you want to implement DOF with this.

The first one is that you need to mask some samples during your blurring pass, and it's not so obvious to choose how to do that, as the filtering is a two-pass process now. Ideally you'd want to mask different thinks during the first and the second pass, but it's not really possible.

The second problem is how to deal with foreground blur, that is, bleeding out the blur outside foreground object borders (see Starcraft II Effects & Techniques. Advances in Real-Time Rendering in 3D Graphics and Games Course - SIGGRAPH 2008).

I found a solution that does not look too bad, it's still improvable (a lot) and tweakable, but I didn't work further on it because it's stil unable to achieve a good bokeh, and I think that's really something we have to improve in our DOF effects now. Probably it's better suited for other effects, where you don't have so many discontinuities in your blur radius. Smoke and fog scattering for example could work very well.

float Script : STANDARDSGLOBAL <
string UIWidget = "none"; // suppress UI for this variable
string ScriptClass = "scene"; // this fx will render then scene AND the postprocessing
string ScriptOrder = "standard";
string ScriptOutput = "color";
string Script = "Technique=Main;";
> = 0.8; // FX Composer supports SAS .86 (directX 1.0 version does not support scripting)

#define COLORFORMAT "A16B16G16R16F"
//#define COLORFORMAT "A8B8G8R8"

// FX Composer 2.5, on Vista, on my MacBook, does ignore "Clear=Color;" and similar... So I have to FAKE IT!
#define USEFAKECLEAR

// -- Untweakables, without UI

float4x4 WorldViewProj : WorldViewProjection < uiwidget="None">;

float2 ViewportPixelSize : VIEWPORTPIXELSIZE
<
string UIName="Screen Size";
string UIWidget="None";
>;

// -- Tweakables

float ZSpillingTolerance
<
string UIWidget = "slider";
float UIMin = 0;
float UIMax = 2;
float UIStep = 0.01;
string UIName = "Blending tollerance for background to foreground";
> = 0.1;

float DOF_ParamA
<
string UIWidget = "slider";
float UIMin = 0;
float UIMax = 20;
float UIStep = 0.01;
string UIName = "Depth of field";
> = 1;

float DOF_ParamB
<
string UIWidget = "slider";
float UIMin = -10;
float UIMax = 10;
float UIStep = 0.1;
string UIName = "Depth distance";
> = -5;

// -- Buffers and samplers

texture DepthStencilBuffer : RENDERDEPTHSTENCILTARGET
<
float2 ViewportRatio = {1,1};
string Format = "D24X8";
string UIWidget = "None";
>;

// No version of FX Composer let me bind a mipmap surface to as a rendertarget, that's why I need all this:
#define DECLAREBUFFER(n) \
texture ColorBuffer##n : RENDERCOLORTARGET \
< \
float2 ViewportRatio = {1./n,1./n}; \
string Format = COLORFORMAT; \
string UIWidget = "None"; \
int MipLevels = 1; \
>; \
sampler2D ColorBuffer##n##Sampler = sampler_state \
{ \
texture = ; \
MagFilter = Linear; \
MinFilter = Linear; \
AddressU = Clamp; \
AddressV = Clamp; \
}; \

DECLAREBUFFER(1)
DECLAREBUFFER(2)
DECLAREBUFFER(3)
DECLAREBUFFER(4)
DECLAREBUFFER(5)
DECLAREBUFFER(6)

// -- Data structures

struct GeomVS_In
{
float4 Pos : POSITION;
float2 UV : TEXCOORD0;
};

struct GeomVS_Out
{
float4 Pos : POSITION;
float4 PosCopy : TEXCOORD1;
float2 UV : TEXCOORD0;
};

struct FSQuadVS_InOut // fullscreen quad
{
float4 Pos : POSITION;
float2 UV : TEXCOORD0;
};

#ifdef USEFAKECLEAR
struct FakeClear_Out
{
float4 Buffer0 : COLOR0;
float Depth : DEPTH;
};
#endif

// -- Vertex shaders

GeomVS_Out GeomVS(GeomVS_In In)
{
GeomVS_Out Out;

Out.Pos = mul( In.Pos, WorldViewProj );
Out.PosCopy = Out.Pos;
Out.UV = In.UV;

return Out;
}

FSQuadVS_InOut FSQuadVS(FSQuadVS_InOut In)
{
return In;
}

float4 tex2DOffset(sampler2D tex, float2 UV, float2 texTexelSize, float2 pixelOffsets)
{
// DirectX requires a half pixel shift to fetch the center of a texel! (fx composer 2.5)
return tex2D(tex, UV + (texTexelSize * (0.5f + pixelOffsets)));
}

float4 SceneBakePS(GeomVS_Out In) : COLOR
{
float linZ = 1/In.PosCopy.z;

float3 color = frac(In.UV.xyx * 3); // just a test color...

return saturate(float4(color, linZ));
}

#ifdef USEFAKECLEAR
FakeClear_Out FakeClearPS(FSQuadVS_InOut In)
{
FakeClear_Out Out = (FakeClear_Out)0;
Out.Depth = 1.f;

return Out;
}
#endif

float ComputeDofCoc(float linZ)
{
return abs(DOF_ParamA * ((1/linZ) + DOF_ParamB));
}

float ComputeAntiZSpillWeight(float linZcenter, float linZ)
{
return max(ZSpillingTolerance - abs(linZ - linZcenter),0) / ZSpillingTolerance;
}

float4 AnalysisPS(
FSQuadVS_InOut In,
uniform float level,
uniform sampler2D InColor,
uniform float2 SrcUVTexelSize
) : COLOR
{
const float AnalysisRadius = 1;

// Reduction step / gathering
float4 col = tex2DOffset( InColor, In.UV, SrcUVTexelSize, 0.f.xx );
float centerLinZ = col.w;
float weight = 1;

float4 res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(-1,-1) * AnalysisRadius);
float resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(1,-1) * AnalysisRadius);
resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(-1,1) * AnalysisRadius);
resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

res = tex2DOffset(InColor, In.UV, SrcUVTexelSize, float2(1,1) * AnalysisRadius);
resweight = ComputeAntiZSpillWeight(centerLinZ, res.w);
weight += resweight;
col += res * resweight;

col /= weight; // normalize
//col.w = centerLinZ; // leave Z as untouched as possible

return col;
}

float4 SynthesisPS(
FSQuadVS_InOut In,
uniform float level,
uniform sampler2D InColor,
uniform float2 SrcUVTexelSize,
uniform sampler2D PrevInColor,
uniform float2 PrevSrcUVTexelSize
) : COLOR
{
// Always sampling the first level to obtain linZ
float linZ = tex2DOffset(ColorBuffer1Sampler, In.UV, 1.f/ViewportPixelSize, 0.f.xx).w;
float dof_coc = ComputeDofCoc(linZ);

// Expansion step / scattering
float4 col = tex2DOffset(InColor, In.UV, SrcUVTexelSize, 0.f.xx);
float4 prevcol = tex2DOffset(PrevInColor, In.UV, PrevSrcUVTexelSize, 0.f.xx);

// Blur out
if(ComputeDofCoc(col.w) < xyz =" prevcol.xyz;
//dof_coc = max(dof_coc, max(ComputeDofCoc(col.w), ComputeDofCoc(prevcol.w)));

float useThisLevel = saturate(dof_coc - (level /* *2 */));

//return float4(useThisLevel.xxx, 1.f); // DEBUG
return float4(col.xyz, useThisLevel);
}

float4 FSQuadBlitPS(FSQuadVS_InOut In, uniform sampler2D InColor, uniform float2 InTexelSize) : COLOR
{
return tex2DOffset(InColor, In.UV, InTexelSize, 0.f.xx).xyzw;
}

// -- Technique

#define DECLAREANALYSIS(s,d) \
pass Analysis##s \
< \
string Script = \
"RenderColorTarget0=ColorBuffer"#d";" \
"Draw=Buffer;"; \
> \
{ \
ZEnable = false; \
ZWriteEnable = false; \
AlphaBlendEnable = false; \
VertexShader = compile vs_3_0 FSQuadVS(); \
PixelShader = compile ps_3_0 AnalysisPS( s, ColorBuffer##s##Sampler, 1.f/(ViewportPixelSize/s) ); \
}

#define DECLARESYNTHESIS(prevs,s,d) \
pass Synthesis##d \
< \
string Script = \
"RenderColorTarget0=ColorBuffer"#d";" \
"Draw=Buffer;"; \
> \
{ \
ZEnable = false; \
ZWriteEnable = false; \
AlphaBlendEnable = true; \
SrcBlend = SrcAlpha; \
DestBlend = InvSrcAlpha; \
ColorWriteEnable = 7; \
VertexShader = compile vs_3_0 FSQuadVS(); \
PixelShader = compile ps_3_0 SynthesisPS( d, ColorBuffer##s##Sampler, 1.f/(ViewportPixelSize/s), ColorBuffer##prevs##Sampler, 1.f/(ViewportPixelSize/prevs) ); \
}

// debug-test stuff:
#define ENABLE_EFFECT
#define ENABLE_SYNTH

technique Main
<
string Script =
#ifdef USEFAKECLEAR
"Pass=FakeClear;"
#endif
"Pass=Bake;"
#ifdef ENABLE_EFFECT
"Pass=Analysis1;"
"Pass=Analysis2;"
"Pass=Analysis3;"
"Pass=Analysis4;"
"Pass=Analysis5;"
#ifdef ENABLE_SYNTH
"Pass=Synthesis5;"
"Pass=Synthesis4;"
"Pass=Synthesis3;"
"Pass=Synthesis2;"
"Pass=Synthesis1;"
#endif
#endif
"Pass=Blit;"
;
>
{
#ifdef USEFAKECLEAR
pass FakeClear
<
string Script =
"RenderColorTarget0=ColorBuffer1;"
"RenderDepthStencilTarget=DepthStencilBuffer;"
"Draw=Buffer;";
>
{
ZEnable = true;
ZWriteEnable = true;
ZFunc = Always;

VertexShader = compile vs_3_0 FSQuadVS();
PixelShader = compile ps_3_0 FakeClearPS();
}
#endif

pass Bake
<
string Script =
"RenderColorTarget0=ColorBuffer1;"
"RenderDepthStencilTarget=DepthStencilBuffer;"
"Clear=Color;"
"Clear=Depth;"
"Draw=Geometry;";
>
{
ZEnable = true;
ZWriteEnable = true;

VertexShader = compile vs_3_0 GeomVS();
PixelShader = compile ps_3_0 SceneBakePS();
}

#ifdef ENABLE_EFFECT
DECLAREANALYSIS(1,2)
DECLAREANALYSIS(2,3)
DECLAREANALYSIS(3,4)
DECLAREANALYSIS(4,5)
DECLAREANALYSIS(5,6)
DECLARESYNTHESIS(6,6,5)
DECLARESYNTHESIS(6,5,4)
DECLARESYNTHESIS(5,4,3)
DECLARESYNTHESIS(4,3,2)
DECLARESYNTHESIS(3,2,1)
#endif

pass Blit
<
string Script =
"RenderColorTarget0=;"
"RenderDepthStencilTarget=;"
"Clear=Color;"
"Draw=Buffer;";
>
{
ZEnable = false;
ZWriteEnable = false;
VertexShader = compile vs_3_0 FSQuadVS();
PixelShader = compile ps_3_0 FSQuadBlitPS( ColorBuffer1Sampler, 1.f/(ViewportPixelSize/1) );
}
};

C0DE517E

Search this blog

31 August, 2009

This year's buzzword: Image-Space

25 August, 2009

Experiment: DOF with Pyramidal Filters