So, first a little "announcement": I'm crafting a small DX11 rendering framework in my spare time. I want to have it opensourced, and it's based on MJP's excellent SampleFramework11.
The goals are to provide an environment roughly as fast to iterate upon as FXComposer was (I consider it dead now...) but for programmers, without being a "shader editor".
If you're interested in collaborating, send me an email at c0de517e (it's a gmail account) with a brief introduction, there is an interesting list of things to do.
That said, this is a little bit of functionality Maurizio Cerrato and I have been working on in a couple of days, a "printf" like function for pixel (and compute) shaders. It all started when chatting Daniel Sewell (a brilliant guy, was my rendering lead on Fight Night) he made me notice that he found, working on CS that a neat way to debug them was to display all kinds of interesting debug visualizations by having geometry shaders "decode" buffers and emit lines.
if(IsDebuggedPixel(input.PositionSS.xy)) DebugDrawFloat(float2(ssao, bloom.x), clipPos); |
You could emit such data per each PS invocation and later sift through it and display what you needed in a meaningful way, but that will be quite slow (and at that point you might want to consider just packing everything into some MRT outputs). The idea behind appendbuffers is to do the work only for a handful of invocations (e.g. screen positions, if current sv_position equals the pixel to "debug" then GPU printf...).
In order to keep everything snappy we also minimize the structure size we use in the append buffer, you can't really printf strings, the debugger so far support only one to three floats w/color and position or lines. Lines is were we started really, our struct containts two end-points a color (index) and a flag which distinguishes lines from float printf. Floats just reinterpret one of the endpoints as the data to print.
This append buffer structure gets then fed to a VS/GS that is invocated twice the times the append buffer count (via draw indirect, you need to multiply by two the count in a small CS, remember, you can't emit the start/end vertices as two separate append calls because the order of these is not deterministic, the vertices will end all mixed in the buffer!), and the GS emits extra lines if we're priting floats to display a small line-based font.
If you're thinking that is lame, well it is, there are certain limitations in the number of primitives the GS can emit that effectively limit the number of digits you can display, and you have to be careful about that, I "optimized" the code to display the most digits possible which unfortunately gives you very low-precision 3-float printf and higher precision 2-float and 1-float (you could though call three times the 1-float version... as there the ordering of the three call doesn't matter).
Keeping the same number of printed digits, the point has to float... |
Anyhow, together with shader hot-reloading (which everybody has, right), this is a quite a handy trick. Bonus: on a similar note, have a look at this shadertoy snippet by my coworker Paul Malin... brilliant guy!
Some code, without doubt full of bugs:
Snippet from the CPU/C++ side, drawing the debug lines...
void ShaderDebugDraw(ID3D11DeviceContext* context, const Float4x4& viewProjectionMatrix, const Float4x4& projMatrix ) { SampleFramework11::PIXEvent market(L"ShaderDebug Draw"); context->CopyStructureCount(AppendBufferCountCopy, 0, AppendBuffer.UAView); // We need a compute shader to write BufferCountUAV, as we need to multiply CopyStructureCount by two ID3D11ShaderResourceView* srViews[] = { AppendBuffer.SRView }; ID3D11UnorderedAccessView* uaViews[] = { AppendBufferCountCopyUAV }; UINT uavsCount[] = { 0 }; context->CSSetUnorderedAccessViews(1, 1, uaViews, uavsCount); context->CSSetShader(DebugDrawShader.AcquireCS(), NULL, 0); context->Dispatch(1,1,1); context->CSSetShader(NULL, NULL, 0); uaViews[0] = NULL; context->CSSetUnorderedAccessViews(1, 1, uaViews, uavsCount); // Set all IA stage inputs to NULL, since we're not using it at all. void* nulls[D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT] = { NULL }; context->IASetVertexBuffers(0, D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT, (ID3D11Buffer**)nulls, (UINT*)nulls, (UINT*)nulls); context->IASetInputLayout(NULL); context->IASetIndexBuffer(NULL, DXGI_FORMAT_UNKNOWN, 0); // Draw debug lines srViews[0] = AppendBuffer.SRView; context->VSSetShaderResources(0, 1, srViews); context->GSSetShaderResources(0, 1, srViews); context->GSSetShader(DebugDrawShader.AcquireGS(), NULL, 0); context->VSSetShader(DebugDrawShader.AcquireVS(), NULL, 0); context->PSSetShader(DebugDrawShader.AcquirePS(), NULL, 0); shaderDebugDrawDataVS.Data.ViewProjection = viewProjectionMatrix; shaderDebugDrawDataVS.Data.Projection = projMatrix; shaderDebugDrawDataVS.ApplyChanges(context); shaderDebugDrawDataVS.SetVS(context, 0); context->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_LINELIST); context->DrawInstancedIndirect(AppendBufferCountCopy, 0); [...]This is roughly how the shader library looks for emitting debug lines/debug numbers from pixel shaders
And finally, the VS/GS/CS shaders needed to draw the debug buffer emitted from the various PS executions:struct ShaderDebugLine { float3 posStart; float3 posEnd; uint color; uint flag; }; cbuffer ShaderDebugData : register(b13) { float2 debugPixelCoords; float2 oneOverDisplaySize; int debugType; };void DebugDrawFloat(float3 number, float3 pos, int color = 0, uint spaceFlag = SHADER_DEBUG_FLAG_2D) { ShaderDebugLine l; l.posStart = pos; l.color = color; l.posEnd = number; l.flag = SHADER_DEBUG_PRIM_FLOAT3|spaceFlag; ShaderDebugAppendBuffer.Append(l); }float2 SVPosToClipspace(float2 svPos, float2 oneOverDisplaySize) { return (svPos * oneOverDisplaySize) * float2(2,-2) + float2(-1,1); } bool IsDebuggedPixel(float2 svPos) { // This is a bit tricky because it depends on the MSAA pattern if(debugType == 1) return dot(abs(debugPixelCoords - svPos + float2(0.5,0.5)), 1.0.xx) <= 0.01f; else if(debugType == 2) return dot(abs(svPos % float2(100,100)), 1.0.xx) <= 1.01f; else return false; }
static const int DigitFontOffsets[] = { 0, 8, 10, 20, 30, 38, 48, 58, 62, 72, 82, 84, 86 }; static const float DigitFontScaling = 0.03; static const float DigitFontWidth = 0.7 * DigitFontScaling; // The font width is 0.5, but we add spacing static const int DigitFontMaxLinesPerDigit = 5; static const float2 DigitFont[] = { /* 0 */ float2(0.f, 0.f), float2(0.5f, 0.f), float2(0.5f, 0.f), float2(0.5f, -1.f), float2(0.5f, -1.f), float2(0.f, -1.f), float2(0.f, -1.f), float2(0.f, 0.f), /*1*/ float2(0.5f, 0.f), float2(0.5f, -1.f), /*2*/ float2(0.f, 0.f), float2(0.5f, 0.f), float2(0.5f, 0.f), float2(0.5f, -0.5f), float2(0.5f, -0.5f), float2(0.f, -0.5f), float2(0.f, -0.5f), float2(0.f, -1.f), float2(0.f, -1.f), float2(0.5f, -1.f), /*3*/ float2(0.f, 0.f), float2(0.5f,0.f), float2(0.5f,0.f), float2(0.5f,-0.5f), float2(0.5f,-0.5f), float2(0.f,-0.5f), float2(0.5f,-0.5f), float2(0.5f,-1.f), float2(0.5f,-1.f), float2(0.f,-1.f), /*4*/ float2(0.f, 0.f), float2(0.f, -0.5f), float2(0.f, -0.5f), float2(0.5f, -0.5f), float2(0.5f, -0.5f), float2(0.5f, 0.f), float2(0.5f, -0.5f), float2(0.5f, -1.f), /*5*/ float2(0.f, 0.f), float2(0.f, -0.5f), float2(0.f, -0.5f), float2(0.5f, -0.5f), float2(0.5f, -0.5f), float2(0.5f, -1.f), float2(0.f, 0.f), float2(0.5f, 0.f), float2(0.f, -1.f), float2(0.5f, -1.f), /*6*/ float2(0.f, 0.f), float2(0.f, -1.f), float2(0.f, -0.5f), float2(0.5f, -0.5f), float2(0.5f, -0.5f), float2(0.5f, -1.f), /* avoidable */ float2(0.f, 0.f), float2(0.5f, 0.f), float2(0.f, -1.f), float2(0.5f, -1.f), /*7*/ float2(0.5f, 0.f), float2(0.5f, -1.f), float2(0.5f, 0.f), float2(0.f, 0.f), /* 8 */ float2(0.f, 0.f), float2(0.5f, 0.f), float2(0.5f, 0.f), float2(0.5f, -1.f), float2(0.5f, -1.f), float2(0.f, -1.f), float2(0.f, -1.f), float2(0.f, 0.f), float2(0.f, -0.5f), float2(0.5f, -0.5f), /*9*/ float2(0.f, 0.f), float2(0.5f, 0.f), float2(0.5f, 0.f), float2(0.5f, -1.f), float2(0.5f, -0.5f), float2(0.f, -0.5f), float2(0.f, -0.5f), float2(0.f, 0.f), float2(0.5f, -1.f), float2(0.f, -1.f), /*-*/ float2(0.5f, -0.5f), float2(0.f, -0.5f), /*.*/ float2(0.8f, -0.9f), float2(0.9f, -1.f), }; cbuffer ShaderDebugDrawData : register(b0) { float4x4 Projection; float4x4 ViewProjection; }; struct vsOut { float4 Pos : SV_Position; float3 Color : TexCoord0; }; StructuredBufferShaderDebugStructuredBuffer : register(u0); RWBuffer<uint> StructureCount : register(u1); void DebugDrawDigit(int digit, float4 pos, inout LineStream GS_Out, float3 color) { for (int i = DigitFontOffsets[digit]; i < DigitFontOffsets[digit+1] - 1; i+=2) { vsOut p; p.Color = color; p.Pos = pos + float4(DigitFont[i] * DigitFontScaling, 0, 0); GS_Out.Append(p); p.Pos = pos + float4(DigitFont[i +1] * DigitFontScaling, 0, 0); GS_Out.Append(p); GS_Out.RestartStrip(); } } float4 DebugDrawIntGS(int numberAbs, uint numdigit, float4 pos, inout LineStream GS_Out, float3 color) { while(numdigit > 0) { DebugDrawDigit(numberAbs % 10u , pos, GS_Out, color); numberAbs /= 10u; --numdigit; pos.x -= DigitFontWidth; } return pos; } void DebugDrawFloatHelperGS(float number, float4 pos, inout LineStream GS_Out, float3 color, int totalDigits) { float numberAbs = abs(number); uint intPart = (int)numberAbs; uint intDigits = 0; if(intPart > 0) intDigits = (uint) log10 ((float) intPart) + 1; uint fractDigits = max(0, totalDigits - intDigits); // Get the fractional part uint fractPart = round(frac(numberAbs) * pow(10, (fractDigits-1))); // Draw the fractional part pos = DebugDrawIntGS(fractPart, fractDigits, pos, GS_Out, color * 0.5 /* make fractional part darker */); // Draw the . pos.x -= DigitFontWidth * 0.5; DebugDrawDigit(11, pos, GS_Out, color); pos.x += DigitFontWidth * 0.25; // Draw the int part if (numberAbs > 0) { pos = DebugDrawIntGS(intPart, intDigits, pos, GS_Out, color); if (number < 0) DebugDrawDigit(10 /* draw a minus sign */, pos, GS_Out, color); } } vsOut VS(uint VertexID : SV_VertexID) { uint index = VertexID/2; uint col = ShaderDebugStructuredBuffer[index].color; uint flags = ShaderDebugStructuredBuffer[index].flag; float3 pos; if((VertexID & 1)==0) // we're processing the start of the line pos = ShaderDebugStructuredBuffer[index].posStart; else // we're processing the start of the line pos = ShaderDebugStructuredBuffer[index].posEnd; vsOut output = (vsOut)0; output.Color = ShaderDebugColors[col]; if(flags & SHADER_DEBUG_FLAG_2D) output.Pos = float4(pos.xy,0,1); else if (flags & SHADER_DEBUG_FLAG_3D_VIEWSPACE) output.Pos = mul( float4(pos.xyz,1.0) , Projection); else // we just assume SHADER_DEBUG_FLAG_3D_WORLDSPACE otherwise output.Pos = mul( float4(pos.xyz,1.0) , ViewProjection); return output; } [numthreads(1,1,1)] void CS(uint3 id : SV_DispatchThreadID) { StructureCount[0] *= 2; StructureCount[1] = 1; StructureCount[2] = 0; StructureCount[3] = 0; } float4 PS(vsOut input) : SV_Target0 { return float4(input.Color, 1.0f); } // Worst case we print 3 floats... 4 digits per float plus we need 4 vertices for the . and -, and another four 4 for the cross [maxvertexcount(3 * (4*(2*DigitFontMaxLinesPerDigit)+4) + 4)] void GS(line vsOut gin[2], inout LineStream GS_Out, uint PrimitiveID : SV_PrimitiveID) { // We'll get two vertices, one primitive, out of the VS for each element in ShaderDebugStructuredBuffer... // TODO: we could avoid reading ShaderDebugStructuredBuffer if we passed the number flag along from the VS ShaderDebugLine dbgLine = ShaderDebugStructuredBuffer[PrimitiveID]; // If we got a line, then just re-emit the line coordinates if((dbgLine.flag & SHADER_DEBUG_PRIM_MASKBITS) == SHADER_DEBUG_PRIM_LINE) { GS_Out.Append(gin[0]); GS_Out.Append(gin[1]); GS_Out.RestartStrip(); return; } float4 pos = gin[0].Pos; // Draw cross vsOut p; p.Color = gin[0].Color; p.Pos = pos + float4(DigitFontWidth*0.5,0,0,0); GS_Out.Append(p); p.Pos = pos + float4(-DigitFontWidth*0.5,0,0,0); GS_Out.Append(p); GS_Out.RestartStrip(); p.Pos = pos + float4(0,DigitFontWidth*0.5,0,0); GS_Out.Append(p); p.Pos = pos + float4(0,-DigitFontWidth*0.5,0,0); GS_Out.Append(p); GS_Out.RestartStrip(); // Draw the numbers, as lines pos += float4(0,-DigitFontWidth*1.5,0,0); float3 number = gin[1].Pos.xyz; if ((dbgLine.flag & SHADER_DEBUG_PRIM_MASKBITS) == SHADER_DEBUG_PRIM_FLOAT1) { // Less floats drawn means we can afford more precision without exceeding maxvertexcount DebugDrawFloatHelperGS(number.x, pos, GS_Out, gin[0].Color, 12); } else if ((dbgLine.flag & SHADER_DEBUG_PRIM_MASKBITS) == SHADER_DEBUG_PRIM_FLOAT2) { // Less floats drawn means we can afford more precision without exceeding maxvertexcount, 12/2 = 6 digits DebugDrawFloatHelperGS(number.x, pos, GS_Out, gin[0].Color, 6); pos.y -= DigitFontWidth * 2; DebugDrawFloatHelperGS(number.y, pos, GS_Out, gin[0].Color, 6); } else //if ((dbgLine.flag & SHADER_DEBUG_PRIM_MASKBITS) == SHADER_DEBUG_PRIM_FLOAT3) { // 3*4 we draw 12 digits here... DebugDrawFloatHelperGS(number.x, pos, GS_Out, gin[0].Color, 4); pos.y -= DigitFontWidth * 2; DebugDrawFloatHelperGS(number.y, pos, GS_Out, gin[0].Color, 4); pos.y -= DigitFontWidth * 2; DebugDrawFloatHelperGS(number.z, pos, GS_Out, gin[0].Color, 4); } }
3 comments:
Fantastic!
"I want to have it opensourced, and it's based on MJP's excellent SampleFramework11."
Btw did you publish your code on any public repository?
What about SampleFramework11, is it open-sourced too? I could not find any repo on MJP's site.
SampleFramework11 is not on a repo but you can download it with any of MJP's samples and it includes a MIT license.
My additions are on a private repo, way too messy to make them public yet.
Post a Comment