Search this blog

Showing posts with label Stupid programming tricks. Show all posts
Showing posts with label Stupid programming tricks. Show all posts

10 February, 2014

Design Pattern: The Push Updater

Just kidding.

Let me tell you a story instead. Near the end of Space Marines we were, as it often happens at these stages, scrambling to find the extra few frames per second needed for shipping (I maintain that a project should be shippable from start to the end, or close, but I digress).

Now, as I did such a good job (...) at optimizing the GPU side of things it came to a point where the CPU was our bottleneck, and on rendering we mostly were bound by the numbers of material changes we could do per frame (quite a common scenario). Turns out that we couldn't afford our texture indirections, that is, to traverse per each material the pointers that led to our texture classes which in turn contained pointers to the actual hardware textures (or well, graphics API textures). Bad.

Most of the trashing happened in distant objects, so one of the solutions we tried was to collapse the number of materials needed in LODs. We thought of a couple of ways in which artists could specify replacement materials for given ones in an object, collapsing the number of unique materials needed to draw an LOD (thus making the process easier than manual editing of the LODs). Unfortunately it would have required some changes in the editor which were too late to do. We tried something alongside of the idea of baking textures to vertex colors in the distance (a very wise thing actually) but again time was too short for that. Multithreading the draw thread was too risky as well (and on consoles we were already saturating the memory BW so we wouldn't get any better by doing that).

In the end we managed to find a decent balance sorting stuff smartly, asking for a lot of elbow grease from the artists (sorry) and doing lots of other rearrangements of the data structures (splitting by access, removing some handles and so on), we ended up shipping a solid 30fps and were quite happy.

A decent trick is to cache the frequently-accessed parts of an indirection, together with a global counter that signals when any of the objects of that kind changed and a local copy of the counter. If you saw no changes (local copy of the counter equals global) you can avoid following the indirection and use the local cache instead... This can get more complex by having a small number of counters instead of just one, hashing the objects into buckets somehow, or keeping a full record of the changes that happened and have a way for an object to see if its indirection was in the list of changed things... We did some smart stuff, but that was still a sore point. Indirections. These bastards.

So, wrapping up the project we went to the drawing board and started tinkering and writing down plans for the next revision of the engine. We estimated that without these indirections we could push 40% more objects each frame. But why do you have these to begin with? Well, the bottom line is that you often want to update some data that is shared among objects. In case of the textures the indirection served our reference counting system which was used to load and unload them, hot-swapping during development and streaming in-game.

Here comes the "push pattern" to the rescue. The idea is simple, instead of going through an indirection to fetch the updated data, create an object (you can call it UpdateManager and create it with a Factory and maybe template it with some policies, if that's what turns you on) that will store the locations of all the copies of a piece of data (sort of like a database version of a garbage collector), so every time you need to make a copy or destroy a copy you register this fact. Now if create/destroy/updates are infrequent compared to accesses, having copies all around instead of indirections will significantly speed up the runtime, while you can still do global updates via the manager by poking the new data in all the registered locations.

A nifty thing is that the manager could even sort the updates to be propagated by memory location, thus pushing many updates at once with potentially less misses. This is basically what we do in some subsystems in an implicit way. Think about culling for example, if you have some bounding volumes which contain an array of pointers to objects, and as these bounding volumes are found visible you append the pointers to a visible object list, you're "pushing" an (implicit) message that said objects were found visibile...

13 May, 2013

Peek'n'Poke

Sometimes I write tools small and stupid enough to be contained in a blog post. This in one of them...

I always wanted to have graphical visualizers inside visual studio, to see matrices, points, images and such things from raw memory locations. It turns out that's very simple if you just ReadProcessMemory from an external tool, even simpler than writing a Visual Studio extension. Of course, this doesn't work when remote debugging (and the simplest option there would be to write a server or just something intrusive in the code). 



This small C# sample does display images from a process memory, refreshing every 33ms, it supports a few formats (r8 is broken as I was too lazy to set the palette, expect bugs in general...) but it could be easily extended to do whatever you need (i.e. graph floats in time...). 

Enjoy!

P.S. If you extend/fix/find anything incredibly dumb in the code below, leave a comment! Thanks...

In the future, it would be really cool to have a dynamic debugging/program visualization tool. There is already quite some work, also if you look in the reversing/hacking community.


Update: Now with floating point images support and endian swaps...
// See http://blackandodd.blogspot.ca/2012/12/c-read-and-write-process-memory-in.html
// and http://www.mpgh.net/forum/250-c-programming/298510-c-writeprocessmemory-readprocessmemory.html
 
using System;
 
namespace Peek
{
    class Program
    {
#region Kernel Imports
        // http://msdn.microsoft.com/en-us/library/windows/desktop/ms684880(v=vs.85).aspx
        const uint ACL_DELETE = 0x00010000;
        const uint ACL_READ_CONTROL = 0x00020000;
        const uint ACL_WRITE_DAC = 0x00040000;
        const uint ACL_WRITE_OWNER = 0x00080000;
        const uint ACL_SYNCHRONIZE = 0x00100000;
        const uint ACL_END = 0xFFF; //if you have Windows XP or Windows Server 2003 you must change this to 0xFFFF
        const uint PROCESS_VM_READ = 0x0010;
        const uint PROCESS_VM_WRITE = 0x0020;
        const uint PROCESS_VM_OPERATION = 0x0008;
        const uint PROCESS_ALL_ACCESS = (ACL_DELETE | ACL_READ_CONTROL | ACL_WRITE_DAC | ACL_WRITE_OWNER | ACL_SYNCHRONIZE | ACL_END);
 
        [System.Runtime.InteropServices.DllImport("kernel32.dll")]
        static extern uint OpenProcess(uint dwDesiredAccessbool bInheritHandleint dwProcessId);
        [System.Runtime.InteropServices.DllImport("kernel32.dll")]
        static extern bool ReadProcessMemory(uint hProcess, UIntPtr lpBaseAddress, IntPtr bufferuint sizeuint lpNumberOfBytesRead);
        /*[System.Runtime.InteropServices.DllImport("kernel32.dll")]
        static extern bool WriteProcessMemory(uint hProcess, UIntPtr lpBaseAddress, byte[] buffer, uint size, uint lpNumberOfBytesWritten);
        [System.Runtime.InteropServices.DllImport("kernel32.dll")]
        static extern bool WriteProcessMemory(uint hProcess, UIntPtr lpBaseAddress, IntPtr buffer, uint size, uint lpNumberOfBytesWritten);*/
 
        class UnmanagedMemWrapper // should we GC.AddMemoryPressure?
        {
            public UnmanagedMemWrapper(uint size)
            {
                this.ptr = System.Runtime.InteropServices.Marshal.AllocHGlobal((int)size);
            }
            ~UnmanagedMemWrapper()
            {
                System.Runtime.InteropServices.Marshal.FreeHGlobal(ptr);
            }
            
            public IntPtr ptr;
        }
#endregion // Kernel Imports
 
 
        // Utility, half2float, could use DirectXMath DirectX::PackedVector functions instead...
        [System.Runtime.InteropServices.DllImport("d3dx9_35.dll")]
        public static extern void D3DXFloat16To32Array(float[] output, IntPtr inputuint nfloats);
 
        static void PrintUsageAndErrors(string error)
        {
            System.Console.WriteLine("Peek");
            System.Console.WriteLine("----");
            System.Console.WriteLine();
            System.Console.WriteLine("Arguments: process name, instance number, pointer address, [peek mode]");
            System.Console.WriteLine("Note that multiple processes can have the same name...");
            System.Console.WriteLine();
            System.Console.WriteLine("Peek mode:");
            System.Console.WriteLine(" img [format] xsize ysize -- draws a 2d image");
            System.Console.WriteLine("  Supported formats: argb8 argb16 rgb8 rgb16 r8 r16 argb32f argb16f rgb32f rgb16f r32f r16f");
            System.Console.WriteLine();           
 
            if(error.Length!=0)
            {
                System.Console.WriteLine("Error!");
                System.Console.WriteLine(error);
            }
        }
 
        [STAThreadstatic void Main(string[] args)
        {
            if (args.Length < 5)
            {
                PrintUsageAndErrors("Not enough arguments"); return;
            }
 
            var procs = System.Diagnostics.Process.GetProcessesByName(args[0]);
            UInt32 procNumber = 0;
 
            if (!UInt32.TryParse(args[1], out procNumber))
            {
                PrintUsageAndErrors("Can't parse process number"); return;
            }
 
            if (procs.Length <= procNumber)
            {
                PrintUsageAndErrors("Process instance not found"); return;
            }
 
            var proc = procs[procNumber];
            uint procHandle = OpenProcess(PROCESS_VM_READfalseproc.Id);
 
            if (procHandle == 0)
            {
                PrintUsageAndErrors("Failed to open process"); return;
            }
 
            switch (args[3])
            {
                case "img":
                    {
                        UInt32 xsizeysize;
                        if ((!UInt32.TryParse(args[5], out xsize)) || (!UInt32.TryParse(args[6], out ysize)))
                        {
                            PrintUsageAndErrors("Can't parse img size"); return;
                        }
 
                        switch (args[4])
                        {
                            case "argb8":
                                PeekImg(procHandleargs[2], xsizeysize, 4, System.Drawing.Imaging.PixelFormat.Format32bppArgbImgOP.NONE);
                                break;
                            case "rgb8":
                                PeekImg(procHandleargs[2], xsizeysize, 3, System.Drawing.Imaging.PixelFormat.Format24bppRgbImgOP.NONE);
                                break;
                            case "argb16":
                                PeekImg(procHandleargs[2], xsizeysize, 8, System.Drawing.Imaging.PixelFormat.Format64bppArgbImgOP.NONE);
                                break;
                            case "rgb16":
                                PeekImg(procHandleargs[2], xsizeysize, 6, System.Drawing.Imaging.PixelFormat.Format48bppRgbImgOP.NONE);
                                break;
                            case "r8":
                                PeekImg(procHandleargs[2], xsizeysize, 1, System.Drawing.Imaging.PixelFormat.Format8bppIndexedImgOP.NONE);
                                break;
                            case "r16":
                                PeekImg(procHandleargs[2], xsizeysize, 2, System.Drawing.Imaging.PixelFormat.Format16bppGrayScaleImgOP.NONE);
                                break;
                            case "argb32f":
                                PeekImg(procHandleargs[2], xsizeysize, 4, System.Drawing.Imaging.PixelFormat.Format32bppArgbImgOP.F32_TO_I8);
                                break;
                            case "rgb32f":
                                PeekImg(procHandleargs[2], xsizeysize, 3, System.Drawing.Imaging.PixelFormat.Format24bppRgbImgOP.F32_TO_I8);
                                break;
                            case "argb16f":
                                PeekImg(procHandleargs[2], xsizeysize, 4, System.Drawing.Imaging.PixelFormat.Format32bppArgbImgOP.F16_TO_I8);
                                break;
                            case "rgb16f":
                                PeekImg(procHandleargs[2], xsizeysize, 3, System.Drawing.Imaging.PixelFormat.Format24bppRgbImgOP.F16_TO_I8);
                                break;
                            case "r32f":
                                PeekImg(procHandleargs[2], xsizeysize, 1, System.Drawing.Imaging.PixelFormat.Format8bppIndexedImgOP.F32_TO_I8);
                                break;
                            case "r16f":
                                PeekImg(procHandleargs[2], xsizeysize, 1, System.Drawing.Imaging.PixelFormat.Format8bppIndexedImgOP.F16_TO_I8);
                                break;
                            default:
                                PrintUsageAndErrors("Unknown image format");
                                return;
                        }
 
                        break;
                    }
                default:
                    PrintUsageAndErrors("Unknown peek options");
                    return;
            }
        }
        
        enum ImgOP { NONEF16_TO_I8F32_TO_I8 }
 
        class PeekImgForm : System.Windows.Forms.Form
        {
            public PeekImgForm()
            {
                DoubleBuffered = true;
                Text = "Peeker";
 
                Controls.Add(memControl); 
                Controls.Add(hdrScale);
                Controls.Add(noAlphaButton); 
                Controls.Add(endianSwapButton);
                Controls.Add(fillBlackButton);                
                Controls.Add(RBSwapButton);
                Controls.Add(xresControl);
                Controls.Add(yresControl);
                Controls.Add(resetButton);
 
                resetButton.Click += delegate(object senderSystem.EventArgs e)
                {
                    CreateBuffers();
                };
 
                var background = new System.Drawing.Drawing2D.HatchBrush(
                    System.Drawing.Drawing2D.HatchStyle.LargeCheckerBoardSystem.Drawing.Color.BlackSystem.Drawing.Color.White);
 
                Paint += delegate(object senderSystem.Windows.Forms.PaintEventArgs e)
                {
                    if (!ReadProcessMemory(procHandlepointerunmanagedMemory.ptrreadSize, 0))
                    {
                        e.Graphics.FillRectangle(System.Drawing.Brushes.Red, 0, 0, Bounds.WidthBounds.Height);
                        return;
                    }
 
                    float scale = (float)hdrScale.Value * 255.0f;
 
                    if ((format == System.Drawing.Imaging.PixelFormat.Format64bppArgb) ||
                        (format == System.Drawing.Imaging.PixelFormat.Format48bppRgb)) // these are not 16bpp, but 13, really
                    {
                        unsafe
                        {
                            ushortushortPtr = (ushort*)unmanagedMemory.ptr;
                            for (int i = 0; i < imageSize / 2; i++)
                                ushortPtr[i] >>= 3;
                        }
                    }
 
                    if (imgOp == ImgOP.F16_TO_I8)
                    {
                        D3DXFloat16To32Array(tempHalfToFloatMemoryunmanagedMemory.ptr, (uint)tempHalfToFloatMemory.Length);
                        unsafe
                        {
                            fixed (floatfloatPtr = tempHalfToFloatMemory)
                            {
                                bytebytePtr = (byte*)unmanagedMemory.ptr;
                                for (int i = 0; i < imageSizei++)
                                {
                                    float scaledVal = floatPtr[i] * scale;
                                    bytePtr[i] = (byte)(scaledVal > 255.0f ? 255.0f : scaledVal);
                                }
                            }
                        }
                    }
                    else if (imgOp == ImgOP.F32_TO_I8)
                    {
                        unsafe
                        {
                            bytebytePtr = (byte*)unmanagedMemory.ptr;
                            floatfloatPtr = (float*)unmanagedMemory.ptr;
                            /*for (int i = 0; i < imageSize; i += 4)
                            {
                                floatPtr[i] /= floatPtr[i + 3];
                                floatPtr[i+1] /= floatPtr[i + 3];
                                floatPtr[i+2] /= floatPtr[i + 3];
                            }*/
                            for (int i = 0; i < imageSizei++)
                            {
                                float scaledVal = floatPtr[i] * scale;
                                bytePtr[i] = (byte)(scaledVal > 255.0f ? 255.0f : scaledVal);
                            }
                        }
                    }
 
                    if (endianSwapButton.Checked)
                    {
                        unsafe
                        {
                            bytebytePtr = (byte*)unmanagedMemory.ptr;
                            for (int i = 0; i < imageSizei += 4)
                            {
                                byte temp = bytePtr[i + 3];
                                bytePtr[i + 3] = bytePtr[i];
                                bytePtr[i] = temp;
                                temp = bytePtr[i + 2];
                                bytePtr[i + 2] = bytePtr[i + 1];
                                bytePtr[i + 1] = temp;
                            }
                        }
                    }
 
                    if (RBSwapButton.Checked// Loop again, I don't want to code the variants...
                    {
                        unsafe
                        {
                            bytebytePtr = (byte*)unmanagedMemory.ptr;
                            for (int i = 0; i < imageSizei += 4)
                            {
                                byte temp = bytePtr[i + 2];
                                bytePtr[i + 2] = bytePtr[i];
                                bytePtr[i] = temp;
                            }
                        }
                    }
 
                    /*var data = bitmap.LockBits(new System.Drawing.Rectangle(0, 0, bitmap.Width, bitmap.Height)
                        , System.Drawing.Imaging.ImageLockMode.WriteOnly, bitmap.PixelFormat);
                    System.Diagnostics.Debug.Assert(data.Scan0 == unmanagedMemory.ptr);
                    bitmap.UnlockBits(data);*/
 
                    if (fillBlackButton.Checked)
                        e.Graphics.FillRectangle(System.Drawing.Brushes.Black, 0, 0, Bounds.WidthBounds.Height);
                    else
                        e.Graphics.FillRectangle(background, 0, 0, Bounds.WidthBounds.Height); // Draw a pattern to be able to "see" alpha...
 
                    if (noAlphaButton.Checked// TODO: add scaling options...
                        e.Graphics.DrawImage(bitmapnew System.Drawing.Rectangle(0, 60, bitmap.Widthbitmap.Height), 0, 0, bitmap.Widthbitmap.HeightSystem.Drawing.GraphicsUnit.PixelimageAttributesKillAlpha );
                    else
                        e.Graphics.DrawImageUnscaled(bitmap, 0, 60);
                };
            }
 
            public void SetParams(uint procHandlestring ptrStringSystem.Drawing.Imaging.PixelFormat formatuint xsizeuint ysizeuint bytesPPImgOP imgOpbool enableImgButtonsbool enableHDRButtons)
            {
                memControl.Text = ptrString;
                xresControl.Value = xsize;
                yresControl.Value = ysize;
 
                this.bytesPP = bytesPP;
                this.imgOp = imgOp;
                this.procHandle = procHandle;
                this.format = format;
 
                if (!enableImgButtons)
                {
                    endianSwapButton.Enabled = false;
                    RBSwapButton.Enabled = false;
                }
 
                if (!enableHDRButtons)
                {
                    hdrScale.Enabled = false;
                }
 
                Refresh();
            }
 
            public void CreateBuffers()
            {
                imageSize = (uint)xresControl.Value * bytesPP * (uint)yresControl.Value;
                readSize = imageSize;
                if (imgOp == ImgOP.F16_TO_I8)
                {
                    tempHalfToFloatMemory = new float[imageSize];
                    readSize *= 2;
                }
                else if (imgOp == ImgOP.F32_TO_I8)
                {
                    readSize *= 4;
                }
                unmanagedMemory = new UnmanagedMemWrapper(readSize);
 
                bitmap = new System.Drawing.Bitmap(
                    (int)xresControl.Value, (int)yresControl.Value, (int)(xresControl.Value * bytesPP), formatunmanagedMemory.ptr
                );
 
                System.Drawing.Imaging.ColorPalette palette = bitmap.Palette;
                if (palette.Entries.Length != 0)
                {
                    for (int i = 0; i < palette.Entries.Lengthi++)
                        palette.Entries.SetValue(System.Drawing.Color.FromArgb(255, iii), i);
                    bitmap.Palette = palette// weird dance...
                }
 
                imageAttributesKillAlpha = new System.Drawing.Imaging.ImageAttributes();
 
                float[][] colorMatrixElements = { 
                    new float[] {1, 0, 0, 0, 0}, // red scale
                    new float[] {0, 1, 0, 0, 0}, // green scale
                    new float[] {0, 0, 1, 0, 0}, // blue scale
                    new float[] {0, 0, 0, 1, 0}, // alpha scale
                    new float[] {0, 0, 0, 1, 1}}; // translation
                imageAttributesKillAlpha.SetColorMatrix(
                    new System.Drawing.Imaging.ColorMatrix(colorMatrixElements), System.Drawing.Imaging.ColorMatrixFlag.DefaultSystem.Drawing.Imaging.ColorAdjustType.Bitmap
                ); // TODO: RGB swaps and R-G-B channel selections and so on can/should be done with a matrix instead of the way they are currently implemented (i.e. endianSwapButton...)
 
                UInt64 pointerInt = 0;
                if (memControl.Text.StartsWith("0x"))
                {
                    try
                    {
                        pointerInt = Convert.ToUInt64(memControl.Text.Substring(2), 16);
                    }
                    catch (System.Exception) { memControl.Text = "Can't parse ptr"; }
                }
                else if (!UInt64.TryParse(memControl.Textout pointerInt))
                {
                    memControl.Text = "Can't parse ptr";
                }
                pointer = new UIntPtr(pointerInt);
 
                Refresh();
            }
 
            uint imageSize = 0;
            uint readSize = 0;
            float[] tempHalfToFloatMemory = null;
            UnmanagedMemWrapper unmanagedMemory = null;
            UIntPtr pointer = new UIntPtr(0);
            System.Drawing.Bitmap bitmap = null;
 
            uint bytesPP;
            ImgOP imgOp;
            uint procHandle;
            System.Drawing.Imaging.PixelFormat format;
            System.Drawing.Imaging.ImageAttributes imageAttributesKillAlpha;
 
            // Meh, there was no reason to do all this by hand really...
            System.Windows.Forms.CheckBox noAlphaButton = new System.Windows.Forms.CheckBox() { Text = "NoAlpha"Left = 0, Width = 70 };
            System.Windows.Forms.CheckBox endianSwapButton = new System.Windows.Forms.CheckBox() { Text = "Endian"Left = 70, Width = 70 };
            System.Windows.Forms.CheckBox RBSwapButton = new System.Windows.Forms.CheckBox() { Text = "RB Swap"Left = 140, Width = 70 };            
            System.Windows.Forms.NumericUpDown hdrScale = new System.Windows.Forms.NumericUpDown() { DecimalPlaces = 2, Minimum = -999999, Maximum = 999999, Increment = 0.25m, Value = 1, Left = 330, Width = 50 };
            System.Windows.Forms.CheckBox fillBlackButton = new System.Windows.Forms.CheckBox() { Text = "Black Backgr."Left = 380, Width = 70 };
 
            System.Windows.Forms.NumericUpDown xresControl = new System.Windows.Forms.NumericUpDown() { Minimum = 0, Maximum = 9999, Top = 25, Left = 0, Width = 105 };
            System.Windows.Forms.NumericUpDown yresControl = new System.Windows.Forms.NumericUpDown() { Minimum = 0, Maximum = 9999, Top = 25, Left = 105, Width = 105 };
            System.Windows.Forms.TextBox memControl = new System.Windows.Forms.TextBox() { Left = 210, Width = 170, Top = 25 };
            System.Windows.Forms.Button resetButton = new System.Windows.Forms.Button() { Text = "Region Update"Left = 380, Top = 25, Width = 100 };
          
        }
 
        static void PeekImg(
            uint procHandlestring ptrString, UInt32 xsize, UInt32 ysize, UInt32 bytesPP,
            System.Drawing.Imaging.PixelFormat formatImgOP imgOp = ImgOP.NONE
        ) // TODO: move the format params into a drop-down of the form, instead of having to specify by hand in the commandline...
        {
            using (var form = new PeekImgForm())
            {
 
                var timer = new System.Windows.Forms.Timer() { Interval = 33, Enabled = true };
                timer.Tick += delegate(object senderEventArgs e)
                {
                    //form.Refresh(); // TODO: Enable-Disable auto refresh switch, via a command-line switch or a checkbox
                };
 
                form.SetBounds(0, 0, xsize > 600 ? (int)xsize : 600, (int)ysize + 100);
                form.SetParams(procHandleptrStringformatxsizeysizebytesPPimgOp,
                    format == System.Drawing.Imaging.PixelFormat.Format32bppArgb, 
                    imgOp != ImgOP.NONE
                ); 
                form.CreateBuffers();         
 
                // Run...
                System.Windows.Forms.Application.EnableVisualStyles();
                form.Show(); form.Focus(); timer.Start();
                System.Windows.Forms.Application.Run(form);
            }
        }
    }
}