I've spent 1.5 days to find that this line:
should have been this one instead:
return &GetPrevReadBuffer(mWritePose)[mWritePose * NUMVECTOR4PERPOSEMATRIX];
And I previously had other problems with that too, basically I'm using a double-buffered (and interleaved... in a complicated way) three-dimensional vector4 array to store animation data...
Note to myself: DON'T ever do that again, DON'T use pointer arithmethic, never. Wrap arrays in classes. I didn't do that because (to make things more complicated) I have three different representations of that data, the "simulation" side sees them as scale-quaternion-translation classes, the replay sees them as compressed versions of the same, and the rendering expands the compressed versions into affine matrices...
Now I have to go back debugging, because there's still a problem in the interleaving and interplation code that even if I've added debug asserts and test cases everywhere, is still hiding itself somewhere. AAAAAAAAAAAAAAAAAAAAA!!!!
p.s. Direct access in arrays is bad from a performance standpoint too. If you wrap your arrays with setters and getters then it's easier to change the in-memory layout of your elements later to optimize for cache misses... There are many cases where good code design also helps performances, not directly but making changes after profiling more easy!