Search this blog

04 March, 2023

Hidden in plain sight: The mundanity of the Metaverse


Don’t you hate it when words get stolen? Now, we won’t ever have a “web 3”, that version number has been irredeemably coopted by scammers or worse, tech-bros that live a delusion of changing the world with their code, blindly following their ideology without ever trying to connect to the humanity code’s meant to serve.

Well, this is what happened to “the metaverse”. It didn’t help that it never had a solid definition, to begin with (I tried to craft one here), and then the hype train came and EVERYTHING needed to be marketed as either a metaverse or for the metaverse.


The straw that broke this camel's back...

The final nail in the word’s coffin fell down when notoriously, a big social networking company, looking at the data on its userbase and monetization trending down, decided it was the time for a BOLD move, stole the word, and decided to rush all-in making huge investments in all sort of random things that looked metaverse-y, just throwing in the trash the innovator’s dilemma and its solution.

But if I told you that, hidden in plain sight, this idea of the metaverse is actually rather obvious, even mundane, and all you need to do is to sit down and observe what has been going on… with people.

Trends in the gaming industry.

I’m not the best person to wade through the philosophy and psychology of entertainment - how it is fundamentally social, interactive, and important.

And neither I am, even in my field, a historian - so I won’t be presenting an accurate accounting of what happened in the past couple of decades.

I hope the following will be mundane enough that it can be shown even through an imperfect lens, and for familiarity’s sake, I’ll use my own career as one.

I have to warn you: this is going to be boring. All that I’m going to say, is obvious… it’s just that for some reason, I don’t see often all the dots being connected…

Let’s go.

I started working in the videogame industry in the early 2000s. The very tail end of the ps2 era (I never touched that console’s code - the closets I came was to modify some og xbox stuff we were using as we repurposed a rack of old consoles to help certain data bakes), right at the beginning of the 360 one.

box art
My first game (uncredited)

What were we doing? Boxed titles. Local, self-contained experiences. Yes, you could play split screen if you happened to have a friend nearby - and that’s incredibly fun, we are social animals after all… 

But all in all, you shipped a title, you pressed discs, people bought discs, inserted them in their console, played on the couch, rinse and repeat.

I did a couple of these, then moved from Italy to Canada, to work for EA, a much bigger company, we’re around the middle of the 360/ps3 era now.

What were we doing? Yeah, you guessed it, multiplayer titles. Single-player was still important, local multiplayer was still important, and we were still pressing discs… but we started to move towards a more connected idea of gaming. 

Is Fight Night Champion Good? Revisiting the Boxing Game 10 Years Later
You know I'm still proud of the work on this one...

We would do DLCs, and support the game longer post-shipping; Communities started to grow bigger as you could connect around a game.

The game you got on disc was not that relevant anymore, was just a starting point, necessarily. There is no way to game-design something that will be played, concurrently, by millions of players. They will break your game, find balancing issues, and so on, so really, the game code was made to be infinitely tweakable, in “real-time” by people monitoring the community and making sure it kept being fun and challenging…

Gaming has always been a community, with forums, magazines, TV shows, and such, but you start seeing all of that grow, people staying with a game longer, sequels to be more important, franchises over single titles…

What’s next? 

For me, Ps4/Xbox one, Activision, Call of Duty… Where are we going? E-sports, twitch, youtube. A longer and longer tail of content. 

Modern Warfare 3 live action trailer brings Hollywood to Call of Duty » EFTM
 I do miss the live-action, star-studded fun trailers COD used to make...

We go beyond tweaking the game post-launch, now, really the success of a game is measured in how well you keep providing interesting content, and interesting experiences with that framework you created.

Games as a service, we see the drop in physical game sales, the move to digital distribution - and with it, the boom of indie game making, of the idea that anyone can create and share.

Even big franchises, with their tight control over their IP, are nothing without the community of creators around them. Playstation “share” et al.

Call of duty is not simply the game that ships in a box, it’s a culture, it’s a scene - a persistent entity even way before it was a persistent gaming universe (only recently happening with WarZone).

And then of course, I moved to Roblox, where I am now - and I guess I should have said somewhere, this is all personal - it’s my view of the industry, not connected with my job there and the company’s goals (Dave started from an educational tool, and from there crafted a vision that has always been quite unique, arguably the reason why now it ended up being ahead, clearer etc...). 

NoisyButters - YouTube
I like the positivity of NoisyButters

Hopefully, you can see that my point here is more general than what this or that company wants to do...

But again, I moved to Roblox, personally because I liked the idea to be closer to the creative side of the equation, but in general, where are we now? 

What’s the new wave of gaming? Fortnite? Minecraft? Among us? Tarkov? Diablo 4? Whatever, you see the trends:

  • Games are social, and encourage socialization, they are communities. Effectively, they are social networks, just as clubhouse, instagram, tiktok…
  • There are user-created universes “around” the games, even when the game does not allow at all UGC.
  • Games live or die based on the supply of content "flowing through" them. They are vehicles for content delivery.
  • The in-game world and real world have continuous crossovers, brands, concerts, events, celebrations…
Why do I play D3? For the transmog fashion of course!
And if you haven't played Fortnite / experienced its immense catalog of skins, you're missing out.

Conclusions.

Yes, all of this has been true in some ways since forever, in a more underground fashion. 

MUDs and modding, Ultima Online and Warcraft, ARGs, and LARPing, I know - nothing's new under the sun. But this does not invalidate the idea, it reinforces it, everything that is mainstream today has been underground before...

So, are we surprised that “the metaverse” matters? The idea of crafting the creative space, making a platform for creativity, having the social aspect built-in, to go beyond owning single IPs? To make the youtube of gaming, to merge creation, distribution, and communication? To allow people to create, instead of trying to cope with content demands by having everything in house, in a continuous death march that anyways will never match what communities can imagine?

I have to admit, a lot of ideas I see in this space look incredibly dumb. The equation that the metaverse is AR/VR/XR, that is the holodeck or ready player one, whatever… and look, one day it might even be, in a time horizon that I really don’t care talking about.

Innovation dies where monopolies thrive: why Meta is failing at metaverse |  Cybernews
:/

But today? Today is mundane, it’s an obvious space that does not need to be created, it’s already here, in products and trends, and will only evolve towards more integrated platforms and better products and so on - but it is anything but surprising. 

It’s not science fiction, it’s basic humanity wanting to connect and create.

20 February, 2023

How to render it. Ten ideas to solve Computer Graphics problems.

Pre-Ramble.

A decade ago or so, yours truly was a greener but enthusiastic computer engineer, working in production to make videogames look prettier. At a point, I had, in my naivety, an idea for a book about the field, and went* to a great mentor, with it.

* messaged over I think MSN messenger. Could have been Skype, could have been ICQ, but I think it was MSN...

He warned me about the amount of toiling required to write a book, and the meager rewards, so that, coupled with my inherent laziness, was the end of it.

The mentor was a guy called Christer Ericson, who I had the fortune of working for later in life, and among many achievements, is the author of Real-time Collision Detection, still to this date, one of the best technical books I’ve read on any subject.

The idea was to make a book not about specific solutions and technologies, but about conceptual tools that seemed to me at the time to be recurringly useful in my field (game engine development).

He was right then, and I am definitely no less lazy now, so, you won’t get a book, but I thought, having accumulated a bit more experience, it might be interesting to meditate on what I’ve found in my career to be useful when it comes to innovation in real-time rendering.

As we'll be talking about tools for innovation, the following is written assuming the reader has enough familiarity with the field - as such, it's perhaps a bit niche. I'd love if others were to write similar posts about other industries though - we have plenty of tools to generate ideas in creative fields, but I've seen fewer around (computer) science.

The metatechniques.

In no specific order, I’ll try to describe ten meta-ideas, tools for thought if you wish, and provide some notable examples of their application.

  1. Use the right space.
  2. Data representation and its properties.
    • Consider the three main phases of computation.
  3. Compute over time.
  4. Think about the limitations of available data.
    1. Machine learning as an upper limit.
  5. The hierarchy of ground truths.
  6. Use computers to help along the way.
  7. Humans over math.
  8. Find good priors.
  9. Delve deep.
  10. Shortcut via proxies.

A good way to use these when solving a problem is to map out a design space, try to sketch solutions using a combination of different choices in each axis, and really try to imagine if it would work (i.e. on pen and paper, not going deep into implementation).

Then, from this catalog of possibilities, select a few that are worth refining with some quick experiments, and so on and so forth, keep narrowing down while going deeper.

Related post: Design optimization landscape

1) Use the right space.

Computer graphics problems can literally be solved from different perspectives, and each offers, typically, different tradeoffs.

Should I work in screen-space? Then I might have an easier time decoupling from scene complexity, and I will most likely work only on what’s visible, but that’s also the main downside (e.g. having to handle disocclusions and not being able to know what’s not in view). Should I work in world-space? In object-space? In “texture”-space, i.e. over a parametrization of the surfaces?

Examples:

2) Data representation and its properties.

This is a fundamental principle of computer science; different data structures have fundamentally different properties in terms of which operations they allow to be performed efficiently.

And even if that’s such an obvious point, do you think systematically about it when exploring a problem in real-time rendering?

List all the options and the relative properties. We might be working on signals on a hemisphere, what do we use? Spherical Harmonics? Spherical Gaussians? LTCs? A hemicube? Or we could map from the hemisphere to a circle, and from a circle to a square, to derive a two-dimensional parametrization, and so on.

Voxels or froxels? Vertices or textures? Meshes or point clouds? For any given problem, you can list probably at least a handful of fundamentally different data structures worth investigating.

2B) Consider the three main phases of computation.

Typically, real-time rendering computation is divided into three: scene encoding, solver, and real-time retrieval. Ideally, we use the same data structure for all three, but it might be perfectly fine to consider different encodings for each.

For example, let’s consider global illumination. We could voxelize the scene, then scatter light by walking the voxel data structure, say, employing voxel cone tracing, and finally utilize the data during rendering by directly sampling the voxels. We can even do everything in the same space, using world-space would be the most obvious choice, starting from using a compute 3D voxelizer over the scene. That would be fine. 

But nobody prohibits us to use different data structures in each step, and the end results might be faster. For example, we might want to take our screen-space depth and lift that to a world-space voxel data structure. We could (just spitballing here, not to mean it’s a good idea) generate probes with a voxel render, to approximate scattering. And finally, we could avoid sampling probes in real-time, by say, incrementally generating lightmaps (again, don’t take this as a serious idea).

Imperfect Shadow Maps are a neat example of thinking outside the box in terms of spaces and data structure to solve a problem...

3) Compute over time.

This is a simple universal strategy to convert computationally hard problems into something amenable to real-time. Just don’t try to solve anything in a single frame, if it can be done over time, it probably should.

Incremental computation is powerful in many different ways. It exploits the fact that typically, a small percentage of the total data we have to deal with is in the working set.

This is powerful because it is a universal truth of computing, not strictly a rendering idea (think about memory hierarchies, caches, and the cost of moving data around).

Furthermore, it’s perceptually sound. Motion grabs our attention, and our vision system deals with deltas and gradients. So, we can get by with a less perfect solution if it is “hidden” by a bigger change.

Lastly, it is efficient computationally, because we deal with a very strict frame budget (we want to avoid jitter in the framerate) but an uneven computational load (not all frames take the same time). Incremental computation allows us to “fill” gaps in frames that are faster to compute, while still allowing us to end in time if a frame is more complex, by only adding lag to the given incremental algorithm. Thus, we can always utilize our computational resources fully.

Obviously, TAA, but examples here are too numerous to give, it’s probably simpler to note how modern engines look like an abstract mess if one forces all the incremental algorithms to not re-use temporal information. It’s everywhere

Parts of Cyberpunk 2077's specular lighting, without temporal 

It’s worth noting also that here I’m not just thinking of temporal reprojection, but all techniques that cache data over time, that update data over multiple frames, and that effectively result in different aspects of the rendering of a frame to operate at entirely decoupled frequencies.

Take modern shadowmaps. Cascades are linked to the view-space frustum, but we might divide them into tiles and cache over frames. Many games then throttle sun movements to happen mostly during camera motion, to hide recomputation artifacts. We might update far cascades at different frequencies than close ones and entirely bail out of updating tiles if we’re over a given frame budget. Finally, we might do shadowmap filtering using stochastic algorithms that are amortized among frames using reprojection.

4) Think about the limitations of available data.

We made some choices, in the previous steps, now it’s time to forecast what results we could get.

This is important in both directions, sometimes we underestimate what’s possible with the data that we can realistically compute in a real-time setting, other times we can “prove” that fundamentally we don’t have enough/the right data, and we need a perspective change.

A good tool to think about this is to try a brute-force solution over our data structures, even if it wouldn’t be feasible in real-time, it would provide a sort of ground truth (more on this later): what’s the absolute best we could do with the data we have.

Some examples, from my personal experience.

  • When Crysis came out I was working at a company called Milestone, and I remember Riccardo Minervino, one of our technical artists, dumping the textures in the game, from which we “discovered” something that looked like AO, but looked like it was done in screen-space. What sorcery was that, we were puzzled and amazed. It took though less than a day, unconsciously following some of the lines of thought I’m writing about now, for me to guess that it must have been done with the depth buffer, and from there, that one could try to “simply” raytrace the depth buffer, taking inspiration from relief mapping.
  • This ended up not being the actual technique used by Crytek (raymarching is way too slow), but it was even back in the day an example of “best that can be done with the data available” - and when Jorge and I were working on GTAO, one thing that we had as a reference was a raymarched AO that Jorge wrote using only the depth data.
  • Similarly, I’ve used this technique a lot when thinking of other screen-space techniques, because these have obvious limitations in terms of available data. Depth-of-field and motion-blur are an example, where even if I never wrote an actual brute-force-with-limited-information solution, I keep that in the back of my mind. I know that the “best” solution would be to scatter (e.g. the DOF particles approach, first seen in Lost Planet, which FWIW on compute nowadays could be more sane), I know that’s too slow (at least, it was when I was doing these things) and that I had to “gather” instead, but to understand the correctness of the gather, you can think of what you’re missing (if anything) comparing to the more correct, but slower, solution.

4B) Machine learning as an upper limit.

The only caveat here is that in many cases the true “best possible” solution goes beyond algorithmic brute force, and instead couples that with some inference. I.e. we don’t have the data we’d like, but can we “guess”? That guessing is the realm of heuristics. 

Lately, the ubiquity of ML opened up an interesting option: to use machine learning as a proxy to validate the “goodness” of data.

For example, in SSAO a typical artifact we get is dark silhouettes around characters, as depth discontinuities are equivalent to long “walls” when interpreting the depth buffer naively (i.e. as a heightfield). But we know that’s bad, and any competent SSAO (or SSR, etc) employs some heuristic to assign some “thickness” to the data in the depth buffer (at least, virtually) to allow rays to pass behind certain objects. That heuristic is a guessing game, how do we know how well we could do? There, training a ML model with ground truth, raytraced AO, and feeding it only the depth-buffer as inputs, can give us an idea of the best we could ever do, even if we are not going to deploy the ML model in real-time, at all.

See also: Deep G-Buffers for GI but remember, here I'm specifically talking about ML as proof of feasibility, not as the final technique to deploy.

5) The hierarchy of ground truths.

The beauty of rendering is that we can pretty much express all our problems in a single equation, we all know it, Kajiya’s Rendering Equation.

From there on, everything is really about making the solution practical, that’s all there is to our job. But we should never forget that the “impractical” solution is great for reference, to understand where our errors are, and to bound the limits of what can be done.

But what is the “true” ground truth? In practice, we should think of a hierarchy. 

At the top, well, there is reality itself, that we can probe with cameras and other means of acquisition. Then, we start layering assumptions and models, even the almighty Rendering Equation already makes many, e.g. we operate under the model of geometrical optics, which has its own assumptions, and even there we don’t take the “full” model, we typically narrow it further down: we discard spectral dependencies, we simplify scattering models and so on.

At the very least, we typically have four levels. First, it’s reality.

Second, is the outermost theoretical model, this is a problem-independent one we just assume for rendering in general, i.e. the flavor of rendering equation, scene representation, material modeling, color spaces, etc we work in.

Then, there is often a further model that we assume true for the specific problem at hand, say, we are Ambient Occlusion, that entire notion of AO being “a thing” is its own simplification of the rendering equation, and certain quality issues stem simply from having made that assumption.

Lastly, there is all that we talked about in the previous point, namely, a further assumption that we can only work with a given subset of data.

Often innovation comes by noticing that some of the assumptions we made along the way were wrong, we simply were ignoring parts of reality that we should not have, that make a perceptual difference. 

What good is it to find a super accurate solution to say, the integral of spherical diffuse lights with Phong shading, if these lights and that shading never exist in the real world? It’s sobering to look back at how often we made these mistakes (and our artists complained that they could not work well with the provided math, and needed more controls, only for us to notice that fundamentally, the model was wrong - point lights anybody?)

Other times, the ground truth is useful only to understand our mistakes, to validate our code, or as a basis for prototyping.

6) Use computers to help along the way.

No, I’m not talking about ChatGPT here.

Numerical optimization, dimensionality reduction, data visualization - in general, we can couple analytic techniques with data exploration, sometimes with surprising results.

The first, more obvious observation, is that in general, we know our problems are not solvable in closed form, we know this directly from the rendering equation, this is all theory that should be so ingrained I won’t repeat it, our integral is recursive, its form even has a name, we know we can’t solve it, we know we can employ numerical techniques and blah blah blah path tracing.

This is not very interesting per se, as we never directly deal with Kajiya’s in real-time, we layer assumptions that make our problem simpler, and we divide it into a myriad of sub-problems, and many of these do indeed have closed-form solutions.

But even in these cases, we might want to further approximate, for performance. Or we might notice that an approximate solution (as in function approximation or look-up tables) with a better model is superior to an exact solution with a more stringent one.

But there is a second layer where computers help, which is to inform our exploration of the problem domain. Working with data, and interacting with it, aids discovery. 

We might visualize a signal and notice it resembles a known function (or that by applying a given transform we improve the visualization) - leading to deductions about the nature of the problem, sometimes that we can reconduct directly to analytic or geometric insights. We might observe that certain variables are not strongly correlated with a given outcome, again, allowing us to understand what matters. Or we might do dimensionality reduction, clustering, and understand that there might be different sub-problems that are worth separating.

To the extreme, we can employ symbolic regression, to try to use brute force computer exploration and have it "tell us" directly what it found. 

Examples. This is more about methodology, and I can't know how much other researchers leverage the same methods, but in the years I've written multiple times about these themes:


  • Some horrible old tools I wrote to connect live programs to data vis

7) Humans over math.

One of the biggest sins of computer engineering in general is not thinking about people, processes, and products. It's the reason why tech fails, companies fail, and careers "fail"... and it definitely extends to research as well, especially if you want to see it applied.

In computer graphics, this manifests in two main issues. First, there is the simple case of forgetting about perceptual error measures. This is a capital sin both in papers (technique X is better by Y by this % MSE) and data-driven results (visualization, approximation...). 

The most obvious issue is to just use mean squared error (a.k.a. L2) everywhere, but often times things can be a bit trickier, as we might seek to improve a specific element in a processing chain that delivers pixels to our faces, and we too often just measure errors in that element alone, discounting the rest of the pipeline which would induce obvious nonlinearities.

In these cases, sometimes we can just measure the error at the end of the pipeline (e.g. on test scenes), and other times we can approximate/mock the parts we don't explicitly consider.

As an example, if we are approximating a given integral of a luminaire with a set BRDF model, we should probably consider that the results would go through a tone mapper, and if we don't want to use a specific one (which might not be wise, especially because that would probably depend on the exposure), we can at least account for the roughly logarithmic nature of human vision...

Note that a variant of this issue is to use the wrong dataset when computing errors or optimizing, for example, one might test a new GPU texture compression method over natural image datasets, while the important use case might be source textures, that have significantly different statistics. All these are subtle mistakes that can cause large errors (and thus, also, the ability to innovate by fixing them...)

The second category of sins is to forget whose life we are trying to improve - namely, the artist's and the end user's. Is a piece of better math useful at all, if nobody can see it? Are you overthinking PBR, or focusing on the wrong parts of the imagining pipeline? What matters for image quality? 

In most cases, the answer would be "the ability of artists to iterate" - and that is something very specific to a given product and production pipeline. 

If you can spend more time with artists, as a computer graphics engineer, you should. 

Nowadays unfortunately productions are so large that this tight collaboration is often unfeasible, artists dwarf engineers by orders of magnitude, and thus we often create some tools with few data points, release them "in the wild" of a production process, where they might or might not be used in the ways they were "supposed" to. Even the misuse is very informative. 

We should always remember that artists have the best eyes, they are our connection to the end users, to the product, and the ones we should trust. If they see something wrong, it probably is. It is our job to figure out why, where, and how to fix it, all these dimensions are part of the researcher's job, but the hint at what is wrong, comes first from art. 

An anecdote I often refer to, because I lived through these days, is when artists had only point lights, and some demanded that lights carried modifiers to the roughness of surfaces they hit. I think this might have even been in the OpenGL fixed function lighting model, but don't quote me there. Well, most of us laughed (and might laugh today, if not paying attention) at the silliness of the request. Only to be humbled by the invention of "roughness modification" as an approximation to area lights...

Here is where I should also mention the idea of taking inspiration from other fields, this is true in general and almost to the same level as suggestions like "take a walk to find solutions" or "talk to other people about the problem you're trying to solve" - i.e. good advice that I didn't feel was specific enough. We know that creativity is the recombination of ideas, and that being a good mix of "deep/vertical" and "wide/horizontal" is important... in life. 

But specifically, here I want to mention the importance of looking at our immediate neighbors: know about art, its tools and language, know about offline rendering, movies, and visual effects, as they can often "predict" where we will go, or as we can re-use their old techniques, look at acquisition and scanning techniques, to understand the deeper nature of certain objects, look at photography and movie making. 

When we think about inspiration, we sometimes limit ourselves to related fields in computer science, but a lot of it comes from entirely different professions, again, humans.

See also:

8) Find good priors.

A fancy term for assumptions, but here I am specifically thinking of statistics over the inputs of a given problem, not simplifying assumptions over the physics involved. It is often the case in computer graphics that we cannot solve a given problem in general, literally, it is theoretically not solvable. But it can become even easy to solve once we notice that not all inputs are equally likely to be present in natural scenes.

This is the key assumption in most image processing problems, upsampling, denoising, inpainting, de-blurring, and so on. In general, images are made of pixels, and any configuration of pixels is an image. But out of this gigantic space (width x height x color channels = dimensions), only a small set comprises images that make any sense at all, most of the space is occupied, literally, by random crap.

If we have some assumption over what configurations of pixels are more likely, then we can solve problems. For example, in general upsampling has no solution, downsampling is a lossy process, and there is no reason for us to prefer a given upsampled version to another where both would generate the same downsampled results... until we assume a prior. By hand and logic, we can prioritize edges in an image, or gradients, and from there we get all the edge-aware upsampling algorithms we know (or might google). If we can assume more, say, that the images are about faces or text and so on, we can create truly miraculous (hallucination) techniques.

As an aside, specifically for images, this is why deep learning is so powerful - we know that there is a tiny subspace of all possible random pixels that are part of naturally occurring images, but we have a hard time expressing that space by handcrafted rules. So, machine learning comes to the rescue.

This idea though applies to many other domains, not just images. Convolutions are everywhere, sparse signals are everywhere, and noise is everywhere, all these domains can benefit from adopting priors. 

E.g. we might know about radiance in parts of the scene only through some diffuse irradiance probes (e.g. spherical harmonics). Can we hallucinate something for specular lighting? In general, no, in practice, probably. We might assume that lighting is likely to come from a compact set of directions (a single dominant luminaire). Often times is even powerful to assume that lighting comes mostly from the top down, in most natural scenes - e.g. bias AO towards the ground...

See also: an old rant on super-resolution ignorance

9) Delve deep.

This is time-consuming and can be annoying, but it is also one of the reasons why it's a powerful technique. Keep asking "why". Most of what we do, and I feel this applies outside computer graphics as well, is habit or worse, hype. And it makes sense for this to be the case, we simply do not have the time to question every choice, check every assumption, and find all the sources when our products and problems keep growing in complexity.

But for a researcher, it is also a land of opportunity. Often times we can even today, pick a topic at random, a piece of our pipeline, and by simply keep questioning it we'll find fundamental flaws that when corrected yield considerable benefits. 

This is either because of mistakes (they happen), because of changes in assumptions (i.e. what was true in the sixties when a given piece of math was made, is not true today), because we ignored the assumptions (i.e. the original authors knew a given thing was applicable only in a specific context, but we forgot about it), or because we plugged in a given piece of math/algorithm/technology and added large errors while doing it.

A simple example: most integrals of lighting and BRDFs with normalmaps, which cause the hemisphere of incoming light directions to partially be occluded by the surface geometry. We clearly have to take that horizon occlusion into consideration, but we often do not, or if we do it's through quick hacks that were never validated. Or how we use Cook-Torrance-based BRDFs, without remembering that they are valid only up to a given surface smoothness. Or how nobody really knows what to do with colors (What's the right space for lighting? For albedos? To do computation? We put sRGB primaries over everything and call it a day...). But again, this is everywhere, if one has the patience of delving...

10) Shortcut via proxies.

Lastly, this one is a bit different than all the others, in a way a bit more "meta". It is not a technique to find ideas, but one to accelerate the development process of ideas, and it is about creating mockups and prototypes as often as possible, as cheaply as possible.

We should always think - can I mock this up quickly using some shortcut? Especially where it matters, that's to say, around unknowns and uncertain areas. Can I use an offline path tracer, and create a scene that proves what my intuition is telling me? Perhaps for that specific phenomenon, the most important thing is, I don't know, the accuracy of the specular reflection, or the influence of subsurface scattering, or modeling lights a given way...

Can I prove my ideas in two dimensions? Can I use some other engine that is more amenable to live coding and experimentation? Can I modify a plug-in? Can I create a static scene that previews the performance implication of something that I know will need to generate dynamically - say a terrain system or a vegetation system.

Can I do something with pen and paper? Can I build a physical model? Can I gather data from other games, from photos, from acquiring real-world data... Can I leverage tech artists to create mocks? Can I create artificial loads to investigate the performance, on hardware, of certain choices?

Any way you have to write less code, take less time, and answer a question that allows you to explore the solution space faster, is absolutely a priority when it comes to innovation. Our design space is huge! It's unwise to put prematurely all the chips on a given solution, but it is equally unwise to spend too long exploring wide, so the efficiency of the exploration is paramount, in practice.

- Ending rant.

I hope you enjoyed this. In some ways I realize it's a retrospective that I write now as I've, if not closed, at least paused the part of my career that was mostly about graphics research, to learn about other areas that I have not seen before.

It's almost like these youtube ads where people peddle free pamphlets on ten easy steps to become rich with Amazon etc, minus the scam (trust me...). Knowledge sharing is the best pyramid scheme! The more other people innovate, the more I can (have the people who actually write code these days) copy and paste solutions :)

I also like to note how this list could be called "anti-design patterns" (not anti-patterns, which are still patterns), the opposite of DPs in the sense that I hope for these to be starting points for ideas generation, to apply your minds in a creative process, while DPs (ala GoF) are prescribed (terrible) "solutions" meant to be blindly applied. 

I probably should not even mention them because at least in my industry, they are finally effectively dead, after a phase of hype (unfortunately, we are often too mindless in general) - but hey, if I can have one last stab... why not :)

26 June, 2022

Machines Arose

The era of algorithmic slavery.

When we think of the rise of the machines, we picture skynet and the matrix. Humanity literally fighting the AI, with big pew-pew guns, and getting enslaved by it. Heroes seeing through the deception, illuminated minds, perhaps looking insane to the average bystander, purposed with a higher calling.

We lose ourselves in the bombast of Hollywood, we take metaphors literally, we fear or dream of the singularity, look for signs of consciousness in the code we write.

A lesser known game from Bethesda... 2029 is close!

In reality, the danger is the opposite. It’s not how much consciousness the machines gain. It is rather, how much they remove from us. Yes, the recent "LaMDA is sentient" BS is not much more than a bad publicity stunt - but that doesn't mean that Google is not scary!

We are of course already dependent on machines - that is not the problem - our degree of attachment to them. We are dependent on all technology we create. It’s the defining feature of humanity to better itself through technology, it has been true since we made fire.

For millennia we have used technology to elevate ourselves, to free us from the minutiae of living and sublimate our spirit, enabling higher forms of creativity, allowing us to dedicate more time to work that is intellectual in nature.

You can call this productivity, even augmented intelligence - once we discovered that technology is not good simply to ease physical labor, but can be shaped into tools for better thinking.

Sougwen Chung (愫君) - Machines can be tools that augment our creativity. 

Is this trend going to end one day? Is is already ending?

Will we live in a world where it’s increasingly hard to be a value-add via the use of technology, but rather most of us will be made irrelevant by it? What happens to the masses that can’t produce anything of interest?

Can our creativity outpace the machine’s forever? 

https://www.youtube.com/watch?v=g9Z0pqsCUhY

One can argue that a tool remains a tool, and in the history of the world, short-sighted people always lamented when creation became more accessible, from painting to film photography, from film to digital cameras, from cameras to smartphones. 

There is always someone lamenting the loss of "true" art - and they are always wrong... but! At the same time, we have enough historical evidence of machines displacing jobs, labor having to learn new skills, often painfully, for the generations caught in the transition. 

There is some reason to worry, then - but it's not the key to the story here. Creativity is likely to remain firmly in the domain of humans, in fact one could say that a truly creative machine would need to be a conscious one, and that is not the scenario I'm interesting in.

The danger is subtler, closer and more real. 

Do we already live in a world where we many creators are replaceable slaves, being milked for content by algorithms that are the true holders of value?


AIs feed us during most of our days. Shodan's tools are videos of kittens, dogs and babies. And her minions are willingly joining, hoping for visibility and connection. 

It's a marvelous machine that exploits the brain chemistry of consumers with cheap dopamine, and of creators, as we seek to show our photos and videos for follows, we increasingly define our value in society by the number of likes we get.

How conscious are we, when most of our connections are software mediated, and sentiment analyzed? The algorithm does not know when to stop, and neither do our brains. Dopamine is the AI’s sugar.

We do not need to be intubated, in pods, to be enslaved. We don't even need to be slaves, once we created a system that gives some short-term pleasure, we willingly subjugate to it.

Don’t take your science fiction literally.

I don't fear the sentient AI and the singularity. I don't care much about privacy and crypto-anarchism. I think we are looking at the wrong problems. Even the worries about physicals changes in our cognitive abilities, psychology and looks might be overstated - as we are very plastic, we adapt.

And for how despicable the role of simplistic recommendation algorithms, shares and likes have on creating information bubbles and drive polarization, we are beginning to understand and rebel - systems might be tuned differently...

The existence of a system though, per se, and the fact that can be tuned - is that ever possibly moral? Are we not saying that we are losing agency, if the way a machine operates controls society?

This is a Silicon Valley problem that SV cannot solve for itself. It's the natural evolution of companies to want to be successful, and we are in a world where success means engaging billions of people, capturing a large percent of their time and attention.

These systems can hardly be called tools, and are clearly not in our control.

11 June, 2022

Real-time rendering - past, present and a probable future.

This presentation was a keynote given to a private company event - I'm not sure if I'm at liberty to say more about it - but the content is quite universal, so I hope you'll enjoy!

It does not talk directly of Roblox or the Metaverse... but at the same time, it has, near the end, some strong connections to it.

Slides here!

Also... this is not the first "open problems" slide deck I make, and I mentioned an unfinished one in previous presentations... I realize I will never finish it - or rather, I am not as passionate about it anymore, so... here it is, frozen in its eternal WIP state: slides - circa 2015



10 April, 2022

DOS Nostalgia: On using a modern DOS workstation.

Premise. 

This blog post is useless. And rambling. As it's useless the machine I'm typing this on, a Pentium 3 subnotebook from the 90ies. You have been warned!

But, it might be entertaining, and I suspect many of the people doing what I do and reading what I write, are in a similar demographic and might be starting to be nostalgic, thinking of their formative years and wondering if they're worth revisiting...

Objectives. 

I wanted to find a DOS machine, not for retrogaming (only), but to do actual "work". Even more narrowly, I had an idea of trying to compile an old DOS demo I made in the nineties, the only production of a short-lived Italian group called "day zero deflection" (you won't find it).

Monotasking. No internet. These things are so appealing to me right now. One tries to escape the dopamine rush of doomscrolling on all the connected devices that surround us. The flesh is weak, and instead of trying to muster the required willpower, shopping for a hardware solution seems so much more attractive. Of course, it's a fool's errand, but hey, I said this post was going to be useless.

A Long, intermezzo of personal history.

(skip this!) 

It's interesting how memory works. So non-linear, and unreliable. I used a lot of computers in my life, and I started early, I began programming around six or seven years old.

This past Christmas, as the pandemic eased up, I was able again to fly and spend time with my family in southern Italy. Found one of the Commodore 64 we had.

The c64 in question. Yes, it needed some love - albeit to my surprise, all my disks worked, with my childhood code! The video glitch is actually a quite mysterious defect, but it's a story for another time...

We, because I grew up with my older cousins, my mother is the last of eleven siblings, so I have a lot of cousins, many close to my house as my family used to be farmers, and thus had land that eventually became buildings, with many of my aunts and uncles ending up living in the same park.

These older cousins taught me programming, and I was using their computers before having my own. In fact, the c64 I found is most likely theirs, as mine was eventually donated to some relative that needed it more.

I remember a lot of this, in detail, albeit I don't know anymore what details are real and what ended up as images remixed from different time eras.

We were in the basement of my aunt's villa, just next door to the building I grew up in, where we had an apartment on the top floor. We would transfer things between the two by lowering a rope from the balcony down to the villa's garden. Later, when we had PCs and network cards, we moved bits between the buildings, having suspended a coax cable that ran from the second floor of my building (where another cousin lived) to my floor, to the villa.

The basement was originally the studio of my uncle, who was the town's priest. I was named after him. He and one of his sisters died in a car accident when I was little, so I am not sure I really remember of him, sadly.

But I remember the basement, the Commodore 64, and later an 8086 with an external hard drive the same size and shape as the main unit. An amber monitor monochrome I think, or perhaps it was both amber and green, with a configuration switch.

I remember all of the c64 games we played, easily. I remember bits of my coding journey, the books we used to study, and once my cousin being dismayed that I could not figure how to make a cursor move on the screen (the math to go to the next/previous row), even if it was mostly a misunderstanding.

I remember playing with my Amiga 600 there too, Body Blows - I switched to the Amiga after visiting... another cousin, this time, in Milan.

I remember the first Pentium they had because it allowed me to use more 3d graphics software. 3d studio 4 without having to resort to software 387 emulation! At the time I had an IBM PS/2 with a 486sx which the seller persuaded my father would be better than a 486dx another guy was offering us - who needs a math coprocessor, and IBM is a much better brand than something home-made... And I know that numerous times I lost all the data on these computers that I did not own, often by typing "format" too fast and putting the wrong drive letter in.

And then, nothing? Everything more modern than that I sort of lost, or rather, becomes more confused. I know the places I went shopping for (pirated) software and hardware, maybe some of the faces, not sure. 

I know used to lug my PC tower for the few kilometers that separated my house in Scafati from the "shop" (really a private apartment) that I used to go to in Pompei, as I was a kid, and did not have a car of course. 

And that tells me that I had lots of different PC configurations over the years, LOTS of them, AMD, Intel, Voodoo cards, a Matrox of some sorts, even a Sound Blaster AWE32 at a point, a CD-ROM and the early CD games, I remember the excitement for each new accessory and card, and the intense hate for cable and thermal management, especially on more modern setups. 

I remember scanners, the first were hand-held (Logitech ScanMan, then Trust), printers, joysticks, graphics tablets when I got into photography, the very first digital camera I had (I think an Olympus). It's all "PC" for me, I have no idea of what I was using in which year.

At a point, around university, I switched to primarily using laptops. Acer or Asus, something cheap and powerful but they would break often (cheap plastics). Then finally the MacBook Pro, and that one has remained a constant, still today my primary personal machine.

So. My nostalgia is about three machines, really, even if I had dozens. The Commodore 64, the one I remember the most. I am eager to play around with that one more, I ordered all sorts of HW, but I have no intentions to use it "daily" - that one belongs to a museum. 

The MisterFPGA c64 core is great and can output 50hz!

The Amiga, which for some reason I don't care as much for anymore, I suspect mostly because I was using it primarily for games so I did not create as much on it - I think that was the key.

I had some graphic programs, but I was not a great 2d artist (DeluxePaint) and I did not understand enough of the 3d tools I happened to get my hands on (Real3D, VistaPro)... and I did no coding on it. At a point, I had a (pirate) copy of Amos, but no manual.

Swapping disks, real or virtual, is also not fun.

And then the PC, specifically the 486sx that I used both for programming again (QBasic, PowerBasic, Assembly then C with DJGPP), for graphics (Imagine, then Lightwave among others), photography, the internet...

That 486 captures all of my PC memories, even if I know it's wrong. For example, during my C demo-coding times, I must have had a different computer, because the demo we were making would never run on a 486, they were sVGA, I even remember coding our sVGA layer, fixing a bug in the Matrox VESA bios - they were out of spec, not setting the viewport to be the same as the screen resolution when changing the latter, and many demos did run with the wrong line pitch because of that. Not mine! And the demo was, for some reason, writing buffers in separate R,G,B planes, with some MMX code I made to then shuffle them back into the display frame. 

So, it could not have been the 486 - but this is great, it gives me the freedom of not trying to recreate a particular setup but instead going for that same feeling and toolset I remember using, on an entirely different system. 

What do we "need"? 

Here's the plan. First and foremost, we'll get a laptop, because I don't have space in my apartment, no, in my life, for retrocomputing desktop or tower. Also, I want to go to hipster coffee shops and write on my hipster retro workstation, as I am doing right now. 

I planned, regardless of the machine I would end up getting, to rip out the cells from the battery pack and reconstruct it - batteries are mostly a liability in old computers and I prefer the weight savings of not having them - this also means, technically, "luggable" computers could be considered.

We will look for:

  • Something fast, because if I'm buying something it must be the best I can get! I don't even care about being period-accurate, this will be a monotasking monster, not a museum piece.
  • Something I can program on, because hey, what if I like it and want to make modern retro-demos? Ideally, this means a Pentium I, Pentium Pro, or Pentium MMX, beautiful in-order CPUs with predictable pipelines I still know how to cycle-count (sort-of). But anything less than the dreadful Pentium 4 will do, P2 and P3s are OOO but still understandable enough.
  • RAM is not an issue really, and we will max out whatever configuration we will settle on. 
  • Storage is not a problem either, because we will replace whatever HDD the machine comes with an SSD (yes, an actual SSD, albeit most people use compact-flash adapters instead) via an mSATA to PATA/IDE 2.5' enclosure which can fit any half-size ssd (I got a 64gb one just to be "safe" as you never know the limits of old motherboards and firmware. You do want to make sure that the machine did originally support hdds of a decent size (tens of gb) though.
  • DOS-compatible (SoundBlaster-compatible) soundcard, is a must.
  • A TFT screen, also is a must. The resolution doesn't really matter, but we want something as modern as possible because old LCDs were really terrible. Ideally, 640x480 would get us the best DOS compatibility, but in practice, it's not a problem.
  • Ideally an sVGA card with good VESA/VBE compatibility, and with good scaling from the VGA resolutions (640x480 text, 320x200 graphics) to whatever the LCD resolution is (that means, either integer-scaling and the right LCD resolution or good quality filters when upsampling).
  • An USB port is highly recommended, as we want to be able to plug in a USB storage device to easily transfer files from and to modern, internet-connected machines. Setting up networking, using PCMCIA cards, etc would be much more painful.
  • We want a good keyboard. And, because we can, we want something cool looking, maybe an iconic piece of design, not some random garbage brand. Also, something that is easy to service.
  • Reasonably priced. There is no way I burn 1000$ on this just because certain hardware is right now "hot", I find it borderline immoral. 

Expectations vs Reality.

After long, long deliberations, research on forums, scouting eBay and so on, I landed on an IBM ThinkPad 240x. The ThinkPads are amazing machines, easy to service, iconic, with great keyboards and the TrackPoint is useable in a pinch.

Beautiful! Pro-tip, a bit of 303 protectant makes the plastics look as new!

I paid around 200$ for it, you will see people getting these for 5$ at a garage sale or stuff like that, but I'm ok paying more for something that the seller verified it's running, has no issues, and so on. More than that I think is crazy, but you do you...

When it arrived it looked amazing. Yes, it had scratches on the top, and even some hairline cracks, one near a hinge and one on the bottom of the chassis, but these are not a problem as I planned to disassemble the thing anyway, see if I needed to clean the internals, replace batteries, check for any leak, re-apply thermal paste if needed and so on.

Regardless of how much research you have done, the reality of the actual machine will surprise you in good and bad ways.

All the hardware setup was trivial, and all the things I thought would be hard were not. 

I gutted the battery as planned (the cells were already a bit bulging). I feared the most for the initial OS setup, but my strategy worked flawlessly. I bought an IDE-to-USB adapter, connected the SSD in its SSD-to-IDE enclosure, and mapped it as a virtual drive in a VirtualBox VM with Windows 98

That allowed me to use Win98's fdisk and format to create something I knew would be recognized by the ThinkPad - I was not sure at all the same would have happened with modern tools. For extra safety, I also made two partitions under 2GB, to be able to format them with fat16, and the remainder of space was left in a third partition using fat32.

Installing the OS was a breeze, and Lenovo still hosts all the latest IBM drivers - Windows 98 just works.

The first tiny hurdle I had to overcome was with the firmware update, IBM tools are adamant about having a charged battery to perform the update... which I clearly did not have. But in reality, the tool just calls a second executable, and even if the binaries have different extensions than the default the flashing tools wanted, it did not take too long to figure out the right switches to use.

Upgrading the OS was also trivial, some people made install packs with all the official patches and lots of unofficial fixes (used mdgx ones, htasoft is an alternative), I just grabbed one and it mostly worked. The only issue I had is that the first time around the OS stopped booting with some DMA error, but disabling a specific patch having to do with enabling DMA on drives solved the issue. Re-installing the OS via the SSD is relatively fast, and I also used an old copy of Norton Ghost to create snapshots.

To my surprise, even USB in DOS mostly worked (via Bret Johnson's drivers, albeit many options exist). It is not 100% reliable, nor it's fast... but it does work! Same for the TrackPoint, via cutemouse.

I ended up with the classic config.sys/autoexec.bat multiple-choice menu for things like emm386 and so on, I remember these being so painful to deal with, but in this case, it was all easy, probably also because this machine has so much RAM. 

That is not to say there aren't problems. There are, but in a way, luckily for me, they seem to be unfixable, so I don't need to spend a ludicrous amount of time trying to overcome them (alright alright, I already did spend more time than it's worth, using DOSBox-debug and a few different decompilers to reverse an audio TSR... but I won't anymore I swear). And I did not foresee them.

First, there is the VGA. I obsessed over resolutions, because I knew, that most laptops of this time do not do resolution scaling well. I had an epiphany though that allowed me to stop worrying about it. It's true that ideally, 640x480 makes you not have to worry about scaling. But! Laptops with 640x480 screens tend to be incredibly crappy and small LCDs, so much so, that the unscaled 640x480 area on a more modern laptop (say, an 800x600 panel) ends up covering a bigger screen estate and looking better!

So, problem solved, right? Yes. If you get a card with good firmware! Unfortunately, the laptop I got has an obscure chipset that not only has crappy VESA/VBE support but is also not software-patchable via UniVBE

Some TSRs help a bit (vbeplus, fastvid), adding more modes by using other resolutions and forcing the viewport to clip, and you can play around with caching modes, but most DOS sVGA demos do not work. 

TBH, that was just plain unlucky, most laptops would not be this bad at sVGA... but expect I guess to find at least one bit of "unlucky" hardware you did not think about in your machine.

The other issue is with DOS audio and this is a biggie. 

Yes, I paid attention, and I got a chipset that does support DOS SoundBlaster emulation. But OMG, nobody told me it was going to be this crappy! It's basically useless, with most software just not working at all, especially when it comes to digital audio. The OPL3 FM music fares better, it tends to work, albeit it might not sound great.

It's sad but most DOS software, especially demos, have a much higher chance of running in Windows 98 than in pure DOS, as when Windows is loaded the audio emulation is much, much better.

This is something that apparently one simply has to live with. No PCI sound card has great DOS support, now I learned, especially with laptops, as DOS audio support for PCI relies on a combination of the right soundcard, the right motherboard and the right firmware. 

It doesn't help that often, when people online report audio working in DOS, they mean dos-under-windows, not pure dos... And you get a laptop from the pre-PCI era, then you're likely on a 486 or less, which not only will be worse in all other areas - but also many of these laptops used not to bundle any audio card at all, so they are strictly worse.

That's not to say that there are no Pentium laptops with built-in ISA audio - there are, and probably I was again unlucky with the 240x being a rare combination of a dos-compatible-ish PCI on a "bad" motherboard (apparently using the intel 440mx chipset which does not support DDMA), but again... expect some issues, there are no perfect laptops, and even back in the day, there was hardly a configuration that would run everything flawlessly...

Conclusions. 

Was it worth it? Should you do it? Yes and no...

It's small!

For retro gaming, or in general, passive consumption (demos, etc), it's overall a terrible idea, I'm pretty confident all laptops would be terrible, and even most desktops.

The early PC landscape was just a mess of incompatible devices, buggy, unpatched software, and crashes. You were lucky when things worked, and this is true today as well. DosBox is a million times more compatible than any real hardware. Yes, it has bugs, and lots of things can be more accurate, but on average it is better than real hardware.

There are many DosBox builds out there, and I'm sure this is going to be quicky outdated, but at the time of writing I recommend:

  • On Windows, primarily DosBox-X
    • I also keep vanilla for debugger-enabled builds - you can even get a dosbox plugin for ida pro, but that's for another time, and DosBox-ECE
  • On Mac, Boxer - Madds branch and vanilla DosBox on Mac
    • Last time I tried, DosBox-X had issues on Mac with the mouse emulation - might have been fixed by now.

On windows, and especially if you care about Windows of any kind, there is 86box (a fork of PCem) which is a lower-level, more accurate emulator. DosBox does not work great even with Win3.11, for some odd mouse emulation problems that seem to be different in each fork.

If like me, you want to experience a monotasking machine that you can grab for a few hours at a time to play with a simpler, more focused experience, then I'd say these laptops are great fun!

I'm even collecting a bit of a digital retro-library by mirroring old websites, often grabbed from the Wayback machine, and grabbing old magazines from the Internet Archive, to recreate the kind of reading materials I had back then...

Overall, setting this up took me less time and energy than tinkering with a Raspberry Pi or say, trying to install a fully functional Linux on a random contemporary laptop. It's one of the least annoying projects I have embarked upon.

My conscience feels ok too. It won't become garbage, I hate clutter, I hate having too much stuff, too many things I don't need in my life, especially digital crap that creates more problems than it really solves... With this one, I know I can sell or donate the hardware the moment I don't want to use it anymore, it's not going to be in a landfill, it's not another stupid gadget with a short lifespan.

The best part, all the software is portable, DOS doesn't really care about the hardware, you only need to replace a few lines in your config.sys if you have specific drivers... so I can migrate all I have on this laptop to a DosBox setup (even today I do keep the two in sync) or a different machine. 

Not bad. You want to try? Luckily it's easy, this is what I learned! You don't have to stress over the hardware (as I did), because none is perfect.

I went for something relatively "modern", a laptop that would have ran in its prime Windows 98/NT/2000 - and "downgraded" it to do mostly DOS - I think that's a good choice, but I don't think this ended up working much better or worse than any other option I was considering.