Search this blog

01 December, 2013

On Mantle

So it seems nobody so far had made a fool of himself being opinionated on AMD's recent Mantle API announcement. Allow me to fill that spot!


- Mantle could be a -great- idea! 

I think AMD nailed it this time... One of the biggest barriers that all kind of innovations face is adoption. For gaming hardware that's often due to a feedback loop, you need developers to be on board in order for the technology to be utilized but they won't care to invest money if the user base is not there, and the user base won't be there if developers don't commit to making product using a given technology.
That's the reason why for example you don't see most of the big players committing to exclusives early-on when a new console generation comes out, and why for example I still believe Oculus will have a tough time regardless of how amazing it is...

One way out of this is to be somewhat scalable, offering a technology that works without your proprietary stuff but works much better with. E.G. PhysX, the hardware acceleration board for physics in games was initially an extremely bad idea, they sold nothing and they deserved to sell less (I feel for people that wasted their money on that). Nowadays though it works, both because NVidia bought them and made the library to work on their much more popular GPUs, so lowering the barrier of entry, and because they offer a CPU fallback, so it's not exclusive to NVidia's. Great job, and I'd like them to follow a similar suit with GameWorks as well, by the way...

Now, it seems that AMD is not following the suit here, I guess it would be possible to make an emulation layer that fallbacks Mantle on DX11 (or well, parts of it), but they aren't.
Consider that next-gen consoles are already on AMD's hardware, and that really console game development is incomparably a bigger market than PC gaming (for most gamedevs), so most companies are more than willing to invest to use all kind of proprietary tech if it runs on consoles

So what AMD really should do (and I think they will) here is to find a way to leverage the position they have on consoles (which some analysts says it probably doesn't bring huge quantities of cash) to help their PC line as well. By keeping Mantle close at least to PS4's API (as I expect it to be... or anyhow providing a Mantle layer for PS4...) they are "recycling" an investment developers already have to make and lowering the barrier of entry. Also, developers already will have a DX11-ish renderer ready for XB1 (and I'll expect Microsoft to similarly keep in sync next versions of DX with the XBone to leverage a similar effect), thus allowing the "PC build" of the game to just be a merge of both technologies studios already have to employ: a XB1-ish renderer and a PS4-ish renderer.

- Why does DirectX 11 suck (not!)?

What are DirectX big issues? 


Well mostly they are all related with the fact that doesn't break the tradition of requiring delayed work on draws, requiring the driver to do extra work to keep track of things around and then pull them in to submit to the GPU...

The hardware state is not (and won't ever be in an abstraction API) one to one with the DirectX bits of state, so the driver has little choice but to store what DirectX tells to set in the hardware, wait until a draw appears and then look at -all- the state it stored and translate it into the appropriate bits and pieces that go in the hardware. 
Another way the driver needs to do extra work is that state lifetime is all "wrong" too, certain objects in DirectX can be just updated with new information but that's not possible for a driver as it has to keep the old ones around until the GPU has finished processing them, so you need to make duplicates, patch things and all kind of horrors.
The final nail on the coffin is how object updates are seen across threads (deferred context). Unfortunately, the DX11 model allows certain objects (i.e. dynamic buffers) to be seen across threads, and the final order of updates will depend on the order the deferred contexts are invoked from the immediate device, forcing the driver to leave holes in the contexts (or not generating a real GPU command buffer at all!) for later patching. This has to happen also for a few other objects, really.

Does this make of DirectX a bad API? I'd argue, it makes it less good than it could have been, surely. But it's still fairly ok, and I suspect that most of the issues could be sidestepped...

Yes, ideally, state could have been grouped in bigger entities (ideally a big block for all HW state but memory buffers, like textures, vertices, shaders and constants), if a single "set" command sets a lot of state chances are the driver will have more often all the information needed to do some work. Buffers could have been defaulted to "discard" and do we really need refcounting this day and age? Persistent state (having a state machine) is also quite bad (OpenGL doesn't fare better in that regard)... Many other little adjustments could have been made, remember though there is also a balance with ease of programmability (less of an issue for gamedevs maybe, but DirectX is not only for games) and compatibility with legacy hardware an so on...

- A frigging hundred thousand of drawcalls. And do I care?

Why is Mantle cool for developers? Assuming that it's not a big investment, so it's approachable, why should we care? Well, one of the biggest bullet-points AMD so far has been putting out there is that the API being faster, slimmer and more low-level will allow for much more drawcalls, more unique objects on screen.

I think this is mostly marketing (I will be proved wrong...), and it's unfortunate that AMD didn't yet publish any real numbers of real games doing really better for real. I think it's because for the most, they won't.
Now, it is true that there are situations where PC games are CPU and driver-bound, if you have an extremely powerful GPU that might quite often be the case, also there are certain games that are really optimized to generate a LOT of draws, e.g. forward-rendering engines that rely on splitting geometry at light and shader "intersections", or engines that rely on multipass rendering... But still, it's nice to have a speedup in these circumstances, but let's be honest, will the industry really jump on the idea or rather let (AMD) PCs to be slightly-less-than-optimal and circumvent that with their extra power (as it always has been)?

To answer that I think you should really look at DirectX 11 and the amount of effort the industry did put into that. DirectX 11 came out in 2008, almost six years ago now! How many games have shipped on it? How much better where they compared to DirectX 9? Are there games that are still faster on 9 than 11? What about tools? How much the industry cared? How much did we use Compute shaders? Geometry shaders? Tessellation? 
Truth is you will start seeing just now better DirectX11 games because now consoles are on DX11-ish hardware. Before, nobody did really care much, and once your assets are made to work on consoles there is only little you can do to "dress them up" with extra features... Everybody remembers Crysis 2 tessellation issues, but almost all games did the same things, just minor cosmetic dressing in their DirectX11 versions when they cared to put out one. If you have actually worked on DirectX11 you will have found that a big issue is still on driver bugs and various things that don't work as they should...

Really, the API is not perfect. But mostly it's that nobody, not the gamedevs nor the GPU vendors did really invest a lot. I would argue that the performance issues for example are as much a fault of the API as they are a fault of the drivers. It could go faster, even right now, in the end even in the golden era of PC graphics we always worked with the GPU vendors to "carve" certain hot-paths where the drivers were fast, a sort of contract on a language, basically an API inside the API. And I don't believe (I might be wrong) that DX11 doesn't allow for a proper multi-threaded driver (or more than it is right now anyways) with some restrictions on how we use it (a "fast path" that falls-back if you do the wrong calls). I think it's that having such things is quite hard and won't make anybody rich, and thus nobody really invested a lot of money in them...

So where does all this leave us? Well, it turns out that we not only are "good" at writing engines that work in the thousands of draws per frame limit but I suspect we still will need to, because not all the platforms will be able to do hundred of thousands. And if not all the platform will, it means that we still will need to think about art assets and graphic techniques in a way that they work on thousands of draws. Once you have a world that works like that you're set, you won't really be able to use that ability of drawing hundreds of thousands. 

Nowadays in any game you have, even if you bump the draw distance all the way you still won't generate so much draws. It would be cool probably to think about a new generation of engines that structurally work with an hundred-of-thousands draws assumption, I think it could very well change the way we think about culling, instancing, figure new ways. 
But I don't think it will happen, I suspect it will remain a marketing thing, and games will go better just because they can be a bit faster on certain hardware configurations at doing the draws they do.
At best, we'll have more particles or such dressing-style effects, but I can't really imagine now an application because we're good at doing these things with very few draws already. Anyhow, it will be hard to fully employ any ability that requires to think about assets in different ways, if said ability doesn't work everywhere...

- Mantle and NVidia? Mantle and EA? Mantle and Steam Machines?

Even if AMD says the opposite, I don't think Mantle will be a cross-platform API. Well, I don't even think it will be a cross-generation API. It could be, because really it's not hard to imagine how a more modern API should work (I bet DX12 will solve most DX11 issues, and OpenGL via certain extensions is already getting there, bindless, multidrawindirect...), but I don't think it even should...
As I wrote at the beginning, it would be the best if Mantle was as close to PS4 as possible, lowering the investment needed for gamedevs. Even if it works only on the GCN hardware and the GCN will be the architecture they use more or less for this entire console generation, that would be plenty. No gaming-oriented graphics API (and probably no graphics API in general) has really to think with longer timeframes, technology changes anyways.

If Mantle is close to PS4 then probably it won't map perfectly to NVidia's hardware, but I don't think it's hard to believe that the API could run and maybe even well on NVidia's GPUs, there are ways to abstract things just well enough. But I don't think NVidia will join the program, developers will surely be happy if that was the case, but, politics... Also if AMD really wanted to make an "open" API well. they surely shouldn't have been doing it behind closed doors with EA|Dice and nobody else...

Speaking of which... What's in for EA? When I first heard about Mantle I was puzzled and in a way I still am. For the reasons I sketched above I don't think that EA is going to directly make more money from it. I don't think it will be as revolutionary as it could sound to begin with, I see it as an optimization but I will be surprised to see significant graphical features to be locked to Mantle-only. And even if such things existed, we're talking about an influence on less than half of the PC market, that per se is significant only for very few EA games to begin with (only Battlefield?).

I don't really see them selling more copies because of Mantle, to a degree that I even suspected that AMD might have just paid to get Dice on board, or did most of the work themselves for the porting. On the other hand I wouldn't discard the sheer passion of Dice's team for graphics and technology. And I can't really get a feeling of how much Frostbite (and Ignite) as brands per se help marketing EA's games, to a degree where being on top of whatever graphical innovation there is directly strengthens that brand and skews people into thinking that everything Frostbite will be a must-buy...

About Steam Machines, I don't know really. I don't know how Steam Machines will succeed to begin with, even less Mantle on them. They are PCs. They cost in the ballpark of how a PC would (from what we know so far) with a similar hardware, just without Windows, where still the vast majority of games are. And I don't think Valve can really do publish their games on SteamOS only (HL3...) as certain people say, for the same reasons no huge title lands on new consoles at launch. Too small of a market. 
Even if SteamOS can just be installed alongside Windows on the PC you already might have, it would simply hurt the sales of the game, piss the Windows Steam users which are the "core" of Valve's market and be a crazy move in every way. Yes, Mantle could run on Linux I guess even easily. But it won't help. You would still need to cover OpenGL for NVidia's hardware anyways, so it doesn't really lower the cost of entry. Unless it gets so crazy popular that some studios will be willing to do Mantle exclusives... Hard to imagine.

- Tl;Dr - Conclusions and expectations

Mantle could be great move for AMD. I hope it will be very easy to port from PS4, if so, we will see titles using it as a performance improvement. That's the minimum expectation, AMD hardware gets a framerate boost on some or many titles depending on how easy it is to port. Both low-end configurations and very high-end (where a single-threaded CPU driver might likely stall the GPU) might get advantage.

I expect also some savings on the GPU side, not only CPU, especially if they expose better ways to control scheduling of draws and compute on the GPU, aiding efficiency, but also from other things like being able to avoid certain operations all-together. Actually I would say there is a lot on compute that is not exposed today by DirectX, being able to schedule that better might be a bigger win than the mostly-marketing stuff that the 100k draws per-frame I think is...

If they expose more of certain GPU details which are not accessible under DirectX/OpenGL, there might even be certain effects that are available only on Mantle. We might see better streaming and texture usage thus enabling nicer textures, so getting some better looks from there instead of just better framerates. Stuff like 3d rendering (rendering the scene twice) might benefit more from Mantle, as they could all kind of algorithms that need to submit the scene to different buffers (think for example more shadowmapped lights etc...).
Overall though my "best" scenario is that it might enable "minor" cosmetic additions that are hard or too expensive to do without it. The kind of things you saw in DirectX10/11 versions of DirectX9 games. I doubt games will look significantly different, I doubt assets will be made exclusively for it. I doubt NVidia will jump on board.

Mantle is interesting also because it will open the "lower lever" layer to researchers and categories which don't usually work on consoles, some good stuff can come out of that, stuff that I can't foresee today and might change the landscape...

But most importantly, it will -surely- tell NVidia and Microsoft to "get their shit together" (having Frostbite on Mantle is enough already to call it a success and make these companies worry, imho). I think Mantle could be cross-platform, I don't think it will be (NVidia won't make it)... which will lead to both a better DirectX (which Microsoft will likely leverage also to make XB1 better) and better drivers... If they feel threatened and they care, put money on that, they might even succeed at making Mantle "obsolete" (less attractive) faster that it will spread... We'll see...

9 comments:

Anonymous said...

Hi. codedivine from twitter here. Some other people think that Mantle won't matter at all on top-end cards such as the 7870+, but one configuration where Mantle *may* provide a better experience is on weaker hardware such as AMD APUs. Do you think so too?

Frogblast said...

From AMD's APU presentations, the takeaway I got is that the degree of alignment with the PS4 API varies depending on the part of the API:

(I'm reading between the lines, as I do not have access to libGNM).

- resource allocation: probably very closely related to the PS4 API, where the user performs their own suballocation out of larger resources. This allows framebuffer aliasing, but also keeps the number of WDDM resource binds to a minimum (at the likely cost of much coarser VRAM paging?). I suspect they also both expose GCN's resource descriptors.

- explicit command buffer management: probably similar to PS4, allowing parallel encode and explicit submission of command buffers to a particular hardware queue.

- 3D state setting: probably very dissimilar to PS4. I suspect that libGNM expresses something very close to GCN's command buffer format (directly exposing the orthoginality/nonorthoginality in GCN), where Mantle has a single coarse 'state object' for most 3D pipeline state to avoid requiring that a handle non-orthoginal state by deferring work to Draw time. This is where AMD is reserving the most ability to make HW changes in the future, including moving the balance between fixed function units and shaders.

DEADC0DE said...

Codedivine: I'm not sure. In general I agree that it might be a move towards low-power configurations, but even there I'm not entirely sure because we tend to put relatively high resolution screens everywhere and so the GPU might be a bottleneck as soon as the CPU is.
In theory it's actually better for very high end configurations where it's easy for the GPU to be much, much more powerful than a single thread on CPU (remember that the drivers are still running on one thread nowadays), SLI stuff even "better". But on the other hand I guess to CPU bottleneck these you still have to push your engine a lot which might be not "easy" once the assets are made a given way... Of course you can ignore LODs (material ones...) and push view distances and manage, but I don't know at which point you hit diminishing returns really...

Honestly I'm quite far from PC development, I've done some games that also shipped on PC and even supervised their performance, but the overall amount of time allocated to the PC version has to be small enough not to make me in anyways an expert. Mostly cared about making sure it ran, first, then on some very specific minimum requirement configurations we would see what to do to make it faster...

DEADC0DE said...

Frogblast: I agree totally. I expect it to be higher level in a lot of ways (as you mention, state is one of such) but still easy to drop-in, as most of the "hard" stuff (namely, manual management of memory and lifetimes) would still work with the logic we already did and tested on PS4.

If they really want to push it they could even have a PS4 layer that sits on top of libGNM, I honestly don't know/didn't ask if that was the case and I can see reasons for them not to (namely, that most devs would still prefer the native PS4 stuff on PS4).

eXile said...

Interesting thoughts. However, in my opinion the whole Mantle discussion has a dire lack of facts -- which is not really surprising, because the "official" information about Mantle consist of two AMD presentations by Johan Andersson.

So here are some more "facts" ... well ... actually they are twitter messages and blog posts. Apparently the best kind of facts you can get in the clandestine world of GPUs.

1. PC/Windows: OpenGL should get all Mantle features exposed as OpenGL extension, and apparently with the same performance. https://twitter.com/grahamsellers/status/383002166329237504 (Please read all tweets in that thread.)
2. PC/Linux: In the second AMD presentation, Johan said only Windows 7 and 8/8.1 are currently supported (probably because of the WDDM), but Linux might possibly follow in the future.
3. XBox One: No Mantle. Via Microsoft decree. http://blogs.windows.com/windows/b/appbuilder/archive/2013/10/14/raising-the-bar-with-direct3d.aspx
4. PlayStation 4: No Mantle. https://twitter.com/AMDRadeon/statuses/389889549016391680

So either I got something wrong (or, more precisely, one of the sources above is wrong), or it seems that Mantle might be substitutable by OpenGL, feature- and performance-wise. I personally doubt the latter, but that is only my opinion.

DEADC0DE said...

1) I don't think that makes a big difference but it's nice to have

2) I don't think porting to Linux would be hard. On the other hand I don't think linux matters

3) Well, that's for sure

4) If gamedevs really want to use Mantle and if they'll find it too different from ps4, they might push for it to be on ps4 and if gamedevs push, things will happen. But I suspect that if it's too different simply it will fail and most games won't be using it.

Anonymous said...

I've lost you on the "hardware state" bits. Modern GPUs don't have any "state" whatsoever. Unless by "state" you are referring to the values stored in register file.
This is why DX9/10/11/12... are bad: they cling to "state" that just does not exist anymore.
And then the issue of draw calls, it's the same problem: modern games do not use many draw calls, because it is prohibitively expensive in D3D, and not because they "don't need it".
Proper geometry LODs, animations or any dynamic objects on screen: each of them produces draw calls.

DEADC0DE said...

I agree that if we could universally use more draws we would find ways to employ them. But if it's only one platform, not.

Arun said...

Great post. I learn useful information from your blog.


http://techhowdy.com/mp3-skull-download