Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size of fully vendored gpu-allocator with "d3d12" is quite large #181

Open
jimblandy opened this issue Oct 3, 2023 · 6 comments
Open

Size of fully vendored gpu-allocator with "d3d12" is quite large #181

jimblandy opened this issue Oct 3, 2023 · 6 comments

Comments

@jimblandy
Copy link

jimblandy commented Oct 3, 2023

Firefox would very much like to use gpu-allocator in its Direct3D12 backend for WebGPU on Windows, but we cannot:

  • Firefox has a policy of vendoring all our dependency crates into the source tree, to meet security requirements.

  • Firefox currently has a policy of not accepting the windows as a dependency, despite it being the officially supported bindings crate for Windows APIs, because of its size when the unpacked crate sources are vendored. As large as the Firefox sources already are, vendoring windows into the tree would be a notable increase. (However, Firefox does permit depending on the windows-sys crate.)

As a result, Firefox's WebGPU implementation uses wgpu's Direct3D12 backend without enabling the windows-rs feature, meaning that we use an older, slower path that does not suballocate buffers. We'd like to put this behind us and begin using gpu-allocator to do things right.

Right now, running cargo vendor in a crate that simply depends on gpu-allocator with the "d3d12" feature enabled produces a 238MiB vendor subdirectory.

Is there any way that gpu-allocator could reduce the size of its vendored footprint?

  • Could we use windows-bindgen at build time to generate only the bindings we need?

  • Are the old winapi bindings close enough to those from windows-rs that gpu-allocator could have some optional configuration that went back to depending on winapi, using the macros to generate whatever bindings are missing?

Those are just two strategies that occur to me, but any tactic to avoid bringing in windows-rs will serve.

Mozilla is willing to do the work. We're just looking for a plan that would be acceptable for inclusion in this code base; forking would be a waste of time and attention in all the generally-understood ways.

Thanks very much for considering our situation!

Background and prior discussion

#107 goes into the reasons gpu-allocator migrated to windows-rs - a move that makes perfect sense.

gfx-rs/wgpu#3207 is the wgpu issue covering this question.

cc: @ErichDonGubler

@jimblandy jimblandy changed the title Size of fully vendored gpu-allocator on Windows is quite large Size of fully vendored gpu-allocator with "d3d12" is quite large Oct 3, 2023
@Elabajaba
Copy link
Contributor

Elabajaba commented Oct 3, 2023

Afaik bindgen isn't really usable for public apis because you end up with unique types that aren't compatible with the ones from eg. windows-rs or winapi.

edit: The only solution for this I've been able to come up with is a gfx-windows crate (or just reworking the d3d12 crate) of windows-bindgen generated bindings to the needed dx12 subset of windows-rs. It's a crappy solution since it wouldn't interoperate with people using windows-rs, and requires either forking or getting upstream crates to maintain a gfx-windows path (and releasing when wgpu needs a release so it can bump the bindings version).

@MarijnS95
Copy link
Member

We previously had a winapi interface in the public API (maybe we still have it) and could do the same for windows by generalizing the default interfaces to use raw Rust types (e.g. void pointers) and rely on the COM bindings via windows-bindgen internally. Then, an optional windows-rs feature could turn on the convenient/type-correct interfaces, even if this requires casting from that type to the internally generated (but identical) type on every API entry point.

@MarijnS95
Copy link
Member

We did try another approach on the microsoft-directx branch where a more minimal d3d12 crate generated from the Agility SDK was used, but that effort was ultimately abandoned since the Agility SDK bindings are now immediately consumed by win32metadata and windows-rs.

@Jasper-Bekkers
Copy link
Member

Jasper-Bekkers commented Oct 3, 2023

Thanks for filing the issue; I understand the concern and we're also constantly fighting windows-rs because of the compile times and it's size. For us, however, winapi is a dead-end street. I'm not willing to invest in it anymore and I'd rather work with Microsoft to resolve the issues around windows-rs then to keep supporting winapi. (We used to do this, but we had to keep painfully add new Dx12 apis to winapi every time we wanted to use them so we abandoned it).

Therefor, I'm very open to the suggestion of replacing windows-rs with windows-bindgen, I think it's the path forward for most of these kinds of "low level" crates. I think before we go down that path, I'd like to highlight that because Traverse Research is doing mostly cutting-edge research into state of the art rendering we'd like to always keep the option to use whatever new Dx12 api's become available either through version updates or agility sdk updates. Having said that, I'm ok with having to "opt in" to those features for example (through build-time features even), but we should at least have an easy way to update the bindings if we do go down this path.

@jimblandy and @Elabajaba If we go down the path of having Firefox ship with gpu-allocator (something that sounds appealing to me), I think it would be good to set up some more process on our end as well and potentially make one of you two maintainers on this project. We're currently only using gpu-allocator with our own renderer (so not with wgpu) for example and we could do with a bit more testing as part of our release process. I'm not entirely sure what the expectations are on Firefox' side but I suspect they to be a bit more stringent then just slinging a new release every once in a while.

@jimblandy
Copy link
Author

jimblandy commented Oct 3, 2023

For us, however, winapi is a dead-end street. I'm not willing to invest in it anymore

Understood. This is not unexpected.

I'm very open to the suggestion of replacing windows-rs with windows-bindgen, I think it's the path forward for most of these kinds of "low level" crates.

Okay. Then this sounds like our best path.

Yes, Firefox would need gpu-allocator to work on plain Windows installations, without extensions like Direct3D's Agility SDK (link for people like me who had to look that up). Given that Cargo features are supposed to be additive, having the shiny new things be "opt in" would probably be necessary. Maybe a single "agility" feature, which you could extend however you like, would be all that's necessary?

I don't think Firefox would need too much process from Traverse; most of the burden will fall on us. We already depend on lots of third-party crates much less organized than this one. We have an internal vetting process we must follow, driven by cargo vet, which requires a full audit for the first import and then incremental audits as we update. Releases would be helpful, but don't need to have everything on crates.io; we're already bringing in specific git commits of wgpu anyway.

At the moment Mozilla's attention is on just getting our implementation stood up the way it needs to be (good CTS results in CI, etc). I think it's going to be a few months before we'll be seeing issues like efficient buffer allocation showing up at the top of our lists. At that point, I'd probably ask my co-worker @ErichDonGubler to take point on this work.

@Elabajaba is an independent wgpu contributor, not associated with Mozilla, so if they want to work on this, that would be lovely - but that's up to them.

@MarijnS95
Copy link
Member

Let's be more clear here. The Agility SDK (with shipping out-of-Windows DLLs and exporting magic symbols from an exe) isn't something that has to be used. Rather, the canonical latest version of the Direct3D12 API is provided by Microsoft out-of-tree at https://github.com/microsoft/DirectX-Headers. It contains types and COM objects for newer interfaces on objects, but those are optional and cannot even be used if QueryInterface() doesn't give the handle containing that interface to you in the first place. You're in the clear here for plain Windows installations as long as you manually audit and limit what interface versions are used (and hence what functionality is available to you).

A great example of this is #168 where we use enums to accept a newer D3D12Device object to use the new (from enhanced barriers) BARRIER_LAYOUT type insted of the older RESOURCE_STATES, but it's optional.

Hence I don't think there's a need for an agility feature unless we really use cutting-edge features inside gpu-allocator (enhanced barriers should be well stabilized, but support is optional at runtime anyway).


Rather, I think there could/should be a feature that enables some convenience APIs that allow users to pass windows types into this crate, rather than untyped void pointers, or reexporting our locally-generated D3D12* types. Their layout, behaviour and implementation is the same (identical) but Rust will consider them different because they're defined in different places.


Back on topic: that microsoft-directx crate/branch serves as an older example. A more recent example from us employing windows-bindgen is Traverse-Research/amd-ext-d3d-rs@fa55fdb but it generates from custom metadata that utilizes windows-rs types rather than generating a minimal windows-rs crate with just D3D12 code. It should only be an argument or two in bindings.txt to get there though. This crate will then have to depend on windows-core for the most basic of Winodws COM functionality - is Mozilla okay with that? It's a requirement for windows-bindgen but I've only seen an ack for windows-sys thus far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants