New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hal/vk] Rework Submission and Surface Synchronization #5681
base: trunk
Are you sure you want to change the base?
[hal/vk] Rework Submission and Surface Synchronization #5681
Conversation
Works fine for me running the cube example with VK validation layers enabled
|
This fixes an annoying access violation termination (that I thought was a driver bug) on program exit after a present. |
Unfortunately, it does not fix #5644 for me. Same constant validation error on presentation:
The actual validation error only happens with a Flamegraph of For comparison, same drivers with Even if I enable |
@hecrj could you try again with these latest commits, someone else who can reproduce the validation error says that this fixed it. |
ea6e6ba
to
329513b
Compare
@cwfitzgerald Yep! The latest changes fix the validation errors. Thanks! I'm still noticing a bit of worse performance in |
I'm actually rewriting these comments as I review so if you're not attached to them then don't worry too much about fixing anything. |
Yeah that's fine - I was just trying to get something there. |
@cwfitzgerald What do you think of this? da543fb51 My hope was that this would make the states of [edit: made this a little nicer still: |
@cwfitzgerald Here's an even more radical suggestion: b5d52fe Just use a single semaphore for all queue ordering. |
@jimblandy i think we can go for it, the only question is if we want to still avoid this deadlock in anv mentioned in the old code, it if we consider that old enough to ignore.
I should have perseved that comment somehow. |
Ahh, yes, I was reviewing by checking out the branch and just wandering around with rust-analyzer, so I didn't notice that the patch removed the comment. If that's the sole rationale for the entire semaphore-swapping doohickey then we had definitely better have the comment. Kvark filed that bug in Oct 2021. That's not that long ago, so I think we'd better keep it. It sounds like a bear to debug, and I'd rather not have to debug it again. |
So I guess the remaining question is what you think of this one. |
So here's the latest version of my suggestion: |
Looks fine to me, could you either PR to my fork or just push to it? |
I still want to get a handle on the swapchain semaphores. I've got some work-in-progress docs for those too. I'll take care of this tomorrow morning. |
for &surface_texture in surface_textures { | ||
wait_stage_masks.push(vk::PipelineStageFlags::TOP_OF_PIPE); | ||
wait_semaphores.push(surface_texture.wait_semaphore); | ||
// We lock access to all of the semaphores. This may block if two submissions are in flight at the same time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my understanding, but I think this can't actually block. All submissions using any given SurfaceTexture
are required to use the same Fence
, and as explained in the docs for wgpu_hal::Queue::submit
, all submissions using the same Fence
must be synchronized to occur in a particular order.
wgpu-hal/src/vulkan/instance.rs
Outdated
// Wait for the previously acquired image used by the semaphore to be available. | ||
swapchain.device.wait_for_fence( | ||
fence, | ||
locked_swapchain_semaphores.previously_used_submission_index, | ||
timeout_ns, | ||
)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this actually accomplish? Aren't we going to wait for the acquire
semaphore anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must wait until the cpu timeline is sure that the semaphore has been signaled and reset. Otherwise there will be two acquire operations that could signal the same semaphore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, in other words:
If we were to remove this wait, then acquisition, command submission, and presentation would all set stuff in motion in the presentation engine or on the GPU, but as far as the CPU is concerned, they'd all return without blocking, so the CPU could just run the process full speed ahead, blow through the entire swapchain, come around to the front of the ring buffer again, and make a fresh call to vkAcquireNextImage
passing the same semaphore as last time. This wait is the only thing that rate-limits that process to match actual presentation speed.
Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed another commit documenting this. If my explanation is correct, please mark this as resolved.
6d2daa8
to
c530f01
Compare
@cwfitzgerald I just pushed two doc commits. Could you look them over for accuracy? |
// debug_assert_eq!( | ||
// Arc::as_ptr(&texture.surface_semaphores), | ||
// Arc::as_ptr(&ssc.surface_semaphores[ssc.next_semaphore_index]), | ||
// "Trying to use a surface texture that does not belong to the current swapchain." | ||
// ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should either be uncommented or removed, but we shouldn't leave it in limbo like this.
a26aba1
to
a5a000c
Compare
rebased on current trunk |
Introduce a utility function for making binary semaphores, and use it where appropriate.
This allows us to delete `should_wait`, and makes `RelaySemaphoreState` the same as `RelaySemaphores`, so we can just delete the former, and have `RelaySemaphores::advance` return a clone of `self`'s current state.
a5a000c
to
4cf108c
Compare
Connections
Description
As described in #5559, our submission and surface synchronization was totally messed up. I've revamped it, made smarter classes to help keep all the bookkeeping straight, and generally make the code cleaner and easier to understand.
Testing
works on my machine lel