Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GS rendering #6352

Conversation

slimbuck
Copy link
Member

@slimbuck slimbuck commented May 10, 2024

This PR updates our runtime GS renderer. GPU render performance is improved as well as sorting performance and jank during camera movement.

Before:
Screenshot 2024-05-10 at 16 07 53

After:
Screenshot 2024-05-10 at 16 08 24

The code changes are summarised as follows:

  • at load time we reorder splat data to be gpu cache friendly (using morton order)
    • this is skipped when loading the compressed PLY format because data is already ordered
  • the cpu sorter writes results to a texture and calculates number of splats in front of the camera
  • render with a two phase approach:
    • phase1: (preprocess) transform splat data into camera space - store the resulting data in render order (this is only required when camera moves)
    • phase2: (render) consume preprocess data and render quads

Other changes:

  • splats are rendered in batches of 128 quad mesh instances to improve occupancy

Issues:

  • the load time reordering is time and memory intensive. we may want to perform this on a worker.
  • this technique uses more gpu memory than before (to store the phase1 output). this might be an issue for mobile devices.
  • phase1 output stores splat XYZW to a RGBA16F buffer. This appears to be enough precision, but more testing is needed to confirm.
  • a followup PR will restructure shaders to accommodate the rendering customisations we need in, for example SuperSplat
  • VR mode may require work. This will be a followup PR.
  • The texture upload API doesn't support uploading partial data, so we can't upload only the required sorting data.

TODO:

  • implement rendering directly from compressed format (we currently decompress the data at load time). this should result in good memory and performance speedup and should fit nicely into the existing rendering pipeline.
  • investigate fragment bandwidth optimisation by:
    • switch to rendering the splats front to back (instead of back to front as we do now)
    • render half the splats
    • stencil out/mark pixels that are fully opaque
    • render second half of the scene skipping already filled pixels

@slimbuck slimbuck added area: graphics Graphics related issue enhancement labels May 10, 2024
@slimbuck slimbuck requested a review from a team May 10, 2024 15:38
@slimbuck slimbuck self-assigned this May 10, 2024
@mvaligursky
Copy link
Contributor

What's the impact to WebXR rendering using two viewports?

@Maksims
Copy link
Contributor

Maksims commented May 10, 2024

Is it possible to configure size of batches? As in case of 1m splats, it will lead to ~7.8k drawcalls?

Comment on lines +154 to +160
centerTexture;

colorTexture;

v1v2RenderTarget;

v1v2RenderPass;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add types?

@@ -120,6 +307,10 @@ class GSplatInstance {

createMaterial(options) {
this.material = createGSplatMaterial(options);
// this.material.setParameter('splatOrder', this.orderTexture);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete?

Copy link
Contributor

@willeastcott willeastcott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A true work of art. 🖼️

@willeastcott
Copy link
Contributor

@slimbuck Would it be reasonable for this PR to close #6344?

@slimbuck
Copy link
Member Author

Is it possible to configure size of batches? As in case of 1m splats, it will lead to ~7.8k drawcalls?

@Maksims, the 128-quad mesh is instanced as many times as necessary with single draw call.

@slimbuck
Copy link
Member Author

What's the impact to WebXR rendering using two viewports?

Yeah thanks I forgot to mention in the issues section. Plan is to fix/address VR in a followup PR.

@slimbuck
Copy link
Member Author

@slimbuck Would it be reasonable for this PR to close #6344?

Hopefully! :D

scope.resolve('transformB').setValue(this.splat.transformBTexture);
scope.resolve('transformC').setValue(this.splat.transformCTexture);

const cameraMatrix = new Mat4();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these seem like per frame allocations?

@slimbuck
Copy link
Member Author

Closing in favour of #6357.

@slimbuck slimbuck closed this May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: graphics Graphics related issue enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants