Try out winnow #86

stefnotch · 2024-04-28T20:15:46Z

I took a stab at replacing one of the existing pieces of code with winnow.

I went for winnow instead of the usual nom library, because epage's winnow is significantly easier to use and debug. It's also the second most popular parser combinator library, so it's a reasonably well tested piece of technology.

The code is twice as long as it should be, because I tried to keep the existing "line by line" logic. It'll get much shorter as I convert more of the source code.

stefnotch · 2024-04-28T20:16:37Z

src/compose/comment_strip_iter.rs

+                .parse_next(input)?;
+        }
+    }
+}


Up until here is the actual parser code. Everything below is a dance to keep the existing behaviour mostly intact, while also prepping to replace more code.

stefnotch · 2024-04-28T20:18:10Z

By the way, are there any extra repositories with shaders that I should ideally test?

robtfm · 2024-04-28T20:35:14Z

By the way, are there any extra repositories with shaders that I should ideally test?

bevy would be the obvious choice: run a few of the graphics examples (e.g. lighting, 3d_scene, bloom, fog, something from the 2d examples) and check for errors and equivalent output (though 99% if it doesn't error it'll match).

stefnotch · 2024-04-28T21:10:56Z

I also benchmarked winnow's compile times VS nom's compile times, and the two are very comparable. (under 2 seconds)
I'll have to benchmark the compile times for this PR at the end, just to make sure that I'm not causing any major regressions.

Elabajaba · 2024-04-29T03:33:17Z

I'd also be interested in runtime benchmarks as well. Naga_oil can take quite a while to assemble bevy shaders.

stefnotch · 2024-04-29T06:50:48Z

@Elabajaba I see, then I better add criterion in a separate PR. And then we should have a before and after for comparison.

Should I let such runtime benchmarks run in GitHub actions as well? (Would involve a bit more setup and a slightly different library)

robtfm · 2024-04-29T10:11:01Z

it's not ideal but there is an example which runs a bunch of compiles over various configurations - cargo run --release --example pbr_compose_test

run it twice to populate the driver cache the first time - really we want to check the runtime performance for a shader that's been used before.

stefnotch · 2024-04-30T13:39:27Z

@robtfm
I've been taking a look at the existing code, and wanted to ask if I understood everything before considering a bigger refactor.

Status quo

Currently, the composer will start by parsing every module that's being added

naga_oil/src/compose/mod.rs

Line 1403 in 37a472c

pub fn add_composable_module(

This parsing step mainly figures out "which imports" and "which defines", which the get_preprocessor_metadata returns.
We're also doing a few secondary tasks, like assigning values to @binding(auto).

Then, when it is time to build the final module, we have to put everything together

naga_oil/src/compose/mod.rs

Line 1551 in 37a472c

pub fn make_naga_module(

It resolves the imports, using self.module_sets from the previous step.
Then we create a derived module, and add everything from the imports to it.

naga_oil/src/compose/mod.rs

Lines 1674 to 1685 in 37a472c

    
           let mut derived = DerivedModule::default(); 
        
           let mut already_added = Default::default(); 
        
           for import in &composable.imports { 
        
               self.add_import( 
        
                   &mut derived, 
        
                   import, 
        
                   &shader_defs, 
        
                   false, 
        
                   &mut already_added, 
        
               ); 
        
           }

Finally, we do a bit of finishing work, like dealing with overrides and validation.

Is that a reasonably accurate overview so far?

Refactoring proposal

Currently, we are preprocessing shader code multiple times.
With a new parsing approach, I would be interested in doing the following:

First we parse the shader code, and we generate a syntax tree. This syntax tree only understands our preprocessor code, and normal WGSL code is just a raw string in the tree.

Module [
  Version [ 450 ]
  Imports [ ... ]
  
  HashtagDefine [ cat ]
  
  HashtagIf [
    Condition [
      cat == 3
    ]
    IfBranch [
      some code
    ]
    ElseBranch[
      other code
    ]
  ]
  
  ...
]

Then, when it's time to build the final module, we only need to look at the syntax tree. Lots of tasks become very easy, such as dealing with a #ifdef #endif. For that, all we have to do is "when printing the tree, only print branches that are true".

Does that make sense, and would it be reasonable? If yes, I'll go ahead with taking a stab at it.
Also, thank you so much for all your patience and excellent answers to my questions. I appreciate it. :)

robtfm · 2024-04-30T13:51:02Z

Is that a reasonably accurate overview so far?

yep, that's accurate.

Does that make sense, and would it be reasonable?

yes, if you can keep the preprocessor api functionally unchanged that should work just fine. one subtlety to note - the required imports are based on usage of items in the code (conditional for the PreprocessedOutput and unconditional for the PreprocessorMetaData), so you do will still need to parse the raw wgsl at the preprocessor level currently.

stefnotch · 2024-04-30T18:06:28Z

This actually raises an interesting future question for me:
Should a preprocessor be an actual preprocessor with its own parsing pass, or should it conceptually be integrated into the language. C's preprocessor works like the first option, Rust's cfg works like the second option.

Status Quo

As in, if I write code like this, then the scoping changes depending on what the preprocessing result is

fn main() {
  let a = 7;
  {
    let a = vec3(1., 1., 1.); // This is a different `a` variable

#ifdef FOO
  }

  {
#endif
    let b = a + 9;
    // Do something with a
  }
}

It could turn into either of those options

fn main() {
  let a = 7; // Pointless
  {
    let a = vec3(1., 1., 1.); // This a is used

    let b = a + 9;
  }
}


fn main() {
  let a = 7; // This a is used
  {
    let a = vec3(1., 1., 1.); // Pointless
  }

  {
    let b = a + 9;
  }
}

Or it could change if something is an identifier or not, with constructions along the lines of

#ifdef FOO
enable 
#else
var
#endif
rounding_mode; // Either this enables an extension, or is a variable called rounding_mode

Alternative option

Rust looked at this and went "we don't want that", and instead only lets cfg appear before "blocks". In terms of a syntax tree, cfg can never split a node in weird ways, like splitting a scope or a variable declaration into pieces.
Rust's approach has the upside of being easier for IDE tooling. Syntax highlighting is a really good example: Which color does rounding_mode deserve? A global-variable coloring, or an known-extension coloring.

stefnotch · 2024-05-07T12:55:11Z

I've been working on the AST rewrite and a few other things. Now I'm realising that if I do everything at once, this PR would explode.

So I wanted to ask what your preferred way forward would be?

For example, I could start with a rewrite, where I keep the public API as similar as possible. I'll definitely have a few places where there will technically be breaking changes, but I'd test that with Bevy and the unit tests.
Then I could extract the comment parsing into a separate PR, since that can be extracted reasonably cleanly.
After that, I'd propose changing a few bits in the public API. And then I'd submit the big PR (maybe split into pieces).

An alternative option would be: I break compatibility wherever reasonably justified, and submit one massive pull request. This is obviously less effort for me, and should lead to faster results. I would also take on the task of updating Bevy.
The major downside here is that it's a lot harder to review.

And finally, there's the option of "first submit massive pull request, then decide how we're going to break it down and integrate it".

Try out winnow

3771d6c

stefnotch commented Apr 28, 2024

View reviewed changes

Add quoted strings

ab50c37

stefnotch added 2 commits April 28, 2024 22:38

Revert preprocess changes

83e04b2

Keep more of the original behaviour

b434809

Convert more parsing

83d6d1b

stefnotch added 2 commits May 3, 2024 00:08

wip parser hardening

2a3a305

wip

a261e9e

stefnotch added 3 commits May 8, 2024 12:07

Remove replaced code

88284e2

construct import graph

f3efa03

Prep for naga ir compilation

085b34a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try out winnow #86

Try out winnow #86

stefnotch commented Apr 28, 2024

stefnotch Apr 28, 2024

stefnotch commented Apr 28, 2024

robtfm commented Apr 28, 2024

stefnotch commented Apr 28, 2024 •

edited

Elabajaba commented Apr 29, 2024

stefnotch commented Apr 29, 2024

robtfm commented Apr 29, 2024

stefnotch commented Apr 30, 2024

robtfm commented Apr 30, 2024

stefnotch commented Apr 30, 2024 •

edited

stefnotch commented May 7, 2024 •

edited

Try out winnow #86

Are you sure you want to change the base?

Try out winnow #86

Conversation

stefnotch commented Apr 28, 2024

stefnotch Apr 28, 2024

Choose a reason for hiding this comment

stefnotch commented Apr 28, 2024

robtfm commented Apr 28, 2024

stefnotch commented Apr 28, 2024 • edited

Elabajaba commented Apr 29, 2024

stefnotch commented Apr 29, 2024

robtfm commented Apr 29, 2024

stefnotch commented Apr 30, 2024

Status quo

Refactoring proposal

robtfm commented Apr 30, 2024

stefnotch commented Apr 30, 2024 • edited

Status Quo

Alternative option

stefnotch commented May 7, 2024 • edited

stefnotch commented Apr 28, 2024 •

edited

stefnotch commented Apr 30, 2024 •

edited

stefnotch commented May 7, 2024 •

edited