Important performance regression since commit 7fd45d6 on a very big file #1212

sroddy · 2023-04-28T23:32:04Z

I've discovered this issue when I tried to bump the version of dart_style from 2.2.5 to 2.3.0.

My build_runner command gets stuck (I've left it running for more than 10 minutes before killing it) when trying to format the output of a particularly big generated code file.
I've bisected the git log and figured out that the first commit in which the issue is present is after the merge of #1182 (commit 7fd45d6). Before that commit, the step takes 2/5 seconds at most.
I've put a breakpoint and managed to dump the input string of the DartFormatter.format function inside the build step that causes it to hang.
I would prefer to avoid attaching that string here, as it contains some proprietary code, but I can send it via email if it can help you fix the issue.

Thanks

The text was updated successfully, but these errors were encountered:

sroddy · 2023-04-29T15:30:29Z

I've managed to reproduce the regression on the attached snippet of code, this is the output (elapsed time) on my machine before and after:

87ms with commit 35a5d9f
1857ms with commit 7fd45d6

regression.dart.zip

munificent · 2023-05-01T18:31:04Z

Thanks, I'll take a look.

munificent · 2023-05-05T23:29:19Z

OK, I know what's going on. The change to allow inline case bodies (c9e5191) incidentally fixed a long-standing bug where metadata splitting behavior across parameters wasn't correct. It was always the intent that if a metadata annotation on a parameter splits, then all metadata annotations across all parameters in the parameter list should split.

Unfortunately, because of the bug, the line splitter didn't enforce that rule. The above commit fixes the behavior. But in order to fix that, all parameters in a parameter list must go through the line splitting algorithm process instead of line splitting each parameter independently. Line splitting is the slowest part of formatting and can get very slow if you throw more code at it at once.

Your program happens to contain a particularly long parameter list (85 parameters), so line splitting takes too long and it gives up.

I'll spend some time and see if I can come up with a better way to handle this.

@required

Also, don't indent parameters with metadata annotations. Before this change, the formatter indents any parameter that contains metadata annotations that end up splitting: function( @required int n, @deprecated('Some long deprecation string.') String s) { ... } It also treats all parameters in a parameter list as a unit when deciding how to split their metadata. If any annotation in any parameter splits, then they all do. That's why there's a split after `@required` in the above example. The intent of both of these is so that a parameter with unsplit metadata doesn't get confused for an annotation on the next parameter: function( @FIRST('some string') int n, @second('another string forces a split') String s) { ... } Note here that there are two parameters, `n`, and `s`, and not just a single parameter `s` with two annotations. Unfortunately line splitting all of the parameters as a unit is very bad for performance when there is a large number of parameters (#1212). Also, it's not very helpful. In practice, parameter metadata is rare and most parameters that have any annotations only have one. And the indentation is just strange looking and inconsistent with how annotations are formatted elsewhere. It also means that parameters with split metadata don't align with parameters that don't have metadata. This change determines whether each parameter's annotations should split independently from the other parameters' and removes that indentation. The above example becomes: function( @required int n, @deprecated('Some long deprecation string.') String s) { ... } This improves performance on large parameter lists and I think looks better on real-world examples. I ran it on a large corpus (2,112,352 lines in 6,911 files) and I think the impact is small enough to not go through the full change process: 293 insertions + 443 deletions = 736 changes 1 changed line for every 2870.04 lines of code 0.3484 changed lines for every 1,000 lines of code The full diff is: https://gist.github.com/munificent/1dc7361438934a3587f6149049682e29 Fix #1212.

@required

Also, don't indent parameters with metadata annotations. Before this change, the formatter indents any parameter that contains metadata annotations that end up splitting: function( @required int n, @deprecated('Some long deprecation string.') String s) { ... } It also treats all parameters in a parameter list as a unit when deciding how to split their metadata. If any annotation in any parameter splits, then they all do. That's why there's a split after `@required` in the above example. The intent of both of these is so that a parameter with unsplit metadata doesn't get confused for an annotation on the next parameter: function( @FIRST('some string') int n, @second('another string forces a split') String s) { ... } Note here that there are two parameters, `n`, and `s`, and not just a single parameter `s` with two annotations. Unfortunately line splitting all of the parameters as a unit is very bad for performance when there is a large number of parameters (#1212). Also, it's not very helpful. In practice, parameter metadata is rare and most parameters that have any annotations only have one. And the indentation is just strange looking and inconsistent with how annotations are formatted elsewhere. It also means that parameters with split metadata don't align with parameters that don't have metadata. This change determines whether each parameter's annotations should split independently from the other parameters' and removes that indentation. The above example becomes: function( @required int n, @deprecated('Some long deprecation string.') String s) { ... } This improves performance on large parameter lists and I think looks better on real-world examples. I ran it on a large corpus (2,112,352 lines in 6,911 files) and I think the impact is small enough to not go through the full change process: 293 insertions + 443 deletions = 736 changes 1 changed line for every 2870.04 lines of code 0.3484 changed lines for every 1,000 lines of code The full diff is: https://gist.github.com/munificent/1dc7361438934a3587f6149049682e29 Fix #1212.

munificent added the performance label May 1, 2023

munificent mentioned this issue May 8, 2023

Split metadata on parameters independently. #1217

Merged

munificent closed this as completed in #1217 May 9, 2023

munificent mentioned this issue May 10, 2023

[CP] Roll dart_style 156f5c8c3bfe4e3b8e7adc95f7cf24d9500f11c6 into the SDK dart-lang/sdk#52346

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Important performance regression since commit 7fd45d6 on a very big file #1212

Important performance regression since commit 7fd45d6 on a very big file #1212

sroddy commented Apr 28, 2023

sroddy commented Apr 29, 2023

munificent commented May 1, 2023

munificent commented May 5, 2023

Important performance regression since commit 7fd45d6 on a very big file #1212

Important performance regression since commit 7fd45d6 on a very big file #1212

Comments

sroddy commented Apr 28, 2023

sroddy commented Apr 29, 2023

munificent commented May 1, 2023

munificent commented May 5, 2023