exact number of empty lines for document start/end #340

wookietreiber · 2020-10-30T10:20:23Z

I would like to enforce having exactly one empty line after document start and exactly one empty line before document end, i.e.:

good

---

foo: bar

...

bad

---
foo: bar
...

---


foo: bar


...

options

I have already forked and wanted to start hacking away, but then I realized from reading through the documentation again and looking at the code and tests, that there are actually two places where this feature could be added:

directly with document-start and document-end

I imagined something the likes of a new option empty-line, allowing either a bool or the string whatever, defaulting to whatever (compatibility). The behavior would have been:
- the empty line option applies only if present: true
- if true enforce a single empty line after document start / before document end, respectively
- if false the empty line is forbidden
- if whatever, don't care
It would also be possible to turn these into an int to enforce an exact number of empty lines.
with empty-lines which already contains max-start and max-end

The feature could also be implemented with empty-lines and two new options called min-start and min-end.

guidance

Before implementing any of these, I would like to ask your guidance about how to implement this. In my opinion, adding this to document-start and document-end would be the most intuitive, but this could potentially conflict with max-start and max-end, while the conflict could also be delegated to the user. Please advise.

The text was updated successfully, but these errors were encountered:

adrienverge · 2020-10-30T14:58:05Z

Hello Christian,

Thanks for your crystal-clear message.

I think empty-lines is not really related, because max-start and max-end refer to the file start / end, not the document start / end (a file can contain multiple YAML documents, and the --- can be on line 2 for instance).
```
                 ← empty-lines max-start
---
                 ← new rule or option
foo:bar
                 ← new rule or option
...
                 ← empty-lines max-end
```
document-start and document-end seem the best way to go.

Your proposal with empty-line: int is cool, but I don't see how a user could enforce a range of values (e.g. I want at least 1 empty line, but not more than 2).

Maybe we could define options like this (inspired by rules braces, brackets, colons and commas)?

document-start:
  present: true
  min-empty-lines-after: 1
  max-empty-lines-after: 2

wookietreiber · 2020-11-02T10:20:06Z

@adrienverge Thanks for the explanation! Your thoughts about [min|max]-empty-lines-[after|before] are very reasonable, so I'll get started on the PR for that.

wookietreiber · 2020-11-02T12:44:03Z

@adrienverge I need your advice again :-)

At the moment, there are three types of rules, i.e. TYPE = 'comment|line|token'. For comparison, yamllint/rules/document_start.py uses token to check for DocumentStartToken, while yamllint/rules/empty_lines.py uses line to count empty lines.

It seems that line counting can't be done with the token-based parser, because it only goes up to nextnext and we'd need to peek further than that to be able to count the empty lines and, from looking at the pyyaml documentation, there does not seem to be a newline token that could be used.

To support the new [min|max]-empty-lines-[after|before] options, the diff might be a bit bigger than simply adding a little bit new logic to the existing rules. I can think of the following variants:

Add a new rule TYPE for document-start and document-end, i.e. a mixture of both token-based parsing to check for DocumentStartToken and line-based parsing to check on the number of empty lines encountered.
Add the new [min|max]-empty-lines-[after|before] options to empty-lines instead of document-start and document-end.
Split document-start and document-end in two parts/rules, one that checks for the token and one that checks for the empty lines.

While still getting to know the code-base, I'm not sure about the implications and practicability of these variants. How do you want to proceed?

adrienverge · 2020-11-04T14:10:05Z

Hello @wookietreiber, indeed that's an interesting point.

It seems that line counting can't be done with the token-based parser, because it only goes up to nextnext and we'd need to peek further than that to be able to count the empty lines and, from looking at the pyyaml documentation, there does not seem to be a newline token that could be used.

Are you sure? I did a simple test with:
```
---

string
```
... and tokens seem to be:
```
0:0 --> 0:0	StreamStartToken(encoding=None)
0:0 --> 0:3	DocumentStartToken()
2:0 --> 2:0	BlockMappingStartToken()
```

If it doesn't work, the best way to go is your point 1. What about a refactor that would allow any rule to be both/any of comment|line|token? For example:

 ID = 'document-start'
-TYPE = 'token'
 CONF = {'present': bool}
 DEFAULT = {'present': True}
 
 
-def check(conf, token, prev, next, nextnext, context):
+def check_line(conf, line):
+    # do stuff
+
+
+def check_token(conf, token, prev, next, nextnext, context):
     if conf['present']:
         if (isinstance(prev, (yaml.StreamStartToken,
                               yaml.DocumentEndToken,

wookietreiber · 2021-07-05T07:04:43Z

It seems that line counting can't be done with the token-based parser, because it only goes up to nextnext and we'd need to peek further than that to be able to count the empty lines and, from looking at the pyyaml documentation, there does not seem to be a newline token that could be used.

Are you sure? I did a simple test with:
---

string
... and tokens seem to be:
0:0 --> 0:0	StreamStartToken(encoding=None)
0:0 --> 0:3	DocumentStartToken()
2:0 --> 2:0	BlockMappingStartToken()

@adrienverge So, if I understand this correctly, BlockMappingStartToken(2:0 --> 2:0) refers to the line where string starts, but doesn't actually include the string itself. There is no token for an empty line, because 1:0 is mentioned nowhere. I could theoretically infer from the line difference in DocumentStartToken(0:0 --> 0:3) and WhateverToken(2:0 --> 2:0) that there is one empty line at 1:0?

wookietreiber linked a pull request Jul 5, 2021 that will close this issue

adds min/max empty lines after document start/end #393

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exact number of empty lines for document start/end #340

exact number of empty lines for document start/end #340

wookietreiber commented Oct 30, 2020

adrienverge commented Oct 30, 2020

wookietreiber commented Nov 2, 2020

wookietreiber commented Nov 2, 2020

adrienverge commented Nov 4, 2020

wookietreiber commented Jul 5, 2021

exact number of empty lines for document start/end #340

exact number of empty lines for document start/end #340

Comments

wookietreiber commented Oct 30, 2020

good

bad

options

guidance

adrienverge commented Oct 30, 2020

wookietreiber commented Nov 2, 2020

wookietreiber commented Nov 2, 2020

adrienverge commented Nov 4, 2020

wookietreiber commented Jul 5, 2021