Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support type alias statements in simple statement positions #8916

Merged
merged 2 commits into from
Nov 30, 2023

Conversation

charliermarsh
Copy link
Member

Summary

Our SoftKeywordTokenizer only respected soft keywords in compound statement positions -- for example, at the start of a logical line:

type X = int

However, type aliases can also appear in simple statement positions, like:

class Class: type X = int

(Note that match and case are not valid keywords in such positions.)

This PR upgrades the tokenizer to track both kinds of valid positions.

Closes #8900.
Closes #8899.

Test Plan

cargo test

@charliermarsh charliermarsh added bug Something isn't working parser Related to the parser labels Nov 30, 2023
/// The lexer is within brackets, with the given bracket nesting depth.
Nested(u32),
/// The lexer is some other location.
Other,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and Nested could be combined, since this is just Nested with a depth of 0. Alternatively, Nested could be NonZeroU32. But both felt clunkier in practice. Open to changing, though.

Copy link
Contributor

github-actions bot commented Nov 30, 2023

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overall like the solution. I'm a bit worried that the colon detection is a bit too permissive.

Comment on lines 165 to 167
Tok::Colon if matches!(self.position, Position::Other) => {
self.position = Position::SimpleStatement;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You could implement Eq on Position so that you can use == instead of matches

// ```python
// class Class: type X = int
// ```
Tok::Colon if matches!(self.position, Position::Other) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looking at the colon itself might not be sufficient because colons are also used to separate type annotations, slice indices, etc). I'm not sure if this is an issue here, but let's add a few more tests, e.g. a: type X = int (which should not be valid)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the logical line rules, we have some more extensive logic to ensure that the colon follows a statement keyword (like class). I could add that here...

@@ -93,7 +98,10 @@ where
// 2. The type token is immediately followed by a name token.
// 3. The name token is eventually followed by an equality token.
Tok::Type => {
if self.start_of_line {
if matches!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to your change: It's somewhat unfortunate that it's necessary to track whether we're at the start of the line, because we already know this inside of the Lexer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. Perhaps we could move this into the lexer...

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overall like the solution. I'm a bit worried that the colon detection is a bit too permissive.

@charliermarsh charliermarsh merged commit 20782ab into main Nov 30, 2023
16 checks passed
@charliermarsh charliermarsh deleted the charlie/soft-keyword branch November 30, 2023 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parser Related to the parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fails to parse type aliases in same line body Fails to parse type aliases after semicolon
3 participants