New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extensions: Select language extension based on longest matching path extension #11697
base: main
Are you sure you want to change the base?
extensions: Select language extension based on longest matching path extension #11697
Conversation
dfd5e80
to
a4f651a
Compare
I like where your head is at! To add to this, Rails has erb files with the extension |
@@ -1,6 +1,6 @@ | |||
name = "TypeScript" | |||
grammar = "typescript" | |||
path_suffixes = ["ts", "cts", "d.cts", "d.mts", "mts"] | |||
path_suffixes = ["ts", "cts", "mts"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not important to this PR – and to be honest, I didn't actually test this – but I noticed while working on this that these suffixes a. would never work (because of the issue this PR is trying to fix) and b. are redundant (because cts
and mts
are already handled).
@@ -533,8 +542,10 @@ impl LanguageRegistry { | |||
}, | |||
); | |||
if path_matches_custom_suffix { | |||
2 | |||
} else if path_matches_default_suffix || content_matches { | |||
usize::MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used MAX
to ensure that user overrides always win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the user override needs to always win. I think we could use the longest match, but in the event of a tie, favor the user-defined path suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to implement that is to multiple the longest matching suffix's length by two and then add one if it's a user-defined suffix. Curious what you think of that.
let path_matches_custom_suffix = user_file_types | ||
.and_then(|types| types.get(language_name)) | ||
.unwrap_or(&empty) | ||
.iter() | ||
.any(|suffix| path_suffixes.contains(&Some(suffix.as_str()))); | ||
.any(|suffix| filename.ends_with(suffix)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I forgot to update this like I did above to support the full file name and prevent (eg) rs
also matching .cars
. I'll leave this as is for now, pending input from someone else on how to proceed. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we should use the same logic for finding the suffix length, for both the default path suffixes and the user-defined path suffixes.
I'd write it something like this:
let matching_suffix_len = config
.path_suffixes
.iter()
.filter(|suffix| {
filename == suffix ||
filename.strip_suffix(suffix).map_or(false, |prefix| prefix.ends_with('.'))
})
.map(|suffix| suffix.len())
.max();
a4f651a
to
bd9174b
Compare
@wassson Thank you! I think you should already be OK with |
@claytonrcarter haha duh yes that sounds right to me! |
bd9174b
to
72db72d
Compare
72db72d
to
8c60680
Compare
This isn't ready for merge, but it is ready for input from staff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be a good change! I left some comments.
@@ -533,8 +542,10 @@ impl LanguageRegistry { | |||
}, | |||
); | |||
if path_matches_custom_suffix { | |||
2 | |||
} else if path_matches_default_suffix || content_matches { | |||
usize::MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the user override needs to always win. I think we could use the longest match, but in the event of a tie, favor the user-defined path suffix.
@@ -533,8 +542,10 @@ impl LanguageRegistry { | |||
}, | |||
); | |||
if path_matches_custom_suffix { | |||
2 | |||
} else if path_matches_default_suffix || content_matches { | |||
usize::MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to implement that is to multiple the longest matching suffix's length by two and then add one if it's a user-defined suffix. Curious what you think of that.
let path_matches_custom_suffix = user_file_types | ||
.and_then(|types| types.get(language_name)) | ||
.unwrap_or(&empty) | ||
.iter() | ||
.any(|suffix| path_suffixes.contains(&Some(suffix.as_str()))); | ||
.any(|suffix| filename.ends_with(suffix)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we should use the same logic for finding the suffix length, for both the default path suffixes and the user-defined path suffixes.
I'd write it something like this:
let matching_suffix_len = config
.path_suffixes
.iter()
.filter(|suffix| {
filename == suffix ||
filename.strip_suffix(suffix).map_or(false, |prefix| prefix.ends_with('.'))
})
.map(|suffix| suffix.len())
.max();
This is a "let's see if I can get it to work" (aka hack) to try address #8408 and #10997. In short: the current language extension matcher is only considering the "final extension" of a file when seeing which language extensions would apply to the file. (This is because that's how Rust's
path.extension()
works.) This works fine for most cases, but there are some cases where a language extension may want to match on a "compound file extension", and this is currently not supported at all.Example:
blade.php
"extension"index.blade.php
path_suffixes = ["blade.php"]
index.blade.php
, and then try to match a lang ext based on any of thepath_suffixes
exactly matching either"php"
or"index.blade.php"
"blade.php" != "php"
nor"index.blade.php"
Proposed Solution
Basically, instead of doing an exact match on the suffix and either the file extension or filename, this change:
Languages that may use this
blade.php
"extension"component.html
Open questions
filename.ends_with(&ext) || filename == suffix
is slower thanpath_suffixes.contains(&Some(suffix.as_str()))
? If so, how bad is it?path_suffixes
in the ext API intopath_suffixes
andfilenames
, and then we could preprocess these when the language ext is first loaded to reduce overheadunintended matches: I realized while writing this that it may introduce behavior where (eg) a lang ext for D (suffixedit I resolved this when I found the tests that cover this change and go this to pass.d
) may be matched to Markdown (md
). In that case, Markdown would still win, though (because of length) ... so may that's not an issue?Release Notes: