-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Add syntax to include grammar by resetting their base #1276
base: master
Are you sure you want to change the base?
Conversation
A scope name followed by two hashes (`##`) resets the base grammar for an inclusion rule. Examples: - `source.ruby#` Include `source.ruby` with the current base as base. Equivalent to `source.ruby`. - 'source.ruby##` Include `source.ruby` with `source.ruby` as base - `source.ruby#regexp` Include the rule `regexp` from `source.ruby`, using the current base as base. - `source.ruby##regexp` Include the rule `regexp` from `source.ruby`, but use `source.ruby` as base.
This reverts commit 6dcfcf2.
This key has the same behavior as the previous implementation, but with the advantage of being backwards compatible.
@sorbits: I gave this some thought overnight and decided that maybe something ike vmg@e0bf32e would be a better idea. Adding a separate flag means that we can fix grammars, e.g.
And the new version of TextMate will load these properly, whilst not breaking backwards compatibility with the previous version (or other parsers that don't support this feature). This is slightly less pretty, but I assume backwards compatibility is a huge deal for you (and rightly so), so I'm leaning towards this approach. |
I think your proposal of adding syntax to the include rule is the best As for the actual syntax, I propose we treat
Alternatively we can set There have been a few requests for user-specified variables, so the So with that syntax, it would look like this:
The general use case for user-specified variables is mainly to define an What do you think? cc: @joachimm On 13 Nov 2014, at 12:21, Vicent Marti wrote:
|
@sorbits: Personally, I believe the variables feature should be orthogonal to this. Adding a If you add a variable for an Note that I really like the idea of variable replacements, but applying them recursively for includes sounds like a recipe for disaster. For this specific use case, we want to reset the base of the include, not replace it with something else. Also, being able to replace the value of So I would learn towards something like |
So I've been thinking about this for a couple days, seems to me what triggers this issue is when a new context is created. In this case an embedded block of C is created, would it not make sense for the block to reset the base rather than the include? I guess it doesn't make much real difference it just seems that the block is what is creating the new context, that the include needs to honor it is incidental. |
I'm not sure I follow. What do you mean exactly by a context? |
I mean that it is the begin/end rule that creates a new context which in this case is a block of embedded |
If that were all that were desired, we could just change the C grammar to include |
Yes, I believe that resetting the base should be a choice when including a subgrammar, and not really related to the block it's included in. And going back to @sorbits' suggestion: it's becoming increasingly clear to me that, although variable substitution would be a great thing to implement (and heck -- I personally wouldn't mind writing the patchset myself and send you a PR, Allan, it sounds like a very useful thing to have), both Hence, my suggestion for a syntax to clear the base rule for an inclusion, but not to arbitrarily replace it, because that would surely lead to chaos. |
For variables, there are two possibilities.
I think the flexibility of dynamic variables has some good use-cases as well. We could e.g. do a common line comment rule which would be used like this:
Though a better reaosn for dynamic variables is probably the current C, C++, Objective-C, and Objective-C++ grammars. The C grammar has a rule to match stdlib functions, the Objective-C grammar has a similar rule for Cocoa functions. The C++ grammar has rules for matching braces to introduce scopes for namespaces and classes, so it needs to include the C grammar’s functions inside these new scopes. The problem is with Objective-C++, this one includes the Objective-C and C++ grammars, the latter includes only C functions in its brace scopes, but it should also include the Objectice-C functions when included from Objective-C++. This could be solved by having the C grammar include As for “fuzzy logic” for variable scopes: I think we can define our way out of that. So back to I think there is value in keeping our options open, and also trying to limit the number of special constructs to a minimum. As for setting Ideally though we would use injection to match embedded code, but the example still show that there could be value in being able to redefine Anyway, for a start I think it’s fine to not allow Come to think of it, the shared line comment rule from my first example might use |
I like this approach. Let me see how an implementation looks like. |
4234188
to
09534b7
Compare
Did it solved? i am using vscode and have this problem Markdown code block syntax highlightning is broken for C and C++ #34525 |
there is an example of Objective-C, C++, and Objective-C++code that does not work without $base? |
093e8eb
to
d2979e2
Compare
e28f51d
to
97caab6
Compare
Allan,
Here's a small proposal to fix a (non-critical) issue we found while deploying TextMate grammars to production.
As you obviously know (since you designed the format, haha),
include
rules in grammars can use the$self
and$base
magic variables to recursively include themselves or the base grammar at the root of the parse tree. This is a crucial feature to parse many programming language that have some kind of recursion in their syntax.Grammars like C (
source.c
), however, routinely include$base
instead of$self
for their recursion rules. This is because the C grammar is included from other languages (likesource.cpp
orsource.objc
) to provide basic syntactic parsing, and when including itself recursively, we want the base grammar to be included again (or else chunks ofsource.cpp
would be parsed assource.c
, as they would be missing the C++ rules).The bug, which I believe is not trivial to fix, arises on languages that include a grammar like
source.c
not to extend their syntax, but to parse a chunk of code as a different language.Two obvious examples of this are
source.lua
andsource.ruby
, which includesource.c
to highlight an external block of C declarations or a heredoc with C code, respectively.The result looks like this:
In this case, when
source.lua
includessource.c
, and as soon as C does a recursive include (when parsing the inside of a struct definition), the$base
rule is obviously Lua, so all the C parsing breaks. We're now parsing Lua inside C inside of Lua. This is not what we want!So, how can we work around this? I propose the following small change in syntax: In an include rule, a scope name followed by two hashes (
##
) resets the base grammar for the inclusion.Examples:
source.ruby#
Include
source.ruby
with the current base as base.Equivalent to
source.ruby
.Include
source.rubywith
source.ruby` as basesource.ruby#regexp
Include the rule
regexp
fromsource.ruby
, using the currentbase as base.
source.ruby##regexp
Include the rule
regexp
fromsource.ruby
, but usesource.ruby
as base.This change will allow any languages that need to require a sub-language in an isolated way to do so. I believe this is the least intrusive way to fix this issue.
@sorbits: Do you think this is reasonable, or can you come up with a more elegant way to fix the issue? I'm all ears and eager for your feedback. I'd love to get this fixed! :)
Cheers,
vmg