Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark 'file-only-linters' using ${real_file} in cmd #1022

Closed

Conversation

kaste
Copy link
Contributor

@kaste kaste commented Feb 26, 2018

Explains #954 (comment)

Maybe fixes #954

Here we use ${real_file} in cmd as a marker to mark 'file-only-linters', we use ${tmp_file} to mark linters that operate on temporary files, everything else (except the compatibility branching) uses STDIN.

@braver
Copy link
Member

braver commented Feb 26, 2018

I think it’s kinda ugly to switch the implementation on magic words inside a string. I think I’d prefer a separate setting.

@kaste
Copy link
Contributor Author

kaste commented Feb 26, 2018

Think that the user can set 'cmd' via the settings. This is rather expressive. For the implementation you could easily switch to a getter like linter.is_file_only_linter() which hides the implementation.

My proposal in #954 (comment) is based on making 'cmd' 'expressive'. I don't have other ideas atm.

@braver
Copy link
Member

braver commented Feb 26, 2018

Ok, that’s not a bad idea. I guess I don’t have any strong opinions or brilliant ideas on this one.

@FichteFoll
Copy link
Contributor

FichteFoll commented Feb 26, 2018

This pattern does not support the following situations:

  • cmd = 'eslint --stdin-filename ${real_file} -'
    eslint working on stdin but declaring the original file name for prettier output (flake8 can do this too). It's not necessary, so this isn't significant.

  • mypy --shadow-file ${real_file} ${temp_file} ${real_file}

    Mypy does project-aware linting because it can follow imports, so the above command would lint the file ${real_file} but substitude its contents with ${temp_file} internally.

The former could be detected by the additional presence of a temp_file argument, but I liked the idea of exposing the three lint modes and a mode-specific cmd override more.

Example config (pseudo):

lint_mode = TEMPORARY_FILE | REAL_FILE  # no STDIN, which would override TEMPORARY_FILE
cmd = 'mypy --shadow-file ${real_file} ${temp_file} ${real_file}'`
cmd_real = 'mypy ${file}' # expands to ${real_file} in REAL_FILE mode; chose to override this mode since the other two are exclusive and this only requires the addition of one "alias"

@kaste
Copy link
Contributor Author

kaste commented Feb 26, 2018

You should use ${file} here

  • cmd = 'eslint --stdin-filename ${file} -'

likewise for 'mypy'

@FichteFoll
Copy link
Contributor

FichteFoll commented Feb 26, 2018

But ${file} would be the temp file's path. But mypy doesn't support stdin, and the presence of ${temp_file} causes ${file} to be replaced with the temp file name as well.

Okay, nvm. I was confusing some lines a but during the diff review.

Using 'mypy --shadow-file ${file} ${temp_file} ${file}' would probably work. The only reason for ${real_file} to exist would be to signal the sole support of linting actual files.

I'm not too fond of the idea as I liked being able to supply the lint modes explicitly instead of this (slightly confusing) magic, but at least it works, unlike I assumed initially. I'd like to see what @braver has to say about my proposal, since he wasn't a fan of the magic either.

@FichteFoll
Copy link
Contributor

As an update, there are probably better ways to provide different command lines for different lint modes than a separate "cmd_real" attribute/setting. I could imagine a dict, for example:

cmd = {
    TEMPORARY_FILE: 'mypy --shadow-file ${real_file} ${temp_file} ${real_file}',
    REAL_FILE: 'mypy ${file}',
}

(Note that I edited my earlier comment a couple times.)

@kaste
Copy link
Contributor Author

kaste commented Feb 27, 2018

There is no expressed user story in the original issue why you would want to declare different 'cmd's (statically) upfront.

Note also that there is no magic here. We don't guess anything, no opinions, no heuristics. (The notion of magic at least involves that something is automatic.)

Some explanations. Currently, we (already) have a setting that actually branches all possible ways. It is called unfortunately tempfile_suffix. It borrows the following semantics, and the main switch here is in Linter.run:

	falsy (mostly None)  => STDIN
    '-'                  => REAL_FILE
    otherwise            => TEMPORARY_FILE

Note (@braver) that the 'magic' '-' already 'leaked' throughout the code (into backend.py and commands.py). T.i. the implementation was never hidden (Usually you give it a name by defining a getter for something like this.).

Now, if we have two settings (tempfile_suffix now, maybe input_mode in the future, and cmd) we face the problem that both must actually match. T.i. you cannot declare 'input_mode' 'TEMPORARY_FILE' without having ${temp_file} in cmd. Likewise, you MUST not have ${temp_file} in cmd if the mode is 'STDIN'.

So, if you're in a situation like this, it is probably better to make the branching based on one setting (here: cmd). But bc we have three possible states, we need another marker to switch into 'REAL_FILE'-mode. That is the reasoning to introduce ${real_file}.

Generally, it is important to keep in mind that cmd will (hopefully) become a user setting (just like selector).

(If you define 'cmd' as a dict, should every plugin provide three different commands to be a nice citizen? How do we actually switch between them?)

@braver
Copy link
Member

braver commented Feb 27, 2018

Note (@braver) that the 'magic' '-' already 'leaked' throughout the code (into backend.py and commands.py). T.i. the implementation was never hidden

I know, it's terrible. With a getter, as you suggest, it would be a lot better.

Now, if we have two settings (tempfile_suffix now, maybe input_mode in the future, and cmd) we face the problem that both must actually match.

A very good point, I didn't think that true.

There is no expressed user story in the original issue why you would want to declare different 'cmd's (statically) upfront.

Isn't having multiple cmd's a different discussion altogether? I think I kinda get why you'd want that, but it's not a big thing IMO. Perhaps focus on replacing the '-' that's leaked everywhere with something self-explanatory + a getter.

@FichteFoll
Copy link
Contributor

FichteFoll commented Feb 28, 2018

There is no expressed user story in the original issue why you would want to declare different 'cmd's (statically) upfront.

That's true. The requirement was just to get rid of the '-' value, because that's not self-explanatory. The suggestion to expose lint_mode was later made by @braver, which is why I tried iterating on it.

We don't guess anything, no opinions, no heuristics. (The notion of magic at least involves that something is automatic.)

It's simpler to specify only a single value instead of having the "redundancy" of a separate setting and specific variables in the cmd, but it irks me that ${real_file} completes to the same value as ${file} and is only used for signaling. This kind of implicit behavior changes based on the usage of a different variable just doesn't strike me as a good usage pattern.

Now, if we have two settings (tempfile_suffix now, maybe input_mode in the future, and cmd) we face the problem that both must actually match. T.i. you cannot declare 'input_mode' 'TEMPORARY_FILE' without having ${temp_file} in cmd. Likewise, you MUST not have ${temp_file} in cmd if the mode is 'STDIN'.

(If you define 'cmd' as a dict, should every plugin provide three different commands to be a nice citizen? How do we actually switch between them?)

I have a suggestion about that (see below).


Time for summarizations. Here's your current proposal:

Current proposal

  1. use REAL_FILE mode if ${real_file} occurs within cmd (highest priority)
  2. use TEMPORARY_FILE mode if ${temp_file} occurs within cmd. This would probably not error if ${real_file} is also provided, but it would unnecessarily cause a temporary to be created.
  3. use STDIN otherwise
  • ${temp_file} is replaced with the temporary file path
  • ${real_file} is replaced with the actual file's path
  • ${file} is replaced with the actual file's path

My proposal

  1. A linter may have multiple lint modes, REAL_FILE and either of TEMPORARY_FILE or STDIN (since they are functionally equivalent)
  2. The lint modes to be active for a linter are explicitly communicated either by a lint_mode setting consisting of bit flags or by having cmd be a dict. cmd may still be a string. If cmd is a dict and lint_mode is specified also, lint_mode is preferred.
  • ${temp_file} is replaced with the temporary file path
  • ${real_file} is also replaced with the actual file's path
  • ${file} is replaced with the temp file path, real file path or - depending on the lint mode

Advantages

  1. For most situations, only a single cmd using ${file} needs to be specified, with the lint_modes being communicated through the separate setting.
  2. This allows for the user to easily configure it by just overriding lint_mode
    and add support for lint_mode options to for specific linters #244 for free.
  3. The linter would operate on the file if it was unchanged. The temporary or stdin lint modes would only be used for dirty/unsaved files.
  4. If a linter needs a fundamentally different command for the non-real case (mypy), specifying multiple commands in a dict allows them to do so.

Disadvantages

  1. If both lint_mode and cmd as a dict are provided, one needs to have higher priority over the other. For easier user configurability, this would be lint_mode.
  2. Potential for errors if cmd is a dict but no command was provided for the requested lint mode (via lint_mode).
  3. ${temp_file} may only be used in TEMPORARY_FILE mode (or if only that mode is allowed). This makes sense, though.
  4. More code, probably.

Disadvantages for both

  1. tempfile_suffix still exists, although it is largely redundant now and doesn't need to be used most of the time. Not specifying an extension will determine one from the file that is being linted and the only linter that seemingly requires the extention to be something specific was the php linter. This also applies to the other proposal. For my proposal, it would also be used to detect the requested lint mode as TEMPORARY_FILE if it was set to -.

@kaste
Copy link
Contributor Author

kaste commented Feb 28, 2018

This feels like a mixing of user story and implementation details.

My proposal
A linter may have multiple lint modes, REAL_FILE and either of TEMPORARY_FILE or STDIN (since they are functionally equivalent)

That's confusing, they're not functionally equivalent. From the point of view of SL TEMP and STDIN share that they can operate on unnamed and dirty buffers. From the point of the linters REAL AND TEMP share that they both operate on real files.

The lint modes to be active for a linter are explicitly communicated either by a lint_mode setting consisting of bit flags or by having cmd be a dict. cmd may still be a string. If cmd is a dict and lint_mode is specified also, lint_mode is preferred.

'lint_mode' would be a user setting, so no BIT_FLAGS pure strings.

${temp_file} is replaced with the temporary file path

${temp_file} would not be needed anymore bc ${file} changes its values based on the mode.

${file} is replaced with the temp file path, real file path or - depending on the lint mode

We want the option to have cmd as a user setting, and then ${file} would be automatically replaced via our settings system. So that a minus.

Generally, the same setting with various meanings is 'implicit behavior change[s] based on the usage of a different variable' This is confusing.

Advantages

For most situations, only a single cmd using ${file} needs to be specified, with the lint_modes being communicated through the separate setting.

I don't think that this is true. You cannot set eslint ${file} and it will, actually magical, just work for different modes. It is eslint --stdin --stdinname $file for STDIN mode, eslint $file for REAL mode and TEMP MODE. For mypy the TEMP mode has an really ugly cmd line, but a simple one for REAL mode, and none for STDIN of course. I stated in #954 that we cannot write one cmd for all modes, but usually if you change the mode, you also change the cmd.

This allows for the user to easily configure it by just overriding lint_mode
and add support for #244 for free.

You get a part of #244 for free for both proposals. But lint_mode is otherwise completely unrelated. It also has 'manual'.

The linter would operate on the file if it was unchanged. The temporary or stdin lint modes would only be used for dirty/unsaved files.

This is the meat here. This is interesting, but probably nobody wants it. E.g. there is no real user interest to lint the real file and then use the buffer contents via STDIN in quick alternations. Why would you want this with eslint. The only difference here would be the line-endings on Windows. (So you fix the one issue why does SL not show the CRLF errors I get on CLI.)

It's simpler to specify only a single value instead of having the "redundancy" of a separate setting and specific variables in the cmd

This is about 'consistency'. Trying to reduce invalid states of a program.

@FichteFoll
Copy link
Contributor

FichteFoll commented Feb 28, 2018

That's confusing, they're not functionally equivalent.

I realized the wording was poor, but was hoping you'd understand anyway. Yes, they are not functionally equivalent, but they have the same results and you don't need two things producing the same thing at the same time.

${temp_file} would not be needed anymore bc ${file} changes its values based on the mode.

Good thinking. I'll take that as an advantage?

We want the option to have cmd as a user setting, and then ${file} would be automatically replaced via our settings system.

Bummer, I forgot about that. It'd need to be escaped (in JSON even, so "\\$"), but that's not the best "user experience". Either way, both implementations would suffer from this, because:

>>> sublime.expand_variables("$undefined", {})
''

You cannot set eslint ${file} and it will, actually magical, just work for different modes.

I did say "for most situations", though. Obviously that's just an estimate, but no linter I use requires --stdin to read the input file from stdin. For example, flake8 understands - just fine.

You get a part of #244 for free for both proposals.

Yes, but with your proposal they'd need to understand the implications of variable usages in cmd first.

there is no real user interest to lint the real file and then use the buffer contents via STDIN in quick alternations.

True, users don't care about this.

This is about 'consistency'. Trying to reduce invalid states of a program.

This can actually be beneficial, too. If a user overrides lint_mode but the plugin didn't provide a command for that mode because it's not supported, the error will probably be easier to understand compared to running the executable with arguments it can't work with. Just theoretical, though, since I'd expect most linters to have all supported modes enabled by default.

@kaste
Copy link
Contributor Author

kaste commented Feb 28, 2018

May I remind you that we have currently

image

for that feature/enhancement. You cannot justify such a huge change for a 1% problem. You cannot say that making cmd a polymorphic type isn't something you avoid already if you can. Then introduce another setting input_mode, also expose lint_mode bc it cannot be satisfied with that. Then either change ${file} so our current system can't use it, or you have to escape it. (With my proposal you don't have to change it.) BTW we always had markers in 'cmd' people had to lookup, we just wanted to give them explicit names, so we introduced e.g. ${args}. I mean in old SL3 we would name ${real_file} probably #. And if the user really wants to tweak cmd, she already expects breakage. If you just set input_mode, it is compared to that more like a setting that should just work.

Let's say I have to think about this more.

@FichteFoll
Copy link
Contributor

Then either change ${file} so our current system can't use it, or you have to escape it. (With my proposal you don't have to change it.)

If cmd is to be provided as a setting, then ${temp_file} and ${real_file} will be replace with nothing. Anyway, let's keep this to when this actually happens, as it's unrelated to this change and both proposals are affected by it.

BTW we always had markers in 'cmd' people had to lookup, we just wanted to give them explicit names, so we introduced e.g. ${args}.

Yes, but you wouldn't need to modify cmd for #244 if lint_mode existed. Only if you actually need to modify the command.


I put the "probably more code" in the disadvantages section for a reason, because I most certainly do believe it will increase code complexity, but don't think it would actually increase UX complexity.

I may just be a single person, but the mere presence of a certain substring inside a setting having a side effect on how that string is used (i.e. changing the lint mode) started ringing alarm bells inside my head. It's not how I would design my software, although I may struggle to articulate for reasons. Similar to saying what is considered "pythonic" and what is not. I guess I prefer having things purpose-built.

Either way, I believe it is my responsibility as a considerate user and developer myself to share ideas that I think might result in improvements. It is not my decision whether to implement them or the way of that implementation, which is why I provided the comparsion earlier and try to make my thought process as easy to follow as I can.

We wouldn't want this to be a breaking change, so take your time. It might also help to ask someone else about their opinion, but they'd have to be involved with the current process and understand the two seperate proposals, which takes at least some time.

@kaste
Copy link
Contributor Author

kaste commented Mar 1, 2018

Now I get this example

>>> sublime.expand_variables("$undefined", {})
''

But this is not true bc e.g. ${args} is currently a marker. So we do sublime.expand_variables("$args", {'args': '__ARGS_MARKER__'}). But we cannot do this with $file bc it has a value that can be used in various settings. Basically we had to introduce a different name/marker for $file.

Now, I summarize the usage again:

Current:

STDIN:  cmd = 'flake8 -'   # done
REAL:   cmd = 'flake8 $file' 
        tempfile_suffix = '-'
TEMP:   cmd = 'flake8 $temp_file'
        tempfile_suffix = 'foobar'  # As long as the linter does not check the filename 
                                    # ext, you can set this to any truthy value except '-'


After this PR:

STDIN:  cmd = 'flake8 -'           # done
REAL:   cmd = 'flake8 $real_file'  # done
TEMP:   cmd = 'flake8 $temp_file'  # done, unless the linter checks the filename ext

# maybe $file_on_disk is a better name
# So, we get rid of tempfile_suffix for common usage

@kaste
Copy link
Contributor Author

kaste commented Mar 1, 2018

Oops, that was posted too early.

@FichteFoll

STDIN:  cmd = 'flake8 $input'
        input_mode = 'stdin'
REAL:   cmd = 'flake8 $input'
        input_mode = 'file_on_disk'
TEMP:   cmd = 'flake8 $input'
        input_mode = 'temp_file'

# T.i. if you substitue $input with - otherwise
        cmd = 'eslint --stdin'
        input_mode = 'stdin'

# This one looks nice for the idealized case, in reality the 'cmd's are usually 
# not the same!

@kaste kaste closed this Mar 26, 2019
@kaste kaste deleted the better-mark-for-file-only-linters branch March 26, 2019 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Introduce a separate attribute for declaring a linter to only operate on the exact files
3 participants