Refine and strengthen diagnostics for problematic URLs #1342

EliahKagan · 2024-04-12T09:03:04Z

This pull request fixed CVE-2024-32884 (RUSTSEC-2024-0335, GHSA-98p4-xjmm-8mfh). Those linked advisories, which are largely equivalent to each other, are the main source of information relevant to these changes.

This PR can be seen as a sequel to #1032, which addressed RUSTSEC-2023-0064 (GHSA-rrjw-j4m2-mf34). As there, https://secure.phabricator.com/T12961 has information on the type of vulnerability patched here.

This pull request fixed a variant of that vulnerability where the non-mandatory username portion of a URL could still begin with a hyphen even when used at the beginning of an argument to an external SSH client, thereby smuggling command-line options to the client. It added and used a user_as_argument method to Url to facilitate matching the safety status of the username, and also an analogous host_as_argument to allow the existing correct host handling code to be made clearer. While adding tests of the fix and the new methods, it also made testing a bit more robust for other variants that were already fixed.

Some more information is in commit messages. Some information of lesser interest than the advisory, commit messages, and the above summary is retained below.

Old PR description

Click to expand

This adds more unit tests and journey tests for existing error messages about URLs that are often not usable as remotes because they cause problems when used with external commands. It also distinguishes a further subcategory of these URLs, adds both kinds of test cases for them, and ensures they are identified and that the associated error messages are accurate.

For background on what led me to some of the new Url methods, see that comment from an intermediate stage. I had at that time thought the existing ambiguous method (and the new ambiguous method I added) should be removed, but now I think that may not be the case, though I've moved code over to using the new ones.

This may be ready, though I strongly suspect some revisions may be needed, stylistically, in naming, or otherwise.

Two things I had thought of while working on this but that seemed better to leave out of this PR, unless requested, were:

Fuzz testing already touches on some of the kinds of inputs that are already handled properly, for which this PR only adds more unit and journey tests. I am unsure if fuzz testing should be extended to cover the other cases that this PR addresses, and if so, if there is anything this PR should do for that.

As before, checking for strange paths is done separately from the gix-url facilities related to that. This doesn't seem like a problem, but I am curious if something ought to be documented to avoid regressions and/or to advise application developers. I haven't identified such an area though; the documentation seems to cover the issues.

The most important points and considerations are not those, but rather are presented outside this PR, such as in code comments/documentation and commit messages.

- Harmonize minor nearby related inconsistencies in documentation. - Add the new assertions that don't require the new method (these are less important than the forthcoming ones that will call it).

This returns `None` if the username begins with a `-`, which would confuse command-line applications. It is analogous to the `Url::host_argument_safe()` and `Url::path_argument_safe()` methods (introduced in d80b5f6), but for usernames rather than hosts or paths.

- Add assertions where user_argument_safe returns None, but because there really is no username, rather than it starting with a `-`. - Add tests where more parts are present. (Most existing tests did not have the optional username part. Most also did/do not have the optional password part, but there's no password_argument_safe method, and probably no need for one because a password is not passed explicitly on the command line as an argument or the front of a larger argument--hopefully never, but if so, then it would be an operand argument following an option argument and would be prevented from being interpreted as an option that way.) - Add comments to clarify the assertions that specifically verify that unsafe parts of URLs are rejected when using *_argument_safe methods.

The committed snapshot out of the output was renamed in f72ecce to fix a spelling error. This updates references to it from gix.sh.

This also adds a snapshot from running them, which shows the expected failure. (However, in this commit the only snapshot that is added is the one that was actually generated on the non-CI test run. This pertains to the `clone` subcommand. Journey tests of the `receive` subcommand are only run on CI, including the new test. It should have a committed snapshot, but that's not included in this commit.) These differ from the ambiguous host tests, in that ambiguous hosts are dangerous in the `gix` command when they appear in an `ssh://` URL (with no `username@` preceding them), while ambiguous paths are instead dangerous in the `gix` command when they appear in a `username@host:path/repo` "URL". The new tests should pass already, and the test of the `clone` subcommand is confirmed to pass (as noted above, journey tests of the `receive` subcommand are only run on CI). This is because ambiguous hosts and ambiguous paths are already handled. In contrast, ambiguous usernames, which have no journey tests yet, are not yet handled in gix-transport.

This test should currently pass, but it, like the other journey tests of the `receive` subcommand, runs only on CI. This is an expected snapshot, so that if the test, code under test, or expectation expressed in the snapshot is wrong, then it will be revealed that there is a problem. This also expresses the expectation to humans, which may help in fixing the new test if it turns out to be wrong. Note however that this does mean that the journey snapshot here is not strictly being used for approval testing.

These should not pass yet, because gix-transport and the gix binary do not yet have special handling to reject ambiguous usernames that can be interpreted as command-line options to ssh clients. These new journey tests may need further refinement.

The tests appear to work, at least partially, in that a test failure, of the kind expected due to gix-transport not yet checking for usernames that ssh clients can interpret as an option argument, is observed now that the current output has snapshots to compare to (the comparison fails, which at this point is intended). Not all tests ran, because of the runner script's fail-fast behavior and because the `receive` subcommand tests only run on CI. While I believe having these snapshots in now will make it eaiser to ensure nothing essential was forgotten, once ambiguous username handling is implemented, this does make these journey tests no longer strictly approval tests (since the snapshots incorporate foreknowledge of the intended behavior).

Not all of these tests can pass yet, since gix-transport does not yet detect and refuse to proceed with leading-hypnen usernames. Some pass; those that do not are, as expected: - ambiguous_user_is_disallowed_explicit_ssh - ambiguous_user_is_disallowed_implicit_ssh - ambiguous_user_and_host_remain_disallowed_together_explicit_ssh - ambiguous_user_and_host_remain_disallowed_together_implicit_ssh This also adds AmbiguousUserName in one of the enums that will need to have it, but nothing fails with this error yet; it is introduced now only to facilitate writing unit tests that assert it.

This detects ambiguous usernames in dangerous cases where they would be passed to external commands to form SSH connections, if they would be misinterpreted as option arguments. This change is analogous to b06a0dd, hardening `gix-transport` and applications that use it against options smuggled in URLs, but for the non-mandatory username portion of a URL, rather than the host and path portions that were covered there. For example, commands like these no longer pass `-F...` options to `ssh`: gix clone 'ssh://-Fconfigfile@example.com/abc' gix clone -- '-Fconfigfile@example.com:abc/def' Instead, they refuse to run `ssh`, producing the error: Error: Username '-Fconfigfile' could be mistaken for a command-line argument

To note that if the `-G` check ever also uses the username, then that has to be checked to ensure it is safe.

This renames it to `looks_like_command_line_option`, because both options and non-option arguments are arguments. The condition it checks for is that something looks like an option and therefore should *not* be used as a command-line argument.

This isn't correct yet, because: - It doesn't compile because of the overcomplicated way I'm doing borrowing. The enum can just have references as type arguments instead. - The enum should be public. (Those are noted here rather than in the next commit that will fix them, because it is not useful to include them in a message generated from it if it is titled as a conventional commit.) In addition, while some existing methods are implemented in terms of these now so there is some test coverage that way, it's far from complete and these really need their own tests.

This adds three methods to `Url`: - `Url::user_as_argument` - `Url::host_as_argument` - `Url::path_as_argument` They return `ArgumentSafety` values that distingiush three cases: 1. There is no wrapped value (`Absent`). 2. The wrapped value is usable and presumably safe (`Usable`). 3. The wrapped value is dangerous and should not be passed as a command-line argument because it could be interpreted as an option due to starting with `-`. The value itself may still be useful and safe to include in error messages. `user_as_argument` and `host_as_argument` are the most useful ones, as they serve as alternatives to `user_argument_safe` and `host_argument_safe` whose return values are unambiguous. The `user_argument_safe` and `host_argument_safe` methods don't distinguish between the absence of a username or host, and the presence of a username or host that is unsafe to pass as an argument. `path_as_argument` included for parity, in case it is useful to write code that handles the three cases similarly. However, there is no underlying ambiguity in the return value of the corresponding `path_argument_safe` method, since unlike `user` and `host`, the `path` is not an option type.

The three new `Url::*_as_argument` methods were only tested indirectly before (and the path one not at all). This adds tests for them so they have the same direct coverage as the corresponding `*_argument_safe` methods. The `Debug` and `PartialEq` traits added to `ArgumentSafety` are primarily added to allow the new assertions to be written in a more streamlined way (without matching).

Right now this is more locked down than necessary, serving to check that ArgumentSafety can be easily matched on in scenarios where one may wish to reject both usernames and hosts that are ambiguous. This is verified by all tests passing except these that fail: - ambiguous_host_is_allowed_with_user_explicit_ssh - ambiguous_host_is_allowed_with_user_implicit_ssh Prohibiting ambiguous hosts even when the username is present and cannot itself be confused with a command-line option isn't needed in gix-transport, which passes URLs of the form `user@host` anytime the username is present. However, if an application ever passes the username as a separate argument rather than as part of the netloc with the host, then being able to express that in a natural and non-error-prone way is important.

Since the `user@` prefix makes it safe as gix-transport uses it, and there are test cases specifying that it should be permitted.

This is to test a code path where a host is passed to an ssh command and it's important to ensure it doesn't start with `-`. This code path uses `-G` to help figure out what SSH client is being used, and most of the time it is not followed. The existing check is sufficiently simple it is most likely correct. But it seems never to to have been covered in automated tests: neither unit tests nor any journey tests. This does not yet manage to test is successfully, because it does not succeed at getting that specific code path to be followed.

This test was absent before, but seems important to add to verify that the code there was correct (it was), but also because I plan to refactor that code to use `host_as_argument` rather than the older (in some cases potentially ambiguous) `host_argument_safe`. This is specific to the case where a command like `ssh -G` -- but where `ssh` is an uncertain command such that gix-transport tries to figure out what is -- is used. A netloc is passed as part of that, but the netloc used consists only of a host and not a username, even when this is a step in a longer process that will later use (and check the safety of) the username. So only the host has to be checked (at that stage) to make sure it cannot masquerade as an option argument.

Instead of `Url::host_argument_safe()`. This also removes the comment about how the username should be handled if added, mostly because I didn't really need to add that comment.

- put more standard-derives on `ArgumentSafety` to allow more convenient usage. - remove `path_as_argument()` method as it wasn't adding anything. This allowed the `ArgumentSafety` wrapper to be simpler. - turn assertion comments into messages.

Byron

This is a fantastic PR and it's like you have been doing Rust for a long time.

I love how you picked up on the commenting style to selectively incorporate changelog messages. On that note, I have a removed a method that was certainly mentioned in a commit, and I accept that the changelog will now be slightly misleading without rewriting history. It's a systematic shortcoming.

Further, the addition of ArgumentSafety is a great one which leads to nicer error messages than was possible before. The previous implementation didn't handle this very well as a the possible states weren't correctly represented.

I will now see to creating a release once CI is green.

Byron · 2024-04-13T08:28:48Z

tests/journey/gix.sh

@@ -345,12 +345,33 @@ title "gix commit-graph"
              }
            )
          )
+          (with "an ambiguous ssh username which could be mistaken for an argument"


I appreciate having this tested in all layers of the application, making it very clear that the safety checks are actually made in the right place and effective.

This comment is just to mention that I avoid adding more journey tests these days especially if we know that the gix (CLI) has to go through plumbing crates, which are typically tested on unit or integration level, hence we would only duplicate efforts here.

These days, I'd only use journey test if the behaviour is truly not tested anywhere else, which is also something I try to avoid. Journey tests aren't used anymore to pin down features or behaviour of the CLI either, as it's a 'can have breaking changes at any time' development tool. Rust is typically great at assuring that if it compiles, it works, assuming that each plumbing crate and the gix crate have the needed tests to validate behaviour.

Byron · 2024-04-13T08:31:53Z

tests/snapshots/plumbing/no-repo/pack/clone/fail-ambiguous-path

@@ -0,0 +1 @@
+[2KError: The repository path '-oProxyCommand=open$IFS-aCalculator/bar' could be mistaken for a command-line argument


It's interesting that it seems to inject terminal escape codes ([2K) here, but not in the snapshots further below.
I'd expect it not to do that, but it's not important enough to even look into it right now.

Byron · 2024-04-13T09:15:02Z

gix-transport/src/client/blocking_io/ssh/program_kind.rs

-                host.into()
-            }
+
+        let host_maybe_with_user_as_ssh_arg = match (url.user_as_argument(), url.host_as_argument()) {


This is a great use of match and one of the main reasons Rust is so powerful! Never again can one miss handling a possible program state, as long as one remembers to match on them that is :).

EliahKagan · 2024-04-13T21:09:25Z

Now that the advisory GHSA-98p4-xjmm-8mfh is published, I've updated this PR description to link to it and provide a small amount of additional information. In the near future, I'll likely edit it further to add more information, including about the approach taken, as well as to give it a clearer title.

I hope it's okay that I have also changed the "Affected versions" field of the advisory itself (the repository-local advisory, since as far as I know it is not yet in advisory databases) from <= 0.41.3 to < 0.42.0. This makes clearer that there are no existing or forthcoming versions of the gix-transport crate lower than 0.42.0 that contain a fix. If this change is not wanted, it can of course be undone.

I am uncertain if there are any other crates--such as related to the gix CLI tool?--that should be listed as affected. If so, they could be added to the advisory. This can be done without replacing gix-transport, which is the most important affected crate: the advisory could list both.

Byron · 2024-04-14T06:04:50Z

I hope it's okay that I have also changed the "Affected versions" field of the advisory itself (the repository-local advisory, since as far as I know it is not yet in advisory databases) from <= 0.41.3 to < 0.42.0. This makes clearer that there are no existing or forthcoming versions of the gix-transport crate lower than 0.42.0 that contain a fix. If this change is not wanted, it can of course be undone.

Yes, thanks a lot for catching this. I somehow behaved like the bound of unpatched versions can't be changed.

I am uncertain if there are any other crates--such as related to the gix CLI tool?--that should be listed as affected. If so, they could be added to the advisory. This can be done without replacing gix-transport, which is the most important affected crate: the advisory could list both.

Right, I wasn't aware that multiple items can be added. Instead, I added the affected items downstream as a 'Tip' right after the summary. I have corrected that by adding two more items accordingly and removed the 'Tip'.

EliahKagan · 2024-04-15T04:04:25Z

I see the RUSTSEC advisory was published! 🎉

I've updated the pull request description to include a link to that, but I still do plan to expand it further later, as noted.

EliahKagan · 2024-04-15T19:25:38Z

It looks like I should have included CWE-88 in the weaknesses list in GHSA-98p4-xjmm-8mfh. Its Example 1 clarifies that it encompasses this kind of vulnerability. That also looks like the (only) CWE that was attempted to be listed in the earlier GHSA-rrjw-j4m2-mf34. (That actually lists no CWEs, but CWE-88 was intended.) I can edit the repository advisory to add CWE-88, and it looks like RUSTSEC advisories don't list CWEs so nothing has to be added there.

But should CWE-88 be added as a second CWE, or should it replace CWE-77? CWE-77 is often overused, but I think this is mostly because of uses where CWE-78 should've been used instead. It seems to me that CWE-77 still applies and may not be completely covered by CWE-88, but I'm not sure.

Edit: Updated to clarify the relevance of CWE-88 in the history of the earlier advisory.

Byron · 2024-04-15T19:41:15Z

To me, as a layperson (strangely enough), and judging only by the title, they all seem pretty similar and I feel that they all somewhat apply. Since CWE assignments are probably useful for people to find (or better understand) an advisory, maybe having multiple slightly overlapping CWEs isn't the worst thing to do.

But I am definitely not sure either.

EliahKagan · 2024-04-15T19:54:22Z

That sounds reasonable. I'll try to look a bit more into how, if at all, CWE-77 and CWE-88 tend to be used together, then I'll add CWE-88 while also keeping CWE-77 unless I've found a reason to do otherwise.

riking · 2024-04-16T16:22:17Z

Although ssh supports aliases for hosts, even if git-upload-pack could be made an alias, that is made difficult by the URL-encoding transformation.

Note that DNS Search Domains could turn that into a valid domain: git-upload-pack.corp.example.com

However, it will be difficult to place a malicious host in a machine's search domains, as the search domains typically have common control with the machine they are configured on.

EliahKagan added 13 commits April 11, 2024 21:51

Reorder tests and add username assertion placeholders

6cbe65d

Prepare for adding Url::user_argument_safe

1bdfdd9

- Harmonize minor nearby related inconsistencies in documentation. - Add the new assertions that don't require the new method (these are less important than the forthcoming ones that will call it).

Clarify circumstances when *_argument_safe should be used

e8e2463

Fix ambiguous host journey test snapshot names

0356345

The committed snapshot out of the output was renamed in f72ecce to fix a spelling error. This updates references to it from gix.sh.

Add ambiguous username journey tests

d639b3b

These should not pass yet, because gix-transport and the gix binary do not yet have special handling to reject ambiguous usernames that can be interpreted as command-line options to ssh clients. These new journey tests may need further refinement.

Comment gix_transport::client::blocking_io::ssh::connect

Loading
Loading status checks…

2e7517e

To note that if the `-G` check ever also uses the username, then that has to be checked to ensure it is safe.

EliahKagan force-pushed the strange-usernames branch from df7db43 to 2e7517e Compare April 12, 2024 16:03

EliahKagan added 10 commits April 12, 2024 18:09

Reallow user@-arg... in prepare_invocation

2911623

Since the `user@` prefix makes it safe as gix-transport uses it, and there are test cases specifying that it should be permitted.

Use Url::host_as_argument() in ssh::connect()

cf59f57

Instead of `Url::host_argument_safe()`. This also removes the comment about how the username should be handled if added, mostly because I didn't really need to add that comment.

(Re)add a short, more specific comment about user@

Loading
Loading status checks…

03fb64a

EliahKagan force-pushed the strange-usernames branch from 28a5018 to 03fb64a Compare April 13, 2024 00:40

EliahKagan marked this pull request as ready for review April 13, 2024 01:07

Byron self-requested a review April 13, 2024 08:10

Byron added 2 commits April 13, 2024 10:55

refactor gix-url

09311b0

- put more standard-derives on `ArgumentSafety` to allow more convenient usage. - remove `path_as_argument()` method as it wasn't adding anything. This allowed the `ArgumentSafety` wrapper to be simpler. - turn assertion comments into messages.

refactor gix-transport with minor edits to comments

Loading
Loading status checks…

996310b

Byron approved these changes Apr 13, 2024

View reviewed changes

Byron merged commit 1272542 into GitoxideLabs:main Apr 13, 2024
19 checks passed

EliahKagan deleted the strange-usernames branch April 13, 2024 14:36

This was referenced Apr 26, 2024

[GHSA-98p4-xjmm-8mfh] gix-transport indirect code execution via malicious username github/advisory-database#4316

Merged

[GHSA-rrjw-j4m2-mf34] gix-transport code execution vulnerability github/advisory-database#4317

Merged

EliahKagan mentioned this pull request Jun 27, 2024

gix clone ignores global core.symlinks on Windows #1353

Open

EliahKagan mentioned this pull request Nov 7, 2024

Test cross-platform symlink and +x permission expectations on Windows #1652

Merged

EliahKagan mentioned this pull request Jan 26, 2025

various improvements #1800

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Refine and strengthen diagnostics for problematic URLs #1342

Refine and strengthen diagnostics for problematic URLs #1342

EliahKagan commented Apr 12, 2024 •

edited

Loading

Byron left a comment

Byron Apr 13, 2024

Byron Apr 13, 2024

Byron Apr 13, 2024

EliahKagan commented Apr 13, 2024 •

edited

Loading

Byron commented Apr 14, 2024

EliahKagan commented Apr 15, 2024 •

edited

Loading

EliahKagan commented Apr 15, 2024 •

edited

Loading

Byron commented Apr 15, 2024

EliahKagan commented Apr 15, 2024

riking commented Apr 16, 2024

		@@ -0,0 +1 @@
		[2KError: The repository path '-oProxyCommand=open$IFS-aCalculator/bar' could be mistaken for a command-line argument

Refine and strengthen diagnostics for problematic URLs #1342

Refine and strengthen diagnostics for problematic URLs #1342

Conversation

EliahKagan commented Apr 12, 2024 • edited Loading

Old PR description

Byron left a comment

Choose a reason for hiding this comment

Byron Apr 13, 2024

Choose a reason for hiding this comment

Byron Apr 13, 2024

Choose a reason for hiding this comment

Byron Apr 13, 2024

Choose a reason for hiding this comment

EliahKagan commented Apr 13, 2024 • edited Loading

Byron commented Apr 14, 2024

EliahKagan commented Apr 15, 2024 • edited Loading

EliahKagan commented Apr 15, 2024 • edited Loading

Byron commented Apr 15, 2024

EliahKagan commented Apr 15, 2024

riking commented Apr 16, 2024

EliahKagan commented Apr 12, 2024 •

edited

Loading

EliahKagan commented Apr 13, 2024 •

edited

Loading

EliahKagan commented Apr 15, 2024 •

edited

Loading

EliahKagan commented Apr 15, 2024 •

edited

Loading