Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine and strengthen diagnostics for problematic URLs #1342

Merged
merged 25 commits into from Apr 13, 2024

Commits on Apr 11, 2024

  1. Configuration menu
    Copy the full SHA
    6cbe65d View commit details
    Browse the repository at this point in the history
  2. Prepare for adding Url::user_argument_safe

    - Harmonize minor nearby related inconsistencies in documentation.
    
    - Add the new assertions that don't require the new method (these
      are less important than the forthcoming ones that will call it).
    EliahKagan committed Apr 11, 2024
    Configuration menu
    Copy the full SHA
    1bdfdd9 View commit details
    Browse the repository at this point in the history

Commits on Apr 12, 2024

  1. feat: Add Url::user_argument_safe()

    This returns `None` if the username begins with a `-`, which would
    confuse command-line applications.
    
    It is analogous to the `Url::host_argument_safe()` and
    `Url::path_argument_safe()` methods (introduced in d80b5f6), but
    for usernames rather than hosts or paths.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    db40382 View commit details
    Browse the repository at this point in the history
  2. Expand and introduce tests of more combinations

    - Add assertions where user_argument_safe returns None, but because
      there really is no username, rather than it starting with a `-`.
    
    - Add tests where more parts are present. (Most existing tests did
      not have the optional username part. Most also did/do not have
      the optional password part, but there's no password_argument_safe
      method, and probably no need for one because a password is not
      passed explicitly on the command line as an argument or the front
      of a larger argument--hopefully never, but if so, then it would
      be an operand argument following an option argument and would be
      prevented from being interpreted as an option that way.)
    
    - Add comments to clarify the assertions that specifically verify
      that unsafe parts of URLs are rejected when using *_argument_safe
      methods.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    169a698 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e8e2463 View commit details
    Browse the repository at this point in the history
  4. Fix ambiguous host journey test snapshot names

    The committed snapshot out of the output was renamed in f72ecce to
    fix a spelling error. This updates references to it from gix.sh.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    0356345 View commit details
    Browse the repository at this point in the history
  5. Add ambiguous path journey tests

    This also adds a snapshot from running them, which shows the
    expected failure. (However, in this commit the only snapshot that
    is added is the one that was actually generated on the non-CI test
    run. This pertains to the `clone` subcommand. Journey tests of the
    `receive` subcommand are only run on CI, including the new test. It
    should have a committed snapshot, but that's not included in this
    commit.)
    
    These differ from the ambiguous host tests, in that ambiguous hosts
    are dangerous in the `gix` command when they appear in an `ssh://`
    URL (with no `username@` preceding them), while ambiguous paths are
    instead dangerous in the `gix` command when they appear in a
    `username@host:path/repo` "URL".
    
    The new tests should pass already, and the test of the `clone`
    subcommand is confirmed to pass (as noted above, journey tests of
    the `receive` subcommand are only run on CI). This is because
    ambiguous hosts and ambiguous paths are already handled. In
    contrast, ambiguous usernames, which have no journey tests yet, are
    not yet handled in gix-transport.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    8c94fd4 View commit details
    Browse the repository at this point in the history
  6. Add expected snapshot for ambiguous path receive

    This test should currently pass, but it, like the other journey
    tests of the `receive` subcommand, runs only on CI. This is an
    expected snapshot, so that if the test, code under test, or
    expectation expressed in the snapshot is wrong, then it will be
    revealed that there is a problem.
    
    This also expresses the expectation to humans, which may help in
    fixing the new test if it turns out to be wrong. Note however that
    this does mean that the journey snapshot here is not strictly being
    used for approval testing.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    a38f646 View commit details
    Browse the repository at this point in the history
  7. Add ambiguous username journey tests

    These should not pass yet, because gix-transport and the gix binary
    do not yet have special handling to reject ambiguous usernames that
    can be interpreted as command-line options to ssh clients.
    
    These new journey tests may need further refinement.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    d639b3b View commit details
    Browse the repository at this point in the history
  8. Add expected snapshots for ambiguous username journey tests

    The tests appear to work, at least partially, in that a test
    failure, of the kind expected due to gix-transport not yet checking
    for usernames that ssh clients can interpret as an option argument,
    is observed now that the current output has snapshots to compare to
    (the comparison fails, which at this point is intended).
    
    Not all tests ran, because of the runner script's fail-fast
    behavior and because the `receive` subcommand tests only run on CI.
    
    While I believe having these snapshots in now will make it eaiser
    to ensure nothing essential was forgotten, once ambiguous username
    handling is implemented, this does make these journey tests no
    longer strictly approval tests (since the snapshots incorporate
    foreknowledge of the intended behavior).
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    a51003d View commit details
    Browse the repository at this point in the history
  9. Add ambiguous user unit tests, and more for hostname

    Not all of these tests can pass yet, since gix-transport does not
    yet detect and refuse to proceed with leading-hypnen usernames.
    Some pass; those that do not are, as expected:
    
    - ambiguous_user_is_disallowed_explicit_ssh
    - ambiguous_user_is_disallowed_implicit_ssh
    - ambiguous_user_and_host_remain_disallowed_together_explicit_ssh
    - ambiguous_user_and_host_remain_disallowed_together_implicit_ssh
    
    This also adds AmbiguousUserName in one of the enums that will need
    to have it, but nothing fails with this error yet; it is introduced
    now only to facilitate writing unit tests that assert it.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    5428609 View commit details
    Browse the repository at this point in the history
  10. fix: Prevent usernames with leading - from being passed to SSH

    This detects ambiguous usernames in dangerous cases where they
    would be passed to external commands to form SSH connections, if
    they would be misinterpreted as option arguments.
    
    This change is analogous to b06a0dd, hardening `gix-transport` and
    applications that use it against options smuggled in URLs, but for
    the non-mandatory username portion of a URL, rather than the host
    and path portions that were covered there.
    
    For example, commands like these no longer pass `-F...` options to
    `ssh`:
    
        gix clone 'ssh://-Fconfigfile@example.com/abc'
        gix clone -- '-Fconfigfile@example.com:abc/def'
    
    Instead, they refuse to run `ssh`, producing the error:
    
        Error: Username '-Fconfigfile' could be mistaken for a command-line argument
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    f56ad39 View commit details
    Browse the repository at this point in the history
  11. Comment gix_transport::client::blocking_io::ssh::connect

    To note that if the `-G` check ever also uses the username, then
    that has to be checked to ensure it is safe.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    2e7517e View commit details
    Browse the repository at this point in the history
  12. Give looks_like_argument a more accurate name

    This renames it to `looks_like_command_line_option`, because both
    options and non-option arguments are arguments. The condition it
    checks for is that something looks like an option and therefore
    should *not* be used as a command-line argument.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    beef8d2 View commit details
    Browse the repository at this point in the history
  13. Sketch *_as_argument methods and supporting enum

    This isn't correct yet, because:
    
    - It doesn't compile because of the overcomplicated way I'm doing
      borrowing. The enum can just have references as type arguments
      instead.
    
    - The enum should be public.
    
    (Those are noted here rather than in the next commit that will fix
    them, because it is not useful to include them in a message
    generated from it if it is titled as a conventional commit.)
    
    In addition, while some existing methods are implemented in terms
    of these now so there is some test coverage that way, it's far from
    complete and these really need their own tests.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    29941e6 View commit details
    Browse the repository at this point in the history
  14. feat: Add ArgumentSafety and Url::*_as_argument() methods

    This adds three methods to `Url`:
    
    - `Url::user_as_argument`
    - `Url::host_as_argument`
    - `Url::path_as_argument`
    
    They return `ArgumentSafety` values that distingiush three cases:
    
    1. There is no wrapped value (`Absent`).
    
    2. The wrapped value is usable and presumably safe (`Usable`).
    
    3. The wrapped value is dangerous and should not be passed as a
       command-line argument because it could be interpreted as an
       option due to starting with `-`. The value itself may still be
       useful and safe to include in error messages.
    
    `user_as_argument` and `host_as_argument` are the most useful ones,
    as they serve as alternatives to `user_argument_safe` and
    `host_argument_safe` whose return values are unambiguous. The
    `user_argument_safe` and `host_argument_safe` methods don't
    distinguish between the absence of a username or host, and the
    presence of a username or host that is unsafe to pass as an
    argument.
    
    `path_as_argument` included for parity, in case it is useful to
    write code that handles the three cases similarly. However, there
    is no underlying ambiguity in the return value of the corresponding
    `path_argument_safe` method, since unlike `user` and `host`, the
    `path` is not an option type.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    5457998 View commit details
    Browse the repository at this point in the history
  15. Give ArgumentSafety traits; test *_as_argument methods

    The three new `Url::*_as_argument` methods were only tested
    indirectly before (and the path one not at all). This adds tests
    for them so they have the same direct coverage as the corresponding
    `*_argument_safe` methods.
    
    The `Debug` and `PartialEq` traits added to `ArgumentSafety` are
    primarily added to allow the new assertions to be written in a more
    streamlined way (without matching).
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    1b0af07 View commit details
    Browse the repository at this point in the history
  16. Start on using {user,host}_as_argument in prepare_invocation

    Right now this is more locked down than necessary, serving to check
    that ArgumentSafety can be easily matched on in scenarios where one
    may wish to reject both usernames and hosts that are ambiguous.
    
    This is verified by all tests passing except these that fail:
    
    - ambiguous_host_is_allowed_with_user_explicit_ssh
    - ambiguous_host_is_allowed_with_user_implicit_ssh
    
    Prohibiting ambiguous hosts even when the username is present and
    cannot itself be confused with a command-line option isn't needed
    in gix-transport, which passes URLs of the form `user@host` anytime
    the username is present. However, if an application ever passes
    the username as a separate argument rather than as part of the
    netloc with the host, then being able to express that in a natural
    and non-error-prone way is important.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    4dda375 View commit details
    Browse the repository at this point in the history
  17. Reallow user@-arg... in prepare_invocation

    Since the `user@` prefix makes it safe as gix-transport uses it,
    and there are test cases specifying that it should be permitted.
    EliahKagan committed Apr 12, 2024
    Configuration menu
    Copy the full SHA
    2911623 View commit details
    Browse the repository at this point in the history

Commits on Apr 13, 2024

  1. Try, so far unsuccessfully, to add missing -G test

    This is to test a code path where a host is passed to an ssh
    command and it's important to ensure it doesn't start with `-`.
    This code path uses `-G` to help figure out what SSH client is
    being used, and most of the time it is not followed. The existing
    check is sufficiently simple it is most likely correct. But it
    seems never to to have been covered in automated tests: neither
    unit tests nor any journey tests.
    
    This does not yet manage to test is successfully, because it does
    not succeed at getting that specific code path to be followed.
    EliahKagan committed Apr 13, 2024
    Configuration menu
    Copy the full SHA
    524739b View commit details
    Browse the repository at this point in the history
  2. Test that leading-- host names aren't used in -G check

    This test was absent before, but seems important to add to verify
    that the code there was correct (it was), but also because I plan
    to refactor that code to use `host_as_argument` rather than the
    older (in some cases potentially ambiguous) `host_argument_safe`.
    
    This is specific to the case where a command like `ssh -G` -- but
    where `ssh` is an uncertain command such that gix-transport tries
    to figure out what is -- is used. A netloc is passed as part of
    that, but the netloc used consists only of a host and not a
    username, even when this is a step in a longer process that will
    later use (and check the safety of) the username. So only the host
    has to be checked (at that stage) to make sure it cannot masquerade
    as an option argument.
    EliahKagan committed Apr 13, 2024
    Configuration menu
    Copy the full SHA
    902367f View commit details
    Browse the repository at this point in the history
  3. Use Url::host_as_argument() in ssh::connect()

    Instead of `Url::host_argument_safe()`.
    
    This also removes the comment about how the username should be
    handled if added, mostly because I didn't really need to add that
    comment.
    EliahKagan committed Apr 13, 2024
    Configuration menu
    Copy the full SHA
    cf59f57 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    03fb64a View commit details
    Browse the repository at this point in the history
  5. refactor gix-url

    - put more standard-derives on `ArgumentSafety` to allow more convenient usage.
    - remove `path_as_argument()` method as it wasn't adding anything. This allowed the `ArgumentSafety` wrapper to be simpler.
    - turn assertion comments into messages.
    Byron committed Apr 13, 2024
    Configuration menu
    Copy the full SHA
    09311b0 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    996310b View commit details
    Browse the repository at this point in the history