Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed file picker #5645

Merged
merged 15 commits into from Mar 31, 2024
Merged

Conversation

xJonathanLEI
Copy link
Contributor

@xJonathanLEI xJonathanLEI commented Jan 23, 2023

I just started learning Git internals so I might be doing things wrong. Please feel free to correct me :)

See known issues here

Connects #5362.

This PR adds a "Git status" file picker that's intended to mimic the list of files from a git status command invocation.

I find this feature very useful for picking up previous work when launching a new editing session. In fact, it's useful even when you're in the middle of a session - as you open more buffers from LSP commands etc. it gets harder and harder to find the files you're actually editing, as the number of buffers opened grows.

I set the default key binding of this picker to <SPACE>g and moved the deubgging menu to <SPACE>G because I think it makes the most sense:

  • we already have ]g and [g for navigating through the VCS changes; and
  • <SPACE>f and <SPACE>g are close both in location and functionality.

I added the sha1 dependency as SHA1 performance matters a lot. This is the same crate gitoxide uses if we enable the fast-sha1 feature on the crate (we didn't). I could have enabled the feature and use the Sha1 from gitoxide instead of importing it myself. I didn't do it beacuse unfortuantely gitoxide uses a wrapper around Sha1 and the wrapper doesn't implement Write, making it impossible to utilize the std::io::copy method.

This PR is obviously in no way complete. I'm still submitting it first as I believe this feature has a lot of value even in its current form. Missing stuffs (at least):

  • parallelization of working tree traversal and SHA1 calculation
  • detection of renaming w/ minor edits (which git status is able to do)
  • handling of CRLF transformation of huge files (see code comments)

(edit: these are no longer the case with the latest gix integration; see thread for details)

With that said, the feature itself works fine. It's been tested on Ubuntu, Windows 11 (with autocrlf of course, not WSL), macOS. All working perfectly.

Screenshot on macOS with stock config:

Screenshot on macOS

@pascalkuthe pascalkuthe added C-bug Category: This is a bug E-medium Call for participation: Experience needed to fix: Medium / intermediate S-waiting-on-review Status: Awaiting review from a maintainer. A-vcs Area: Version control system interaction C-enhancement Category: Improvements and removed C-bug Category: This is a bug labels Jan 23, 2023
Copy link
Member

@pascalkuthe pascalkuthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I haven't looked at the implementation too much but here are already things I noticed while scanning

helix-term/src/commands.rs Outdated Show resolved Hide resolved
helix-term/src/commands.rs Outdated Show resolved Hide resolved
@haikyuu
Copy link

haikyuu commented Jan 23, 2023

This will definitely become my most used picker right away. Thanks for working on this @xJonathanLEI 🙏

@xJonathanLEI xJonathanLEI changed the title Git status file picker Changed file picker Jan 23, 2023
@pascalkuthe pascalkuthe self-requested a review January 23, 2023 16:05
@Byron
Copy link
Contributor

Byron commented Jan 23, 2023

I didn't do it beacuse unfortuantely gitoxide uses a wrapper around Sha1 and the wrapper doesn't implement Write, making it impossible to utilize the std::io::copy method.

Good news, what you were looking for actually exists :)

  • parallelization of working tree traversal and SHA1 calculation
  • detection of renaming w/ minor edits (which git status is able to do)
  • handling of CRLF transformation of huge files (see code comments)

Even though nothing speaks against having a first version of these in this codebase, especially if you have the time and fun doing it, I truly think the right place for these capabilities is gitoxide itself. @pascalkuthe and I will be putting our heads together to find a good solution for this so that everybody is happy.

Please note that right now I am super busy with the cargo integration and thus can't jump on this myself, but I will support the efforts together with you to be part of getting the best possible solution both for helix-editor and for gitoxide :).

Exciting!

@xJonathanLEI
Copy link
Contributor Author

First off thank you for taking your time looking at this! @Byron

Good news, what you were looking for actually exists :)

Thanks for thr pointer! I somehow missed this type when looking at that file. However, it looks like a write multiplxer: it writes to an internal writer and the hasher at the same time. I don't really have the second writer to write to and actually only want the Sha1 wrapper to implement Write (which is reasonable to do I assume given it's just a newtype wrapper). Should I submit the Write implementation patch to gitoxide and wait for a new release for this PR?

I truly think the right place for these capabilities is gitoxide itself.

I agree. I started out building this PR by looking for similar features in gitoxide but no luck. IMO having the majority (other than the UI part of course) of this PR implemented in gitoxide makes the most sense simply because of specialization - Helix gets incremental improvements over time for free.

That's why I stopped after having this (reasonably) working POC - leaving the hard part (perf & edge cases) to Git masters like yourself and @pascalkuthe. I do see Helix abandoning this implementation in favor of a native out-of-the-box solution from gitoxide in the future.

On the other hand, I do believe this would be a highly wanted feature itself. That's why I'm still submitting it depsite things not being perfect.

@pascalkuthe and I will be putting our heads together to find a good solution for this so that everybody is happy.

Amazing! We're in good hands!

@xJonathanLEI
Copy link
Contributor Author

While using this on one my of other repos, I realized that it's wrong that I thought a Commit entry cannot appear in the HEAD tree - turns out it can when it's a submodule. As a result, the diff provider chokes whenever there's a submodule. I just pushed a patch that simply ignores all submodules for now.

What do maintainers think? Should I complete the support for change detection in submodules in this PR or do we leave it for future work?

I have a feeling that the UX here is kinda hard to get right. If we just extend the current changed file list to include the submodule files, things are actually pretty messy: files are listed together but you can't really commit them together - they belong to different repositories. We probably need some kind of grouping when submodules exist.

@xJonathanLEI
Copy link
Contributor Author

Just realized there's a small bug: on Linux if a file is was committed as CRLF then it would be incorrectly detected as modified. This is actually not a new bug per ser. I followed the CRLF handling logic in #4995 by @pascalkuthe. Unsurprisingly, the file that is getting incorrectly flagged also has Git gutters on all lines.

I had a feeling that our CRLF handling was too naive and it turns out to be true. I guess we need to mimic how Git handles the option to get rid of this issue (along with the gutters).

@xJonathanLEI
Copy link
Contributor Author

xJonathanLEI commented Jan 24, 2023

Update on the CRLF issue: after investigating more I realized it's not related to the CRLF handling logic itself. The project I encountered this issue on has a .gitattributes file:

# Default.
*           text eol=lf

# Treat patch files as binaries but let diff'ing them
# as normal text.
*.diff      binary diff
*.patch     binary diff
*.patch32   binary diff
*.patch64   binary diff
*.patch.*   binary diff

# Powershell scripts.
*.ps1       text eol=crlf

# Binaries.
*.gpg       binary
*.gz        binary
*.jpg       binary
*.png       binary
*.tar       binary
*.tar.*     binary

To resolve this we need to be able to respect this file instead of blindly following core.autocrlf. I already implemented binary file detection but still need to add support for this one.

Again I'm not sure if this should be left as future work. Input from maintainers would be nice. Thanks!

Edit: just read #4995 more carefully and turns out it's a known issue

helix-vcs/src/git.rs Outdated Show resolved Hide resolved
@pascalkuthe
Copy link
Member

I am pretty sure the diff gutter doesn't work for submodules either so I would be fine with leaving that as future work for. Especially because git status also doesn't report submodule changes I believe

Update on the CRLF issue: after investigating more I realized it's not related to the CRLF handling logic itself. The project I encountered this issue on has a .gitattributes file:

# Default.
*           text eol=lf

# Treat patch files as binaries but let diff'ing them
# as normal text.
*.diff      binary diff
*.patch     binary diff
*.patch32   binary diff
*.patch64   binary diff
*.patch.*   binary diff

# Powershell scripts.
*.ps1       text eol=crlf

# Binaries.
*.gpg       binary
*.gz        binary
*.jpg       binary
*.png       binary
*.tar       binary
*.tar.*     binary

To resolve this we need to be able to respect this file instead of blindly following core.autocrlf. I already implemented binary file detection but still need to add support for this one.

Again I'm not sure if this should be left as future work. Input from maintainers would be nice. Thanks!

Edit: just read #4995 more carefully and turns out it's a known issue

I believe attributes aren't fully supported yet in gitoxide. The diff gutter doesn't support then for this reason either so its fine if you ignore them to now

@xJonathanLEI
Copy link
Contributor Author

PR conflicts with master now due to Cargo.lock. I just rebased and squashed.

@poliorcetics
Copy link
Contributor

Resolves #5362.

This PR doesn't resolve it, that's not the (only) picker I had in mind for the issue

self.providers
.iter()
.find_map(|provider| provider.get_changed_files(cwd).ok())
.ok_or_else(|| anyhow::anyhow!("no diff provider returns success"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this better than using a negation, it avoids having 'success' as the last word

Suggested change
.ok_or_else(|| anyhow::anyhow!("no diff provider returns success"))
.ok_or_else(|| anyhow::anyhow!("all diff providers failed to list changed files"))

fn git_meta_from_path(
path: &Path,
autocrlf: bool,
) -> Result<Option<(FileEntryMode, ObjectId)>, std::io::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) -> Result<Option<(FileEntryMode, ObjectId)>, std::io::Error> {
) -> std::io::Result<Option<(FileEntryMode, ObjectId)>> {

Comment on lines 348 to 393
// Windows doesn't support symlinks. This block runs fine but is just wasting CPU cycles.
#[cfg(not(windows))]
match path.symlink_metadata() {
Ok(meta) => {
if meta.is_symlink() {
let link_content = std::fs::read_link(path)?;
let link_content = link_content.to_string_lossy();
let link_content = link_content.as_bytes();

let mut hasher = sha1::Sha1::default();
hasher.update(b"blob ");
hasher.update(format!("{}", link_content.len()).as_bytes());
hasher.update(b"\0");
hasher.update(link_content);

let hash: [u8; 20] = hasher.finalize().into();

return Ok(Some((FileEntryMode::Link, ObjectId::from(hash))));
}
}
Err(_) => return Ok(None),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Windows doesn't support symlinks. This block runs fine but is just wasting CPU cycles.
#[cfg(not(windows))]
match path.symlink_metadata() {
Ok(meta) => {
if meta.is_symlink() {
let link_content = std::fs::read_link(path)?;
let link_content = link_content.to_string_lossy();
let link_content = link_content.as_bytes();
let mut hasher = sha1::Sha1::default();
hasher.update(b"blob ");
hasher.update(format!("{}", link_content.len()).as_bytes());
hasher.update(b"\0");
hasher.update(link_content);
let hash: [u8; 20] = hasher.finalize().into();
return Ok(Some((FileEntryMode::Link, ObjectId::from(hash))));
}
}
Err(_) => return Ok(None),
};
// Windows doesn't support symlinks. This block runs fine but is just wasting CPU cycles.
if cfg!(not(windows)) {
match path.symlink_metadata() {
Ok(meta) => {
if meta.is_symlink() {
let link_content = std::fs::read_link(path)?;
let link_content = link_content.to_string_lossy();
let link_content = link_content.as_bytes();
let mut hasher = sha1::Sha1::default();
hasher.update(b"blob ");
hasher.update(format!("{}", link_content.len()).as_bytes());
hasher.update(b"\0");
hasher.update(link_content);
let hash: [u8; 20] = hasher.finalize().into();
return Ok(Some((FileEntryMode::Link, ObjectId::from(hash))));
}
}
Err(_) => return Ok(None),
}
}

When the code compiles on all platform, I find it better to use cfg!(...) because it allows autocompletion everywhere, even when not compiling for the targted feature or arch, unlike #[cfg(...)], so it provides a nicer contributor experience :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation level can be reduced using a tuple match, but I'm not sure it's worth it here

Comment on lines 372 to 490
Ok(match path.metadata() {
Ok(meta) => {
if meta.is_file() {
let entry_mode = {
#[cfg(unix)]
{
use std::os::unix::prelude::PermissionsExt;
if meta.permissions().mode() & 0o111 != 0 {
FileEntryMode::BlobExecutable
} else {
FileEntryMode::Blob
}
}

#[cfg(not(unix))]
{
FileEntryMode::Blob
}
};

let oid = {
let mut file = std::fs::File::open(path)?;

// `git::features::hash::Sha1` doesn't implement `Write` so we use the
// underlying crate directly for max perf.
let mut hasher = sha1::Sha1::default();
hasher.update(b"blob ");

if autocrlf {
// When autocrlf is on, we either have to fit the whole file into memory,
// or we read the file twice. Either way is not optimal. How should we
// handle this?
//
// With the current implementation, there's no way we can handle huge files
// that do not fit into memory. Maybe we can set a size limit? Anything
// over a certain size will simply be read twice: once for getting the
// normalized size, and once for the hasher updates?
const BUFFER_SIZE: usize = 8 * 1024;
let mut buffer = [0u8; BUFFER_SIZE];

let mut len = file.read(&mut buffer)?;
if content_inspector::inspect(&buffer[..len])
== content_inspector::ContentType::BINARY
{
// No CRLF handling! We update the part already read + the remaining
// content in the file.
hasher.update(format!("{}", meta.len()).as_bytes());
hasher.update(b"\0");

hasher.update(&buffer[..len]);
std::io::copy(&mut file, &mut hasher)?;
} else {
// It's a text file. CRLF transformation as planned.
let mut normalized_file = Vec::with_capacity(meta.len() as usize);
let mut was_cr = false;

loop {
buffer[..len].iter().for_each(|byte| {
if was_cr && *byte == b'\n' {
normalized_file.pop();
}
normalized_file.push(*byte);
was_cr = *byte == b'\r';
});

if len < BUFFER_SIZE {
break;
}
len = file.read(&mut buffer)?;
}

hasher.update(format!("{}", normalized_file.len()).as_bytes());
hasher.update(b"\0");

hasher.update(&normalized_file);
}
} else {
hasher.update(format!("{}", meta.len()).as_bytes());
hasher.update(b"\0");

std::io::copy(&mut file, &mut hasher)?;
}

let hash: [u8; 20] = hasher.finalize().into();
ObjectId::from(hash)
};

Some((entry_mode, oid))
} else {
// It's a non-symlink folder. Git doesn't track folders. Same as deletion.
None
}
}
Err(_) => None,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the indentation levels here can be reduced by using this pattern, it would avoid the line Err(_) => None at the end that is difficult to link to the earlier path.metadata() call and the if meta.is_file() that is 99% of the Ok branch of the match

let metadata = match path.metadata() {
    Ok(m) if m.is_file() => m,
    // Ignore errors and non-files
    _ => return Ok(None),
}

/// ...

fn from(mut raw: RawChanges) -> Self {
let mut status_entries = vec![];

let additions_left = if !raw.additions.is_empty() && !raw.deletions.is_empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inverting the condition would mean having the one-line else branch at the top and the big branch below, making it much easier to remember the condition when getting to the else

})?;

let mut head_tree_set = HashSet::new();
let mut submodule_paths = vec![];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the macro for an empty Vec seems wasteful

path: PathBuf,
}

#[allow(unused)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary :/ ?


#[cfg(test)]
mod test;

pub struct Git;

/// A subset of `git_repository::objs::tree::EntryMode` that actually makes sense for tree nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use [..] so that people building the doc locally will have a link to the enum

Suggested change
/// A subset of `git_repository::objs::tree::EntryMode` that actually makes sense for tree nodes.
/// A subset of [`git_repository::objs::tree::EntryMode`] that actually makes sense for tree nodes.

Comment on lines 128 to 138
if item.mode == Mode::COMMIT {
submodule_paths.push(full_path);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removes an indentation level, but for this one I'm not sure it's better

Suggested change
if item.mode == Mode::COMMIT {
submodule_paths.push(full_path);
} else {
if item.mode == Mode::COMMIT {
submodule_paths.push(full_path);
continue;
}
// ...

Comment on lines 142 to 160
let entry_mode_changed = {
#[cfg(unix)]
{
new_entry_mode != old_entry_mode
}

#[cfg(not(unix))]
{
false
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let entry_mode_changed = {
#[cfg(unix)]
{
new_entry_mode != old_entry_mode
}
#[cfg(not(unix))]
{
false
}
};
let entry_mode_changed = cfg!(unix) && new_entry_mode != old_entry_mode;

@xJonathanLEI
Copy link
Contributor Author

Rebased on master due to merge conflicts.

@xJonathanLEI
Copy link
Contributor Author

xJonathanLEI commented Mar 26, 2024

gix can do the desired thing already

Yeah that's what I meant. While it is true that currently gix status early terminates as soon as an error is encountered, it doesn't have to. It's a simple fix to just gracefully handle the Err from iteration. I couldn't make it work from Helix for some reasons: into_index_worktree_iter would already return Err. I'd say it's quite an edge case though, as it only impacts repos with permission issues. So maybe we don't need to deal with it in this PR.

Definitely possible to narrow down the issue. Was raising there just in case you know what's going on. But all good otherwise.

helix-vcs/src/diff.rs Outdated Show resolved Hide resolved
helix-vcs/src/git.rs Outdated Show resolved Hide resolved
helix-term/src/ui/menu.rs Outdated Show resolved Hide resolved
@pascalkuthe pascalkuthe added this to the next milestone Mar 27, 2024
helix-term/src/commands.rs Outdated Show resolved Hide resolved
helix-term/src/commands.rs Outdated Show resolved Hide resolved
Copy link
Member

@pascalkuthe pascalkuthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a couple minor things that can be improved with regarrd readability while doing one last pas.

It took me a bit to figure out what exactly I wanted so I already made the changes locally and pushed them as seperate commits (nothing is changed regarding functionality just some readability nits). Hope you don't mind. I think with these changes this looks good to go

DiffProviderRegistry { providers }
}
}

/// A union type that includes all types that implement [DiffProvider]. We need this type to allow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to replace this with an enum anyway but I don't think its necessary to keep the trait once you have an enum

.ok_or_else(|| anyhow::anyhow!("working tree not found"))?
.to_path_buf();

let mapper = |item: std::result::Result<Item, gix::status::index_worktree::Error>| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mapper is reptty hard to read I think its jsut leftover from when you ahd an iterator, moving this directly into the forloop and using continue instead of none makes this less cluttered

@@ -61,6 +70,91 @@ impl Git {

Ok(res)
}

/// Emulates the result of running `git status` from the command line.
fn status<FC, FE>(repo: &Repository, on_change: FC, on_err: FE) -> Result<()>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of two seperate callbacks you can use a single one

Copy link
Member

@the-mikedavis the-mikedavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, I'm excited to use this!

I edited the description to remove the closing keywords for #5362. I think it's worth exploring/evaluating the other pickers in that issue (I have one locally in the-mikedavis@9fd4fcf that I'm trying out now) so I don't want to close it too soon

Comment on lines +3024 to +3034
let (sign, style, content) = match self {
Self::Untracked { path } => ("[+]", data.style_untracked, process_path(path)),
Self::Modified { path } => ("[~]", data.style_modified, process_path(path)),
Self::Conflict { path } => ("[x]", data.style_conflict, process_path(path)),
Self::Deleted { path } => ("[-]", data.style_deleted, process_path(path)),
Self::Renamed { from_path, to_path } => (
"[>]",
data.style_renamed,
format!("{} -> {}", process_path(from_path), process_path(to_path)),
),
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For #9647 we will want to change the UI slightly. Like the symbol pickers instead of adding icons / glyphs for each kind of change I prefer words, so the symbols don't need a legend. I have this merged into my driver branch with a UI like this:

foo3

so that the symbols stay but are accompanied by a git status-like description of the change. But we can hash this out on #9647 once I rebase rather than here

@the-mikedavis the-mikedavis merged commit a224ee5 into helix-editor:master Mar 31, 2024
6 checks passed
@xJonathanLEI xJonathanLEI deleted the dev/vcs_picker branch April 1, 2024 00:24
@mehmetkursataydin
Copy link

Wow, I love this!

Desdaemon pushed a commit to Desdaemon/helix that referenced this pull request Apr 3, 2024
Co-authored-by: WJH <hou32hou@gmail.com>
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
Co-authored-by: Pascal Kuthe <pascalkuthe@pm.me>
Byron added a commit to Byron/gitoxide that referenced this pull request Apr 3, 2024
Even inactive submodules are shown in the status by `git status`,
so `gix` should do the same.

First observed in helix-editor/helix#5645 (comment)
postsolar pushed a commit to postsolar/helix that referenced this pull request Apr 4, 2024
Co-authored-by: WJH <hou32hou@gmail.com>
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
Co-authored-by: Pascal Kuthe <pascalkuthe@pm.me>
postsolar pushed a commit to postsolar/helix that referenced this pull request Apr 4, 2024
Co-authored-by: WJH <hou32hou@gmail.com>
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
Co-authored-by: Pascal Kuthe <pascalkuthe@pm.me>
postsolar pushed a commit to postsolar/helix that referenced this pull request Apr 4, 2024
Co-authored-by: WJH <hou32hou@gmail.com>
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
Co-authored-by: Pascal Kuthe <pascalkuthe@pm.me>
github-merge-queue bot pushed a commit to knope-dev/knope that referenced this pull request Apr 13, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [gix](https://togithub.com/Byron/gitoxide) | dependencies | minor |
`0.61.0` -> `0.62.0` |

---

### Release Notes

<details>
<summary>Byron/gitoxide (gix)</summary>

###
[`v0.62.0`](https://togithub.com/Byron/gitoxide/releases/tag/gix-v0.62.0):
gix v0.62

[Compare
Source](https://togithub.com/Byron/gitoxide/compare/gix-v0.61.1...gix-v0.62.0)

Please note that this release contains a security fix originally
implemented in `gix-transport` via [this
PR](https://togithub.com/Byron/gitoxide/pull/1342) which prevents `ssh`
options to be smuggled into the `ssh` command-line invocation with a
username provided to a clone or fetch URL.

Details can be found [in the
advisory](https://togithub.com/Byron/gitoxide/security/advisories/GHSA-98p4-xjmm-8mfh).

##### Bug Fixes

- `into_index_worktree_iter()` now takes an iterator, instead of a Vec.
    This makes the API more consistent, and one can pass `None`
    as well.
-   show submodules in status independently of their active state.
    Even inactive submodules are shown in the status by `git status`,
    so `gix` should do the same.

First observed in
[helix-editor/helix#5645 (comment)
- forward `curl` rustls feature from `gix-transport` to avoid `curl` in
`gix`.
This removes the `curl` dependency just for configuring it, and removes
    a hazard which became evident with reqwest.

##### Bug Fixes (BREAKING)

- Make `topo` more similar to `Ancestors`, but also rename `Ancestors`
to `Simple`

##### Commit Statistics

- 16 commits contributed to the release over the course of 20 calendar
days.
-   22 days passed between releases.
- 4 commits were understood as
[conventional](https://www.conventionalcommits.org/).
- 1 unique issue was worked on:
[Byron/gitoxide#1328

##### Thanks Clippy

[Clippy](https://togithub.com/rust-lang/rust-clippy) helped 1 time to
make code idiomatic.

##### Commit Details

-
**[Byron/gitoxide#1328
- Forward `curl` rustls feature from `gix-transport` to avoid `curl` in
`gix`.
(Byron/gitoxide@98cfbec)
-   **Uncategorized**
- Prepare changelogs prior to release
(Byron/gitoxide@5755271)
- Merge pull request
[Byron/gitoxide#1341
from szepeviktor/typos
(Byron/gitoxide@55f379b)
- Fix typos
(Byron/gitoxide@f72ecce)
- Merge branch 'add-topo-walk'
(Byron/gitoxide@b590a9d)
- Adapt to changes in `gix-traverse`
(Byron/gitoxide@1cfeb11)
- Make `topo` more similar to `Ancestors`, but also rename `Ancestors`
to `Simple`
(Byron/gitoxide@2a9c178)
- Adapt to changes in `gix-traverse`
(Byron/gitoxide@6154bf3)
- Thanks clippy
(Byron/gitoxide@7f6bee5)
- Merge branch 'status'
(Byron/gitoxide@45edd2e)
- `into_index_worktree_iter()` now takes an iterator, instead of a Vec.
(Byron/gitoxide@18b2921)
- Show submodules in status independently of their active state.
(Byron/gitoxide@719ced8)
- Make it easier to discover `is_path_excluded()` in documentation
(Byron/gitoxide@c136329)
- Adapt to changes in `gix-index`
(Byron/gitoxide@1e1fce1)
- Merge branch 'patch-1'
(Byron/gitoxide@9e9c653)
- Remove dep reqwest from gix
(Byron/gitoxide@e3eedd8)

###
[`v0.61.1`](https://togithub.com/Byron/gitoxide/releases/tag/gix-v0.61.1):
gix v0.61.1

[Compare
Source](https://togithub.com/Byron/gitoxide/compare/gix-v0.61.0...gix-v0.61.1)

This release also updates `reqwest` to v0.12, bringing hyper 1.0 and a
more recent `rustls` version.

##### Bug Fixes

-   missing closing backtick in gix lib documentation

##### Commit Statistics

- 7 commits contributed to the release over the course of 2 calendar
days.
-   3 days passed between releases.
- 1 commit was understood as
[conventional](https://www.conventionalcommits.org).
-   0 issues like '(#ID)' were seen in commit messages

##### Commit Details

<csr-read-only-do-not-edit/>

<details><summary>view details</summary>

-   **Uncategorized**
- Prepare changelogs prior to release
([`7018a92`](https://togithub.com/Byron/gitoxide/commit/7018a92))
- Merge branch 'patch-1'
([`8fde62b`](https://togithub.com/Byron/gitoxide/commit/8fde62b))
- Turn`curl` into a workspace package
([`adee500`](https://togithub.com/Byron/gitoxide/commit/adee500))
- Make reqwest a workspace package
([`369cf1b`](https://togithub.com/Byron/gitoxide/commit/369cf1b))
- Merge pull request
[#&#8203;1325](https://togithub.com/Byron/gitoxide/issues/1325) from
kdelorey/fix/simple-docs-formatting
([`3b34699`](https://togithub.com/Byron/gitoxide/commit/3b34699))
- Fixed opening of backtick in documentation.
([`f1bc4cd`](https://togithub.com/Byron/gitoxide/commit/f1bc4cd))
- Missing closing backtick in gix lib documentation
([`e1fec3c`](https://togithub.com/Byron/gitoxide/commit/e1fec3c))

</details>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/knope-dev/knope).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yNjkuMiIsInVwZGF0ZWRJblZlciI6IjM3LjI2OS4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
LeoniePhiline added a commit to LeoniePhiline/quote-of-the-week that referenced this pull request Apr 13, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [gix](https://togithub.com/Byron/gitoxide) | dependencies | minor |
`0.61.0` -> `0.62.0` |

---

### Release Notes

<details>
<summary>Byron/gitoxide (gix)</summary>

###
[`v0.62.0`](https://togithub.com/Byron/gitoxide/releases/tag/gix-v0.62.0):
gix v0.62

[Compare
Source](https://togithub.com/Byron/gitoxide/compare/gix-v0.61.1...gix-v0.62.0)

Please note that this release contains a security fix originally
implemented in `gix-transport` via [this
PR](https://togithub.com/Byron/gitoxide/pull/1342) which prevents `ssh`
options to be smuggled into the `ssh` command-line invocation with a
username provided to a clone or fetch URL.

Details can be found [in the
advisory](https://togithub.com/Byron/gitoxide/security/advisories/GHSA-98p4-xjmm-8mfh).

##### Bug Fixes

- `into_index_worktree_iter()` now takes an iterator, instead of a Vec.
    This makes the API more consistent, and one can pass `None`
    as well.
-   show submodules in status independently of their active state.
    Even inactive submodules are shown in the status by `git status`,
    so `gix` should do the same.

First observed in
[helix-editor/helix#5645 (comment)
- forward `curl` rustls feature from `gix-transport` to avoid `curl` in
`gix`.
This removes the `curl` dependency just for configuring it, and removes
    a hazard which became evident with reqwest.

##### Bug Fixes (BREAKING)

- Make `topo` more similar to `Ancestors`, but also rename `Ancestors`
to `Simple`

##### Commit Statistics

- 16 commits contributed to the release over the course of 20 calendar
days.
-   22 days passed between releases.
- 4 commits were understood as
[conventional](https://www.conventionalcommits.org/).
- 1 unique issue was worked on:
[Byron/gitoxide#1328

##### Thanks Clippy

[Clippy](https://togithub.com/rust-lang/rust-clippy) helped 1 time to
make code idiomatic.

##### Commit Details

-
**[Byron/gitoxide#1328
- Forward `curl` rustls feature from `gix-transport` to avoid `curl` in
`gix`.
(Byron/gitoxide@98cfbec)
-   **Uncategorized**
- Prepare changelogs prior to release
(Byron/gitoxide@5755271)
- Merge pull request
[Byron/gitoxide#1341
from szepeviktor/typos
(Byron/gitoxide@55f379b)
- Fix typos
(Byron/gitoxide@f72ecce)
- Merge branch 'add-topo-walk'
(Byron/gitoxide@b590a9d)
- Adapt to changes in `gix-traverse`
(Byron/gitoxide@1cfeb11)
- Make `topo` more similar to `Ancestors`, but also rename `Ancestors`
to `Simple`
(Byron/gitoxide@2a9c178)
- Adapt to changes in `gix-traverse`
(Byron/gitoxide@6154bf3)
- Thanks clippy
(Byron/gitoxide@7f6bee5)
- Merge branch 'status'
(Byron/gitoxide@45edd2e)
- `into_index_worktree_iter()` now takes an iterator, instead of a Vec.
(Byron/gitoxide@18b2921)
- Show submodules in status independently of their active state.
(Byron/gitoxide@719ced8)
- Make it easier to discover `is_path_excluded()` in documentation
(Byron/gitoxide@c136329)
- Adapt to changes in `gix-index`
(Byron/gitoxide@1e1fce1)
- Merge branch 'patch-1'
(Byron/gitoxide@9e9c653)
- Remove dep reqwest from gix
(Byron/gitoxide@e3eedd8)

###
[`v0.61.1`](https://togithub.com/Byron/gitoxide/releases/tag/gix-v0.61.1):
gix v0.61.1

[Compare
Source](https://togithub.com/Byron/gitoxide/compare/gix-v0.61.0...gix-v0.61.1)

This release also updates `reqwest` to v0.12, bringing hyper 1.0 and a
more recent `rustls` version.

##### Bug Fixes

-   missing closing backtick in gix lib documentation

##### Commit Statistics

- 7 commits contributed to the release over the course of 2 calendar
days.
-   3 days passed between releases.
- 1 commit was understood as
[conventional](https://www.conventionalcommits.org).
-   0 issues like '(#ID)' were seen in commit messages

##### Commit Details

<csr-read-only-do-not-edit/>

<details><summary>view details</summary>

-   **Uncategorized**
- Prepare changelogs prior to release
([`7018a92`](https://togithub.com/Byron/gitoxide/commit/7018a92))
- Merge branch 'patch-1'
([`8fde62b`](https://togithub.com/Byron/gitoxide/commit/8fde62b))
- Turn`curl` into a workspace package
([`adee500`](https://togithub.com/Byron/gitoxide/commit/adee500))
- Make reqwest a workspace package
([`369cf1b`](https://togithub.com/Byron/gitoxide/commit/369cf1b))
- Merge pull request
[#&#8203;1325](https://togithub.com/Byron/gitoxide/issues/1325) from
kdelorey/fix/simple-docs-formatting
([`3b34699`](https://togithub.com/Byron/gitoxide/commit/3b34699))
- Fixed opening of backtick in documentation.
([`f1bc4cd`](https://togithub.com/Byron/gitoxide/commit/f1bc4cd))
- Missing closing backtick in gix lib documentation
([`e1fec3c`](https://togithub.com/Byron/gitoxide/commit/e1fec3c))

</details>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/LeoniePhiline/quote-of-the-week).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yNjkuMiIsInVwZGF0ZWRJblZlciI6IjM3LjI2OS4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
bors added a commit to rust-lang/cargo that referenced this pull request Apr 16, 2024
…epage

chore(deps): update rust crate gix to 0.62.0 [security]

[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [gix](https://togithub.com/Byron/gitoxide) | workspace.dependencies | minor | `0.61.0` -> `0.62.0` |

### GitHub Vulnerability Alerts

#### [GHSA-98p4-xjmm-8mfh](https://togithub.com/Byron/gitoxide/security/advisories/GHSA-98p4-xjmm-8mfh)

### Summary

`gix-transport` does not check the username part of a URL for text that the external `ssh` program would interpret as an option. A specially crafted clone URL can smuggle options to SSH. The possibilities are syntactically limited, but if a malicious clone URL is used by an application whose current working directory contains a malicious file, arbitrary code execution occurs.

### Details

This is related to the patched vulnerability GHSA-rrjw-j4m2-mf34, but appears less severe due to a greater attack complexity. Since [Byron/gitoxide#1032, `gix-transport` checks the host and path portions of a URL for text that has a `-` in a position that will cause `ssh` to interpret part of all of the URL as an option argument. But it does not check the non-mandatory username portion of the URL.

As in Git, when an address is a URL of the form `ssh://username@hostname/path`, or when it takes the special form `username@hostname:dirs/repo`, this is treated as an SSH URL. `gix-transport` will replace some characters in `username` with their `%`-based URL encodings, but otherwise passes `username@hostname` as an argument to the external `ssh` command. This happens even if `username` begins with a hyphen. In that case, `ssh` treats that argument as an option argument, and attempts to interpret and honor it as a sequence of one or more options possibly followed by an operand for the last option.

This is harder to exploit than GHSA-rrjw-j4m2-mf34, because the possibilities are constrained by:

- The difficulty of forming an option argument `ssh` accepts, given that characters such as `=`, `/`, and `\`, are URL-encoded, `:` is removed, and the argument passed to `ssh` contains the ``@`` sign and subsequent host identifier, which in an effective attack must be parseable as a suffix of the operand passed to the last option.

  The inability to include a literal `=` prevents the use of `-oNAME=VALUE` (e.g., `-oProxyCommand=payload`). The inability to include a literal `/` or `\` prevents smuggling in a path operand residing outside the current working directory, incuding on Windows. (Although a `~` character may be smuggled in, `ssh` does not perform its own tilde expansion, so it does not form an absolute path.)

- The difficulty, or perhaps impossibility, of completing a connection (other than when arbitrary code execution has been achieved). This complicates or altogether prevents the use of options such as `-A` and `-X` together with a connection to a real but malicious server. The reason a connection cannot generally be completed when exploiting this vulnerability is that, because the argument `gix-transport` intends as a URL is treated as an option argument, `ssh` treats the subsequent non-option argument `git-upload-pack` as the host instead of the command, but it is not a valid host name.

  Although `ssh` supports aliases for hosts, even if `git-upload-pack` could be made an alias, that is made difficult by the URL-encoding transformation.

However, an attacker who is able to cause a specially named `ssh` configuration file to be placed in the current working directory can smuggle in an `-F` option referencing the file, and this allows arbitrary command execution.

This scenario is especially plausible because programs that operate on git repositories are often run in untrusted git repositories, sometimes even to operate on another repository. Situations where this is likely, such that an attacker could predict or arrange it, may for some applications include a malicious repository with a malicious submodule configuration.

Other avenues of exploitation exist, but appear to be less severe. For example, the `-E` option can be smuggled to create or append to a file in the current directory (or its target, if it is a symlink). There may also be other significant ways to exploit this that have not yet been discovered, or that would arise with new options in future versions of `ssh`.

### PoC

To reproduce the known case that facilitates arbitrary code execution, first create a file in the current directory named `configfile@example.com`, of the form

```text
ProxyCommand payload
```

where `payload` is a command with an observable side effect. On Unix-like systems, this could be `date | tee vulnerable` or an `xdg-open`, `open`, or other command command to launch a graphical application. On Windows, this could be the name of a graphical application already in the search path, such as `calc.exe`.

(Although the syntax permitted in the value of `ProxyCommand` may vary by platform, this is not limited to running commands in the current directory. That limitation only applies to paths directly smuggled in the username, not to the contents of a separate malicious configuration file. Arbitrary other settings may be specified in `configfile@example.com` as well.)

Then run:

```sh
gix clone 'ssh://-Fconfigfile@example.com/abc'
```

Or:

```sh
gix clone -- '-Fconfigfile@example.com:abc/def'
```

(The `--` is required to ensure that `gix` is really passing the argument as a URL for use in `gix-transport`, rather than interpreting it as an option itself, which would not necessarily be a vulnerability.)

In either case, the payload specified in `configfile@example.com` runs, and its side effect can be observed.

Other cases may likewise be produced, in either of the above two forms of SSH addresses. For example, to create or append to the file `errors@example.com`, or to create or append to its target if it is a symlink:

```sh
gix clone 'ssh://-Eerrors@example.com/abc'
```

```sh
gix clone -- '-Eerrors@example.com:abc/def'
```

### Impact

As in GHSA-rrjw-j4m2-mf34, this would typically require user interaction to trigger an attempt to clone or otherwise connect using the malicious URL. Furthermore, known means of exploiting this vulnerability to execute arbitrary commands require further preparatory steps to establish a specially named file in the current directory. The impact is therefore expected to be lesser, though it is difficult to predict it with certainty because it is not known exactly what scenarios will arise when using the `gix-transport` library.

Users who use applications that make use of `gix-transport` are potentially vulnerable, especially:

- On repositories with submodules that are automatically added, depending how the application manages submodules.
- When operating on other repositories from inside an untrusted repository.
- When reviewing contributions from untrusted developers by checking out a branch from an untrusted fork and performing clones from that location.

---

### Release Notes

<details>
<summary>Byron/gitoxide (gix)</summary>

### [`v0.62.0`](https://togithub.com/Byron/gitoxide/releases/tag/gix-v0.62.0): gix v0.62

[Compare Source](https://togithub.com/Byron/gitoxide/compare/gix-v0.61.1...gix-v0.62.0)

Please note that this release contains a security fix originally implemented in `gix-transport` via [this PR](https://togithub.com/Byron/gitoxide/pull/1342) which prevents `ssh` options to be smuggled into the `ssh` command-line invocation with a username provided to a clone or fetch URL.

Details can be found [in the advisory](https://togithub.com/Byron/gitoxide/security/advisories/GHSA-98p4-xjmm-8mfh).

##### Bug Fixes

-   `into_index_worktree_iter()` now takes an iterator, instead of a Vec.
    This makes the API more consistent, and one can pass `None`
    as well.
-   show submodules in status independently of their active state.
    Even inactive submodules are shown in the status by `git status`,
    so `gix` should do the same.

    First observed in [helix-editor/helix#5645 (comment)
-   forward `curl` rustls feature from `gix-transport` to avoid `curl` in `gix`.
    This removes the `curl` dependency just for configuring it, and removes
    a hazard which became evident with reqwest.

##### Bug Fixes (BREAKING)

-   Make `topo` more similar to `Ancestors`, but also rename `Ancestors` to `Simple`

##### Commit Statistics

-   16 commits contributed to the release over the course of 20 calendar days.
-   22 days passed between releases.
-   4 commits were understood as [conventional](https://www.conventionalcommits.org/).
-   1 unique issue was worked on: [Byron/gitoxide#1328

##### Thanks Clippy

[Clippy](https://togithub.com/rust-lang/rust-clippy) helped 1 time to make code idiomatic.

##### Commit Details

-   **[Byron/gitoxide#1328
    -   Forward `curl` rustls feature from `gix-transport` to avoid `curl` in `gix`. (Byron/gitoxide@98cfbec)
-   **Uncategorized**
    -   Prepare changelogs prior to release (Byron/gitoxide@5755271)
    -   Merge pull request [Byron/gitoxide#1341 from szepeviktor/typos (Byron/gitoxide@55f379b)
    -   Fix typos (Byron/gitoxide@f72ecce)
    -   Merge branch 'add-topo-walk' (Byron/gitoxide@b590a9d)
    -   Adapt to changes in `gix-traverse` (Byron/gitoxide@1cfeb11)
    -   Make `topo` more similar to `Ancestors`, but also rename `Ancestors` to `Simple` (Byron/gitoxide@2a9c178)
    -   Adapt to changes in `gix-traverse` (Byron/gitoxide@6154bf3)
    -   Thanks clippy (Byron/gitoxide@7f6bee5)
    -   Merge branch 'status' (Byron/gitoxide@45edd2e)
    -   `into_index_worktree_iter()` now takes an iterator, instead of a Vec. (Byron/gitoxide@18b2921)
    -   Show submodules in status independently of their active state. (Byron/gitoxide@719ced8)
    -   Make it easier to discover `is_path_excluded()` in documentation (Byron/gitoxide@c136329)
    -   Adapt to changes in `gix-index` (Byron/gitoxide@1e1fce1)
    -   Merge branch 'patch-1' (Byron/gitoxide@9e9c653)
    -   Remove dep reqwest from gix (Byron/gitoxide@e3eedd8)

### [`v0.61.1`](https://togithub.com/Byron/gitoxide/releases/tag/gix-v0.61.1): gix v0.61.1

[Compare Source](https://togithub.com/Byron/gitoxide/compare/gix-v0.61.0...gix-v0.61.1)

This release also updates `reqwest` to v0.12, bringing hyper 1.0 and a more recent `rustls` version.

##### Bug Fixes

-   missing closing backtick in gix lib documentation

##### Commit Statistics

-   7 commits contributed to the release over the course of 2 calendar days.
-   3 days passed between releases.
-   1 commit was understood as [conventional](https://www.conventionalcommits.org).
-   0 issues like '(#ID)' were seen in commit messages

##### Commit Details

<csr-read-only-do-not-edit/>

<details><summary>view details</summary>

-   **Uncategorized**
    -   Prepare changelogs prior to release ([`7018a92`](https://togithub.com/Byron/gitoxide/commit/7018a92))
    -   Merge branch 'patch-1' ([`8fde62b`](https://togithub.com/Byron/gitoxide/commit/8fde62b))
    -   Turn`curl` into a workspace package ([`adee500`](https://togithub.com/Byron/gitoxide/commit/adee500))
    -   Make reqwest a workspace package ([`369cf1b`](https://togithub.com/Byron/gitoxide/commit/369cf1b))
    -   Merge pull request [#&#8203;1325](https://togithub.com/Byron/gitoxide/issues/1325) from kdelorey/fix/simple-docs-formatting ([`3b34699`](https://togithub.com/Byron/gitoxide/commit/3b34699))
    -   Fixed opening of backtick in documentation. ([`f1bc4cd`](https://togithub.com/Byron/gitoxide/commit/f1bc4cd))
    -   Missing closing backtick in gix lib documentation ([`e1fec3c`](https://togithub.com/Byron/gitoxide/commit/e1fec3c))

</details>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://developer.mend.io/github/rust-lang/cargo).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yOTMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjI5My4wIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIiwibGFiZWxzIjpbXX0=-->
postsolar pushed a commit to postsolar/helix that referenced this pull request Apr 20, 2024
Co-authored-by: WJH <hou32hou@gmail.com>
Co-authored-by: Michael Davis <mcarsondavis@gmail.com>
Co-authored-by: Pascal Kuthe <pascalkuthe@pm.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-vcs Area: Version control system interaction C-enhancement Category: Improvements E-medium Call for participation: Experience needed to fix: Medium / intermediate S-waiting-on-review Status: Awaiting review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants