Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ** matches to explore multiple possible matches #15

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zkry
Copy link
Contributor

@zkry zkry commented Jan 20, 2019

fixes #14

Previously the doublestar token would match greedily as much as it could, so for example, the double star token of the pattern **/*/file, when applied to a directory string a/b/c/file, would greedily match all of a/b/c/file and not leave enough for the rest of the pattern (*/file).

The solution implemented here changes the token interface so that the Consume method returns a slice of paths([]string) that the token could match. These results are added to a stack and iterated over until a solution is found. So in the above example, instead of just returning an empty slice of strings (because the ** consumed everything), it would return the following slice:

[
  [""]
  ["file"]
  ["c" "file"]
  ["b" "c" "file"]
]

So now, the rest of the pattern (file) can match the second item returned.

Copy link
Contributor

@ttaylorr ttaylorr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure that the "longest match" solution is a sufficient condition to guarantee that matches can progress forward in all cases. Having said that, I also don't have a witness to prove it otherwise, for which you'll have to forgive me.

Is it possible to use another strategy here? One interesting thought that comes to mind is "trading" the last component between a foo/**/*/bar-style match, and then backtracking through the resulting structure.

@zkry
Copy link
Contributor Author

zkry commented Jan 25, 2019

Thanks for the feedback. I definitely could have missed something and it might not work in some cases.

Actually, this problem actually has two components; the first is determining whether or not a match string matches a path name. This part seems to work as intended. I tried to cover the test cases but maybe there is a test case that could break it though.

The second problem is that if it doesn't match, then it should return the remaining directory paths. This is actually what that longest path is for. As we are going through all of the match possibilities, it keeps a record of the longest path it has seen. This was less clear as none of the tests cover what this []string value should be. Maybe another test could be made which tests this []string from the consume method.

@ttaylorr
Copy link
Contributor

This part seems to work as intended. I tried to cover the test cases but maybe there is a test case that could break it though.

That's good to hear. I am still fearful that there is a particularly bad case that neither of us are thinking of. @git-lfs/core: are there any such cases that you can think of?

The second problem is that if it doesn't match, then it should return the remaining directory paths.

I think that the current implementation of consume() provides us with this information without the requirement we modify it. For instance, consume() returns a ([]string, bool) indicating (1) which directory components were matched and (2) whether or not there is any left.

If there are directory components left over, I think that we could do an array-wise prefix difference from initial directory components and the ones that was matched. Anything in the suffix of the original directory structure (e.g., the parts not in the dirs slice) is what is left over.

Base automatically changed from master to main February 1, 2021 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mismatch of **/* for multiple directories
2 participants