Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix traversing the directory hierarchy on WindowsOS / Ant-style path matching #1615

Merged

Conversation

paul-dingemans
Copy link
Collaborator

@paul-dingemans paul-dingemans commented Aug 27, 2022

Description

Globs always use a "/" as directory separator on all OS's. Input patterns containing a "" on Windows OS are transformed to "/" as users on Windows more likely would assume that the "" may be used.

On WindowsOS, transform "" in the filepath to "/" before comparing the filename with the regular expression (of the glob) which always uses "/" as separator.

Refactor all logic which create globs based on an input path.

  • If a path (absolute or relative) point to a directory, that path is expanded to the default globs (*.kt, *.kts) in that specific directory or any of its subdirectories.
  • If a path (absolute or relative) does not point to a directory, e.g. it
      points to a file, or it is a pattern. See "**" replacement below.
  • On Windows OS patters containing a "*" (or "") can not be resolved with default Paths utilities. In such case the given input pattern is handled as is. See "" replacement below.

Patterns that contain one or more occurrence of a "**" are split into multiple patterns so that files on that specific path and subdirectories will be matched.

  • For example, for path "some/path//.kt" an additional pattern "some/path/.kt" is generated to make sure that not only the ".kt" files in a subdirectory of "some/path/" are found but also the ".kt" in directory "some/path" as well. This is in sync with the "" notation in a glob which should be interpreted as having zero or more intermediate subdirectories.
  • For example, for path "some//path//.kt", multiple additional patterns are generated. As it contains two "" patterns, 2 x 2 patterns are needed to match all possible combinations:
       - "some/
    /path/**/
    .kt"
       - "some//path/*.kt"
       - "some/path/
    /.kt"
       - "some/path/
    .kt"

Finally, on Windows OS more fixes are needed as the resulting globs may not contain any drive destinations as the start of the path. Such a drive destination is replaced with a "**". So "D:/some/path/.kt" becomes "/some/path/.kt". Note that the last glob representation is less strict than the original pattern as it could match on other drives that "D:/" as well.

Extend trace logging.

Closes #1600
Closes #1601

Checklist

  • PR description added
  • tests are added
  • KtLint has been applied on source code itself and violations are fixed
  • documentation is updated
  • CHANGELOG.md is updated

In case of adding a new rule:

Comment on lines +214 to +218
private val onWindowsOS
get() =
System
.getProperty("os.name")
.startsWith("windows", true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private val onWindowsOS
get() =
System
.getProperty("os.name")
.startsWith("windows", true)
private val onWindowsOS: Boolean
get() = System.getProperty("os.name").startsWith("windows", true)

…matching

Globs always use a "/" as directory separator on all OS's. Input patterns containing
a "\" on Windows OS are transformed to "/" as users on Windows more likely would
assume that the "\" may be used.

On WindowsOS, transform "\" in the filepath to "/" before comparing the filename with
the regular expression (of the glob) which always uses "/" as separator.

Refactor all logic which create globs based on an input path.
- If a path (absolute or relative) point to a directory, that path is expanded
  to the default globs (*.kt, *.kts) in that specific directory or any of its
  subdirectories.
- If a path (absolute or relative) does not point to a directory, e.g. it
  points to a file, or it is a pattern. See "**" replacement below.
- On Windows OS patters containing a "*" (or "**") can not be resolved with
  default Paths utilities. In such case the given input pattern is handled as
  is. See "**" replacement below.

Patterns that contain one or more occurrence of a "**" are split into multiple
patterns so that files on that specific path and subdirectories will be matched.
 - For example, for path "some/path/**/*.kt" an additional pattern
   "some/path/*.kt" is generated to make sure that not only the "*.kt" files in
   a subdirectory of "some/path/" are found but also the "*.kt" in directory
   "some/path" as well. This is in sync with the "**" notation in a glob which
   should be interpreted as having zero or more intermediate subdirectories.
 - For example, for path "some/**/path/**/*.kt", multiple additional patterns
   are generated. As it contains two "**" patterns, 2 x 2 patterns are needed
   to match all possible combinations:
   - "some/**/path/**/*.kt"
   - "some/**/path/*.kt"
   - "some/path/**/*.kt"
   - "some/path/*.kt"

Finally, on Windows OS more fixes are needed as the resulting globs may not
contain any drive destinations as the start of the path. Such a drive
destination is replaced with a "**". So "D:/some/path/*.kt" becomes
"/some/path/*.kt". Note that the last glob representation is less strict than
the original pattern as it could match on other drives that "D:/" as well.

Extend trace logging.

Closes pinterest#1600
Closes pinterest#1601
@paul-dingemans paul-dingemans changed the title Enable unit test on windows build to get better understanding why it breaks Fix traversing the directory hierarchy on WindowsOS / Ant-style path matching Sep 1, 2022
@paul-dingemans paul-dingemans added this to the 0.47.1 milestone Sep 3, 2022
@paul-dingemans paul-dingemans merged commit 3c71f71 into pinterest:master Sep 3, 2022
@paul-dingemans paul-dingemans deleted the 1600-path-traversal-windows branch September 3, 2022 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants