Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some combinations of wildcards inside alternates fail #50

Open
hackery opened this issue May 11, 2021 · 2 comments
Open

Some combinations of wildcards inside alternates fail #50

hackery opened this issue May 11, 2021 · 2 comments

Comments

@hackery
Copy link

hackery commented May 11, 2021

This glob library is used for a critical bit of functionality in Telegraf -metric filtering. There appear to be a number of cases where it silently fails to match, and one which has caused us much grief recently involves multiple * wildcards inside a { } alternate construct.

Telegraf is configured with a list of patterns for a namepass/namedrop function, and internally it composes the list into a single pattern with alternates. I've reduced our test case to one similar to the samples in glob_test.go; here is the failing test surrounded by variations which all work:

glob(true, "yandex:*.exe:page.*", "yandex:service.exe:page.12345"),
glob(true, "*yandex:*.exe:page.*", "yandex:service.exe:page.12345"),
glob(true, "{*yandex:*.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,yandex:*.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,*yandex:*.exe:page.*}", "yandex:service.exe:page.12345"), // FAIL
glob(true, "{google.?,*yandex:*.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,*yandex:service.exe:page.*}", "yandex:service.exe:page.12345"),
glob(true, "{google.*,*yandex:*.exe:*.12345}", "yandex:service.exe:page.12345"),

The result of running this test in current master branch is:

--- FAIL: TestGlob (0.00s)
    --- FAIL: TestGlob/#64 (0.00s)
        glob_test.go:190: pattern "{google.*,*yandex:*.exe:page.*}" matching "yandex:service.exe:page.12345" should be true but got false
            <btree:[<nil><-<any_of:[<text:`google.`>,<btree:[<contains:[yandex:]><-<text:`.exe:page.`>-><nil>]>]>-><super>]>
FAIL
exit status 1
FAIL    github.com/gobwas/glob  0.004s

This has become a huge problem for us, with huge numbers of metrics being sent to InfluxDB which over time overwhelm it (and post-hoc deletions are unusably slow).

@j3h
Copy link

j3h commented May 27, 2022

I ran into a bug that may or may not be the same, and created a minimal reproduction. Short version is that {,*}x works fine, but {*,}x does not.

@matthewmueller
Copy link

matthewmueller commented Nov 29, 2022

I also ran into this in Bud where the following test was failing (in this case matching unexpectedly):

func TestMatchSubdir(t *testing.T) {
	is := is.New(t)
	matcher, err := glob.Compile(`{generator/**.go,bud/internal/generator/*/**.go}`)
	is.NoErr(err)
	is.True(!matcher.Match("bud/internal/generator/generator.go"))
}

I was able to fix this by manually expanding the globs into generator/**.go and bud/internal/generator/*/**.go.

Then you can compile each one individually and create a matcher:

// Comple
func Compile(pattern string) (Matcher, error) {
	patterns, err := Expand(pattern)
	if err != nil {
		return nil, err
	}
	globs := make(globs, len(patterns))
	for i, pattern := range patterns {
		glob, err := glob.Compile(pattern)
		if err != nil {
			return nil, err
		}
		globs[i] = glob
	}
	return globs, nil
}

type globs []glob.Glob

func (globs globs) Match(path string) bool {
	for _, glob := range globs {
		if glob.Match(path) {
			return true
		}
	}
	return false
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants