Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create pkg/regexp to better handle init regex #1465

Merged
merged 1 commit into from
Jan 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .cirrus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ lint_task:
echo "deb http://deb.debian.org/debian stretch-backports main" > /etc/apt/sources.list.d/backports.list
apt-get update
apt-get install -y libbtrfs-dev libdevmapper-dev
test_script: make local-validate && make lint
test_script: make TAGS=regex_precompile local-validate && make lint && make clean


# Update metadata on VM images referenced by this repository state
Expand Down
10 changes: 3 additions & 7 deletions pkg/idtools/usergroupadd_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@ package idtools

import (
"fmt"
"regexp"
"sort"
"strconv"
"strings"
"sync"

"github.com/containers/storage/pkg/regexp"
)

// add a user and/or group to Linux /etc/passwd, /etc/group using standard
Expand All @@ -24,8 +25,7 @@ var (
"usermod": "-%s %d-%d %s",
}

idOutOnce sync.Once
idOutRegexp *regexp.Regexp
idOutRegexp = regexp.Delayed(`uid=([0-9]+).*gid=([0-9]+)`)
// default length for a UID/GID subordinate range
defaultRangeLen = 65536
defaultRangeStart = 100000
Expand All @@ -37,10 +37,6 @@ var (
// /etc/sub{uid,gid} ranges which will be used for user namespace
// mapping ranges in containers.
func AddNamespaceRangesUser(name string) (int, int, error) {
idOutOnce.Do(func() {
idOutRegexp = regexp.MustCompile(`uid=([0-9]+).*gid=([0-9]+)`)
})

if err := addUser(name); err != nil {
return -1, -1, fmt.Errorf("adding user %q: %w", name, err)
}
Expand Down
214 changes: 214 additions & 0 deletions pkg/regexp/regexp.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
package regexp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure about using the same name as the standard library package, the two are not interoperable in any sense. OTOH there’s a certain elegance to this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the same package name does make it tricky to use this and call regular regexp global functions from the same file (you'd have to rename one of them), so i agree with this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH that could be useful as a speed bump to make the user consider whether that call should use the delayed package as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for dynaminc use (i.e. create a regexp in a function) you never want the lazy stuff.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a good point.


import (
"io"
"regexp"
"sync"
)

// Regexp is a wrapper struct used for wrapping MustCompile regex expressions
// used as global variables. Using this stucture helps speed the startup time
// of apps that want to use global regex variables. This library initializes them on
// first use as opposed to the start of the executable.
type Regexp struct {
once sync.Once
regexp *regexp.Regexp
val string
}

func Delayed(val string) Regexp {
re := Regexp{
val: val,
}
if precompile {
re.regexp = regexp.MustCompile(re.val)
}
return re
}

func (re *Regexp) compile() {
if precompile {
return
}
re.once.Do(func() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don’t do this if precompile?

On the one hand, the current version makes run-time profiling of the precompile version representative of the other one.

On the other hand, making compilation conditional here would allow an easy experiment to see if precompilation, in total, pays off in a particular situation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so you want the re.once.Do command not to be executed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My very weak preference would be to not execute it, but I’m completely fine with either one.

re.regexp = regexp.MustCompile(re.val)
})
}

func (re *Regexp) Expand(dst []byte, template []byte, src []byte, match []int) []byte {
re.compile()
return re.regexp.Expand(dst, template, src, match)
}

func (re *Regexp) ExpandString(dst []byte, template string, src string, match []int) []byte {
re.compile()
return re.regexp.ExpandString(dst, template, src, match)
}
func (re *Regexp) Find(b []byte) []byte {
re.compile()
return re.regexp.Find(b)
}

func (re *Regexp) FindAll(b []byte, n int) [][]byte {
re.compile()
return re.regexp.FindAll(b, n)
}

func (re *Regexp) FindAllIndex(b []byte, n int) [][]int {
re.compile()
return re.regexp.FindAllIndex(b, n)
}

func (re *Regexp) FindAllString(s string, n int) []string {
re.compile()
return re.regexp.FindAllString(s, n)
}

func (re *Regexp) FindAllStringIndex(s string, n int) [][]int {
re.compile()
return re.regexp.FindAllStringIndex(s, n)
}

func (re *Regexp) FindAllStringSubmatch(s string, n int) [][]string {
re.compile()
return re.regexp.FindAllStringSubmatch(s, n)
}

func (re *Regexp) FindAllStringSubmatchIndex(s string, n int) [][]int {
re.compile()
return re.regexp.FindAllStringSubmatchIndex(s, n)
}

func (re *Regexp) FindAllSubmatch(b []byte, n int) [][][]byte {
re.compile()
return re.regexp.FindAllSubmatch(b, n)
}

func (re *Regexp) FindAllSubmatchIndex(b []byte, n int) [][]int {
re.compile()
return re.regexp.FindAllSubmatchIndex(b, n)
}

func (re *Regexp) FindIndex(b []byte) (loc []int) {
re.compile()
return re.regexp.FindIndex(b)
}

func (re *Regexp) FindReaderIndex(r io.RuneReader) (loc []int) {
re.compile()
return re.regexp.FindReaderIndex(r)
}

func (re *Regexp) FindReaderSubmatchIndex(r io.RuneReader) []int {
re.compile()
return re.regexp.FindReaderSubmatchIndex(r)
}

func (re *Regexp) FindString(s string) string {
re.compile()
return re.regexp.FindString(s)
}

func (re *Regexp) FindStringIndex(s string) (loc []int) {
re.compile()
return re.regexp.FindStringIndex(s)
}

func (re *Regexp) FindStringSubmatch(s string) []string {
re.compile()
return re.regexp.FindStringSubmatch(s)
}

func (re *Regexp) FindStringSubmatchIndex(s string) []int {
re.compile()
return re.regexp.FindStringSubmatchIndex(s)
}

func (re *Regexp) FindSubmatch(b []byte) [][]byte {
re.compile()
return re.regexp.FindSubmatch(b)
}

func (re *Regexp) FindSubmatchIndex(b []byte) []int {
re.compile()
return re.regexp.FindSubmatchIndex(b)
}

func (re *Regexp) LiteralPrefix() (prefix string, complete bool) {
re.compile()
return re.regexp.LiteralPrefix()
}

func (re *Regexp) Longest() {
re.compile()
re.regexp.Longest()
}

func (re *Regexp) Match(b []byte) bool {
re.compile()
return re.regexp.Match(b)
}

func (re *Regexp) MatchReader(r io.RuneReader) bool {
re.compile()
return re.regexp.MatchReader(r)
}
func (re *Regexp) MatchString(s string) bool {
re.compile()
return re.regexp.MatchString(s)
}

func (re *Regexp) NumSubexp() int {
re.compile()
return re.regexp.NumSubexp()
}

func (re *Regexp) ReplaceAll(src, repl []byte) []byte {
re.compile()
return re.regexp.ReplaceAll(src, repl)
}

func (re *Regexp) ReplaceAllFunc(src []byte, repl func([]byte) []byte) []byte {
re.compile()
return re.regexp.ReplaceAllFunc(src, repl)
}

func (re *Regexp) ReplaceAllLiteral(src, repl []byte) []byte {
re.compile()
return re.regexp.ReplaceAllLiteral(src, repl)
}

func (re *Regexp) ReplaceAllLiteralString(src, repl string) string {
re.compile()
return re.regexp.ReplaceAllLiteralString(src, repl)
}

func (re *Regexp) ReplaceAllString(src, repl string) string {
re.compile()
return re.regexp.ReplaceAllString(src, repl)
}

func (re *Regexp) ReplaceAllStringFunc(src string, repl func(string) string) string {
re.compile()
return re.regexp.ReplaceAllStringFunc(src, repl)
}

func (re *Regexp) Split(s string, n int) []string {
re.compile()
return re.regexp.Split(s, n)
}

func (re *Regexp) String() string {
re.compile()
return re.regexp.String()
}

func (re *Regexp) SubexpIndex(name string) int {
re.compile()
return re.regexp.SubexpIndex(name)
}

func (re *Regexp) SubexpNames() []string {
re.compile()
return re.regexp.SubexpNames()
}
6 changes: 6 additions & 0 deletions pkg/regexp/regexp_dontprecompile.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
//go:build !regexp_precompile
// +build !regexp_precompile

package regexp

const precompile = false
6 changes: 6 additions & 0 deletions pkg/regexp/regexp_precompile.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
//go:build regexp_precompile
// +build regexp_precompile

package regexp

const precompile = true
29 changes: 29 additions & 0 deletions pkg/regexp/regexp_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
package regexp

import (
"testing"
)

func TestMatchString(t *testing.T) {
r := Delayed(`[0-9]+`)

if !r.MatchString("100") {
t.Fatalf("Should have matched")
}

if r.MatchString("test") {
t.Fatalf("Should not have matched")
}
}

func TestFindStringSubmatch(t *testing.T) {
r := Delayed(`a=([0-9]+).*b=([0-9]+)`)

if len(r.FindStringSubmatch("a=1,b=2")) != 3 {
t.Fatalf("Should have matched 3")
}

if len(r.FindStringSubmatch("a=1")) != 0 {
t.Fatalf("Should not have matched 0")
}
}
19 changes: 4 additions & 15 deletions pkg/stringid/stringid.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,35 +9,26 @@ import (
"math"
"math/big"
"math/rand"
"regexp"
"strconv"
"strings"
"sync"
"time"

"github.com/containers/storage/pkg/regexp"
)

const shortLen = 12

var (
validShortID *regexp.Regexp
validHex *regexp.Regexp
onceRegex sync.Once
validShortID = regexp.Delayed("^[a-f0-9]{12}$")
validHex = regexp.Delayed(`^[a-f0-9]{64}$`)

rngLock sync.Mutex
rng *rand.Rand // A RNG with seeding properties we control. It can only be accessed with randLock held.
)

func initRegex() {
onceRegex.Do(func() {
validShortID = regexp.MustCompile("^[a-f0-9]{12}$")
validHex = regexp.MustCompile(`^[a-f0-9]{64}$`)
})
}

// IsShortID determines if an arbitrary string *looks like* a short ID.
func IsShortID(id string) bool {
initRegex()

return validShortID.MatchString(id)
}

Expand Down Expand Up @@ -88,8 +79,6 @@ func GenerateNonCryptoID() string {

// ValidateID checks whether an ID string is a valid image ID.
func ValidateID(id string) error {
initRegex()

if ok := validHex.MatchString(id); !ok {
return fmt.Errorf("image ID %q is invalid", id)
}
Expand Down