Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utils: ConvertToBytes #1875

Merged
merged 5 commits into from Apr 24, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions utils/README.md
Expand Up @@ -84,4 +84,9 @@ Benchmark_Trim/default-16 18457221 66.0 ns
Benchmark_Trim/default-16 18177328 65.9 ns/op 32 B/op 1 allocs/op
Benchmark_Trim/default.trimspace-16 188933770 6.33 ns/op 0 B/op 0 allocs/op
Benchmark_Trim/default.trimspace-16 184007649 6.42 ns/op 0 B/op 0 allocs/op

Benchmark_ConvertToBytes/fiber-8 3541161 317.9 ns/op 128 B/op 2 allocs/op
Benchmark_ConvertToBytes/fiber-8 3465039 336.8 ns/op 128 B/op 2 allocs/op
Benchmark_ConvertToBytes/default-8 311490284 3.980 ns/op 0 B/op 0 allocs/op
Benchmark_ConvertToBytes/default-8 313441532 3.835 ns/op 0 B/op 0 allocs/op
```
31 changes: 31 additions & 0 deletions utils/common.go
Expand Up @@ -8,6 +8,9 @@ import (
"crypto/rand"
"encoding/binary"
"encoding/hex"
"regexp"
"strconv"
"strings"

"net"
"os"
Expand All @@ -22,6 +25,11 @@ import (
const (
toLowerTable = "\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\u007f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"
toUpperTable = "\x00\x01\x02\x03\x04\x05\x06\a\b\t\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`ABCDEFGHIJKLMNOPQRSTUVWXYZ{|}~\u007f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"
kb = 1000
mb = 1000 * kb
gb = 1000 * mb
tb = 1000 * gb
pb = 1000 * tb
)

// Copyright © 2014, Roger Peppe
Expand All @@ -32,6 +40,8 @@ var (
uuidSeed [24]byte
uuidCounter uint64
uuidSetup sync.Once
unitsMap = map[string]int64{"k": kb, "m": mb, "g": gb, "t": tb, "p": pb}
sizeRegex = regexp.MustCompile(`(?i)^(\d+(\.\d+)*) ?([kmgtp])?b?$`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to use regex here?
Can you explain more about this regex? I do like to make many tests using strings instead regex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(based on my strong experience in regexp patterns)

This regexp is simple pattern for extracting float value and unit name

So, something like:

(strict begin)(int or float value)(maybe space symbol)(maybe unit letter)(maybe b symbol)(string end)

So, any next examples are correct:

42
42m
42 m
42 Mb
42.5 Mb

Pattern is correct and many developers use it in different packages, frameworks and languages.

And, I really don't know how parse human readable value using strings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tried the string recognition
4491602

old:
Benchmark_ConvertToBytes/fiber-8     3541161        317.9 ns/op          128 B/op          2 allocs/op
Benchmark_ConvertToBytes/fiber-8     3465039        336.8 ns/op          128 B/op          2 allocs/op

new:
Benchmark_ConvertToBytes/fiber-12    32883782       33.76 ns/op            0 B/op          0 allocs/op
Benchmark_ConvertToBytes/fiber-12    36084900       33.47 ns/op            0 B/op          0 allocs/op

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webdevium are you okay with the new code ? is now a little more, but also 10 times faster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ReneWerner87 Awesome!
I don't have enough golang experience to do things like this without regular expressions :)

Thank you for your help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ReneWerner87 But method has panic if human string is empty :(

)

// UUID generates an universally unique identifier (UUID)
Expand Down Expand Up @@ -109,3 +119,24 @@ func IncrementIPRange(ip net.IP) {
}
}
}

// ConvertToBytes returns integer size of bytes from human-readable string, ex. 42kb, 42M
// Returns 0 if string is unrecognized
func ConvertToBytes(humanReadableString string) int {
matches := sizeRegex.FindStringSubmatch(humanReadableString)
if len(matches) != 4 {
return 0
}

size, err := strconv.ParseFloat(matches[1], 64)
if err != nil {
return 0
}

unitPrefix := strings.ToLower(matches[3])
if mul, ok := unitsMap[unitPrefix]; ok {
size *= float64(mul)
}

return int(size)
}
42 changes: 42 additions & 0 deletions utils/common_test.go
Expand Up @@ -7,6 +7,7 @@ package utils
import (
"crypto/rand"
"fmt"
"strconv"
"testing"
)

Expand Down Expand Up @@ -89,3 +90,44 @@ func Benchmark_UUID(b *testing.B) {
AssertEqual(b, 36, len(res))
})
}

func Test_ConvertToBytes(t *testing.T) {
t.Parallel()
AssertEqual(t, 42, ConvertToBytes("42"))
AssertEqual(t, 42, ConvertToBytes("42b"))
AssertEqual(t, 42, ConvertToBytes("42B"))
AssertEqual(t, 42, ConvertToBytes("42 b"))
AssertEqual(t, 42, ConvertToBytes("42 B"))

AssertEqual(t, 42*1000, ConvertToBytes("42k"))
AssertEqual(t, 42*1000, ConvertToBytes("42K"))
AssertEqual(t, 42*1000, ConvertToBytes("42kb"))
AssertEqual(t, 42*1000, ConvertToBytes("42KB"))
AssertEqual(t, 42*1000, ConvertToBytes("42 kb"))
AssertEqual(t, 42*1000, ConvertToBytes("42 KB"))

AssertEqual(t, 42*1000000, ConvertToBytes("42M"))
AssertEqual(t, int(42.5*1000000), ConvertToBytes("42.5MB"))
AssertEqual(t, 42*1000000000, ConvertToBytes("42G"))

AssertEqual(t, 0, ConvertToBytes("string"))
AssertEqual(t, 0, ConvertToBytes("MB"))
}

// go test -v -run=^$ -bench=Benchmark_ConvertToBytes -benchmem -count=2

func Benchmark_ConvertToBytes(b *testing.B) {
var res int
b.Run("fiber", func(b *testing.B) {
for n := 0; n < b.N; n++ {
res = ConvertToBytes("42B")
}
AssertEqual(b, 42, res)
})
b.Run("default", func(b *testing.B) {
for n := 0; n < b.N; n++ {
res, _ = strconv.Atoi("42")
}
AssertEqual(b, 42, res)
})
}