Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify string isAllWhitespace implementation #235

Merged

Conversation

LucianoPAlmeida
Copy link
Contributor

This patch is really very simple one, but I believe can be both a simplification on implementation and a perf improvement in the sense that as it is if we take for example as reference the implementation of trimmingCharacters we note that it will do a lot extra work as allocating buffers or a whole new string maybe, when this only need a simple check that can exit early and also avoid those extra allocations.

I'm not sure if is significant, because I cannot tell if this is used in hot paths within the project and also I didn't measure, so perf improvement is just a guess that makes sense to me, and since it was a simple change I thought in open the PR. So let me know if this makes sense :)

@jshier
Copy link

jshier commented Nov 8, 2021

(Hope you don't mind a drive by nerd snipe.)

I took the liberty of doing some benchmarking here. Each case was 10,000 loops checking isAllWhitespace against a String of 10,000 characters. First case is all space, second is all as, and third is 5,000 as with 2,500 spaces on either end.

extension String {
    func isAllWhitespace() -> Bool {
        trimmingCharacters(in: .whitespacesAndNewlines) == ""
    }

//    func isAllWhitespace() -> Bool {
//        allSatisfy { $0.isNewline || $0.isWhitespace }
//    }
}

let allWhitespace = String(repeating: " ", count: 10_000)
//let noWhitespace = String(repeating: "a", count: 10_000)
//let mixed = String(repeating: " ", count: 2500) + String(repeating: "a", count: 5000) + String(repeating: " ", count: 2500)

var results: [Bool] = []
results.reserveCapacity(10_000)

for _ in 0..<10_000 {
    results.append(allWhitespace.isAllWhitespace())
}

precondition(results.allSatisfy { $0 == true })

Here are the results.

jshier@Jons-iMac isWhitespace % hyperfine './allWhitespaceTrimming'            
Benchmark 1: ./allWhitespaceTrimming
  Time (mean ± σ):     920.7 ms ±   5.6 ms    [User: 911.1 ms, System: 8.5 ms]
  Range (min … max):   911.2 ms … 927.5 ms    10 runs
 
jshier@Jons-iMac isWhitespace % hyperfine './allWhitespaceAllSatisfy' 
Benchmark 1: ./allWhitespaceAllSatisfy
  Time (mean ± σ):      2.143 s ±  0.004 s    [User: 2.122 s, System: 0.019 s]
  Range (min … max):    2.139 s …  2.153 s    10 runs

jshier@Jons-iMac isWhitespace % hyperfine './noWhitespaceTrimming'  
Benchmark 1: ./noWhitespaceTrimming
  Time (mean ± σ):       6.8 ms ±   0.2 ms    [User: 5.3 ms, System: 1.0 ms]
  Range (min … max):     6.5 ms …   7.4 ms    318 runs

jshier@Jons-iMac isWhitespace % hyperfine './noWhitespaceAllSatisfy'     
Benchmark 1: ./noWhitespaceAllSatisfy
  Time (mean ± σ):       3.5 ms ±   0.1 ms    [User: 2.1 ms, System: 0.9 ms]
  Range (min … max):     3.2 ms …   4.1 ms    489 runs
 
  Warning: Command took less than 5 ms to complete. Results might be inaccurate.
 
jshier@Jons-iMac isWhitespace % hyperfine './mixedTrimming'            
Benchmark 1: ./mixedTrimming
  Time (mean ± σ):     492.6 ms ±   3.9 ms    [User: 474.9 ms, System: 16.7 ms]
  Range (min … max):   485.7 ms … 499.4 ms    10 runs
 
jshier@Jons-iMac isWhitespace % hyperfine './mixedAllSatisfy' 
Benchmark 1: ./mixedAllSatisfy
  Time (mean ± σ):     574.6 ms ±   1.9 ms    [User: 567.6 ms, System: 5.9 ms]
  Range (min … max):   571.1 ms … 577.6 ms    10 runs

This isn't exhaustive, but as you can see allSatisfy only beats trimmingCharacters when the string is entirely whitespace. I can't quite explain why allSatisfy is so much slower in the longer case (perhaps checking the properties is slower than set inclusion?) but for larger strings, the more whitespace they contain the bigger the trimmingCharacters advantage. Crossover in the test above seems to be about 1,000 spaces padding 8,000 as.

You can also take a look at the rough source of trimmingCharacters and see that it operates on the underlying String buffer and subranges until it needs to materialize the result, it doesn't directly mutate or otherwise create copies until then.

In the end, depending on typical use, this may be a useful change for small strings of that usually not whitespace, but I'm not sure the difference on the small end is work the larger difference on the high end. That's up to the project.

@MaxDesiatov
Copy link
Collaborator

I wonder if benchmarks could show the same performance difference on Linux and Windows, since closed-source Foundation on Mac may behave in unexpected ways.

Also, if we could infer some typical length of strings where one method is beneficial over the other, maybe we could keep both applying them conditionally?

@LucianoPAlmeida
Copy link
Contributor Author

Thank you @jshier for running benchmarks, as always perf hints can be really surprising when we measure it.

I can't quite explain why allSatisfy is so much slower in the longer case (perhaps checking the properties is slower than set inclusion?) but for larger strings, the more whitespace they contain the bigger the trimmingCharacters advantage.

Changed to operate directly on unicode view using char set, seems to close the huge gap...

lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./allWhiteSpaceAllsatisfy --prepare true 
Benchmark 1: ./allWhiteSpaceAllsatisfy
  Time (mean ± σ):     754.6 ms ±   0.2 ms    [User: 752.6 ms, System: 1.4 ms]
  Range (min … max):   754.4 ms … 754.9 ms    10 runs
 
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./allWhiteSpaceTrimmed --prepare true 
Benchmark 1: ./allWhiteSpaceTrimmed
  Time (mean ± σ):     806.3 ms ±   1.1 ms    [User: 804.1 ms, System: 1.6 ms]
  Range (min … max):   805.3 ms … 808.2 ms    10 runs
 
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./noWhitespacesAllsatisfy --prepare true 
Benchmark 1: ./noWhitespacesAllsatisfy
  Time (mean ± σ):      59.7 ms ± 169.7 ms    [User: 5.7 ms, System: 1.2 ms]
  Range (min … max):     3.5 ms … 542.7 ms    10 runs
 
  Warning: Command took less than 5 ms to complete. Results might be inaccurate.
 
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./noWhiteSpaceTrimmed --prepare true 
Benchmark 1: ./noWhiteSpaceTrimmed
  Time (mean ± σ):      23.7 ms ±  60.6 ms    [User: 6.7 ms, System: 0.8 ms]
  Range (min … max):     5.1 ms … 225.2 ms    13 runs
 
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./mixedAllsatisfy --prepare true 
Benchmark 1: ./mixedAllsatisfy
  Time (mean ± σ):     215.6 ms ±  75.8 ms    [User: 194.4 ms, System: 0.9 ms]
  Range (min … max):   191.3 ms … 431.3 ms    10 runs
 
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./mixedWhiteSpaceTrimmed --prepare true 
Benchmark 1: ./mixedWhiteSpaceTrimmed
  Time (mean ± σ):     441.5 ms ±  78.3 ms    [User: 414.3 ms, System: 5.5 ms]
  Range (min … max):   416.3 ms … 664.4 ms    10 runs

Then I think is just see if is worth for the project, as mixed also seems to be an improvement

@MaxDesiatov
Copy link
Collaborator

Did all of these benchmarks run in release mode? I still would love to see how it performs on Linux and Windows, but overall this is definitely something I'd consider merging 👍

@LucianoPAlmeida
Copy link
Contributor Author

I wonder if benchmarks could show the same performance difference on Linux and Windows, since closed-source Foundation on Mac may behave in unexpected ways.

It should be possible since this tool work in Linux and Windows as well, but unfortunately I don't have the environment.

Did all of these benchmarks run in release mode? I still would love to see how it performs on Linux and Windows, but overall this is definitely something I'd consider merging 👍

Yeah, I just ran in with -O but could consider experimenting with -Osize if you guys think is worth it :)

@MaxDesiatov
Copy link
Collaborator

It should be possible since this tool work in Linux and Windows as well, but unfortunately I don't have the environment.

I wanted to set up some reproducible benchmarks that measure relative changes in performance between main and PR branches across all platforms. I'm looking forward to running that comparison for this PR when that's ready, but it may take some time.

Yeah, I just ran in with -O but could consider experimenting with -Osize if you guys think is worth it :)

Great, I assume -O is equivalent to swift build -c release?

@LucianoPAlmeida
Copy link
Contributor Author

Great, I assume -O is equivalent to swift build -c release?

I compiled a single standalone file, so using swiftc directly, but as far as I understand swift build using release should invoke the compiler with -O

@jshier
Copy link

jshier commented Nov 8, 2021

I've made some benchmarks using swift-collection-benchmark. Here's the first set of results for 100% whitespace strings up to 16M elements.

chart

As you can see, trimming has a lower floor in this case, possibly due to its raw iteration over the String's buffer instead of going through an actual Swift Iterator.

I'll work on generating cases with different combinations of whitespace.

@jshier
Copy link

jshier commented Nov 8, 2021

Don't really have any other interesting results, but I've added another run that's 50% leading whitespace. Similar curve, but the allSatisfy / trimmingCharacters crossover happens later, possible due to more work materializing the output string for comparison.

chart

@LucianoPAlmeida
Copy link
Contributor Author

Thank you for the benchmarks @jshier, my initial guess was exactly that avoid materialization of string for compare would be a perf win, but I guess the benchmarks show that it is not true. Have you tested cases with white space in the back, I think this would be a win for allSatisfy because it can bail early while trimming has to look both sides.
And again, thanks for the benchmarks :)

@jshier
Copy link

jshier commented Nov 9, 2021

Back half is the same as just having no whitespace at all: both implementations early out, so there's a fixed difference between them, just like the early part of these graphs where the setup cost of trimmingCharacters hasn't yet paid off in faster iteration.

@MaxDesiatov
Copy link
Collaborator

Thanks for the detailed charts @jshier! What tools did you use to generate them and to collect benchmark data? I was going to use https://github.com/google/swift-benchmark for CI benchmarking, but I see you probably use something fancier?

@LucianoPAlmeida
Copy link
Contributor Author

Back half is the same as just having no whitespace at all: both implementations early out, so there's a fixed difference between them, just like the early part of these graphs where the setup cost of trimmingCharacters hasn't yet paid off in faster iteration.

That is unexpected, because for a string with no whitespaces in front I would expect allSafisfy to be noop, because it can bail on first char, while trimming has to loop and trim char in the back. I run the bench locally and it does show the expected results

chart

Here are the generators:

// 50%
let inputGenerator: (Int) -> String = { size in
    return String(repeating: "a", count: size / 2) + String(repeating: " ", count: size / 2)
}
// 25% 50%(non-whitespaces) 25%
let inputGenerator: (Int) -> String = { size in
    return String(repeating: " ", count: size / 4) + String(repeating: "a", count: size / 2) + String(repeating: " ", count: size / 4)
}

@LucianoPAlmeida
Copy link
Contributor Author

I was going to use https://github.com/google/swift-benchmark for CI benchmarking, but I see you probably use something fancier?

@MaxDesiatov That would be https://github.com/apple/swift-collections-benchmark, it does produce the results in a json format as well, so it probably can be used on a CI setting too :)

@jshier
Copy link

jshier commented Nov 10, 2021

Yes, I used swift-collection-benchmark as @LucianoPAlmeida mentioned.

@LucianoPAlmeida you're right, the backside benchmarks are more interesting as you show. In that case trimmingCharacters doesn't get to early out like allSatisfy does. I'll put together some final results but it seems likely the smaller string performance wins for allSatisfy may be the most important here, but I haven't checked where isAllWhitespace() is actually called.

@jshier
Copy link

jshier commented Nov 10, 2021

Here are some final results. Unfortunately the label is a bit in the way but you can get the overall idea.

chart

@jshier
Copy link

jshier commented Nov 10, 2021

Here's my bencher if you want to play with it.

isWhitespaceBencher.zip

@LucianoPAlmeida
Copy link
Contributor Author

Thank you @jshier for the benchmarks, really good overview the last one!
In my opinion for my understanding of the results, overall it seems like a nice improvement because where it loses the difference is almost negligible but when it can bail early the difference is quite good. Let me know your thoughts about it and if that's feels like a good change for the project :)

@MaxDesiatov
Copy link
Collaborator

LGTM, thank you both for the detailed investigation!

@MaxDesiatov MaxDesiatov merged commit 57d3b8f into CoreOffice:main Nov 11, 2021
@LucianoPAlmeida LucianoPAlmeida deleted the minor-extension-modification branch May 14, 2022 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants