Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid using Double in HashTable implementation #73583

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kubamracek
Copy link
Contributor

Using floating point in the HashTable implementation is (1) unfriendly to embedded environments which might not have floating point support in hardware and/or would need to pay the codesize cost of software fp libraries, (2) seemingly not necessary.

@kubamracek kubamracek requested a review from lorentey May 11, 2024 22:44
@kubamracek kubamracek requested a review from a team as a code owner May 11, 2024 22:44
@kubamracek
Copy link
Contributor Author

@swift-ci please test

@kubamracek
Copy link
Contributor Author

@swift-ci please test

@kubamracek kubamracek added the embedded Embedded Swift label May 12, 2024
}

internal static func capacity(forScale scale: Int8) -> Int {
let bucketCount = (1 as Int) &<< scale
return Int(Double(bucketCount) * maxLoadFactor)
return bucketCount * maxLoadFactor.numerator / maxLoadFactor.denominator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 32b and smaller systems, we plausibly might care about intermediate overflow here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions? We could ifdef this change to apply just to #if $Embedded, perhaps it would be reasonable to assume that 32-bit embedded systems will never have use a hashtable that's on the order of ~1 billion entries?

Or would using 64-bit arithmetics even on 32-bit systems be reasonable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect to want to tweak the load factor, @lorentey?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but not on a whim! It'd be fine if it took some effort to tweak it.

(75% is very much on the lower end of the slider, so we're trading a bit of memory for lookup speed.)

capacity(forScale:) and scale(forCapacity:) are only ever called when allocating a new storage instance, so they do not need to be super quick. (Both the scale and the capacity are stored in the storage header.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, so would it be reasonable to replace the multiply-then-divide (possible overflow) with divide-then-multiply (no overflow) plus handling the remainder separately?

Int((Double(capacity) / maxLoadFactor).rounded(.up)),
func divideRoundingUp(n: Int, by: Int) -> Int { (n + (by - 1)) / by }
let minimumEntries = Swift.max(divideRoundingUp(
n: capacity * maxLoadFactor.denominator, by: maxLoadFactor.numerator),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
embedded Embedded Swift
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants