- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SampleUniform for char #1068
Conversation
It would be nice to document the behavior that skips some chars. |
src/distributions/uniform.rs
Outdated
fn new_(mut low: u32, mut high: u32) -> Self { | ||
if low >= 0xD800 { | ||
low -= 0xE000 - 0xD800; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth to add the following helper functions:
/// UTF-16 surrogate range start
const RANGE_START: u32 = 0xD800
/// UTF-16 surrogate range size
const RANGE_SIZE: u32 = 0xE000 - RANGE_START;
/// Convert `char` to compressed `u32`
fn to_comp_u32(c: char) -> u32 {
match c as u32 {
c if c >= RANGE_START => c - RANGE_SIZE,
c => c,
}
}
/// Convert compressed `u32` to `char`
fn from_comp_u32(mut c: u32) -> u32 {
if c >= RANGE_START {
c += RANGE_SIZE;
}
// SAFETY: checked that `c` is a legal unicode value, since `c` is initially
// constructed from `char` it can not be larger than 0xFFFE, i.e. safety of
// this function relies on the constructor methods
unsafe { core::char::from_u32_unchecked(c) }
}
In addition to make code easier to understand it would allow to remove the new_
method.
IIRC it's allowed to have private functions safety of which relies on code in the current module (e.g. it applies to the Vec
code).
I wonder if in future it will be possible to define a sampler which would accept several ranges: fn new<const N: usize>(ranges: [RangeInclusive<char>; N]) -> Self { .. } It would allow us to write |
Updated. I prefer to keep |
The idea was to reduce number of fn new<B1, B2>(low_b: B1, high_b: B2) -> Self
where
B1: SampleBorrow<Self::X> + Sized,
B2: SampleBorrow<Self::X> + Sized,
{
let low = to_comp_u32(*low_b.borrow());
let high = to_comp_u32(*high_b.borrow());
let sampler = UniformInt::<u32>::new(low, high);
Self { sampler }
}
fn new_inclusive<B1, B2>(low_b: B1, high_b: B2) -> Self
where
B1: SampleBorrow<Self::X> + Sized,
B2: SampleBorrow<Self::X> + Sized,
{
let low = to_comp_u32(*low_b.borrow());
let high = to_comp_u32(*high_b.borrow());
let sampler = UniformInt::<u32>::new_inclusive(low, high);
Self { sampler }
}
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Self::X {
from_comp_u32(self.sampler.sample(rng))
} I think it will be a bit easier to understand. Note that we do not need |
It's the same number of branches (other than you removing some asserts — ones that should be there for good error reporting and should allow the ones within UniformInt to be optimised out). But okay, it does look a little neater. |
I implemented your suggestion @newpavlov. The error messages would be identical (other than code location), so they are mostly redundant. |
This would require a different implementation. It could be generalised but then the distribution object would require allocation (unless we used something like |
Updated with a couple of fixes. @vks @newpavlov are you happy with this now? |
Looks good to me! |
Apparently this is a problem now:
The I guess the solution is to import |
BTW once CI passes I still need a "green review" from one of you to be able to merge this. |
Sorry, I struggled a bit how to do that with the GitHub app... |
The same error is still reported... ideas? I can't actually reproduce the error message locally, I just noticed the same symptoms. |
I also don't get the warning, but the link is not working anyway. IIRC, we always had that problem with the reexported distributions? |
Maybe (doesn't look like a new problem), but I thought CI used to pass. Whatever. |
So it seems an invalid link is generated: Note that the Maybe a rustdoc bug? |
Additional fixes: - Nightly: rename of partition_at_index → select_nth_unstable - Fix doc links in rand_distr
This enables
rng.gen_range('A'..='Z')
.I don't recall why we didn't do this before but I don't see any real issues. Thoughts @vks / @newpavlov?