Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lazy generation of Arb.localDate() edge-cases #3878

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

oliverblaha
Copy link
Contributor

See #3877.

This slightly changes current behavior, as exceptional years (can be divided by four, but are not leap years, like 1900 or 2100) will also have an edge case on the day after February 28.

I considered excluding them, leading to either some imbalance in edge-case probabilities or to some more complex (and less readable) calculations. However, essentially I believe that - due to those years commonly being considered exceptions to the "usual" rule of "every four years", it's probably even desirable to have March 1 being an edge case for those years.

Comment on lines +70 to +76
val r = rs.random.nextInt(potentialLeapYears + 1)
when {
r == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
else -> LocalDate.of(firstPotentialLeapYear + (r - 1) * 4, 2, 28)
.plusDays(rs.random.nextLong(2))
}
}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val r = rs.random.nextInt(potentialLeapYears + 1)
when {
r == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
else -> LocalDate.of(firstPotentialLeapYear + (r - 1) * 4, 2, 28)
.plusDays(rs.random.nextLong(2))
}
}) {
val leapYear = rs.random.nextInt(1..potentialLeapYears) * 4
when {
r == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
else -> LocalDate.of(leapYear, 2, 28).plusDays(rs.random.nextLong(0..1))
}
}) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fairly sure firstPotentialLeapYear got lost in the suggestion, didn't it?
Also the case r == 0 is gone now. Both the +1 and the -1 were important. (number of potential leap years, two cases each, PLUS the two cases for min and max - even distribution)

Means, the random leapYear now obstructs the purpose of both variable and the + 1.
I'm struggling to come up with a concise yet readable alternative... but how about this?

val edgeCase = rs.random.nextInt(potentialLeapYears + 1)
when {
  edgeCase == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
  else -> (firstPotentialLeapYear + (edgeCase - 1) * 4).let { potentialLeapYear -> LocalDate.of(potentialLeapYear, 2, 28).plusDays(rs.random.nextLong(0..1))
}

or

val edgeCase = rs.random.nextInt(potentialLeapYears + 1)
when {
  edgeCase == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
  else -> LocalDate.of(firstPotentialLeapYear, 2, 28).plusYears((edgeCase - 1) * 4).plusDays(rs.random.nextLong(0..1))
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a new arbitrary-builder which lets us provide edgecases as a Sequence instead? Then you could do yield the minDate + maxDate first and then focus on random leap years afterwards?

I'd like to get rid of the +/- 1, since off by one errors are tricky. Is there any way to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I like the idea, having a sequence would basically fix the order of edge cases, and I'm wondering if that's a good or a bad thing. This brings up some more fundamental questions:

  • Are all edge-cases equal?
  • If not, how to define their weight?
  • How could users override their weight?

In this particular case, any single edge-case has equal weight. That's nice if you want to cover as many different edge-case dates as possible. However, if you have a really wide date range, the likelihood of min and max still being generated is getting really small.

So in this particular case, it's maybe best to change the approach, and first classify edge cases and then keep an equal distribution between those. We might even want to add more edge cases. E.g.:

  1. actual edges (min/max)
  2. (potential) leap years (28, day after 28)
  3. first and last days of a month/year/decade/century/millennium (that would be new - but maybe too much?)

However, I'd at least like to go forward and simply split between min/max and leap-year-cases equally. If I can find an easy solution for it, I also want to separate the "exceptional years" so that those are tested more often. Either option also eliminates the +/- 1.

Any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good approach. It feels like one would expect min/max to be tested and not get diluted to the point of not being tested by the presence of leap years. So I think we can consider there to be classes of edge cases, and some more important than others.. I'm not sure if there's any built-in way that emulates this currently. @myuwono do you know?

I think your suggestion of testing min/max and leap-years equally sounds like a good solution in this case. 🚀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm that I didn't forget about this; unfortunately didn't get around to it yet, but will eventually.


val february28s = leapYears.map { LocalDate.of(it, 2, 28) }
val february29s = february28s.map { it.plusDays(1) }
val firstPotentialLeapYear = (minDate.year + 3) and (0.inv() shl 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nigh un-readable to me. Could we just do this instead?

Suggested change
val firstPotentialLeapYear = (minDate.year + 3) and (0.inv() shl 2)
val firstPotentialLeapYear = (minDate.year..minDate.year+8).first { isLeap(it.toLong()) }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, forgot to reply.
I don't mind making it more readable, but using isLeap makes it inconsistent.

As mentioned above, I opted for "potential leap years" due to it both being simpler and it probably even being an advantage to also have those "special cases" as edge cases for March 1. However, isLeap does only return true on actual leap years, so we'd be mixing two different things.

However, we could just replace it by first { it % 4 == 0 }. Just not sure if that's really that more readable; it will be a (tiny) bit slower though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought:

val canDivideByFour = 0.inv() shl 2
val firstPotentialLeapYear = (minDate.year + 3) and canDivideByFour

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me at least it % 4 == 0 clearly signals divisibility by 4. If you want to keep the bitwise arithmetic, I think your latest suggestion is better but I would also add a comment on the 2nd line explaining that you're masking away the 2 least significant bits to get a number divisible by 4, or something like it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe calling the variable "divisibleByFourBitmask" does the trick?

val divisibleByFourBitmask = -4 also avoids the bit shift, but the "minus" feels rather weird. However, as it's a signed int, I can't really get around that. The actual bitmask would be written as -0b100, so also not overly obvious.

I'm not a big fan of using the range + first - it might somewhat fool-proof it, but also introduces an obscure constant (+8) that reads weird at first. Maybe really just go for:

val divisibleByFourBitmask = -4
val firstPotentialLeapYear = (minDate.year + 3) and divisibleByFourBitmask

That explains there's a bitmask involved and anyone who doesn't know enough about bitmasks at least has a starting point to look it up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of the last suggestion. Thanks for putting up with the nitpicking.. 👍🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants