Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lazy generation of Arb.localDate() edge-cases #3878

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ fun Arb.Companion.localDate() = Arb.Companion.localDate(LocalDate.of(1970, 1, 1)
* This generator creates randomly generated LocalDates, in the range [[minDate, maxDate]].
*
* If any of the years in the range contain a leap year, the date [29/02/YEAR] will always be a constant value of this
* generator.
* generator. For exceptional years that are not leap years but can be divided by four, also the date [01/03/YEAR] will
* be a constant value.
*
* @see [localDateTime]
* @see [localTime]
Expand All @@ -59,12 +60,20 @@ fun Arb.Companion.localDate(
minDate > maxDate -> throw IllegalArgumentException("minDate must be before or equal to maxDate")
minDate == maxDate -> Arb.constant(minDate)
else -> {
val leapYears = (minDate.year..maxDate.year).filter { isLeap(it.toLong()) }

val february28s = leapYears.map { LocalDate.of(it, 2, 28) }
val february29s = february28s.map { it.plusDays(1) }
val firstPotentialLeapYear = (minDate.year + 3) and (0.inv() shl 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nigh un-readable to me. Could we just do this instead?

Suggested change
val firstPotentialLeapYear = (minDate.year + 3) and (0.inv() shl 2)
val firstPotentialLeapYear = (minDate.year..minDate.year+8).first { isLeap(it.toLong()) }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, forgot to reply.
I don't mind making it more readable, but using isLeap makes it inconsistent.

As mentioned above, I opted for "potential leap years" due to it both being simpler and it probably even being an advantage to also have those "special cases" as edge cases for March 1. However, isLeap does only return true on actual leap years, so we'd be mixing two different things.

However, we could just replace it by first { it % 4 == 0 }. Just not sure if that's really that more readable; it will be a (tiny) bit slower though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought:

val canDivideByFour = 0.inv() shl 2
val firstPotentialLeapYear = (minDate.year + 3) and canDivideByFour

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me at least it % 4 == 0 clearly signals divisibility by 4. If you want to keep the bitwise arithmetic, I think your latest suggestion is better but I would also add a comment on the 2nd line explaining that you're masking away the 2 least significant bits to get a number divisible by 4, or something like it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe calling the variable "divisibleByFourBitmask" does the trick?

val divisibleByFourBitmask = -4 also avoids the bit shift, but the "minus" feels rather weird. However, as it's a signed int, I can't really get around that. The actual bitmask would be written as -0b100, so also not overly obvious.

I'm not a big fan of using the range + first - it might somewhat fool-proof it, but also introduces an obscure constant (+8) that reads weird at first. Maybe really just go for:

val divisibleByFourBitmask = -4
val firstPotentialLeapYear = (minDate.year + 3) and divisibleByFourBitmask

That explains there's a bitmask involved and anyone who doesn't know enough about bitmasks at least has a starting point to look it up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of the last suggestion. Thanks for putting up with the nitpicking.. 👍🏼

val potentialLeapYears = when {
maxDate.year < firstPotentialLeapYear -> 0
else -> (maxDate.year - firstPotentialLeapYear) / 4 + 1
}

arbitrary(february28s + february29s + minDate + maxDate) {
arbitrary(edgecaseFn = { rs ->
val r = rs.random.nextInt(potentialLeapYears + 1)
when {
r == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
else -> LocalDate.of(firstPotentialLeapYear + (r - 1) * 4, 2, 28)
.plusDays(rs.random.nextLong(2))
}
}) {
Comment on lines +70 to +76
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val r = rs.random.nextInt(potentialLeapYears + 1)
when {
r == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
else -> LocalDate.of(firstPotentialLeapYear + (r - 1) * 4, 2, 28)
.plusDays(rs.random.nextLong(2))
}
}) {
val leapYear = rs.random.nextInt(1..potentialLeapYears) * 4
when {
r == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
else -> LocalDate.of(leapYear, 2, 28).plusDays(rs.random.nextLong(0..1))
}
}) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fairly sure firstPotentialLeapYear got lost in the suggestion, didn't it?
Also the case r == 0 is gone now. Both the +1 and the -1 were important. (number of potential leap years, two cases each, PLUS the two cases for min and max - even distribution)

Means, the random leapYear now obstructs the purpose of both variable and the + 1.
I'm struggling to come up with a concise yet readable alternative... but how about this?

val edgeCase = rs.random.nextInt(potentialLeapYears + 1)
when {
  edgeCase == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
  else -> (firstPotentialLeapYear + (edgeCase - 1) * 4).let { potentialLeapYear -> LocalDate.of(potentialLeapYear, 2, 28).plusDays(rs.random.nextLong(0..1))
}

or

val edgeCase = rs.random.nextInt(potentialLeapYears + 1)
when {
  edgeCase == 0 -> if (rs.random.nextBoolean()) minDate else maxDate
  else -> LocalDate.of(firstPotentialLeapYear, 2, 28).plusYears((edgeCase - 1) * 4).plusDays(rs.random.nextLong(0..1))
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a new arbitrary-builder which lets us provide edgecases as a Sequence instead? Then you could do yield the minDate + maxDate first and then focus on random leap years afterwards?

I'd like to get rid of the +/- 1, since off by one errors are tricky. Is there any way to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I like the idea, having a sequence would basically fix the order of edge cases, and I'm wondering if that's a good or a bad thing. This brings up some more fundamental questions:

  • Are all edge-cases equal?
  • If not, how to define their weight?
  • How could users override their weight?

In this particular case, any single edge-case has equal weight. That's nice if you want to cover as many different edge-case dates as possible. However, if you have a really wide date range, the likelihood of min and max still being generated is getting really small.

So in this particular case, it's maybe best to change the approach, and first classify edge cases and then keep an equal distribution between those. We might even want to add more edge cases. E.g.:

  1. actual edges (min/max)
  2. (potential) leap years (28, day after 28)
  3. first and last days of a month/year/decade/century/millennium (that would be new - but maybe too much?)

However, I'd at least like to go forward and simply split between min/max and leap-year-cases equally. If I can find an easy solution for it, I also want to separate the "exceptional years" so that those are tested more often. Either option also eliminates the +/- 1.

Any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good approach. It feels like one would expect min/max to be tested and not get diluted to the point of not being tested by the presence of leap years. So I think we can consider there to be classes of edge cases, and some more important than others.. I'm not sure if there's any built-in way that emulates this currently. @myuwono do you know?

I think your suggestion of testing min/max and leap-years equally sounds like a good solution in this case. 🚀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm that I didn't forget about this; unfortunately didn't get around to it yet, but will eventually.

minDate.plusDays(it.random.nextLong(ChronoUnit.DAYS.between(minDate, maxDate) + 1))
}.filter { it in minDate..maxDate }
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ import io.kotest.property.checkAll
import io.kotest.property.forAll
import java.text.SimpleDateFormat
import java.time.LocalDate
import java.time.LocalDate.MAX
import java.time.LocalDate.MIN
import java.time.LocalDate.of
import java.time.LocalDateTime
import java.time.LocalTime
Expand Down Expand Up @@ -85,6 +87,12 @@ class DateTest : WordSpec({
}

"Arb.localDate(minDate, maxDate)" should {
"be able to handle a very large delta between boundaries" {
shouldNotThrowAny {
Arb.localDate(MIN, MAX).take(10_000).toList()
}
}

"Work when min date == max date" {
val date = of(2021, 1, 1)
Arb.localDate(date, date).take(10).toList() shouldBe List(10) { date }
Expand Down