Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavior of Gen.listOfN behaves different to what I was assuming based on the documentation #844

Closed
L7R7 opened this issue Oct 14, 2021 · 11 comments · May be fixed by #845
Closed

Behavior of Gen.listOfN behaves different to what I was assuming based on the documentation #844

L7R7 opened this issue Oct 14, 2021 · 11 comments · May be fixed by #845

Comments

@L7R7
Copy link

L7R7 commented Oct 14, 2021

I have a Generator for a list of elements. To limit the number of elements in that list, I used Gen.listOfN instead of Gen.listOf.

The Scaladoc says "Generates a list with at most the given number of elements", so I was assuming that the generated lists will have different lengths, but never more than the provided value for the argument n.
But instead what I see is that the lists always seem to be of length n.

I would assume that this is the intended behavior and not a bug (based on a quick look at buildableOfN, it seems to be intended to aim for n elements in the result), so probably I was fooled by the docs.

Theoretically I could go with Gen.list(...).map(_.take(n)), but that wouldn't be efficient because the first generator would generate a lot of values that are discarded by take.

Or something like:

def listGen[T](n: Int)(implicit gen: Gen[T]): Gen[List[T]] = {
  for {
    length <- Gen.choose(0, n)
    ts <- Gen.listOfN(length, gen)
  } yield ts
}

which feels primitive enough to make me expect it in scalacheck directly.

Am I missing something here? Is there a different way to write an efficient generator for a list that produces lists of different lengths with a given maximum length?

@satorg
Copy link
Contributor

satorg commented Oct 15, 2021

I think the scaladoc here is quite misleading. The listOfN function calls buildableOfN which generalizes the generation for any Buildable and which docs says the following:

The size of the generated container is limited by n. Depending on what kind of container that is generated, the resulting container may contain fewer elements than n, but not more.

That basically means that if the supplied Buildable can drop some elements then the result size can be less than n. For example:

Gen.buildableOfN[Set[Char], Char](37, Gen.numChar)

Apparently this generator will never produce more than 10 elements since the inner generator produces chars in range 0..9.

But I cannot come up with any example where listOfN could generate less then the passed n.

@rossabaker
Copy link
Member

It does what I expect from the name and not what I expect from the docs and I usually end up sampling to remember what it really does.

I don't know that it should be a primitive, but I've had luck mixing it with Poisson when there's no hard max. It gives me outliers without killing test performance:

val g = Gen.poisson(10).flatMap(n => Gen.listOfN(n, Gen.asciiChar))

@ashawley
Copy link
Contributor

ashawley commented Oct 15, 2021

One concern with fixing is that people have written generators and tests that also do what they "expected", and what ScalaCheck does(!), and not what the doc describes. However, we could probably try making this change in a minor release, which would be 1.16.0, and add release notes and see how it goes.

@ashawley
Copy link
Contributor

Indeed, there's probably not a way to change buildableOfN because it is the underlying implementation of other collections like Set and Map and would have averse outcomes for even trivial generators.

@ashawley
Copy link
Contributor

ashawley commented Oct 15, 2021

I opened an initial PR, #845, to give more variation to the size for Gen.listOfN. This variation only modifies this method for Lists, and won't affect other collections, for the reasons I said above. There's a chance we could try it for others.

There were tests even in ScalaCheck's test suite that were disrupted by this change. Changing a popular method like listOfN is likely going to cause failures for any users who also have fragile tests that are expecting to generate lists of an exact size of N.

@ashawley
Copy link
Contributor

Sergey commented on the PR and suggested that these generator methods shouldn't behave inconsistently.

I am not sure that it is a good idea overall. Having listOfN which behaves differently from containerOfN and buildableOfN would be really confusing. Personally I like to have all of them behaving in a similar way.

Wouldn't it be better to create a separate function (or functions) for creating generators like this one above? Something like listOfAtMostN, containerOfAtMostN and buildableOfAtMostN

@ashawley
Copy link
Contributor

It's a valid point and why I did plan on investigating the other collection types besides List. The current naming conventions in Gen aren't always ideal, and the duplicate implementations aren't really helping avoid confusion, so this may be an opportunity to make improvements.

@ashawley
Copy link
Contributor

Georgi Krastev mades a similar conclusion on the PR on what it is that could be done:

I will just weigh in my opinion as a user - I think changing the behaviour in this way has the potential to break too many tests. Also the current behaviour is consistent with the name listOfN. For these reasons I think it would be better to just change the documentation of listOfN and add new methods with "at most" semantics. If it's possible, calling buildableOfN for Set and Map could then be deprecated.

@L7R7
Copy link
Author

L7R7 commented Oct 18, 2021

Sounds like I was tricked by the documentation. In fact, the name of the method is pretty accurate and the scaladoc is not.

Wouldn't it be better to create a separate function (or functions) for creating generators like this one above? Something like listOfAtMostN, containerOfAtMostN and buildableOfAtMostN

That sounds like exactly what I was looking for, I'd love to see them in scalacheck :) If that's something the maintainers want and nobody else would like to work on that (and if it's a good first issue), I'd be happy to spend some time on implementing them.

I don't know that it should be a primitive, but I've had luck mixing it with Poisson when there's no hard max. It gives me outliers without killing test performance:

val g = Gen.poisson(10).flatMap(n => Gen.listOfN(n, Gen.asciiChar))

I'll give that one a try for now and see if it helps in my particular case

@satorg
Copy link
Contributor

satorg commented Apr 2, 2022

@L7R7 do you think this issue is safe to be closed now?

@L7R7
Copy link
Author

L7R7 commented Apr 2, 2022

Yes, can be closed.

@L7R7 L7R7 closed this as completed Apr 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants