Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Use a single seed value per faker function invocation #2664

Open
ST-DDT opened this issue Feb 12, 2024 · 4 comments · May be fixed by #1499
Open

Proposal: Use a single seed value per faker function invocation #2664

ST-DDT opened this issue Feb 12, 2024 · 4 comments · May be fixed by #1499
Assignees
Labels
c: feature Request for new feature p: 1-normal Nothing urgent s: accepted Accepted feature / Confirmed bug s: on hold Blocked by something or frozen to avoid conflicts
Milestone

Comments

@ST-DDT
Copy link
Member

ST-DDT commented Feb 12, 2024

Clear and concise description of the problem

Glossary:

  • seed value: direct or indirect invocation of randomizer.next() or equivalents thereof

Currently when invoking a faker function it may consume an unknown number of seed values (0-n).
While that isn't bad by itself, it has the side effect, that any change to the implementation affects the user by generating different subsequent values. This is especially relevant for functions that generate multiple values such as multiple and unique, each of the generated elements affect the following elements generated in them and all elements afterwards.

Suggested solution

Let each method use only a single seed value by deriving it, if it uses more than one.
Each method would be responsible for itself to consume only a single seed value.

function multiple(fakerCore: Faker(Core), generator: (fakerCore, index) => T, options): T[] {
  // consumes a single seed from the original
  const derived = fakerCore.derive(); 
  // consumes a seed value from the derived
  const count = rangeToNumber(derived, options.count); 
  // each call on the generator consumes another seed from derived
  // if the generator would need more than one single value it would derive by itself
  // even if it doesn't the multiple function upholds its contract and still behaves better than a simple for loop
  return Array.from({length: count}, (_, i) => generator(derived, i));
}

Important usage detail, the fakerCore instance must be passed on and used by any nested code.

Deriving an instance does come at a performance cost, but we could make it cheap, if we keep that as a priority during the derive implementation and use standalone functions like teased in the code example.

E.g. by not re-initializing the twister from scratch, but only copy and transforming the state.
derive() {
  let random = this.state.next();
  const stateCopy = this.state.map((old) => old + random + aRandomStaticValue + 0 * (random = old));
  return new Twister(stateCopy);
}

(We should measure though how much "re-initializing" vs "copy and transforming the state" actually has of a performance impact)

It also comes at a cost of additional code, we could hide that in our potential meta framework though.

This section is largely unrelated to this proposal and should just demonstrate, how the derive could be included in the potential meta framework.

function multiple(fakerCore: Faker(Core), generator: (fakerCore, index) => T, options): T[] {
  const count = rangeToNumber(fakerCore, options.count); 
  return Array.from({length: count}, (_, i) => generator(fakerCore, i));
}

--- Autogenerated same file

declare function boundMultiple(generator: (fakerCore, index) => T, options): T[];
[...]
export const multiple = fakerize<typeof multiple, typeof boundMultiple>(multiple, {derive: true, isCallable: ...});
[...]
function fakerize<TRaw, TBound>(fn: TRaw, options): Fakerized<TRaw, TBound> {
  if (options.derive) {
    fn = (fakerCore, ...args) => fn(fakerCore.derive(), ...args);
  }
  fn.bindTo = (fakerCore) => (...args) => fn(fakerCore, ....args) as TBound;
  fn.isCallable = ...;
  return fn;
}

Usage

multiple(fakerEN, (fakerCore) => firstName(fakerCore), 5);
// or multiple(fakerEN, firstName, 5);

const multipleEN = multiple.bindTo(fakerEN);
multipleEN((fakerCore) => firstName(fakerCore), 5);
// or multipleEN(firstName, 5);

// Why not like this?
const firstNameEN = firstName.bindTo(fakerEN);
multipleEN(() => firstNameEN(), 5);
// or
multiple(fakerCore, () => firstNameEN(), 5);

// Because firstNameEN would consume the seeds directly from the bound fakerEN instance
// and thus bypass most benefits from `derive`.

Alternative

Don't change the current behavior.

Additional context

Relevant issues/PRs:

Potentially impacted issues/PRs:


We already talked about this feature quite a bit, but I would like to have this issue, to bring everybody on the same level.
And give everybody the chance to comment and react.

@ST-DDT ST-DDT added c: feature Request for new feature p: 1-normal Nothing urgent s: needs decision Needs team/maintainer decision labels Feb 12, 2024
@ST-DDT ST-DDT added this to the v9.0 milestone Feb 12, 2024
@ST-DDT ST-DDT self-assigned this Feb 12, 2024
@ST-DDT ST-DDT linked a pull request Feb 12, 2024 that will close this issue
@matthewmayer
Copy link
Contributor

How would that work though with fake patterns since different fake patterns can have different numbers of placeholders?

@ST-DDT
Copy link
Member Author

ST-DDT commented Feb 12, 2024

Fake would assume that the pattern requires multiple seed values and always derive at the start of the fake method.

@ST-DDT
Copy link
Member Author

ST-DDT commented Feb 29, 2024

Team Decision

  • We want to do this conceptually, but aren't sure about the exact implementation requirements.

@ST-DDT ST-DDT added s: accepted Accepted feature / Confirmed bug and removed s: needs decision Needs team/maintainer decision labels Feb 29, 2024
@ST-DDT
Copy link
Member Author

ST-DDT commented Feb 29, 2024

@ST-DDT ST-DDT added the s: on hold Blocked by something or frozen to avoid conflicts label Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: feature Request for new feature p: 1-normal Nothing urgent s: accepted Accepted feature / Confirmed bug s: on hold Blocked by something or frozen to avoid conflicts
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants