Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Population Data Provider #1527

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

jojijac0b
Copy link
Contributor

@jojijac0b jojijac0b commented Mar 18, 2024

Create Population Data Provider Simulator w/ tests

@wfa-reviewable
Copy link

This change is Reviewable

@jojijac0b jojijac0b requested review from stevenwarejones and removed request for stevenwarejones March 18, 2024 19:28
Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @jojijac0b)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 1 at r2 (raw file):

// Copyright 2021 The Cross-Media Measurement Authors

2024


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 281 at r2 (raw file):

        modelRolloutsStub.listModelRollouts(
          listModelRolloutsRequest { parent = measurementSpecModelLineName }
        )

i don't think this method is guaranteed to return this sorted. You should sort them yourself imo.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 291 at r2 (raw file):

    // Retrieves latest ModelRollout from list.
    val latestModelRollout = listModelRolloutsResponse.modelRolloutsList[0]

.modelRolloutsList.first()


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 314 at r2 (raw file):

  ) {
    // Calculates total sum by running the populationBucketsList through a program that will filter
    // out irrelevant populations and sum the relevant ones.

.... that will sum matching populations I think is simpler


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 352 at r2 (raw file):

          // find better way to do this
          &&
            Timestamp.parseFrom(it.validStartTime.toByteString()).seconds >=

this won't work. See the javadocs on DateTime -- https://cloud.google.com/java/docs/reference/proto-google-common-protos/latest/com.google.type.DateTime -- or I think its fine to make both of these Timestamps.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 354 at r2 (raw file):

            Timestamp.parseFrom(it.validStartTime.toByteString()).seconds >=
              requisitionIntervalStartTime.seconds &&
            Timestamp.parseFrom(it.validEndTime.toByteString()).seconds >=

there is a scenario where validEndTime is not set (open-ended) - that should return true as well.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 357 at r2 (raw file):

              requisitionIntervalEndTime.seconds
        }
        .distinct()

you should assume they are already distinct.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 359 at r2 (raw file):

        .distinct()

    for (populationBucket in validPopulationBucketsList) {
return validPopulationBucketsList.sumBy {
      val program = EventFilters.compileProgram(TestEvent.getDescriptor(), filterExpression)

      // Use program to check if Person field in the PopulationBucket contains the filterExpression.
      if (EventFilters.matches(populationBucket.event, program)) {
        populationBucket.populationSize
      } else {
        0L
      }

}

src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 36 at r2 (raw file):

    (google.api.field_behavior) = UNORDERED_LIST
  ];
  google.type.DateTime valid_start_time = 4;

see my earlier comment - I think its fine to make these Timestamp for convenience

…ier comparison to requisitionInterval start and end time. Sort ModelRolloutsList. Use sumOf() to return total population in getTotalPopulation().
Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @jojijac0b)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 293 at r3 (raw file):

    val sortedModelRolloutsList =
      listModelRolloutsResponse.modelRolloutsList.sortedWith { a, b ->
        Timestamps.compare(a.updateTime, b.updateTime) * -1

you should use the oneof of rollout_deploy_period not updateTime here

also, can't you just do listModelRolloutsResponse.modelRolloutsList.sortedWith(TimeStamps.comparator


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 355 at r3 (raw file):

      populationBucketsList.filter {
        it.modelReleasesList.contains(modelRelease.name) &&
          Timestamps.compare(it.validStartTime, requisitionIntervalStartTime) >= 0 &&

the start time needs to be less than the requisitionIntervalEndTime. the validendTime needs to be null or greater than the validStartTime

jojijac0b added 2 commits March 25, 2024 03:21
Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 13 files reviewed, 5 unresolved discussions (waiting on @jojijac0b, @SanjayVas, and @stevenwarejones)


src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 27 at r4 (raw file):

option java_multiple_files = true;

message PopulationBucket {

Can we use the population spec in this PR instead? world-federation-of-advertisers/cross-media-measurement-api#206

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 13 files reviewed, 5 unresolved discussions (waiting on @jojijac0b, @SanjayVas, and @stevenwarejones)


src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 27 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Can we use the population spec in this PR instead? world-federation-of-advertisers/cross-media-measurement-api#206

After talking with @stevenwarejones I'm going to make some updates to my PR. I'll ping this comment when it's ready. Hopefully later today.

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 13 files reviewed, 5 unresolved discussions (waiting on @jojijac0b, @SanjayVas, and @stevenwarejones)


src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 27 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

After talking with @stevenwarejones I'm going to make some updates to my PR. I'll ping this comment when it's ready. Hopefully later today.

It's been updated.

Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r3, 2 of 11 files at r4, all commit messages.
Reviewable status: 4 of 13 files reviewed, 12 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @stevenwarejones)

a discussion (no related file):
Check in the root cert and root private key for pdp1 as well.



src/main/k8s/testing/secretfiles/BUILD.bazel line 48 at r4 (raw file):

    srcs = [
        "aggregator_root.pem",
        "kingdom_root.pem",

Add pdp1_root.pem here too, as the MC will need to trust the PDP in order to verify signed results.


src/main/k8s/testing/secretfiles/BUILD.bazel line 217 at r4 (raw file):

    "pdp1_enc_private.tink",
    "pdp1_enc_public.tink",
    "pdp1_result_cs_cert.der",

You don't need a separate result cert here. Note that the only EDP with a separate result cert is edp1, and that's just for a corresponding test.


src/main/k8s/testing/secretfiles/pdp1_cs_cert.der line 0 at r4 (raw file):
I downloaded and checked and this file looks like it is just a copy of edp1_cs_cert.der. That won't work - each entity must have its own root cert with a unique ID. You'll need to generate a new root cert and key, and then generate all the corresponding certs with 10-year expiration.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 137 at r4 (raw file):

    Exception(message, cause)

  private fun verifySpecifications(

This function and the supporting data classes appear to be identical to those in EdpSimulator.

Extract common code to a separate library so it can be shared. If need be, this could be an abstract DataProviderSimulator base class as EDP and PDP are just two types of Data Providers.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 226 at r4 (raw file):

      try {
        logger.info("Processing requisition ${requisition.name}...")

Why was the TODO from the original version of this function from EdpSimulator.kt removed? It still applies here, just with PDP instead of EDP.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 323 at r4 (raw file):

through a program that will sum matching populations.

What? Is this talking about the CEL program? That doesn't do any aggregation, it's an implementation detail of how the CEL library we're using for filtration.

Code quote:

    // Calculates total sum by running the populationBucketsList through a program that will sum
    // matching populations.

@SanjayVas
Copy link
Member

Attaching a diff of EdpSimulator.kt to PdpSimulator.kt since it's clear that the latter is based on the former.

Simulator.patch.txt

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 11 files at r4, all commit messages.
Reviewable status: 11 of 13 files reviewed, 12 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @SanjayVas)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 323 at r4 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

through a program that will sum matching populations.

What? Is this talking about the CEL program? That doesn't do any aggregation, it's an implementation detail of how the CEL library we're using for filtration.

Maybe Filters populationBucketsList through a CEL program and sums the result.

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 13 unresolved discussions (waiting on @jojijac0b, @SanjayVas, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 92 at r4 (raw file):

/** A simulator handling PDP businesses. */
class PdpSimulator(

Why would the simulator be any different than the production PDP?

Copy link
Contributor Author

@jojijac0b jojijac0b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 10 of 13 files reviewed, 13 unresolved discussions (waiting on @kungfucraig, @SanjayVas, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 281 at r2 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

i don't think this method is guaranteed to return this sorted. You should sort them yourself imo.

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 359 at r2 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…
return validPopulationBucketsList.sumBy {
      val program = EventFilters.compileProgram(TestEvent.getDescriptor(), filterExpression)

      // Use program to check if Person field in the PopulationBucket contains the filterExpression.
      if (EventFilters.matches(populationBucket.event, program)) {
        populationBucket.populationSize
      } else {
        0L
      }

}

sumBy is deprecated so used sumOf instead


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 293 at r3 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

you should use the oneof of rollout_deploy_period not updateTime here

also, can't you just do listModelRolloutsResponse.modelRolloutsList.sortedWith(TimeStamps.comparator

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 355 at r3 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

the start time needs to be less than the requisitionIntervalEndTime. the validendTime needs to be null or greater than the validStartTime

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 226 at r4 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Why was the TODO from the original version of this function from EdpSimulator.kt removed? It still applies here, just with PDP instead of EDP.

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 323 at r4 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

Maybe Filters populationBucketsList through a CEL program and sums the result.

Done.

Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 17 of 18 files reviewed, 15 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/DataProviderSimulator.kt line 93 at r8 (raw file):

  /** A sequence of operations done in the simulator. */
  suspend fun run() {
    dataProvidersStub.replaceDataAvailabilityInterval(

This is specific to EDPs and not applicable to PDPs.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 92 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Why would the simulator be any different than the production PDP?

Good point. @jojijac0b , can you extract out both the shared code and what is currently called the PdpSimulator to a non-test package? Probably rename it to PdpFulfillmentDaemon or something.


src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 27 at r4 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

It's been updated.

Given how long PopulationSpec has been in the API, I agree that this PR should be migrated over to it before merging at this point.

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 8 files at r6, 1 of 3 files at r7, 2 of 2 files at r8, all commit messages.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @SanjayVas)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/EdpSimulator.kt line 191 at r8 (raw file):

  /** A sequence of operations done in the simulator. */
  suspend fun run() {

as Sanjay stated - restore this here


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 355 at r3 (raw file):

Previously, jojijac0b wrote…

Done.

i don't think we need this logic anymore since we are just mapping populationIds


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 55 at r8 (raw file):

  private val modelRolloutsStub: ModelRolloutsCoroutineStub,
  private val modelReleasesStub: ModelReleasesCoroutineStub,
  private val populationBucketsList: List<PopulationBucket>,

as discussed, change to

: Map<PopulationKey, PopulationSpec>

src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 190 at r8 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Also specify the methodology.

yeah - just use DeterministicSum


src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 32 at r6 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

since model_releases map to populations, lets just use Population here

https://github.com/world-federation-of-advertisers/cross-media-measurement-api/blob/7c8702ed832c51fdcc2b22917ee4a91c0fb11377/src/main/proto/wfa/measurement/api/v2alpha/model_release.proto#L44

correction: as discussed - just remove this file and use populationSpec instead

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm out next week but if this gets to a good place where Craig and Sanjay approve, I'm okay if you dismiss my comments and merge.

Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @SanjayVas)

Copy link
Contributor Author

@jojijac0b jojijac0b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 9 of 19 files reviewed, 13 unresolved discussions (waiting on @kungfucraig, @SanjayVas, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/DataProviderSimulator.kt line 93 at r8 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

This is specific to EDPs and not applicable to PDPs.

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/EdpSimulator.kt line 477 at r8 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Can you put the TODO here that was originally in verifySpecifications before that function was moved?

// TODO(@uakyol): Validate that collection interval is not outside of privacy landscape.

Done.


src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 27 at r4 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Given how long PopulationSpec has been in the API, I agree that this PR should be migrated over to it before merging at this point.

Done.


src/main/proto/wfa/measurement/api/v2alpha/populations/testing/population_bucket.proto line 32 at r6 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

correction: as discussed - just remove this file and use populationSpec instead

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 355 at r3 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

i don't think we need this logic anymore since we are just mapping populationIds

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 92 at r4 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Good point. @jojijac0b , can you extract out both the shared code and what is currently called the PdpSimulator to a non-test package? Probably rename it to PdpFulfillmentDaemon or something.

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulator.kt line 55 at r8 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

as discussed, change to

: Map<PopulationKey, PopulationSpec>

Done.

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 11 files at r4, 2 of 8 files at r6, 1 of 3 files at r7, 1 of 2 files at r8.
Reviewable status: 9 of 19 files reviewed, 18 unresolved discussions (waiting on @jojijac0b, @SanjayVas, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 44 at r10 (raw file):

import org.wfanet.measurement.loadtest.dataprovider.DataProviderSimulator

/** A simulator handling PDP businesses. */

Update the comment.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 52 at r10 (raw file):

  trustedCertificates: Map<ByteString, X509Certificate>,
  measurementConsumerName: String,
  private val populationSpecMap: Map<PopulationKey, PopulationSpec>,

You may want to run the PopulationSpecs through the PopulationSpecValidator.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 53 at r10 (raw file):

  measurementConsumerName: String,
  private val populationSpecMap: Map<PopulationKey, PopulationSpec>,
  private val populationId: PopulationKey

Why pass both the map and the populationID when you could pass the PopulationSpec by itself?

Or maybe pass the populationId to one of the functions instead of the constructor?


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 55 at r10 (raw file):

  private val populationId: PopulationKey
) :
  DataProviderSimulator(

I don't think you want to inherit simulator for a non-simulator. Perhaps "DataProviderSimulator" needs to be renamed? It's fine for the EDPSimualtor to inherit from a non-simulator base class.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 180 at r10 (raw file):

   */
  private fun isValidAttributesList(attributeList: List<Any>, filterExpression: String): Boolean {
    val program = EventFilters.compileProgram(TestEvent.getDescriptor(), filterExpression)

This code should not use TestEvent directly. It needs to interrogate the "Any" and use what is pointed to by that.

You should also add some checks to ensure that all subpopulations are using the same types.

Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 8 files at r9, 7 of 10 files at r10, all commit messages.
Reviewable status: 16 of 19 files reviewed, 20 unresolved discussions (waiting on @jojijac0b and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/BUILD.bazel line 320 at r10 (raw file):

)

kt_jvm_library(

This needs to be moved out to some non-testonly package so that the PDP daemon can depend on it.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/DataProviderSimulator.kt line 156 at r10 (raw file):

    }

    // TODO(@uakyol): Validate that collection interval is not outside of privacy landscape.

The TODO is specific to EDPs. Move it to the call site in EdpSimulator.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/BUILD.bazel line 8 at r10 (raw file):

package(
    default_testonly = True,

This should not be testonly.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 45 at r10 (raw file):

/** A simulator handling PDP businesses. */
class PdpFulfillmentDaemon(

nit: By convention, the class ending in "Daemon" should be runnable from the command-line, i.e. have a call to commandLineMain. Maybe call this PopulationRequisitionFulfiller and have another Daemon class as the runner? See the Herald vs. HeraldDaemon split.

Copy link
Contributor Author

@jojijac0b jojijac0b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 14 of 20 files reviewed, 20 unresolved discussions (waiting on @SanjayVas and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/BUILD.bazel line 320 at r10 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

This needs to be moved out to some non-testonly package so that the PDP daemon can depend on it.

I set the testonly flag to False for this specific library. Should I still move it to another package?


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/DataProviderSimulator.kt line 156 at r10 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

The TODO is specific to EDPs. Move it to the call site in EdpSimulator.

Done.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/BUILD.bazel line 8 at r10 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

This should not be testonly.

Done.

Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r11, 2 of 2 files at r12.
Reviewable status: 15 of 20 files reviewed, 18 unresolved discussions (waiting on @jojijac0b and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/BUILD.bazel line 320 at r10 (raw file):

Previously, jojijac0b wrote…

I set the testonly flag to False for this specific library. Should I still move it to another package?

Yes, as everything under the loadtest package should be testonly.


src/main/proto/wfa/measurement/api/v2alpha/event_templates/testing/BUILD.bazel line 5 at r13 (raw file):

package(
    default_testonly = True,

Anything in a testing package should remain testonly.

…derSimulator to DataProviderRequisitionFulfiller. Moved DataProviderRequisitionFulfiller to a new non-test package. Updated tests and references to DataProviderRequisitionFulfiller.
Copy link
Contributor Author

@jojijac0b jojijac0b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 21 files reviewed, 18 unresolved discussions (waiting on @kungfucraig, @SanjayVas, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/BUILD.bazel line 320 at r10 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Yes, as everything under the loadtest package should be testonly.

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/EdpSimulator.kt line 191 at r8 (raw file):

Previously, stevenwarejones (Steven Ware Jones) wrote…

as Sanjay stated - restore this here

Done.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 44 at r10 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Update the comment.

Done.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 52 at r10 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

You may want to run the PopulationSpecs through the PopulationSpecValidator.

Done.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 55 at r10 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

I don't think you want to inherit simulator for a non-simulator. Perhaps "DataProviderSimulator" needs to be renamed? It's fine for the EDPSimualtor to inherit from a non-simulator base class.

Done.

Copy link
Contributor Author

@jojijac0b jojijac0b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 21 files reviewed, 18 unresolved discussions (waiting on @kungfucraig, @SanjayVas, and @stevenwarejones)


src/main/proto/wfa/measurement/api/v2alpha/event_templates/testing/BUILD.bazel line 5 at r13 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Anything in a testing package should remain testonly.

I need to check if the attributes in the population spec are of type Person so this is needed in PdpRequisitionFulfiller. How can I use this? Should I move it?

Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 11 files at r14, all commit messages.
Reviewable status: 14 of 21 files reviewed, 23 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/integration/common/InProcessEdpSimulator.kt line 51 at r14 (raw file):

import org.wfanet.measurement.eventdataprovider.privacybudgetmanagement.PrivacyBudgetManager
import org.wfanet.measurement.eventdataprovider.privacybudgetmanagement.testing.TestPrivacyBucketMapper
import org.wfanet.measurement.populationdataprovider.DataProviderData

Why is this Event DataProvider simulator depending on something from a Population DataProvider package?


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/BUILD.bazel line 136 at r14 (raw file):

    name = "measurement_results",
    srcs = ["MeasurementResults.kt"],
    testonly = False,

This must be testonly, or else it needs to be moved to another package.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/EdpSimulatorRunner.kt line 38 at r14 (raw file):

import org.wfanet.measurement.common.throttler.MinimumIntervalThrottler
import org.wfanet.measurement.loadtest.config.PrivacyBudgets.createNoOpPrivacyBudgetManager
import org.wfanet.measurement.populationdataprovider.DataProviderData

Same here re: Event DataProvider code depending on something from a Population DataProvider package.


src/main/proto/wfa/measurement/api/v2alpha/event_templates/testing/BUILD.bazel line 5 at r13 (raw file):

I need to check if the attributes in the population spec are of type Person

You in fact do not need to do this. The population requisition fulfiller must work for any event template types. This is why we use fields of type Any. The types are not known until runtime and are controlled by passing in the appropriate FileDescriptorSet via a command-line option. See Craig's comment about not referencing TestEvent directly.

These are all test messages and should only be used in tests.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/DataProviderRequisitionFulfiller.kt line 15 at r14 (raw file):

// limitations under the License.

package org.wfanet.measurement.populationdataprovider

Isn't this the base class for both Event and Population? If so, it shouldn't be in the Population package.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/DataProviderRequisitionFulfiller.kt line 71 at r14 (raw file):

)

abstract class DataProviderRequisitionFulfiller(

nit: maybe just RequisitionFulfiller, since only DataProviders can fulfill Requisitions?

Suggestion:

RequisitionFulfiller

src/main/kotlin/org/wfanet/measurement/populationdataprovider/PdpRequisitionFulfiller.kt line 46 at r14 (raw file):

/** A requisition fulfiller for PDP businesses. */
class PdpRequisitionFulfiller(

nit: maybe PopulationRequisitionFulfiller, to indicate which types of requisitions are being fulfilled?

Suggestion:

PopulationRequisitionFulfiller

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 8 files at r9, 3 of 10 files at r10, 1 of 3 files at r11, 10 of 11 files at r14, all commit messages.
Reviewable status: 20 of 21 files reviewed, 23 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @SanjayVas)


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PdpRequisitionFulfiller.kt line 112 at r14 (raw file):

        val populationSpec = populationSpecMap.getValue(populationId)
        val populationSpecValidator = PopulationSpecValidator.validateVidRangesList(populationSpec)

you should also validate that all the population fields are set here


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PdpRequisitionFulfiller.kt line 168 at r14 (raw file):

      val attributesList = it.attributesList
      val vidRanges = it.vidRangesList
      val shouldSumPopulation = isValidAttributesList(attributesList, filterExpression)

you should be populating your testEvent proto with all these population and then filtering


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 180 at r10 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

This code should not use TestEvent directly. It needs to interrogate the "Any" and use what is pointed to by that.

You should also add some checks to ensure that all subpopulations are using the same types.

yes, add it to the key on your PopulationId map since event population may map to different Event protos

jojijac0b added 3 commits May 22, 2024 14:22
…lease of the latest model rollout connected to the model line in the measurement spec.
…ove it to its own dataprovider package. Rename PdpRequisitionFulfiller to PopulationRequisitionFulfiller and update test to reflect this.
Copy link
Contributor Author

@jojijac0b jojijac0b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 10 of 22 files reviewed, 20 unresolved discussions (waiting on @kungfucraig, @SanjayVas, and @stevenwarejones)


src/main/kotlin/org/wfanet/measurement/integration/common/InProcessEdpSimulator.kt line 51 at r14 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Why is this Event DataProvider simulator depending on something from a Population DataProvider package?

Moved RequisitionFulfiller(fka DataProviderRequisitionFulfiller), which contains DataProviderData, to its own dataprovider package.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/BUILD.bazel line 136 at r14 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

This must be testonly, or else it needs to be moved to another package.

Done.


src/main/kotlin/org/wfanet/measurement/loadtest/dataprovider/EdpSimulatorRunner.kt line 38 at r14 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Same here re: Event DataProvider code depending on something from a Population DataProvider package.

Moved RequisitionFulfiller(fka DataProviderRequisitionFulfiller), which contains DataProviderData, to its own dataprovider package.


src/main/proto/wfa/measurement/api/v2alpha/event_templates/testing/BUILD.bazel line 5 at r13 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

I need to check if the attributes in the population spec are of type Person

You in fact do not need to do this. The population requisition fulfiller must work for any event template types. This is why we use fields of type Any. The types are not known until runtime and are controlled by passing in the appropriate FileDescriptorSet via a command-line option. See Craig's comment about not referencing TestEvent directly.

These are all test messages and should only be used in tests.

Done.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/DataProviderRequisitionFulfiller.kt line 15 at r14 (raw file):

Previously, SanjayVas (Sanjay Vasandani) wrote…

Isn't this the base class for both Event and Population? If so, it shouldn't be in the Population package.

Moved RequisitionFulfiller(fka DataProviderRequisitionFulfiller), which contains DataProviderData, to its own dataprovider package.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/daemon/PdpFulfillmentDaemon.kt line 53 at r10 (raw file):

Previously, kungfucraig (Craig Wright) wrote…

Why pass both the map and the populationID when you could pass the PopulationSpec by itself?

Or maybe pass the populationId to one of the functions instead of the constructor?

Getting the population from the model release of the model line's(provided by the measurement spec) latest model rollout.

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 6 files at r15, 11 of 11 files at r16, all commit messages.
Reviewable status: all files reviewed, 30 unresolved discussions (waiting on @jojijac0b, @kungfucraig, and @SanjayVas)


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 68 at r16 (raw file):

  private val modelRolloutsStub: ModelRolloutsCoroutineStub,
  private val modelReleasesStub: ModelReleasesCoroutineStub,
  private val populationSpecMap: Map<PopulationKey, PopulationInfo>,

nit: rename populationSpecMap to populationInfoMap


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 126 at r16 (raw file):

        val modelRelease = getModelRelease(measurementSpec)

        val populationId = PopulationKey.fromName(modelRelease.population)

combine this and the next line

val populationKey = requireNotNull(...) {
  ...
}

src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 179 at r16 (raw file):

    // Sort list of ModelRollouts by descending updateTime.
    val sortedModelRolloutsList =

@Marco-Premier - can you take a look at this logic?


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 264 at r16 (raw file):

      // Ensure attribute is a field in the event descriptor.
      if(!isAttributeFieldInEvent(attributeMessage, eventDescriptor)){

you can just use require for these to


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 265 at r16 (raw file):

      // Ensure attribute is a field in the event descriptor.
      if(!isAttributeFieldInEvent(attributeMessage, eventDescriptor)){
        throw InvalidSpecException("Attribute is not part of Event Descriptor")

can you add a test that throws this error


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 270 at r16 (raw file):

      // Validate attribute message with descriptor from type registry.
      if(!isValidAttribute(attributeMessage, attributeDescriptor)){
        throw InvalidSpecException("Invalid Subpopulation Attribute")

can you add a test that throws this error


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 310 at r16 (raw file):

  private fun createAttributeMessage(attribute: Any, attributeDescriptor: Descriptor): Message {
    val attributeMessage = DynamicMessage.parseFrom(attributeDescriptor, attribute.toByteString()).toBuilder()

See my other comment. You don't need a builder. Also, you should reading the value field of the Any

val attributeMessage = DynamicMessage.parseFrom(attributeDescriptor, attribute.value)

src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 313 at r16 (raw file):

    attributeDescriptor.fields.forEach {
      attributeMessage.setField(it, attribute.unpackSameTypeAs(attributeMessage.build()).getField(it))

instead of setting the fields here, they should already all be set when you unpack. Instead, here, you should be validating that all the population fields are set.


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 316 at r16 (raw file):

    }

    return attributeMessage.build()

no need to build.


src/test/kotlin/org/wfanet/measurement/loadtest/dataprovider/PdpSimulatorTest.kt line 322 at r6 (raw file):

          RequisitionSpecKt.population {
            filter = eventFilter { expression = "person.gender == ${Person.Gender.MALE_VALUE}" }
            interval = interval {

we should have a test with an invalid interval (ie there is no modelRelease for that range) or at least one that maps to a population id.


src/test/kotlin/org/wfanet/measurement/loadtest/dataprovider/PopulationRequisitionFulfillerTest.kt line 175 at r16 (raw file):

}

val ATTRIBUTE_2 = any {

can't you just do

 ... = PERSON_2.pack()

src/test/kotlin/org/wfanet/measurement/loadtest/dataprovider/PopulationRequisitionFulfillerTest.kt line 335 at r16 (raw file):

    // Result should be the sum of SUB_POPULATION_1 and SUB_POPULATION_3
    assertThat(result.population.value).isEqualTo(400)

i would suggest adding a helper function.... then it could just be

...isEqualTo(VID_RANGE_1.size() + VID_RANGE_2.size())

src/test/kotlin/org/wfanet/measurement/loadtest/dataprovider/PopulationRequisitionFulfillerTest.kt line 391 at r16 (raw file):

    // Result should be the sum of SUB_POPULATION_1 and SUB_POPULATION_2
    assertThat(result.population.value).isEqualTo(300)

ditto

Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 10 files at r10, 1 of 11 files at r14, 1 of 6 files at r15, 8 of 11 files at r16, all commit messages.
Reviewable status: all files reviewed, 25 unresolved discussions (waiting on @jojijac0b and @kungfucraig)

Copy link
Member

@kungfucraig kungfucraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 22 unresolved discussions (waiting on @jojijac0b and @SanjayVas)


src/main/kotlin/org/wfanet/measurement/populationdataprovider/PopulationRequisitionFulfiller.kt line 133 at r16 (raw file):

        val populationInfo = populationSpecMap.getValue(populationId)
        val populationSpecValidator = PopulationSpecValidator.validateVidRangesList(populationInfo.populationSpec)
        if(populationSpecValidator.isFailure){

You can write: PopulationSpecValidator.validateVidRangesList(populationSpec).getOrThrow() instead and it will throw the encapsulated exception which will contain a lot of additional detail over what you have here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants