Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent toBinary error with undefined fields #848

Closed
wants to merge 1 commit into from

Conversation

mnito
Copy link

@mnito mnito commented May 17, 2024

I noticed that there was a subtle behavioral change introduced in version 1.8.0 that causes an error in toBinary when a field was set to undefined explicitly on a Message instance. At runtime this can be fairly common when setting proto fields based on the result of less type-safe functions.

I am not sure if this behavioral change was intentional or not. The proposed fix is a one-line change to use the custom writeField implementation that handles the undefined case explicitly.

This error can happen at runtime when a field on an existing
Message instance was set to undefined and toBinary is called
later in the lifetime of the Message.
@CLAassistant
Copy link

CLAassistant commented May 17, 2024

CLA assistant check
All committers have signed the CLA.

@timostamm
Copy link
Member

Hey Michael, thanks for the contribution. v1.8.0 added a couple of additional checks to avoid undefined behavior that can possibly lead to data corruption.

To clarify, you have a definition like this:

syntax = "proto3";
message User {
  string first_name = 1;
}

Which generates:

export declare class User extends Message<User> {
  firstName: string;
  // ...

And toBinary errors when setting the field to undefined:

const user = new User();
user.firstName = undefined as unknown as string;
user.toBinary(); // error

I see your point that it can be useful to skip undefined values in toBinary, but that only helps if you immediate discard the message after serialization. If the message is passed elsewhere, all application logic still has to check for undefined, even though the types say otherwise. So even if we supported this for toBinary, toJson, equals, and probably several other places, it still wouldn't be a good solution.

In my opinion, it's better to let strongly-typed messages be strongly typed, and have a separate layer to convert from unknown input.

For the example above, you could write a function that's safe:

export function createUser(firstName: unknown) {
  const user = new User();
  if (typeof firstName == "string") {
    user.firstName = firstName;
  }
  return user;
}

If you only need to handle undefined, it's even simpler:

declare const firstName: string | undefined;
new User({
  firstName, // the constructor skips fields with value undefined
})

Because of the issues described above, we do not want to support setting field values that do not match typings, and recommend to validate inputs before assigning to a message.

@timostamm timostamm closed this May 18, 2024
@mnito
Copy link
Author

mnito commented May 20, 2024

Hey Michael, thanks for the contribution. v1.8.0 added a couple of additional checks to avoid undefined behavior that can possibly lead to data corruption.

To clarify, you have a definition like this:

syntax = "proto3";
message User {
  string first_name = 1;
}

Which generates:

export declare class User extends Message<User> {
  firstName: string;
  // ...

And toBinary errors when setting the field to undefined:

const user = new User();
user.firstName = undefined as unknown as string;
user.toBinary(); // error

I see your point that it can be useful to skip undefined values in toBinary, but that only helps if you immediate discard the message after serialization. If the message is passed elsewhere, all application logic still has to check for undefined, even though the types say otherwise. So even if we supported this for toBinary, toJson, equals, and probably several other places, it still wouldn't be a good solution.

In my opinion, it's better to let strongly-typed messages be strongly typed, and have a separate layer to convert from unknown input.

For the example above, you could write a function that's safe:

export function createUser(firstName: unknown) {
  const user = new User();
  if (typeof firstName == "string") {
    user.firstName = firstName;
  }
  return user;
}

If you only need to handle undefined, it's even simpler:

declare const firstName: string | undefined;
new User({
  firstName, // the constructor skips fields with value undefined
})

Because of the issues described above, we do not want to support setting field values that do not match typings, and recommend to validate inputs before assigning to a message.

That's fair -- thanks for the detailed response! I implemented a workaround on our end for one of our shared libraries that uses Protobuf ES for serialization -- upgrading to 1.8.0 made some of our tests fail. We did a somewhat automated refactor from our former JS proto library which used setters and getters to Protobuf ES, so we have a lot of cases where we are not using the constructor and working with some less type-safe code in a large codebase.

I think a similar change may be worth considering a la Postel's law which I think is particularly important for libraries. Our use case does pretty much always discard messages after using them for serialization, so a change in the core library may help some folks avoid production issues after upgrading past 1.8.0. Fortunately our tests caught it, so the workaround I implemented in our wrapper library gives me peace of mind.

This library is awesome and really helped us address some tech debt -- appreciate all the good work from you and the team on Protobuf ES! It really is a game changer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants