Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the open/closed behavior for enums to documentation #522

Closed
Valodim opened this issue Jul 10, 2023 · 5 comments 路 Fixed by #523
Closed

Add the open/closed behavior for enums to documentation #522

Valodim opened this issue Jul 10, 2023 · 5 comments 路 Fixed by #523
Labels
Docs Things to do with documentation

Comments

@Valodim
Copy link

Valodim commented Jul 10, 2023

Hey there,

first of all, thanks a lot for maintaining this library, what a breath of fresh air to the grpc ecosystem 馃憤

We noticed that enum values are decoded "naively" into their integer values, without any bounds check. This may cause out-of-type values on the typescript side, whereas my expectations from protobuf would be to receive the default value for such cases.

I'm not sure what the best way to handle this is. Checking for value correctness comes with a significant performance overhead since it needs to compare against a specific set of known values. But on the other hand, out-of-type values in typescript are very unexpected and can lead to severe consequences. Perhaps a good middle ground is to simply document this behavior? Or make it a configurable performance vs correctness option?

For the record, we are now using this function on all incoming enums to ensure correctness:

function knownEnumOrDefault<T extends Enum>(value: T[keyof T], enumObject: T): T[keyof T] {
  // those next two lines are not cheap, unfortunately
  const enumValues = Object.values(enumObject);
  if (enumValues.includes(value)) {
    return value;
  }
  // Using the protobuf notion of "default", i.e. the first value. By convention, this is
  // a dedicated UNKNOWN value for all our enums.
  // This goes through the reverse mapping to get the actual entry with index 0.
  return enumObject[enumObject[0]] as T[keyof T];
}
@jcready
Copy link

jcready commented Jul 10, 2023

This is in line with the official specification for enums (at least for proto3/open enums). See the official and unofficial docs.

Enums have two distinct flavors (open and closed). They behave identically except in their handling of unknown values. Practically, this means that simple cases work the same, but some corner cases have interesting implications.

For the purpose of explanation, let us assume we have the following .proto file (we are deliberately not specifying if this is a syntax = "proto2" or syntax = "proto3" file right now):

enum Enum {
 A = 0;
 B = 1;
}

message Msg {
 optional Enum enum = 1;
}

The distinction between open and closed can be encapsulated in a single question:

What happens when a program parses binary data that contains field 1 with the value 2?

  • Open enums will parse the value 2 and store it directly in the field. Accessor will report the field as being set and will return something that represents 2.
  • Closed enums will parse the value 2 and store it in the message鈥檚 unknown field set. Accessors will report the field as being unset and will return the enum鈥檚 default value.

FWIW you could likely make your function less expensive by changing it to something like:

function knownEnumOrDefault<T extends Enum>(value: T[keyof T], e: T): T[keyof T] {
  return e[e[value]] ?? e[e[0]] as T[keyof T];
}

@Valodim
Copy link
Author

Valodim commented Jul 10, 2023

Thanks for the explanation, that sounds reasonable. It would be great to document this behavior, e.g. under Using enumerations 馃憤

@smaye81
Copy link
Member

smaye81 commented Jul 10, 2023

Fair point. We could make this a bit more evident through our docs.

@smaye81 smaye81 changed the title enum deserialization allows out-of-type values Add the open/closed behavior for enums to documentation Jul 10, 2023
@smaye81 smaye81 added the Docs Things to do with documentation label Jul 10, 2023
@jcready
Copy link

jcready commented Jul 11, 2023

Unfortunately the proto3/open enums don't play very well with TS enums (or rather, the developer experience around them). The Most Correct鈩笍 type for an field in a message interface would be something like:

enum SomeEnum {
    UNSET = 0,
    FOO = 1,
    BAR = 2
}

interface MyMessage {
    // number is for potentially unknown enum values in a proto3/open enum
    myEnum: SomeEnum | number;
}

Unfortunately because SomeEnum (when used as a type) in a union with number just reduces to number. This makes the type almost useless. Fortunately TS is kind enough to allow one to index into any enum with a number so that can be used to check that the number actually exists in the enum (for runtime purposes), but that check doesn't actually refine the value from SomeEnum | number to just SomeEnum.

declare var m: MyMessage;

if (SomeEnum[m.myEnum]) {
    m.myEnum;
    // ^? (property) MyMessage.myEnum: number
}

So you either have to cast it (馃あ) or make a function to cast it (馃槙):

if (SomeEnum[m.myEnum]) {
    m.myEnum as SomeEnum;
    // ^? (property) MyMessage.myEnum: SomeEnum
}

function isKnownSomeEnumValue(val: number): val is SomeEnum {
    return Boolean(SomeEnum[val]);
}

if (isKnownSomeEnumValue(m.myEnum)) {
    m.myEnum;
    // ^? (property) MyMessage.myEnum: SomeEnum
}

I think the only way to make the DX less bad (at least for performing the refinement) would be to represent protobuf enums as string unions and enum fields as: MyEnum | number. It's debatable whether the DX in general is better with TS enums or string unions.

This allows much easier refinements between known/unknown values for an enum field:

type SomeEnum = 'UNSET' | 'FOO' | 'BAR';

interface MyMessage {
    // number is for potentially unknown enum values in a proto3/open enum
    myEnum: SomeEnum | number;
}

if (typeof m.myEnum === 'string') {
    m.myEnum;
    // ^? (property) MyMessage.myEnum: SomeEnum
}

@timostamm
Copy link
Member

As @jcready commented, this is desired behavior for proto3, and - at least at the time the implementation was written - it matches the behavior of TypeScript enumerations, which also accept arbitrary numbers:

enum Foo {
  A = 1,
  B = 2,
}

const f: Foo = 3; // compiles with TS 4.9.5

TypeScript v5 changes the behavior of enums, making them closed in the type system - see the note at the bottom of the PR description here: microsoft/TypeScript#50528. We will have to see how we can improve the mismatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Things to do with documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants