Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas #58

Open
timostamm opened this issue Dec 23, 2020 · 8 comments
Open

Ideas #58

timostamm opened this issue Dec 23, 2020 · 8 comments

Comments

@timostamm
Copy link
Owner

timostamm commented Dec 23, 2020

This is a collection of ideas for further development:

@timostamm timostamm pinned this issue Dec 23, 2020
@timostamm timostamm unpinned this issue Dec 23, 2020
@bufdev
Copy link

bufdev commented Apr 1, 2021

For what's it's worth, https://developers.google.com/protocol-buffers/docs/reference/proto3-spec isn't accurate, not even close actually heh.

@odashevskii-plaid
Copy link
Contributor

odashevskii-plaid commented Aug 27, 2021

@timostamm

generate "typeName" discriminator property in interfaces? a symbol?

I'm trying to convert a legacy protobuf.js-based system to protobuf-ts. The biggest problem that I've encountered is that protobufjs generates "classes"; given a message object, you can access its type name, and this feature is used in the system (ex.: messages can be passed to a logger as a part of a deeply nested structure, and the logger applies redaction rules based on sensitivity options attached to fields in proto).

Do you think it's ok to have something like

export function reflectionCreate(info: MessageInfo): any {
    // ....
    return { ...msg, [Symbol("protoReflection")]: info };
}

to brand the created objects?

You are suggesting adding typeName, but without a central registry, it's hard to get from a name to the reflection information.

@timostamm
Copy link
Owner Author

I've been thinking about this feature for a while, @odashevskii-plaid.

In my experiments, I found that it would be useful to have a symbol property (acting as a type branding) pointing to the IMessageType.

There is a downside with adding this property to the generated interfaces. It will be less obvious to users how to create such an object.

On the upside, having to pass the message as well as its message type - which you need to do most of the time for a function that acts generically on a message - would become unnecessary. And it allows for some dramatic changes in the public API. IMessageType as the public API to work with messages could be replaced by simple functions. For example, instead of having to write Foo.toBinary(foo), you would simply use toBinary(foo). toBinary would look at the type property and retrieve the reflection information (or the speed-optimized internalBinaryWrite).

This would ultimately allow the JSON format to be omitted from the webpack build output, as long as toJson() isn't used in the code, further reducing code size. Same for many other methods of IMessageType.

So I see technical benefits, and also benefits for the developer experience. There are quite a few things left to figure out though, and I am not entirely sure how the downside will manifest.

@jcready
Copy link
Contributor

jcready commented Sep 14, 2021

Possibly provide a new package for parsing the currently unspecified text protobuf format.

@jcready
Copy link
Contributor

jcready commented May 16, 2022

Regarding the FieldMask stuff on the IMessageType: wouldn't that require that the runtime (I assume) had access to a pre-generated generated FieldMask MessageType? How would that work?

@timostamm
Copy link
Owner Author

Probably just a FieldMaskLike interface. It's a pretty simple definition, if you look at the fields. Field masks are a nice idea, but the spec is odd for repeated fields, and using field names instead of field numbers makes it more brittle than necessary.

@jcready
Copy link
Contributor

jcready commented Jun 4, 2022

While investigating the FieldMask stuff I noticed that the existing implementations (namely the Java and Python versions) both have "merge options" wherein the caller is able to configure the merge strategy. For example repeated fields on the source can either replace or append-to the target fields.

Another thing I noticed is that both implementations end up relying upon their Message#Merge() once reaching a "leaf" of the FieldMask when that leaf refers to a singular message field. This leads to a few complications assuming we'd like to have matching behaviors with regard to the unit tests.

So if someone like myself wanted to take a stab at implementing some of the FieldMask work you outlined above they'd be left with a choice to do one of the following:

  1. Ignore the existing implementations for FieldMask's merge behaviors and create unit tests which wouldn't match.
  2. Attempt to match the existing implementations' default merge behavior which means zero reliance on reflectionMergePartial (as it will "replace" repeated fields by default instead of "append") which means a lot of nearly identical code.
  3. Add some kind of new "merge options" to reflectionMergePartial which would default to the current reflectionMergePartial behavior, but allow the FieldMask's merge behavior to match the existing implementations' behavior by passing a custom "merge options" object when calling MessageType.mergePartial() on a "leaf".

Maybe something like

export enum MergeRepeated {
    /** 
     * Source will replace target
     * ```ts
     * target = [{ a: 9, b: true }, { a: 8, b: true }]
     * source = [{ a: 1          }                   ]
     * result = [{ a: 1          }                   ]
     * ```
     */
    REPLACE = 0,
    /** 
     * Source will append to target
     * ```ts
     * target = [{ a: 9, b: true  }, { a: 8, b: true }          ]
     * source = [                                       { a: 1 }]
     * result = [{ a: 9, b: true  }, { a: 8, b: true }, { a: 1 }]
     * ```
     */
    APPEND = 1,
    /** 
     * Source will overwrite target by index
     * ```ts
     * target = [{ a: 9, b: true }, { a: 8, b: true }]
     * source = [{ a: 1          }                   ]
     * result = [{ a: 1          }, { a: 8, b: true }]
     * ```
     */
    SHALLOW = 2,
    /** 
     * Source will deeply merge target by index
     * ```ts
     * target = [{ a: 9, b: true }, { a: 8, b: true }]
     * source = [{ a: 1          }                   ]
     * result = [{ a: 1, b: true }, { a: 8, b: true }]
     * ```
     */
    DEEP = 3,
}

export enum MergeMap {
    /** 
     * Source will replace target
     * ```ts
     * target = { foo: { a: 9, b: true }, bar: { a: 8, b: true }                }
     * source = {                                                 baz: { a: 1 } }
     * result = {                                                 baz: { a: 1 } }
     * ```
     */
    REPLACE = 0,
    /** 
     * Source will overwrite target by key
     * ```ts
     * target = { foo: { a: 9, b: true  }, bar: { a: 8, b: true }                }
     * source = { foo: {       b: false },                         baz: { a: 1 } }
     * result = { foo: {       b: false }, bar: { a: 8, b: true }, baz: { a: 1 } }
     * ```
     */
    SHALLOW = 1,
    /** 
     * Source will recursively merge
     * ```ts
     * target = { foo: { a: 9, b: true  }, bar: { a: 8, b: true } }
     * source = { foo: {       b: false },                         baz: { a: 1 } }
     * result = { foo: { a: 9, b: false }, bar: { a: 8, b: true }, baz: { a: 1 } }
     * ```
     */
    DEEP = 2,
}

export interface MergeOptions {
    /**
     * @default MergeMap.SHALLOW
     */
    map?: MergeMap;
    /**
     * If false message fields will be replaced instead of recursively merged (if the source field is set).
     * @default true
     */
    message?: boolean;
    /**
     * @default MergeRepeated.REPLACE
     */
    repeated?: MergeRepeated;
}

@timostamm
Copy link
Owner Author

Some context - from field_mask.proto:

If a repeated field is specified for an update operation, new values will
be appended to the existing repeated field in the target resource.

This merge behavior runs pretty deep in protobuf. It is also used in the binary format, meaning that deserializing a message multiple times into the same object will overwrite singular (scalar) fields, but append to repeated fields.

internalBinaryRead() does just that, but we do not have a matching merge() function that operates on two message instances. An implementation of FieldMask operations should definitely be compatible with this behavior. For example, a client might send a delta update for an entity using a field mask, and that only works if the algorithms on server and client match.

In my opinion, this merge behavior is really useful for the binary format, but not ideal for a general-use merge() function. The merge options in the Java and Python implementation seems to support that assessment. I think it's especially true for JavaScript, where we are used to Object.assign(), Lodash's merge(), and many others. That's why the only merge function we offer is MessageType.mergePartial().

Looking at the notes on field masks in the issue description, It looks like MessageType.mergeMessage() relying on extract() and mergePartial() would be a problem, and adding methods to MessageType should probably be avoided so that they don't increase code size for users who don't use field masks.

The two most important operations related to field masks seem to be:

  • Update a target message from a source message + field mask
  • Create a diff between two messages, resulting in a field mask + a message where only the different fields are present

I think just the implementation of those two pieces would be a substantial step forward. They could simply be two standalone functions that use the protobuf merge behavior. A follow-up could add merge options and then we should probably look into sharing code with mergePartial(), but I'm not sure they would be necessary for a start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants