Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ProtoBuf] Should we check duplicating of proto id? #2653

Open
xiaozhikang0916 opened this issue Apr 25, 2024 · 9 comments
Open

[ProtoBuf] Should we check duplicating of proto id? #2653

xiaozhikang0916 opened this issue Apr 25, 2024 · 9 comments
Labels

Comments

@xiaozhikang0916
Copy link

Right now the ProtoBuf format allow duplicating proto ids in one message, e.g.

@Serializable 
data class MessageWithDuplicateId(
    @ProtoNumber(3) val s: String, 
    @ProtoNumber(3) val d: Int
)

ProtoBuf.encodeToHexString(MessageWithDuplicateId("foo", 42))
// 1a03666f6f182a
// 3:LEN {"foo"} 3:VARINT 42

is a valid definition in this lib, and can be serialized into wire.

But when decoding from the same wire data will just throw message complaining wrong wire type.

Intent

IMO we should check that all proto ids should be unique, as described in the Protocol Buffer Documentation, before any deserializing and serializing happens.

With this check, developers can find out what is wrong ( or what will be wrong ) more easily, especially in messages with one-of properties (in #2546), whose ids may be in different classes.

@xiaozhikang0916
Copy link
Author

xiaozhikang0916 commented Apr 25, 2024

Note: It is legal to have duplicating ids in wire data.

@sandwwraith
Copy link
Member

IIRC duplicating ids in wire data are allowed for messages merging/updating purpose; they should represent the same type, and the last one should always win (i.e. being deserialized into actual class)

@pdvrieze
Copy link
Contributor

@sandwwraith Semantically they are supposed to refer to the same value/attribute. The question is whether/how this is verified. Putting it in the plugin is challenging (esp. to make it format independent - maybe allow the property to be annotated with a @Unique annotation), doing it at runtime is too late (and might not reliably trigger).
The best way I would see is in schema generation tools (which ideally would be integrated into gradle, with format specific plugins)

@xiaozhikang0916
Copy link
Author

And tried with duplicating SerialName:

    @Serializable
    data class Data(
        @SerialName("name") val name: String,
        @SerialName("name") val name2: String
    )

And IDE ( or the plugin? ) will just show error message

Serializable class has duplicate serial name of property 'name', either in the class itself or its supertypes

and also fails in compile time.

I believe that duplicating id should also fail, either in compile time or run time.

@pdvrieze
Copy link
Contributor

@xiaozhikang0916 The difference is that @SerialName is actually translated at compile time (by the serialization compiler plugin) and used to generate the descriptor. As such it can be reliably verified at compile time and an error generated. @ProtoId is a "user/format defined" annotation that is just recorded in the serial descriptor. While the protobuf format is an official part of the library it is still poor design to special case any specific format. Hence if we can add some information to the annotation definition this is something that can be picked up by the plugin and handled properly (such as making the protoId unique).

@xiaozhikang0916
Copy link
Author

Brainstorming:

We can have a base type interface SerialIdentifier: Comparable<SerialIdentifier> to be super type of SerialName and ProtoNumber, so that plugin can check every inheritors of it in class properties to be unique.

But it is still a challenge to check across classes as ProtoOneOf required.

@sandwwraith
Copy link
Member

@xiaozhikang0916 Annotations can't have super interfaces. But we can do this with meta-annotation, e.g.

@Target(AnnotationTarget.ANNOTATION_CLASS)
public annotation class SerialUnique

@xiaozhikang0916
Copy link
Author

Annotations can't have super interfaces.

Oh, sorry I have few experiance in annotation developing.


So then we can introduce an IdentifierProvider, to collect all identifiers recursively from Serializable and ProtoOneOf, and any other annotations needs to flatten ids.

With this, we can collect necessary info across classes for ProtoOneOf with sealed interface, but for those with open classes, static check is still impossible.

@pdvrieze
Copy link
Contributor

@xiaozhikang0916 What would happen is that the code that generates the serial descriptor (for a generated serializer) would verify that annotations with the given meta annotation are unique (in whatever way we define it, noting that some annotations have multiple parameters and may need to be unique across some of them).
Overall I would thus see the following parameters for the meta annotation:

  • Array of the annotation parameters that must be unique (probably by name, default: [value])
  • Uniqueness scope (handled carefully for both annotations on types and annotations on attributes): unique within struct; unique within module (as per internal)
  • Whether attribute annotations override type annotations (generally true)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants