Make it easy to write forward-compatible code: struct builders? #399

wchargin · 2020-12-02T03:08:56Z

In protobuf, adding a field to a message is considered a backward
compatible change. It’s compatible at the wire format level and also
at the source level for many sets of bindings, including the standard
bindings for Python, Go, C++, and Java.

Prost makes it easy to directly initialize structs of message type:

// message Person { string name = 1; string email = 2; }
let person = proto::Person {
    name: "A. U. Thor".into(),
    email: "author@example.com".into(),
};

But then if a new type is added to the message, the Rust code fails to
compile, because it’s no longer initializing all the fields.

One way to achieve compatibility is to spread in ..Default::default():

let person = proto::Person {
    name: "A. U. Thor".into(),
    email: "author@example.com".into(),
    ..Default::default() // more fields may be added later
};

This works, but (a) you have to remember to do it every time, and (b) it
can get a bit spammy when there are multiple layers of nesting:

            ..Default::default()
        },
        ..Default::default()
    },
    ..Default::default()
}

It would be nice to have a way to guarantee better compatibility. One
approach would be some kind of optional builder API:

let person = proto::Person::builder()
    .name("A. U. Thor".into())
    .email("author@example.com".into())
    .build();

I don’t know what exactly the right ergonomics are here, especially
around nested fields. (I half-wonder if the builder should take a
closure, but I think that that might get a bit hairy.)

A nice side-effect of such a builder pattern would be that the methods
could take impl Into<FieldType>, so you could pass &str literals
rather than having to write .to_string()/.into() everywhere.

An option to set #[non_exhaustive] on structs doesn’t sound quite
right to me, since (a) that doesn’t do anything inside the same crate
and (b) then you can’t use functional record updates (..default()).

It’s fine with me if the struct fields remain public; we can have a
style recommendation that says “always use builders when building
messages of a type that you don’t control”, or something. Though being
able to control that via a codegen option would be nice, too.

This is an important consideration for using Prost in shared or large
codebases, especially monorepos. It’s common to have a shared protobuf
definition with autogenerated bindings that many teams depend on.
Enabling the proto owners to make forward-compatible changes without
breaking other teams’ builds would be a requirement to use Prost.

The text was updated successfully, but these errors were encountered:

danburkert · 2020-12-02T19:11:07Z

I don't agree that adding a field to a message being a backwards compatible wire change is sufficient motivation to also guarantee that it's a source-compatible change. There are plenty of other guaranteed wire-compatible changes that typically are not source compatible in any (typed) language bindings, e.g. widening an integer, moving a field into a oneof, renaming a field, removing an optional field, etc.

I do think #[non_exhaustive] is the way forward for users who want to expose a prost-generated type in a public crate API that needs stability guarantees. I don't personally think this is a wise thing to do, but a lot of people seem to want to do it 🤷 . For workspace-style private codebase I find that the compiler pointing out where code needs to get updated is extremely helpful.

(b) then you can’t use functional record updates (..default())

This appears to work just fine: https://gist.github.com/a64fe9b1dd55d89ed8687381395f02d4

danburkert · 2020-12-02T19:15:52Z

This appears to work just fine

Maybe that's an artifact of the type being in the same crate? I haven't used #[non_exaustive] in anger since it's a relatively new feature, so I'm not exactly sure what its limitations are.

wchargin · 2020-12-02T22:08:09Z

Hi! Thanks for your response.

I don't agree that adding a field to a message being a backwards compatible wire change is sufficient motivation to also guarantee that it's a source-compatible change. There are plenty of other guaranteed wire-compatible changes that typically are not source compatible in any (typed) language bindings, e.g. widening an integer, moving a field into a oneof, renaming a field, removing an optional field, etc.

Agreed: not all wire-compatible changes are source-compatible. But
adding a field specifically is meant to be both wire-compatible and
source-compatible. The guidelines (also linked above) say that
fields “may be added to existing APIs in the same version”, and also
say a single version “must” be source- as well as wire- and
semantic-compatible, which implies the conclusion.

Correspondingly, even though Go, Java, and C++ are all typed, their
language bindings treat this as a source-compatible change. Java and C++
only expose getters and setters (and/or builders). Go has structs like
Prost does, but omitted fields in Go struct literals are implicitly set
to zero values.

For workspace-style private codebase I find that the compiler pointing out where code needs to get updated is extremely helpful.

Absolutely; I’m a huge fan of this workflow as well (“change one type,
fix all the errors, it works first try”). To quote from Abseil’s “Use
Exhaustive switch Statements Responsibly” (h/t @nfelt), for
this to work, you need one of three things:

The owner of the enum guarantees no new enumerators will be
added,

The owner of the enum is willing and able to fix our code when new
enumerators are added (e.g. the enum definition is part of the
same project),

The owner of the enum will not be blocked by breaking our build
(e.g. their code is in a separate source control repository), and
we’re willing to be forced to update our switch statements when
updating to the latest version of the enum-owner’s code.

The “change one type, fix all the errors, it works first try” flow is
basically picking option (2), but for published protos this doesn’t
work. For protos that aren’t set in stone, this leaves (3), which may be
impossible (monorepo) or undesirable (if you want to take the proto
semantics).

This appears to work just fine

Maybe that's an artifact of the type being in the same crate?

Yeah, that’s right:

$ cat sub.rs
#[non_exhaustive]
#[derive(Debug, Default)]
pub struct S {
    pub a: i32,
    pub b: i32,
}
$ cat main.rs
extern crate sub;
fn main() {
    let s = sub::S {
        a: 1,
        b: 2,
        ..Default::default()
    };
    dbg!(s);
}
$ rustc sub.rs --crate-type=lib
$ rustc main.rs -L.
error[E0639]: cannot create non-exhaustive struct using struct expression
 --> main.rs:3:13
  |
3 |       let s = sub::S {
  |  _____________^
4 | |         a: 1,
5 | |         b: 2,
6 | |         ..Default::default()
7 | |     };
  | |_____^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0639`.

By the way, I played with sketching out an API for this, and was pleased
to see that if you put the builder-related methods in traits,
then the whole thing works fine even if the proto happens to have a
field called build or something else pathological. (Maybe obvious to
you, but it’s not too often that I see the tangible benefits of trait
methods being namespaced as they are.)

danburkert · 2020-12-02T22:21:58Z

The guidelines (also linked above) say that
fields “may be added to existing APIs in the same version”, and also
says a single version “must” be source- as well as wire- and
semantic-compatible.

That link appears to be to Google Cloud Client guidelines, that's not really relevant to the design of a general purpose protobuf library. I haven't seen any protobuf docs or guidelines that say anything about source compatibility of generated code in any context, and in practice there is no such guarantee as I pointed out.

WRT #[non_exhaustive] that's pretty unfortunate, it makes it a little less useful of a tool. It's still possible to use #[non_exhaustive] without a builder API though, doing something like:

let mut msg = MyMessage::default();
msg.foo = "bar".to_owned();
...

danburkert · 2020-12-02T22:23:40Z

Also note that you could in theory attach a general purpose builder derive macro to prost generated types, e.g. https://crates.io/crates/derive_builder. I haven't tried this so maybe there's some sticking points, but I believe all the hooks are available in prost-build to make it possible.

Kixunil · 2021-07-06T18:07:09Z

Combination of #[non_exhaustive] and builder sounds like a best option to me since then it's possible to match on the struct and access the fields directly without creating compatibility hazard.

I kinda like the idea of general-purpose builder although I'm not sure if there's a reason why prost-specific could be better. One thing that comes to my mind is avoiding naming conflicts (if someone funny names protobuf messages Foo and FooBuilder).

My use case is a library for quite large gRPC protocol that I don't have the time to write cleaner bindings for. (I wish I had.)

damelLP · 2022-12-14T06:53:41Z

https://lib.rs/crates/proto-builder-trait

This seems to do the job?
Maybe it could be upstreamed?

mzabaluev · 2023-04-13T23:07:13Z

derive_builder does away with a lot of the boilerplate and is very flexible, but I think a code generator using specific knowledge of the protobuf type system would be able to provide a better API out of the box.

Consider repeated fields:

message HelloRequest {
    repeated string flags = 2;
}

The builder for HelloRequest could provide this setter method:

impl HelloRequestBuilder {
    pub fn flags<T>(mut self, flags: impl IntoIterator<Item = T>) -> Self
    where T: Into<String>,
    {
        self.inner.flags = flags.into_iter().map(Into::into).collect();
        self
    }
}

Stylistically, I don't like the builder type becoming a part of the API that you have to use by name. It's better to add an associated fn to the message type, and have the builder type tucked away in a submodule:

impl HelloRequest {
    pub fn builder() -> hello_request::HelloRequestBuilder {
        todo!()
    }
}

mzabaluev · 2023-09-14T09:45:49Z

As realized in #901, a protobuf-specific generator can provide much more ergonomic API than a general purpose derive.

wchargin mentioned this issue Dec 2, 2020

rust: add short style guide tensorflow/tensorboard#4399

Merged

Kixunil mentioned this issue Jan 17, 2022

Add usage example Kixunil/tonic_lnd#8

Open

damelLP mentioned this issue Dec 14, 2022

[prost-build] Add Message and Enum specific attributes #783

Closed

mzabaluev mentioned this issue Aug 28, 2023

Extend prost-build to generate message builders #901

Open

mzabaluev mentioned this issue Mar 24, 2024

A more ergonomic enumeration representation #276

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it easy to write forward-compatible code: struct builders? #399

Make it easy to write forward-compatible code: struct builders? #399

wchargin commented Dec 2, 2020

danburkert commented Dec 2, 2020

danburkert commented Dec 2, 2020

wchargin commented Dec 2, 2020 •

edited

danburkert commented Dec 2, 2020

danburkert commented Dec 2, 2020

Kixunil commented Jul 6, 2021

damelLP commented Dec 14, 2022

mzabaluev commented Apr 13, 2023 •

edited

mzabaluev commented Sep 14, 2023 •

edited

Make it easy to write forward-compatible code: struct builders? #399

Make it easy to write forward-compatible code: struct builders? #399

Comments

wchargin commented Dec 2, 2020

danburkert commented Dec 2, 2020

danburkert commented Dec 2, 2020

wchargin commented Dec 2, 2020 • edited

danburkert commented Dec 2, 2020

danburkert commented Dec 2, 2020

Kixunil commented Jul 6, 2021

damelLP commented Dec 14, 2022

mzabaluev commented Apr 13, 2023 • edited

mzabaluev commented Sep 14, 2023 • edited

wchargin commented Dec 2, 2020 •

edited

mzabaluev commented Apr 13, 2023 •

edited

mzabaluev commented Sep 14, 2023 •

edited