Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for proto3 optional fields in C# #7382

Merged
merged 6 commits into from Apr 24, 2020

Conversation

jskeet
Copy link
Contributor

@jskeet jskeet commented Apr 15, 2020

This is the generator part of the C# proto3 optional fields work.

Still to do:

  • Reflection API changes
  • Write unit tests against the (now generated) UnittestProto3Optional.cs

Copy link
Member

@haberman haberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

src/google/protobuf/compiler/csharp/csharp_generator.h Outdated Show resolved Hide resolved
src/google/protobuf/descriptor.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_helpers.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_helpers.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_message.cc Outdated Show resolved Hide resolved
Copy link
Contributor Author

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the speedy review. Will revert the first commit and use real_oneof_decl_count etc to simplify the second commit, then force push.

src/google/protobuf/descriptor.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_generator.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_helpers.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_helpers.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_message.cc Outdated Show resolved Hide resolved
@jskeet
Copy link
Contributor Author

jskeet commented Apr 16, 2020

Changes now back in - it was nice to remove the first commit and the frequent "is_synthetic" checks.

(It looks like some of the rest of the code could be tidied up by using real_containing_oneof and real_oneof_decl_count too, but I haven't tried to do that here.)

Copy link
Contributor Author

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the next iteration - will see when I can find time to make the proposed changes.

I'll probably do those in a separate commit for now so it's easier to review, then rebase before merging. (The previous changes were sufficiently pervasive that it wasn't worth having them as extra commits.)

src/google/protobuf/compiler/csharp/csharp_helpers.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_helpers.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_helpers.h Outdated Show resolved Hide resolved
src/google/protobuf/compiler/csharp/csharp_message.cc Outdated Show resolved Hide resolved
@haberman
Copy link
Member

I haven't changed anything about what the property returns. I was just trying to stick to this guideline in the implementation doc:

For consistency with proto2, proto3 optional fields should use exactly the same API as proto2 optional.

Proto2 message fields have HasFoo/Clear() members. Personally I don't like that (for the reasons stated earlier) but I was trying to follow the guidelines. Definitely happy to make an exception in this case though.

Ah, I think I understand the quandry we are in then. It sounds like proto2 and proto3 are currently differing on this point? Proto2 returns a default instance for an unset message field, while proto3 returns null?

That means we have to choose our inconsistency. Either:

  1. proto2 optional message fields are inconsistent with proto3 optional message fields
  2. proto3 explicit optional message fields are inconsistent with proto3 implicit optional message fields.

I'm not sure yet what the right path is here.

@jskeet
Copy link
Contributor Author

jskeet commented Apr 16, 2020

Proto2 returns a default instance for an unset message field, while proto3 returns null?

No, both return null - it's just that Proto2 still has HasFoo and ClearFoo() members which just check for null or set null. Example:

public global::ProtobufTestMessages.Proto2.TestAllTypesProto2 RecursiveMessage {
get { return recursiveMessage_; }
set {
recursiveMessage_ = value;
}
}
/// <summary>Gets whether the recursive_message field is set</summary>
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public bool HasRecursiveMessage {
get { return recursiveMessage_ != null; }
}
/// <summary>Clears the value of the recursive_message field</summary>
[global::System.Diagnostics.DebuggerNonUserCodeAttribute]
public void ClearRecursiveMessage() {
recursiveMessage_ = null;
}

If we're content to break existing proto2 customers, we could remove those - I don't think they add value.

At that point we could be consistent between proto2, proto3-implicit-optional and proto3-explicit-optional.

@jskeet
Copy link
Contributor Author

jskeet commented Apr 16, 2020

It sounds like the right thing to do for this PR is to omit presence for message fields. That's very easy to do.

@jskeet
Copy link
Contributor Author

jskeet commented Apr 16, 2020

I've added two more commits - one with source code changes addressing your comments (hopefully - I may have missed some long lines, so please let me know if there's anything there), and another with the result of regenerating for comparison.

I would expect to fold these into the existing commits later - let me know when you want me to (or if you'd rather I didn't).

@haberman
Copy link
Member

It sounds like the right thing to do for this PR is to omit presence for message fields. That's very easy to do.

By "omit presence" do you just mean omitting the HasFoo() and ClearFoo() methods, and a hasbit?

That seems like a fine first step. :)

We would generally say that message fields are still tracking presence, since null is distinct from an empty sub-message. You could write HasFoo() and ClearFoo() methods that do nothing but test against null and set to null. But omitting them is fine for now too.

@jskeet
Copy link
Contributor Author

jskeet commented Apr 16, 2020

Yes, I just mean removing the HasFoo property and ClearFoo() method for a foo field. (We already didn't allocate a presence bit for messages, given the use of null.) The "methods that do nothing but test against null and set to null" is precisely what happens in the proto2 generation at the moment - and personally I'd love to remove those, as they make the API surface much larger for no benefit IMO.

Sounds like we're on the same page - so I suspect the next step is writing unit tests for the code generated from unittest_proto3_optional.proto, and doing whatever's required for the reflection API. At that point I believe this PR would be ready to merge, and then we can consider the other suggestions that have come up in the review.

@haberman
Copy link
Member

At that point I believe this PR would be ready to merge, and then we can consider the other suggestions that have come up in the review.

Sounds great, thanks! :)

@jskeet jskeet changed the title Add support for proto3 optional fields in C# (NOT READY TO MERGE) Add support for proto3 optional fields in C# Apr 17, 2020
@jskeet jskeet marked this pull request as ready for review April 17, 2020 07:39
@jskeet jskeet requested a review from haberman April 17, 2020 07:41
@jskeet
Copy link
Contributor Author

jskeet commented Apr 17, 2020

Okay, I believe this is now ready for review with an eye to merging.

@ObsidianMinor
Copy link
Contributor

What I was thinking with HasFoo and ClearFoo for nullable types was I wanted the API to stay the same for all fields no matter how their presence is stored to keep everything consistent with both proto2 non-nullable fields and proto3 nullable fields, since proto3 nullable fields don't accept null. I thought it may be confusing if some messages had nullable fields that took null and other messages had nullable fields that did not. Plus if consumers moved from proto2 to proto3 those field setters would silently change behavior rather than just HasFoo and ClearFoo no longer existing and consumers having to adjust accordingly.

@jskeet
Copy link
Contributor Author

jskeet commented Apr 17, 2020

since proto3 nullable fields don't accept null

I don't know what you mean by that. The wrapper type fields are represented by nullable value types, and you can set them to null. Message type fields are represented by references, and you can set them to null.

It's only bytes and string fields that don't accept null (except for wrapper type fields).

Given that the Has/Clear members for message type fields just check/set null references, and that's easy to do in C# already, I'd rather not have them in either proto2 or proto3. It's already quite confusing looking through the reference documentation for a generated C# class given how many members there are - having three members for every message field, without providing any usability benefit IMO, seems like a bad idea.

I don't see what you mean about field setters silently changing behavior - could you clarify what you mean by that? (If you can give a concrete example of a proto and what you understand the behavior to be before and after, that would be easiest.)

@ObsidianMinor
Copy link
Contributor

It's only bytes and string fields that don't accept null (except for wrapper type fields).

Ah, it's been a while. You're right if messages don't already do null-checks then that was a waste of API space and a mistake on my part.

I believe when proto2 support was released we mentioned that all the API was experimental so I think it might be ok to remove those properties and methods.

@jskeet
Copy link
Contributor Author

jskeet commented Apr 17, 2020

I believe when proto2 support was released we mentioned that all the API was experimental so I think it might be ok to remove those properties and methods.

Great. I've filed #7395

I'd suggest only looking at implementing it after this is in, as then it'll just be a couple of lines of code, I believe.

Copy link
Contributor

@jtattermusch jtattermusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I performed review for the C# portion of the code and things are looking good to me (with a few very minor nits).

throw new InvalidOperationException("HasValue is not implemented for proto3 fields");
hasDelegate = message =>
{
throw new InvalidOperationException("HasValue is not implemented for non-optional proto3 fields");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/implemented/supported/ - "not implemented" sounds like something that should be implemented, but isn't yet. "Not supported" sounds like this is by design. Leaving up to you whether you want to change this or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I really don't have strong feelings one way or another - I tend to lean towards "keep things as they are" when I'm on the fence.

@@ -48,13 +48,16 @@ namespace csharp {
// header. If you create your own protocol compiler binary and you want
// it to support C# output, you can do so by registering an instance of this
// CodeGenerator with the CommandLineInterface in your main() function.
class PROTOC_EXPORT Generator : public CodeGenerator {
class PROTOC_EXPORT CSharpGenerator : public CodeGenerator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious: why rename the class? It's under the csharp namespace already, so strictly speaking this shouldn't be necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it for consistency with the other classes I was looking at - namely JavaGenerator and CppGenerator. Looking at PHP, Ruby, and Python though, they just have Generator.

@haberman any preference on this?

@jskeet
Copy link
Contributor Author

jskeet commented Apr 22, 2020

@haberman: Other than the question about renaming the generator class, do you have any further requested changes? Looks like the rename might have been the cause of the Kokoro build failing - I'll wait for comments about that before changing anything, as either I'll revert the rename, or update the tests.

@haberman
Copy link
Member

@jskeet Naming of the code generator class is totally up to you. :) There aren't really any compatibility implications there to worry about.

What was the final outcome of the discussion about API?

  • does the API of an optional message field change between proto2 and proto3?
  • does the API of a message field change in proto3 if you add or remove the optional keyword?

Those are the two issues I'm the most focused on. I want the answers to both of these to be "no." :)

It sounds like your plan is to merge this PR, then remove hazzers from message fields in proto2 per #7395?

That seems like a fine resolution. If you are always representing non-present messages with null (instead of a default instance), then I can see why a separate generated HasFoo() method would seem superfluous.

@jskeet
Copy link
Contributor Author

jskeet commented Apr 22, 2020

@jskeet Naming of the code generator class is totally up to you. :) There aren't really any compatibility implications there to worry about.

In that case I'll revert that part of the change then. Given that there's no need for the class name to change, and it's as negative as it's positive, let's avoid the churn :)

does the API of an optional message field change between proto2 and proto3?

Yes, at the moment.

does the API of a message field change in proto3 if you add or remove the optional keyword?

No.

It sounds like your plan is to merge this PR, then remove hazzers from message fields in proto2 per #7395?

Exactly - assuming we're willing to make a breaking change to the proto2 generated code.

(It doesn't yet, but will in the next commits...)
Most changes are:

- Introducing new helpers of SupportsPresenceApi and RequiresPresenceBit. This allows calling code to be a lot clearer about what it's interested in.
- Changing most previous IsProto2 calls to use one of the two new helper methods
- Avoiding treating synthetic oneofs as regular ones
- Some slight refactoring in csharp_primitive_field to avoid code duplication
- Comments explaining what we want when, so the next maintainer doesn't need to do the detective work I did!

This change deliberately doesn't modify the API surface of any
existing code. The only change to previously-generated C# should be
making presence bits more efficient in proto2.

Once proto3 optional fields are supported, we can consider further
changes to make the proto2 and proto3 generated API surface more
consistent (e.g. adding presence API for message fields and oneofs).
…ional.proto

The changes in the existing proto2 code are solely around presence bits. The new generator allocated presence bits more efficiently. (Previously bits were sometimes allocated but never used.)
(This isn't as exhaustive as it might be, but the behavior is basically the same as proto2 optional fields.)
Copy link
Member

@haberman haberman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.

Also, would it make sense to expose accessors like C++ does for real_containing_oneof(), real_oneof_count() etc.? Does C# support loading descriptors at runtime? If so we should probably enforce that synthetic oneofs are last, like in C++.

/// <summary>
/// Returns <c>true</c> if this field is a proto3 optional field; <c>false</c> otherwise.
/// </summary>
public bool IsProto3Optional => Proto.Proto3Optional;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to expose this publicly. I think we paint ourselves into a corner if code is written to think about the difference between proto2 and proto3.

Could you follow the pattern of the C++ API? C++ is hiding this accessor, and instead exposing a more semantically meaningful accessor: has_presence().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to remove it from the public API for now, but I'll need to check what exact semantics we want for HasPresence. (In particular, it may well not be "if there a HasFoo property".) I'm not going to speculate on it late at night my time :)

@@ -59,7 +59,8 @@ public interface IFieldAccessor
object GetValue(IMessage message);

/// <summary>
/// Indicates whether the field in the specified message is set. For proto3 fields, this throws an <see cref="InvalidOperationException"/>
/// Indicates whether the field in the specified message is set.
/// For proto3 fields that aren't explicitly optional, this throws an <see cref="InvalidOperationException"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to work for oneof fields also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could do, but it didn't before this PR - I'd prefer to treat that as a separate feature to implement. Likewise we can do it for message fields whether or not they're explicitly optional.

Copy link
Contributor Author

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, would it make sense to expose accessors like C++ does for real_containing_oneof(), real_oneof_count() etc.? Does C# support loading descriptors at runtime? If so we should probably enforce that synthetic oneofs are last, like in C++.

Yes, we can load descriptors at execution time. I think I'll want to do the extra oneof work in a separate commit. Will look into it tomorrow, but I may not have time to implement it this week. (It's probably worth mentioning this as a requirement in the internal implementation guide as well.)

/// <summary>
/// Returns <c>true</c> if this field is a proto3 optional field; <c>false</c> otherwise.
/// </summary>
public bool IsProto3Optional => Proto.Proto3Optional;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to remove it from the public API for now, but I'll need to check what exact semantics we want for HasPresence. (In particular, it may well not be "if there a HasFoo property".) I'm not going to speculate on it late at night my time :)

@@ -59,7 +59,8 @@ public interface IFieldAccessor
object GetValue(IMessage message);

/// <summary>
/// Indicates whether the field in the specified message is set. For proto3 fields, this throws an <see cref="InvalidOperationException"/>
/// Indicates whether the field in the specified message is set.
/// For proto3 fields that aren't explicitly optional, this throws an <see cref="InvalidOperationException"/>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could do, but it didn't before this PR - I'd prefer to treat that as a separate feature to implement. Likewise we can do it for message fields whether or not they're explicitly optional.

@haberman
Copy link
Member

haberman commented Apr 22, 2020

I'm happy to remove it from the public API for now, but I'll need to check what exact semantics we want for HasPresence.

Yep that is fair. I think the best semantics for HasPresence are: Reflection.HasField returns a semantically meaningful result.

It could do, but it didn't before this PR - I'd prefer to treat that as a separate feature to implement. Likewise we can do it for message fields whether or not they're explicitly optional.

Yep, agree on both counts (message fields should always work here, and it seems fine to implement this separately in a follow-up PR).

It's probably worth mentioning this as a requirement in the internal implementation guide as well.

Good point, I will add this to: https://github.com/protocolbuffers/protobuf/blob/9ae5203712eb80f89261d6df8d5674efa5a0edb8/docs/implementing_proto3_presence.md

@jskeet
Copy link
Contributor Author

jskeet commented Apr 24, 2020

@haberman: Please see the last commit for reflection changes. I'm pretty comfortable with those. Just to confirm, I'm then expecting to perform follow-up PRs of:

  • Breaking change for proto2 generation: don't generate Has/Clear for message fields
  • Add HasPresence for FieldDescriptor, and make it work wherever feasible (everywhere for proto2; oneofs + message + optional for proto3)

Does that all sound correct to you?

@jskeet
Copy link
Contributor Author

jskeet commented Apr 24, 2020

Thanks - will fix up the commits to be a bit more pleasant, and investigate the Kokoro build failures, then merge on green.

This is more involved than might be expected because the synthetic oneofs don't generate the properties we would usually expect to see.
@jskeet jskeet merged commit 81c9b85 into protocolbuffers:master Apr 24, 2020
@jskeet jskeet deleted the presence-csharp branch April 24, 2020 16:39
@haberman
Copy link
Member

Does that all sound correct to you?

Yep, sounds great. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants