Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve more java field accessor name conflicts #8198
Resolve more java field accessor name conflicts #8198
Changes from 6 commits
af6ce22
2529c60
7da00a5
1f8baae
4038de8
7c0c1fc
9f88f0c
8663cff
bbd504b
ce16ade
7427439
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think snakeCaseToUpperCamelCase is only ever used for checking against specialFieldNames. Can we just keep specialFieldNames in lower camel case instead? This would simplify our code and IMO is a bit more intuitive anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specialFieldNames are in UpperCamelCase to handle proto fields named
_my_field
, since_my_field
will beMyField
when it is converted to camel case (either upper or lower). It seemed more straightforward to keep the special field names in UpperCamelCase for comparison (since it handles proto fields named bothmy_field
,_my_field
, and__my_field
), rather than adding additional logic to handle proto fields starting with_
.Note the tests that use
ForbiddenWordsLeadingUnderscoreMessage
test this case (pun intended ;)).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why both __ and _? Is there a conflict if both cases uses a single underscore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The single and double underscores referenced here are simply to match the java code that is currently generated by the protoc java compiler. Notice the previous code used the single and double underscore. In other words, my change does not change the existing semantics for single and double underscores.
The protoc java compiler currently (prior to my change) uses a single or double underscore depending on the proto field name.
The protoc java compiler always appends at least one underscore to java field names. This prevents those field names from clashing with java keywords. For example, if the proto field name was
int
, then the protoc java compiler would generate a java field name ofint_
.The protoc java compiler does not include the last trailing underscore in accessor method names for these fields. For example, the getter method for the proto field named
int
would begetInt()
.In addition, the protoc java compiler adds another underscore (for a total of two underscores) to the java field name for names that would result in accessor method names that clash with other method names. For example, if the proto field name was
class
, the java field name would beclass__
(double underscore). And again, the last trailing underscore is not included in the accessor method names. So the getter method for the proto field namedclass
would begetClass_()
This behavior existed prior to my change. My change simply adds more detection of field names that result in accessor method name clashes. Previously, only
cached_size
,serialized_size
, andclass
were detected. Unfortunately, those are only a subset of the proto field names that would actually cause java assessor method name conflicts. My change adds the complete set of proto field names that cause java accessor method name conflicts.Again, I'm not changing the semantics of single or double underscores. I am also not going to refactor the protoc compiler to change its current semantics for single and double underscores. I'm simply fixing a bug in the protoc compiler where it was generating non-compilable java code due to missing detection of field names that cause conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should field names here be field method accessor names as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the protoc compiler appends a single underscore specifically to the java field names to prevent clashes with java keywords. The protoc compiler does not include this last undescore in accessor method names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't split the ternary ?: operator across these lines. I found this really hard to read. Possibly this method should be split up, one for handling method names and one for handling field names. As is, I'm finding it very hard to grok. I did not initially understand what it was doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the ternary operator and improved the comments. As stated in my other comment, I'm not going to refactor the semantics of how the protoc compiler uses single and double underscores. I'm also not going to refactor how accessor method names are currently generated from field names. Both would be great improvements, but they are outside of the scope of this bugfix.
For comparison, take a look at the previous implementation of this method...
protobuf/java/core/src/main/java/com/google/protobuf/DescriptorMessageInfoFactory.java
Lines 592 to 598 in 3a4d931
I believe the comments in the new implementation have made it more clear, and are sufficient for this bugfix.