Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move DSL implementation from C to pure Ruby #8850

Merged
merged 9 commits into from Aug 2, 2021

Conversation

haberman
Copy link
Member

The Ruby DSL for defining a protobuf schema has always been implemented in C. Originally this was necessary, because upb could not consume descriptors directly. However for several years now the ultimate output of the DSL is just a protobuf (FileDescriptorProto), so the DSL could just as easily be implemented in Ruby.

This PR removes 1,352 lines of code from C and reimplements them as ~400 lines of Ruby. This will have the following benefits:

  • Reduces our overall C footprint, reducing the chance of SIGSEGV bugs (such as [Urgent] Seg fault using Ruby google-protobuf v3.17.3 #8842 which is a recent SEGV crash bug in the DSL).
  • Reduces our overall line count and maintenance burden.
  • Makes it possible to share the DSL code between MRI and JRuby.

It is possible that this PR will have some amount of performance regression in the startup time required to load generated foo_pb.rb files, since the DSL is evaluated using Ruby instead of C. However, the DSL was already calling back and forth between C and Ruby a lot, so hopefully the impact will not be too large. In any case, there are ways of mitigating this if it becomes an issue (see below).

This change requires us to add a new method DescriptorPool#add_serialized_file, which allows for defining messages using a serialized descriptor proto instead of the DSL. This is necessary to bootstrap google/protobuf/descriptor_pb.rb, which cannot use the DSL as these protos are used to implement the DSL.

Future Directions

In the future we may want to change the code generator to use DescriptorPool#add_serialized_file for all generated files, instead of the DSL. This would have the following benefits:

  • Performance: loading from a serialized descriptor will avoid the CPU cost of evaluating the DSL. For large schemas with many proto files, this could have a noticeable impact on app startup time.
  • Correctness/completeness: as we continue to implement missing features in Ruby protobuf, particularly custom options, the DSL will become more and more of a hindrance. Custom options have some very difficult edge cases that would be nearly impossible to accommodate in the DSL unless we resort to serialized descriptors at some level. Using binary descriptors avoids all of this complication.

There are various options for how we could keep the generated code readable, even if we move to binary descriptors. For example, we could do something like:

if false
  # Code for IDEs and human readers.
  class FooMessage
    attr_accessor :bar_field,
    attr_accessor :baz_field
  end
else
  # Code that is actually executed, implements semantics equivalent to the above.
  Google::Protobuf::DescriptorPool.generated_pool.add_serialized_file("<unreadable binary data>")
  FooMessage = Google::Protobuf::DescriptorPool.generated_pool.lookup("FooMessage").msgclass
end

Since this will require some input and dialogue about what is the best fit for the Ruby community, I'm leaving this out of the current PR.

The DSL will continue to be available no matter what, since we've exposed it already as a public API.

@haberman
Copy link
Member Author

Hmm, looks like this is having problems on Ruby 3.0. Both the Linux and MacOS builds are getting ("argument stack underflow") on Ruby 3.0:

MacOS:

 <L010> [sp: 2]
   0081 getlocal_WC_0        3                                           ( 335)
*  0083 opt_send_without_block <calldata:false, -1326341328>             ( 334)
   0085 opt_ltlt             <calldata:<<, 1>                            ( 334)
   0087 leave                                                            ( 338)
 <L003> [sp: 0]
   0088 putnil                                                           ( 334)
 <L001> [sp: -1]
   0089 nop                                                              (   0)
   trace: 200
   0090 leave                                                            ( 338)
---------------------
/Volumes/BuildData/tmpfs/src/github/protobuf/ruby/lib/google/protobuf/descriptor_dsl.rb: /Volumes/BuildData/tmpfs/src/github/protobuf/ruby/lib/google/protobuf/descriptor_dsl.rb:334: argument stack underflow (-208732189) (SyntaxError)

Linux:

 <L010> [sp: 2]
   0081 getlocal_WC_0        3                                           ( 335)
*  0083 opt_send_without_block <calldata:false, 22188128>                ( 334)
   0085 opt_ltlt             <calldata:<<, 1>                            ( 334)
   0087 leave                                                            ( 338)
 <L003> [sp: 0]
   0088 putnil                                                           ( 334)
 <L001> [sp: -1]
   0089 nop                                                              (   0)
   trace: 200
   0090 leave                                                            ( 338)
---------------------
/tmp/protobuf/protobuf/ruby/lib/google/protobuf/descriptor_dsl.rb: /tmp/protobuf/protobuf/ruby/lib/google/protobuf/descriptor_dsl.rb:334: argument stack underflow (-679739421) (SyntaxError)

This is a very odd error, especially that it occurs on Ruby 3.0 only. When running locally I've also seen:

<OBJ_INFO:gc_mark_ptr@gc.c:6106> 0x000055afeb295360 [2 M   ] T_NONE                                                                                                                                                
/usr/local/google/home/haberman/code/protobuf/ruby/lib/google/protobuf.rb:54: [BUG] try to mark T_NONE object 
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-linux]

@haberman
Copy link
Member Author

The Ruby 3.0 errors appear to not reproduce under Ruby 3.0.2, only Ruby 3.0.0.

This leads me to believe that these are bugs in Ruby 3.0.0 that have since been fixed.

I'll see if I can update the Kokoro jobs to use Ruby 3.0.2 instead.

# above descriptor. We need to infer that "foo" is the package name, and not
# a message itself. */

def get_parent_name(msg_or_enum)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a getter that mutates it argument and returns a value is fairly surprising. Can we instead have this be

def split_parent_name(msg_or_enum)
  full_name = msg_or_enum.name
  idx = full.rindex(?.)
  if idx
    return (name[0...idx], name[idz+1..-1])
  else
    return nil, full_name
  end
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -527,6 +527,42 @@ bool MaybeEmitDependency(const FileDescriptor* import,
}
}

bool GenerateDSLDescriptor(const FileDescriptor* file, io::Printer* printer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateDslDescriptor for the google style guide

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

last_common_dot = nil
min.size.times { |i|
if min[i] != max[i] then break end
if min[i] == ?. then last_common_dot = i end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's switch to using strings over character literals, the syntax is confusing to non-ruby folks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Peredery
Copy link

so...

/usr/local/bundle/gems/google-protobuf-3.18.0/lib/google/protobuf/descriptor_dsl.rb:58:in `add_serialized_file': Unable to build file to DescriptorPool: duplicate file name (groups.proto) (Google::Protobuf::TypeError)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants