Skip to content

Protocol Buffers v3.0.0-alpha-2

Pre-release
Pre-release
Compare
Choose a tag to compare
@liujisi liujisi released this 26 Feb 09:49

Version 3.0.0-alpha-2 (C++/Java/Python/Ruby/JavaNano)

General

  • Introduced Protocol Buffers language version 3 (aka proto3).

    When protobuf was initially opensourced it implemented Protocol Buffers
    language version 2 (aka proto2), which is why the version number
    started from v2.0.0. From v3.0.0, a new language version (proto3) is
    introduced while the old version (proto2) will continue to be supported.

    The main intent of introducing proto3 is to clean up protobuf before
    pushing the language as the foundation of Google's new API platform.
    In proto3, the language is simplified, both for ease of use and to
    make it available in a wider range of programming languages. At the
    same time a few features are added to better support common idioms
    found in APIs.

    The following are the main new features in language version 3:

    1. Removal of field presence logic for primitive value fields, removal
      of required fields, and removal of default values. This makes proto3
      significantly easier to implement with open struct representations,
      as in languages like Android Java, Objective C, or Go.
    2. Removal of unknown fields.
    3. Removal of extensions, which are instead replaced by a new standard
      type called Any.
    4. Fix semantics for unknown enum values.
    5. Addition of maps.
    6. Addition of a small set of standard types for representation of time,
      dynamic data, etc.
    7. A well-defined encoding in JSON as an alternative to binary proto
      encoding.

    This release (v3.0.0-alpha-2) includes partial proto3 support for C++,
    Java, Python, Ruby and JavaNano. Items 6 (well-known types) and 7
    (JSON format) in the above feature list are not implemented.

    A new notion "syntax" is introduced to specify whether a .proto file
    uses proto2 or proto3:

    // foo.proto
    syntax = "proto3";
    message Bar {...}
    

    If omitted, the protocol compiler will generate a warning and "proto2" will
    be used as the default. This warning will be turned into an error in a
    future release.

    We recommend that new Protocol Buffers users use proto3. However, we do not
    generally recommend that existing users migrate from proto2 from proto3 due
    to API incompatibility, and we will continue to support proto2 for a long
    time.

  • Added support for map fields (implemented in proto2 and proto3 C++/Java/JavaNano and proto3 Ruby).

    Map fields can be declared using the following syntax:

    message Foo {
      map<string, string> values = 1;
    }
    

    Data of a map field will be stored in memory as an unordered map and it
    can be accessed through generated accessors.

C++

  • Added arena allocation support (for both proto2 and proto3).

    Profiling shows memory allocation and deallocation constitutes a significant
    fraction of CPU-time spent in protobuf code and arena allocation is a
    technique introduced to reduce this cost. With arena allocation, new
    objects will be allocated from a large piece of preallocated memory and
    deallocation of these objects is almost free. Early adoption shows 20% to
    50% improvement in some Google binaries.

    To enable arena support, add the following option to your .proto file:

    option cc_enable_arenas = true;
    

    Protocol compiler will generate additional code to make the generated
    message classes work with arenas. This does not change the existing API
    of protobuf messages and does not affect wire format. Your existing code
    should continue to work after adding this option. In the future we will
    make this option enabled by default.

    To actually take advantage of arena allocation, you need to use the arena
    APIs when creating messages. A quick example of using the arena API:

    {
      google::protobuf::Arena arena;
      // Allocate a protobuf message in the arena.
      MyMessage* message = Arena::CreateMessage<MyMessage>(&arena);
      // All submessages will be allocated in the same arena.
      if (!message->ParseFromString(data)) {
        // Deal with malformed input data.
      }
      // Must not delete the message here. It will be deleted automatically
      // when the arena is destroyed.
    }
    

    Currently arena does not work with map fields. Enabling arena in a .proto
    file containing map fields will result in compile errors in the generated
    code. This will be addressed in a future release.

Python

  • Python has received several updates, most notably support for proto3
    semantics in any .proto file that declares syntax="proto3".
    Messages declared in proto3 files no longer represent field presence
    for scalar fields (number, enums, booleans, or strings). You can
    no longer call HasField() for such fields, and they are serialized
    based on whether they have a non-zero/empty/false value.
  • One other notable change is in the C++-accelerated implementation.
    Descriptor objects (which describe the protobuf schema and allow
    reflection over it) are no longer duplicated between the Python
    and C++ layers. The Python descriptors are now simple wrappers
    around the C++ descriptors. This change should significantly
    reduce the memory usage of programs that use a lot of message
    types.

Ruby

  • We have added proto3 support for Ruby via a native C extension.

    The Ruby extension itself is included in the ruby/ directory, and details on
    building and installing the extension are in ruby/README.md. The extension
    will also be published as a Ruby gem. Code generator support is included as
    part of protoc with the --ruby_out flag.

    The Ruby extension implements a user-friendly DSL to define message types
    (also generated by the code generator from .proto files). Once a message
    type is defined, the user may create instances of the message that behave in
    ways idiomatic to Ruby. For example:

    • Message fields are present as ordinary Ruby properties (getter method
      foo and setter method foo=).
    • Repeated field elements are stored in a container that acts like a native
      Ruby array, and map elements are stored in a container that acts like a
      native Ruby hashmap.
    • The usual well-known methods, such as #to_s, #dup, and the like, are
      present.

    Unlike several existing third-party Ruby extensions for protobuf, this
    extension is built on a "strongly-typed" philosophy: message fields and
    array/map containers will throw exceptions eagerly when values of the
    incorrect type are inserted.

    See ruby/README.md for details.

JavaNano

  • JavaNano is a special code generator and runtime library designed especially
    for resource-restricted systems, like Android. It is very resource-friendly
    in both the amount of code and the runtime overhead. Here is an an overview
    of JavaNano features compared with the official Java protobuf:

    • No descriptors or message builders.
    • All messages are mutable; fields are public Java fields.
    • For optional fields only, encapsulation behind setter/getter/hazzer/
      clearer functions is opt-in, which provide proper 'has' state support.
    • For proto2, if not opted in, has state (field presence) is not available.
      Serialization outputs all fields not equal to their defaults.
      The behavior is consistent with proto3 semantics.
    • Required fields (proto2 only) are always serialized.
    • Enum constants are integers; protection against invalid values only
      when parsing from the wire.
    • Enum constants can be generated into container interfaces bearing
      the enum's name (so the referencing code is in Java style).
    • CodedInputByteBufferNano can only take byte[](not InputStream).
    • Similarly CodedOutputByteBufferNano can only write to byte[].
    • Repeated fields are in arrays, not ArrayList or Vector. Null array
      elements are allowed and silently ignored.
    • Full support for serializing/deserializing repeated packed fields.
    • Support extensions (in proto2).
    • Unset messages/groups are null, not an immutable empty default
      instance.
    • toByteArray(...) and mergeFrom(...) are now static functions of
      MessageNano.
    • The 'bytes' type translates to the Java type byte[].

    See javanano/README.txt for details.