Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move transformations and code generation out of the intermediate representation #117

Open
jonas opened this issue Jul 12, 2018 · 9 comments
Labels
bindgen Binding generator rfc A proposal for a new functionality

Comments

@jonas
Copy link
Member

jonas commented Jul 12, 2018

Currently the code generation is part of the intermediate representation classes. In order to open the possibility of rewriting the code generation in Scala and potentially using the IR for integration with 3rd part bindings, transformations and code generation should be implemented in a separate module.

I’ve experimented with writing a protobuf schema for the IR and it looks like a simple way to move towards a cross language serialization format. It has been used by scala meta’s semantic DB. Generating the IR will also help enforcing that no custom logic is added to the IR.

@jonas jonas added rfc A proposal for a new functionality bindgen Binding generator labels Jul 12, 2018
@kornilova203
Copy link
Member

I like the idea, I feel that IR gets too complex because of all transformations and it would be good to keep IR immutable.
Will you share the schema that you wrote?

@jonas
Copy link
Member Author

jonas commented Jul 13, 2018

Here's the protobuf schema, feel free to edit it (via this comment).

syntax = "proto3";

package org.scalanative.bindgen;

option optimize_for = LITE_RUNTIME;

message IR {
  repeated EnumType enums = 1;
  repeated StructType structs = 2;
  repeated UnionType unions = 3;
  repeated TypedefType typedefs = 4;
  repeated Decl.FunctionDecl functions = 5;
  repeated Decl.VariableDecl variables = 6;
  //std::vector<std::shared_ptr<LiteralDefine>> literalDefines;
  //std::vector<std::shared_ptr<PossibleVarDefine>> possibleVarDefines;
  //std::vector<std::shared_ptr<VarDefine>> varDefines;
}

message Range {
  int32 start_line = 1;
  int32 start_character = 2;
  int32 end_line = 3;
  int32 end_character = 4;
}

message Location {
  string uri = 1;
  Range range = 2;
}

message Type {
  /*
  enum Kind {
    UNKNOWN = 0;
    PRIMITIVE = 1;
    ENUM = 2;
    POINTER = 3;
    ARRAY = 4;
    FUNCTION_POINTER = 5;
    UNION = 6;
    STRUCT = 7;
    TYPEDEF = 8;
  }
  */

  //Kind kind = 1; // FIXME: remove?
  Location location = 2;

  oneof kind {
    PrimitiveType primitiveType = 3;
    EnumType enumType = 4;
    PointerType pointerType = 5;
    ArrayType arrayType = 6;
    FunctionPointerType functionPointerType = 7;
    UnionType unionType = 8;
    StructType structType = 9;
    TypedefType typedefType = 10;
  }
}

message PrimitiveType {
  enum Modifier {
    NONE = 0;
    CONST = 0x1;
  }

  string type = 1;
  uint64 modifiers = 2;
}

message EnumType {
  message Enumerator {
    string name = 1;
    // Have both int64 and uint64?
    int64 value = 2;
  }

  string name = 1;
  repeated Enumerator enumerators = 2;
}

message PointerType {
  Type type = 1;
}

message ArrayType {
  Type type = 1;
  uint64 size = 2;
}

message FunctionPointerType {
  // FIXME: Include parameter name in doc string?
  Type returnType = 1;
  repeated Type parameterTypes = 2;
  bool isVariadic = 3;
}

message Field {
  string name = 1;
  Type type = 2;
}

message UnionType {
  string name = 1;
  repeated Field fields = 2;
  uint64 size = 3;
}

message StructType {
  // FIXME: packed attr
  string name = 1;
  repeated Field fields = 2;
  uint64 size = 3;
}

message TypedefType {
  string name = 1;
  Type type = 2;
}

message Decl {
  /*
  enum Kind {
    UNKNOWN = 0;
    FUNCTION = 1;
    VARIABLE = 2;
    VARIABLE_DEFINE = 3;
    LITERAL_DEFINE = 4;
  }
  */

  //Kind kind = 1;
  Location location = 2;

  oneof kind {
    FunctionDecl functionDecl = 11;
    VariableDecl variableDecl = 12;  
  }

  message FunctionDecl {
    message Parameter {
      string name = 1;
      Type type = 2;
    }
    
    string name = 1;
    Type returnType = 2;
    repeated Parameter parameters = 3;
    bool isVariadic = 4;
  }
  
  message VariableDecl {
    enum Modifier {
      NONE = 0;
      VOLATILE = 1;
    }
  
    string name = 1;
    Type type = 2;
    repeated Modifier modifiers = 3;
  }
}

@jonas
Copy link
Member Author

jonas commented Jul 13, 2018

Added in branch protobuf-ir.

@kornilova203
Copy link
Member

@jonas, do you think it would be better to start using libclang and do both header parsing and code generation using Scala Native? We cannot use C++ libtooling with Scala Native.

@kornilova203
Copy link
Member

Protobuf does not support Scala or C, but FlatBuffers does support C, so it can be used with Scala Native.
I personally prefer FlatBuffers, because serialized data is also in-memory representation (it does not matter for this project though because IR does not contain lots of objects).

@kornilova203
Copy link
Member

I generated FlatBuffers C sources for a test scheme. But definitions of generated functions are in header files and I cannot compile a lib from headers to use the lib in Scala Native.
Changing .h to .c causes errors.

@jonas
Copy link
Member Author

jonas commented Jul 17, 2018

Protobuf messages can be consumed in Scala with https://github.com/scalapb/ScalaPB. There's also a C implementation but C++ would work as long as the integration with Scala Native is done via a C API.

@kornilova203
Copy link
Member

But what about libtooling? It has C++ API. Use libclang instead?

@jonas
Copy link
Member Author

jonas commented Jul 20, 2018

As long as Scala Native doesn't have to interface directly with C++ APIs we should be okay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bindgen Binding generator rfc A proposal for a new functionality
Projects
None yet
Development

No branches or pull requests

2 participants