Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding IfElseStatement for if-then-else, separating from IfStatement for if-then #33

Open
arai-a opened this issue May 15, 2018 · 6 comments

Comments

@arai-a
Copy link

arai-a commented May 15, 2018

(moved from binast/binjs-ref#131)

Background

  • we want to generate bytecode directly from .binjs file, without generating on-memory AST structure
  • we don't want to seek or lookahead inside .binjs file, for simplicity and performance
  • we want to reduce the amount of modification to already-generated bytecode (including source note) (patching jump target is necessary anyway tho...)
  • IfStatement has optional alternate statement, and the existence of the alternate is unknown until we start parsing it, that means, it's unknown when generating branch opcode or generating bytecode for consequent
interface IfStatement : Node {
  attribute Expression test;
  // The first `Statement`.
  attribute Statement consequent;
  // The second `Statement`, if present.
  attribute Statement? alternate;
};

So, with current IfStatement interface, we should modify the branch's kind (source note) when it turns out that there's alternate.
It would be better that kind of information is known at the beginning.

Solution

Separate IfStatement into IfStatement without alternate, and IfElseStatement with alternate

interface IfStatement : Node {
  attribute Expression test;
  attribute Statement consequent;
};

interface IfElseStatement : Node {
  attribute Expression test;
  attribute Statement consequent;
  attribute Statement alternate;
};

Pros

  • Bytecode generation becomes simpler
  • IfStatement without alternate becomes smaller in .binjs file

Cons

  • The number of interfaces increases, that might increase the header size and tree size (depends on file format)
    (With multipart encoding, increasing interfaces may result in the number of tuple index exceeding 127, which results in 2-bytes data inside [TREE])

Consideration

  • This is very implementation specific requirement (at least for my case), and I'm not sure if it's a good idea to modify spec for such purpose
  • Similar issue (requiring a certain aspect of subtree, before the appearance of the actual child node) would happen to other interfaces as well, and separating all of those interfaces might explode the number of interfaces
@Yoric
Copy link
Collaborator

Yoric commented May 15, 2018

Anything that simplifies streaming bytecode compilation is a good idea.

So I'm for this.

@j-f1
Copy link

j-f1 commented May 15, 2018

How about having a “variant” byte after the node type (defaulting to 0), which could act as a hint to the engine?
For example, for IfStatement:

hint value meaning
0 (no hint provided, no optimization possible)
1 if (...) { ... }
2 if (...) { /* empty */ } else { ... }
3 if (...) { ... } else { ... }

@arai-a
Copy link
Author

arai-a commented May 16, 2018

having another variant for IfStatement with empty-then-clause sounds interesting.
it would be nice to figure out how common it is.

Then, about variant byte, if we can (or should?) throw error if the subtree violates the variant's structure, I think it's almost the same thing as adding dedicated interfaces for each, and just using dedicated interface should be simpler (we don't need extra byte for each node).

If the variant is just a hint and the tree may violate the hint, it won't so much simplify bytecode compilation.
(we have to support modifying already-emitted code anyway)

@Yoric
Copy link
Collaborator

Yoric commented May 16, 2018

I agree that the hint doesn't look useful to me. It takes more space, adds one more possible hint violation/syntax error, and I don't feel that it would improve anything.

The empty-then-clause is an interesting question. How would we use the information?

@arai-a
Copy link
Author

arai-a commented May 16, 2018

currently we're not emitting different bytecode for empty-then-clause case,
but we might be able to emit somewhat optimized code (like, flipping the condition and reduce the number of jumps),
and such modification requires the info about the existence of then-clause's body before emitting branch's bytecode,
which is possible if we have on-memory AST, but not with streaming compilation without seek/lookahead.
(I haven't checked how much it improves performance tho)

@Yoric
Copy link
Collaborator

Yoric commented May 21, 2018

Note that the format I'm currently experimenting doesn't care all that much about having more than 128 nodes. We only get an observable compression impact if we have more than 128 instances of Expression or more than 128 instances of Statement, etc. in a single file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants