-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define CST elements #107
base: master
Are you sure you want to change the base?
Define CST elements #107
Conversation
For the sake of being able to more easily tell the type of a source element, it would probably be helpful to include |
I thought about that, but |
|
||
```js | ||
interface ConcreteNode <: Node { | ||
sourceElements: [ ChildReference | Token | Nontoken ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can ConcreteNode
show up in sourceElements
? If not, should it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No and no, because ConcreteNode
extends Node
(which corresponds to a syntactic production) while sourceElements
contains lexical input elements, which are not nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did it this way because I don't think adding sourceElements: [ ChildReference | Token | Nontoken ] | null
to Node
would be valid syntax, but I see that I missed noting that everything else should extend not Node
but ConcreteNode
. I'll update.
Thanks for the use cases, @nzakas; they're great! I'll give them a first-pass attempt, but understand that some sharp edges are explicitly avoided, and others I'm sure I just missed.
This one has some interesting properties; I'll come back to it.
/* Insert `newArg` between `a` and `b` in `node.arguments`, where `b` is at `argIndex` */
node.arguments.splice(argIndex, 0, newArg);
let srcIndex;
let fill = node.sourceElements.reduce(function(fill, input, i) {
// Found an argument?
if ( input.reference === "arguments#next" ) {
// Record it by decrementing `argIndex`, and set `srcIndex` if we found `b`
if ( argIndex-- === 0 ) srcIndex = i;
// Copy everything between `a` and `b` except comments
} else if ( argIndex === 0 && input.element.slice(0, 7) !== "Comment" ) {
fill.push( Object.assign({}, input) );
}
return fill;
}, []);
node.sourceElements.splice(srcIndex, 0, { reference: "arguments#next" }, ...fill);
I'll assume this is a function declaration, since replacing a function expression with a comment requires more information about the context (but is otherwise similar). /* Replace the function declaration at index `i` in `node.body` with a comment */
srcIndex = indexOfRef(node.sourceElements, i); // uses the logic from above
node.sourceElements.splice(srcIndex, 1,
createBlockCommentHead(),
createCommentBody(
// Escape `*/` sequences without touching existing escapes
render(node.body[i]).replace(/\\([\w\W])|(\*)(\/)/g, "$2\\$1$3")
),
createBlockCommentTail()
);
node.body.splice(i, 1);
For simplicity, I'll avoid adding indentation and assume the same /* Roll variable declarations into `decl`, at index `i` in `node.body` */
let declIndex = indexOfRef(node.sourceElements, i); // uses the logic from above
do {
// Break out if there is no further declaration
next = node.body[++i];
if ( !next || next.type !== "VariableDeclaration" ) break;
// Ensure the necessary comma
let input, terminatorIndex = decl.sourceElements.length;
while ( (input = decl.sourceElements[--terminatorIndex]) &&
input.value !== ";" && !input.reference );
if ( input.reference ) {
decl.sourceElements.splice(terminatorIndex + 1, 0,
(input={ element: "Puncutator" }));
}
input.value = ",";
// Claim intervening source elements
let nextIndex = declIndex;
while ( (input = node.sourceElements[++nextIndex]) &&
input.reference !== "body#next" );
decl.sourceElements.push(
// Pluck (but do not import) the reference to `next`
...node.sourceElements.splice(declIndex + 1, nextIndex).slice(0, -1)
);
// Claim _subordinate_ source elements (including declarator references)
decl.sourceElements.push(
// ...but excluding the `var`/`let`/`const` keyword
...next.sourceElements.filter( input => input.element !== "Keyword" )
);
// Update the AST, removing `next` and moving its declarators to `decl`
node.body.splice(i, 1);
decl.declarations.push(...next.declarations);
} while ( true ); Ok, back to the first:
I think this requires access to parent nodes, but we absolutely prohibit the multiple (and in fact circular) references necessary to get them directly from the ESTree data structure. However, programs processing trees are free to do whatever they need on input/output, be they attaching ids like my POC, using on-the-side (Weak)Maps, or even introducing outright cycles like CST. So, assuming some helper functions, let reNonToken = /^WhiteSpace|^LineTerminator|^Comment/;
function isParenthetical( node ) {
// Check inwards if the first token is an open parenthesis
let inwardsParenthetical = node.sourceElements.reduce(function(answer, input) {
if ( answer != null ) return answer;
if ( reNonToken.test(input.element) ) return answer;
if ( input.value === "(" ) return true;
return false;
}, null);
if ( inwardsParenthetical ) return true;
// Check upwards
for ( let dir of [-1, 1] ) {
let expectedValue = dir === -1 ? "(" : ")";
let categorize = dir === -1 ? $categorizeOpenParen : $categorizeCloseParen;
let nut = node, parent; // "node under test"
while ( (parent = $parentNode(nut)) ) {
// Find the first preceding/following token
let input, nutIndex = $sourceIndexOf(nut, parent);
while ( (input = parent.sourceElements[nutIndex += dir]) &&
reNonToken.test(input.element) );
if ( input ) {
// If it's not a grouping parenthesis, we know all we need to
if ( input.value !== expectedValue ||
categorize(input, parent) !== "ParenthesizedExpression" ) {
return false;
}
// Otherwise, we may need to check the other side
continue;
}
nut = parent;
}
// If we run out of "up", `node` is not parenthesized
return false;
}
// We `continue`d twice, so `node` _is_ parenthesized
return true;
} |
rough implementation at https://npmjs.com/cstify (repo); test at http://forivall.com/astexplorer/ |
Fixes gh-41
Based on #41 (comment) , with slight modifications.