Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduces Symbol Provider DSL using @DslMarker annotations to limit scopes within lambdas #311

Closed
wants to merge 5 commits into from

Conversation

loradd
Copy link
Member

@loradd loradd commented Mar 12, 2024

feat: introduces Symbol Provider DSL using @DslMarker annotations to limit scopes within lambdas

…limit scopes within lambdas

Signed-off-by: Lorenzo Addazi <lorenzo.addazi@strumenta.com>
Signed-off-by: Lorenzo Addazi <lorenzo.addazi@strumenta.com>
@loradd
Copy link
Member Author

loradd commented Mar 12, 2024

In this PR, we also modify the Symbol Description API to support arbitrary values as Symbol Description properties. That is, the API now supports including properties as follows:

symbolProvider {
    symbolFor(SomeNode::class) {
        include("someProperty", UUID.randomUUID().toString()) // for example
        include("anotherProperty", "anything that can be converted to a value description")
    }
}

@loradd loradd self-assigned this Mar 12, 2024
@ftomassetti
Copy link
Member

I will wait for the PR to be marked as ready before reviewing it

…Node to promote ReferenceByNames as standalone nodes (required for lsp support)

Signed-off-by: Lorenzo Addazi <lorenzo.addazi@strumenta.com>
Signed-off-by: Lorenzo Addazi <lorenzo.addazi@strumenta.com>
…lasses, e.g. SymbolProvider

Signed-off-by: Lorenzo Addazi <lorenzo.addazi@strumenta.com>
@loradd loradd marked this pull request as ready for review April 12, 2024 11:53
@loradd loradd requested a review from ftomassetti April 12, 2024 11:53
Copy link
Member

@ftomassetti ftomassetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we miss tests for type calculation

import kotlin.reflect.full.isSuperclassOf

// TODO merge into NodeIdProvider.kt file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this TODO be solved before merging?

}
}
}
private val namesToSymbolKeys: MutableMap<String, String> = mutableMapOf()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't I have more than one symbol with the same name in the scope? For example, overloaded methods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, multiple symbols for the same symbol key are possible - both in the local and external symbols map. This map is only keeping track of the original name and the corresponding key. For example, if I have a class 'Person' and a variable 'person' within a case-insensitive scope description, there will be two entries in this map: 'Person' -> 'person' and 'person' -> 'person'. The actual symbols will both be associated with the key 'person'. This allows us to keep track of the original names, which are used in features like code completion. Otherwise, we would lose this information when converting names to keys, e.g. Person would become person and be suggested in the wrong way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the overloaded methods support: it would require a specific symbol describing methods. I think this is a language-specific feature. In the scope description, we can have multiple symbols associated to a given name. However, we can only have one symbol per type in this list. Hence, the multiple overloaded implementations of a method should be represented inside the function symbol. Am I wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence, the multiple overloaded implementations of a method should be represented inside the function symbol. Am I wrong?

I am not sure I understand this part

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that, in case of function overloading, we might have a FunctionSymbol containing an implementation property linking to the various actual FunctionDefinition instances. I imagine that also the logic matching the correct implementation depending on the parameter types would be inside the FunctionSymbol class definition. Otherwise, how should handle cases where multiple symbols of the same type and name are contained in a given scope? I am not sure this is something that should be handled inside ScopeDescription, but a rather something to consider when designing the symbol representations for a given language. Any suggestions about a possible generic solution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view symbol resolution should figure out, among other things, which method get called. And that means also distinguishing between overloaded variants of methods with the same name. So after symbol resolution is performed I would expect the reference in a method call to point to the right method declaration. In this way other systems can just consume that information without having to know anything on the method resolution rules. For example, some editor could just use that information to implement a "go to declaration" functionality. Symbol resolution should fine the information necessary to perform the resolution by looking at the symbol definition.

Otherwise, how should handle cases where multiple symbols of the same type and name are contained in a given scope?

I was thinking to have a symbol for each overloaded variant, describing the signature of that method and symbol resolution would consider the information to pick the right symbol. If the actual symbol resolution is delegated to FunctionSymbol, then the user needs to know about it and calling the appropriate method.

Otherwise, how should handle cases where multiple symbols of the same type and name are contained in a given scope?

Considering not only the name but the whole symbol description. Otherwise why do we need to have any information but the name in the Symbol Description? The information is there only to be consumed during symbol resolution, or at least this was my understanding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the symbol resolver traverses the AST and aggregates all references. For each reference, it computes the corresponding scope and tries to resolve the reference using this. The resolution is generic and uses:

  • The name of the reference by name instance;
  • The type of the reference by name definition;

It absolutely makes sense to consider the signature in cases of overloaded functions, for example. I am not sure, however, how this could be implemented in a generic way. It looks to me like we might need some mechanism to specify the matching criteria. For example, in the case of a variable, it might be the name. In case of functions, it could integrate parameter type information as well.

The actual sequence of symbol resolution steps might change from node to node. For example, when resolving symbols for a function call expression, we might want to do it on its argument expressions first. I feel we could introduce support to specify resolution steps, for example:

resolve(InvocationExpr::class) { (invocationExpr) ->
   // resolve symbols in arguments first
   invocationExpr.arguments.forEach { argument -> resolve(argument) }
   // compute the scope for the invocationExpr::function property
   val scope = scopeProvider.scopeFor(invocationExpr.function)
   // use the scope to obtain a list of candidates
   // select the most appropriate candidate according to some criteria
   invocationExpr.function.identifier = /* the identifier of the selected candidate */
   // we could also differentiate between externalCandidates and localCandidates
   invocationExpr.function.referred = /* the node result from asking local candidates */
}

The workflow would consist in:

  • Defining a Symbol provider to specify which nodes can be referenced from the outside;
  • Defining a Scope provider to specify what is visible from a given Node;
  • Defining a Type provider to specify the type for a given Node (considers references as resolved);
  • Defining a Symbol resolver to specify how to resolve symbols;

In the definition of a symbol resolution rule, we might use a type provider to retrieve the type of a given node - with the assumption that all required symbols have been resolved if possible. A scope provider to return back the scope description from a given node. The scope provider can use the type provider, but the type provider cannot use the scope provider. Symbol resolution rules coordinate and make sure that symbols are resolved before being used. Does this make sense?

* @property fields the visible fields of the described node
*
* @author Lorenzo Addazi <lorenzo.addazi@strumenta.com>
**/
data class SymbolDescription(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not make SymbolDescription as a Node, because it is not part of an AST and Kolasu nodes represent only parts of the AST. Is it possible to do this change or would that have too much impact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed and decided to define SymbolDescription as nodes some weeks ago and the internal implementations have been modified with this idea in mind. This could be avoided but it would have impacts on the Symbol Provider configuration API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I do not remember why we decided to make SymbolDescriptions to extend Node. Do you remember the reason? Is that reason still true?

In order not to delay merging this PR I would then write down the rationale for making SymbolDescription extend Node, and add a note that this could be later reconsidered

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, we thought it would have made integration with LionWeb easier - re-use logic for storing/loading Kolasu nodes. I don't know if that is still the case, but it would still be valid in case we would decide to store SymbolDescriptions inside a LionWeb repository. For the same reason, we defined the Symbol Description Language. Defining symbol descriptions as nodes is also useful as all traversal and other Kolasu APIs can be reused. Is there any document where we decided that Kolasu nodes only represent AST nodes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, we thought it would have made integration with LionWeb easier - re-use logic for storing/loading Kolasu nodes. I don't know if that is still the case,

I think this is not anymore the case. Sorry for the confusion. We can implement serialization to LionWeb (and deserialization) rather easily, without the need to force extending Node for that goal

Defining symbol descriptions as nodes is also useful as all traversal and other Kolasu APIs can be reused

I see the point

Is there any document where we decided that Kolasu nodes only represent AST nodes?

No, and there should be one.

I think that on a practical level is sometimes tempting to make something a Kolasu Node, to reuse the APIs we have. But on a conceptual level this causes confusion as we end up having Node indicating mostly AST nodes and sometimes other things, for which the class Node have features that makes sense to use for proper AST nodes, but not for other Nodes.

Maybe in Kolasu 1.6 we could have two interfaces: one called GenericNode and one called ASTNode extending GenericNode, but I do not see clearly what abstraction would GenericNode represent

*
* @author Lorenzo Addazi <lorenzo.addazi@strumenta.com>
**/
data class TypeDescription(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also make TypeDescription not nodes.
Also, for which purpose do we need TypeDescriptions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type descriptions are used to describe the type of the node described by a symbol - its name and possible supertypes. These are used when reasoning over the type of symbol without loading the corresponding node. I think we could define these without extending Node, is there any issue related to these being Node instances?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Sorry, I thought this represented types calculated for expressions. I instead understand that they represent instead the type of the Node (e.g., the AST class). In Kolasu 1.6., we have classes representing the metamodel (e.g., Concept) and we could use that for this goal.

is there any issue related to these being Node instances?

The issue is that a Node should represent an AST Node. Using it for other things could lead to confusion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keeping these as Nodes would allow us to re-use most of the Kolasu API for navigating, tracking positions, representing references. Is there a document where we decided that Kolasu nodes can only represent nodes strictly contained in the AST? I feel this is not really the current practice, e.g. nodes have also been used to represent types or external symbol representations in past discussions/projects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment about this above. I agree this should be something well documented and defined, while now it is a sort of grey area

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. I understand on this topic things are changing (based on what came up working on the Code Insight Studio and applying some ideas in practice) and that we today had limited time to discuss all aspects of it during the technical meeting. I am sorry for any complication caused by these updates and I am happy to have a meeting to discuss these things with more calm, if it is useful

* @param type the possible subtype
* @return true if [type] is a subtype, false otherwise
**/
fun isSuperTypeOf(type: TypeDescription?): Boolean {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we have rules for super typing and sub typing that are language specific? For example generics, variance, contra-variance, etc.?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand. Type descriptions are currently used as light representations of the type of a node in symbol descriptions. For example, a binary expression node might have a type description indicating that the described node is of type 'BinaryExpression', which extends 'Expression'. When would we need to use information such as variance, contra-variance, etc? Would not this become too specific to Kotlin data structures?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing as above:

Oh, I see. Sorry, I thought this represented types calculated for expressions. I instead understand that they represent instead the type of the Node (e.g., the AST class). In Kolasu 1.6., we have classes representing the metamodel (e.g., Concept) and we could use that for this goal.


@Test
fun testScopeFor() {
// val internalNode = InternalNode(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should revise this one

@ftomassetti
Copy link
Member

ftomassetti commented Apr 12, 2024

Could you merge master on this or rebase this on master?

@loradd
Copy link
Member Author

loradd commented Jun 3, 2024

Replaced by #347

@loradd loradd closed this Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants