Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LS state & performance refactoring #1667

Merged
merged 18 commits into from
Jun 11, 2024
Merged

LS state & performance refactoring #1667

merged 18 commits into from
Jun 11, 2024

Conversation

dbanck
Copy link
Member

@dbanck dbanck commented Mar 22, 2024


TL;DR: This PR tries to solve two main problems:

  • slow startup performance & high resource usage for larger workspaces
  • not easy extendable internal state for new languages, like Terraform Test & Stacks

I would suggest that you review this PR commit by commit. First, I introduce an event bus to tie everything together. Then I introduce the concept of features to the codebase along with three new features:

  • A feature for root modules (.terraform, .terraform.lock.hcl)
  • A feature for Terraform files (*.tf, *.tf.json)
  • A feature for Terraform variable definition files (*.tfvars, *.tfvars.json)
    Each feature decouples and isolates all functionality related to handling these files (or directories). It has its own internal state, its own set of jobs, its own subscription to specific event topics, and its own decoder (if applicable).
    It's intended to have more features in the future: a stacks feature, a tests feature, and some for other Terraform-related languages/things.

Now that the features are in place, we can remove everything from the existing packages that belongs in a feature. Mainly we can clean up the global state, the terraform package and the walker & indexer. Only structures that are relevant to the language server as a whole (job scheduler, schema store) or that are intended to be used in conjunction with multiple features (change batches) should live at the global package level.

I also refactored the walker and removed the scheduling of jobs when walking a workspace after opening it. We still detect and collect all directories relevant to terraform-ls, but we don't index them. Only when a user opens a file do we schedule jobs to parse the modules, variables, and root module files if applicable. This results in faster startup times and overall less CPU and memory usage.

Also, the text synchronization handlers now don't block on incoming requests. Previously, we would wait for all jobs to finish processing before accepting the next didChange request. Now, instead, we wait in the language feature handlers (e.g. complete or hover) until all jobs for a directory have been processed before returning a result. This should result in a better user experience and better job deduplication in the queue.


Known issues:

Missing decoding of installed modules
I plan to address this in a separate PR. Since we don't index the whole .terraform directory at startup, we need to introduce another way to detect installed modules. We can extend the existing local module detection on open to detect installed modules as well. The only missing piece is a mapping from a module's source to its installation location on disk.

The global workspace symbol LSP method is now less accurate
Since we only keep open modules in memory, the handler will only report symbols for all modules that have been opened at least once. This is a tradeoff for not indexing the entire workspace at startup. We can still look at a low resource background index job or kick off indexing the workspace when a sends a workspace symbol request. At that point, we can report progress in the UI.

Flaky tests?
Keep monitoring


Rough performance estimates:
A a workspace with ~50k lines of Terraform code
B a workspace with ~600 lines of Terraform code
on an Intel i9 2,4Ghz 8 Core MacBook Pro

main

A: 19,09s 229MB (startup, unitialized)
A: 23,38s 282MB (3 open modules, unitialized)

B: 0,09s 11,9MB (startup, unitialized)
B: 0,23s 12,3MB (2 open modules, unitialized)

new

A: 0,14s 9MB (startup, unitialized)
A: 5,26s 204MB (3 open modules, unitialized)

B: 0,02s 6MB (startup, unitialized)
B: 0,11s 11,3MB (2 open modules, unitialized)

Closes:

@dbanck dbanck self-assigned this Mar 22, 2024
@dbanck dbanck force-pushed the c-refactoring-prep branch 6 times, most recently from b0ad1b4 to c8d820d Compare April 17, 2024 11:10
@dbanck dbanck force-pushed the c-refactoring-prep branch 5 times, most recently from 914ac8b to 0c5e3d6 Compare April 25, 2024 08:03
@dbanck dbanck force-pushed the c-refactoring-prep branch 2 times, most recently from 9a174bf to dd2dd71 Compare May 22, 2024 10:01
@dbanck dbanck force-pushed the c-refactoring-prep branch 2 times, most recently from 3d0d8f9 to 67551b4 Compare June 6, 2024 19:22
@dbanck dbanck changed the base branch from main to pre-release June 6, 2024 19:32
@dbanck dbanck force-pushed the c-refactoring-prep branch 2 times, most recently from 7fc8086 to c9759bc Compare June 7, 2024 10:58
The EventBus has a fixed list of topics that anyone can subscribe to.
Whenever an event is published on a topic, all subscribers are notified.
This moves all logic related to Terraform root modules into a feature.
The feature keeps track of installed providers and modules.
This moves all logic related to module files into a feature. The feature
contains all state, jobs, hooks, and decoder related files.
dbanck added 13 commits June 10, 2024 14:11
This moves all logic related to Terraform's variable definition
files into a feature.
All jobs are now scheduled from their respective feature. There is no
need for a separate indexer anymore.
The hooks are only relevant inside the modules feature and have been
moved into the feature.
We don't schedule any jobs when walking a workspace for the first time,
so the walker only needs to report directories and their files via
discover events.
This better reflects the new structure as the walker only walks
directories and fires events now. The state is managed in the separate
features after they get a discover event.
All parsing related logic has been moved to the specific features. All
job implementations now live in their respective features as well.
The global decoder now only acts as a "proxy" and will use a more
specific decoder from a feature depending on the language ID.
The state has been moved to separate features. The global state now
only contains entities that are relevant to everything.

The new change tracking has been partially lifted into the features and
only the generic part is kept in the global state.
We keep track of all features in this central location. This should be
the main place you need to extend whenever a new feature is introduced.
Document synchronization handlers will now fire an event for every
incoming request. Features can subscribe to these events.

Language feature handlers will now wait for running jobs for a
directory before attempting to complete the request and return data.
@dbanck dbanck marked this pull request as ready for review June 10, 2024 13:39
@dbanck dbanck requested a review from a team as a code owner June 10, 2024 13:39
Copy link
Contributor

@jpogran jpogran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +17 to +18
// TODO? Can features contribute commands, so we don't have to import
// the features here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to create issues for this somewhere? (I'm fine with leaving these comments as is)

Copy link
Member

@ansgarm ansgarm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@ansgarm
Copy link
Member

ansgarm commented Jun 11, 2024

I just noticed that we might need to update our docs (e.g. docs/architecture.md) in a follow-up PR to reflect the refactoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants