-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Refactoring of module source addresses and module installation #28854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+2,039
−467
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
As the comment notes, this hostname is the default for provide source addresses. We'll shortly be adding some address types to represent module source addresses too, and so we'll also have DefaultModuleRegistryHost for that situation. (They'll actually both contain the the same hostname, but that's a coincidence rather than a requirement.)
This new package aims to encapsulate all of our interactions with go-getter to fetch remote module packages, to ensure that the rest of Terraform will only use the small subset of go-getter functionality that our modern module installer uses. In older versions of Terraform, go-getter was the entire implementation of module installation, but along the way we found that several aspects of its design are poor fit for Terraform's needs, and so now we're using it as just an implementation detail of Terraform's handling of remote module packages only, hiding it behind this wrapper API which exposes only the services that our module installer needs. This new package isn't actually used yet, but in a later commit we will change all of the other callers to go-getter to only work indirectly through this package, so that this will be the only package that actually imports the go-getter packages.
We've previously had the syntax and representation of module source addresses pretty sprawled around the codebase and intermingled with other systems such as the module installer. I've created a factored-out implementation here with the intention of enabling some later refactoring to centralize the address parsing as part of configuration decoding, and thus in turn allow the parsing result to be seen by other parts of Terraform that interact with configuration objects, so that they can more robustly handle differences between local and remote modules, and can present module addresses consistently in the UI.
It's been a long while since we gave close attention to the codepaths for module source address parsing and external module package installation. Due to their age, these codepaths often diverged from our modern practices such as representing address types in the addrs package, and encapsulating package installation details only in a particular location. In particular, this refactor makes source address parsing a separate step from module installation, which therefore makes the result of that parsing available to other Terraform subsystems which work with the configuration representation objects. This also presented the opportunity to better encapsulate our use of go-getter into a new package "getmodules" (echoing "getproviders"), which is intended to be the only part of Terraform that directly interacts with go-getter. This is largely just a refactor of the existing functionality into a new code organization, but there is one notable change in behavior here: the source address parsing now happens during configuration loading rather than module installation, which may cause errors about invalid addresses to be returned in different situations than before. That counts as backward compatible because we only promise to remain compatible with configurations that are _valid_, which means that they can be initialized, planned, and applied without any errors. This doesn't introduce any new error cases, and instead just makes a pre-existing error case be detected earlier. Our module registry client is still using its own special module address type from registry/regsrc for now, with a small shim from the new addrs.ModuleSourceRegistry type. Hopefully in a later commit we'll also rework the registry client to work with the new address type, but this commit is already big enough as it is.
Now that we (in the previous commit) refactored how we deal with module sources to do the parsing at config loading time rather than at module installation time, we can expose a method to centralize the determination for whether a particular module call (and its resulting Config object) enters a new external package. We don't use this for anything yet, but in later commits we will use this for some cross-module features that are available only for modules belonging to the same package, because we assume that modules grouped together in a package can change together and thus it's okay to permit a little more coupling of internal details in that case, which would not be appropriate between modules that are versioned separately.
We have some tests in this package that install real modules from the real registry at registry.terraform.io. Those tests were written at an earlier time when the registry's behavior was to return the URL of a .tar.gz archive generated automatically by GitHub, which included an extra level of subdirectory that would then be reflected in the paths to the local copies of these modules. GitHub started rate limiting those tar archives in a way that Terraform's module installer couldn't authenticate to, and so the registry switched to returning direct git repository URLs instead, which don't have that extra subdirectory and so the local paths on disk now end up being a little different, because the actual module directories are at a different subdirectory of the package.
2939109
to
4b34cc3
Compare
Previously we had a separation between ModuleSourceRemote and ModulePackage as a way to represent within the type system that there's an important difference between a module source address and a package address, because module packages often contain multiple modules and so a ModuleSourceRemote combines a ModulePackage with a subdirectory to represent one specific module. This commit applies that same strategy to ModuleSourceRegistry, creating a new type ModuleRegistryPackage to represent the different sort of package that we use for registry modules. Again, the main goal here is to try to reflect the conceptual modelling more directly in the type system so that we can more easily verify that uses of these different address types are correct. To make use of that, I've also lightly reworked initwd's module installer to use addrs.ModuleRegistryPackage directly, instead of a string representation thereof. This was in response to some earlier commits where I found myself accidentally mixing up package addresses and source addresses in the installRegistryModule method; with this new organization those bugs would've been caught at compile time, rather than only at unit and integration testing time. While in the area anyway, I also took this opportunity to fix some historical confusing names of fields in initwd.ModuleInstaller, to be clearer that they are only for registry packages and not for all module source address types.
jbardin
approved these changes
Jun 2, 2021
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It's been a long while since we gave close attention to the codepaths for module source address parsing and external module package installation. Due to their age, these codepaths often diverged from our modern practices such as representing address types in the addrs package, and encapsulating package installation details only in a particular location.
In particular, this refactor makes source address parsing a separate step from module installation, which therefore makes the result of that parsing available to other Terraform subsystems which work with the configuration representation objects. The start of that is included here as some new
EntersNewPackage
methods in theconfigs
package for recognizing the difference between a local module call and a call that enters a new remote package; we'll make more use of that in later PRs.This also presented the opportunity to better encapsulate our use of go-getter into a new package
getmodules
(echoinggetproviders
), which is intended to be the only part of Terraform that directly interacts with go-getter.This is largely just a refactor of the existing functionality into a new code organization, but there is one notable change in behavior here: the
source
address parsing now happens during configuration loading rather than module installation, which may cause errors about invalid addresses to be returned in different situations than before. That counts as backward compatible because we only promise to remain compatible with configurations that are "valid", which means that they can be initialized, planned, and applied without any errors. This doesn't introduce any new error cases, and instead just makes a pre-existing error case be detected earlier.Our module registry client is still using its own special module address type from
registry/regsrc
for now, with a small shim from the newaddrs.ModuleSourceRegistry type
. Hopefully in a later commit we'll also rework the registry client to work with the new address type, but this PR is already big enough as it is.I tried to break this into some smaller commits for easier commit-by-commit review, although there is a big meaty middle where I had to change a bunch of things all at once in order for it to keep compiling. That middle commit should just be moving things around and not materially changing behavior (aside from parsing earlier as mentioned above).