Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow default ALC to be unloadable for plugin scenarios #1633

Closed
jazzdelightsme opened this issue Jan 11, 2020 · 27 comments
Closed

Allow default ALC to be unloadable for plugin scenarios #1633

jazzdelightsme opened this issue Jan 11, 2020 · 27 comments
Milestone

Comments

@jazzdelightsme
Copy link

There is a native app; let’s call it Foo.exe. Foo can host arbitrary plugins. I wrote a plugin primarily using managed code, but to get loaded into Foo, I provide a native DLL with certain exports, and then when those exports are called, I load up the CLR and run my managed code. Life was good.

Now .NET Core has come along. I think things have progressed far enough (for example, I was waiting for C++/CLI support) that I could move my code onto .NET Core—after all, that’s the live branch of .NET. However, there's a big snag.

In my old .NET code, I relied on my native host being able to create an alternate AppDomain to load my managed code in. That way, when my plugin got unloaded, I could unload the AppDomain. I had to leave the CLR alive, but my binaries could be deleted, and/or replaced, and/or reloaded.

I have heard that now we have the concept of a “collectible” assembly load context. But what that means is that I will need to have another C# shim DLL that stays loaded permanently, in order to host the “real” DLL in a collectible context, so that it can dump it when desired. And having that shim DLL be “pinned” in place for the lifetime of the process is highly undesirable—it means I can’t actually easily replace my plugin on disk.

In other words, being able to create collectible AssemblyLoadContexts for plugins is great, when the plugin host is a .NET Core application. But if the plugin host is native... it doesn't quite fit the bill.

I think .NET Core should be a viable technology to write add-ins for any application, but to do that credibly, it will need to be able to fully unload the add-ins. Perhaps this could be as simple as providing a way for the hostfxr-created AssemblyLoadContext to be collectible, I don't know. And of course the next-level goal would be to extend unloadability to coreclr itself, and side-by-side (imagine two plugins, each using a different coreclr).

TIA!

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jan 11, 2020
@timhe
Copy link

timhe commented Jan 21, 2020

WinDbg/WinDbgX debugger extensions could greatly benefit from the ability to unload. The Mex Debugger Extension would like to move to .NET core, but without the ability to unload, it would be a hard sell

@jeffschwMSFT
Copy link
Member

@vitek-karas

@jeffschwMSFT jeffschwMSFT added this to the Future milestone Jan 30, 2020
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Jan 30, 2020
@agocke
Copy link
Member

agocke commented Apr 19, 2024

It sounds like the main request here is to be able to unload CoreCLR itself when it is loaded into a native application.

We don't have any plans to support that scenario. Unloading Native AOT DLLs might be more feasible, but is still not in the backlog. If you're already using Native AOT successfully and think this is feature would heavily improve the experience, I think we can open a new issue for that.

@agocke agocke closed this as completed Apr 19, 2024
@agocke agocke closed this as not planned Won't fix, can't repro, duplicate, stale Apr 19, 2024
@jazzdelightsme
Copy link
Author

@agocke You have misunderstood the Issue.

It sounds like the main request here is to be able to unload CoreCLR itself

No, that is not the request at all. It has never been possible to unload the CLR, yet I explicitly described how in old .NET Framework I was able to write an unloadable plugin in C# (even though the CLR itself must stay loaded). But in .NET Core, I don't see how it is possible anymore to write an unloadable / reloadable plugin for a native host, because you have to have a managed shim to do the ALC stuff... which then itself cannot be unloaded.

(ALCs have their own problems, too. 😢)

@jazzdelightsme
Copy link
Author

Here is the desired workflow for writing a plugin for a native app:

  1. Write some plugin code (in C#, yay).
  2. Build it.
  3. Load it into the native application through whatever its plugin mechanism is.
  4. Test it.
  5. Oops, there was a bug.
  6. Unload the plugin.
  7. Goto step 1.

Step 6 was possible in .NET Framework, but is no longer possible. When I get to step 2 the second time around and try to build my code, there will be a sharing violation with the shim assembly that does the ALC stuff.

@agocke
Copy link
Member

agocke commented Apr 19, 2024

I see, so the request here is to have the "default" ALC be unloadable, that is you want the entry-point assembly for your plugin to be unloadable.

@agocke agocke reopened this Apr 19, 2024
@agocke agocke changed the title .NET Core can host plugins, but can't BE a plugin Allow default ALC to be unloadable for plugin scenarios Apr 19, 2024
@agocke
Copy link
Member

agocke commented Apr 19, 2024

Also, you don't care if CoreCLR.dll and the rest of the CLR remains loaded into your app, even if the plugin DLL is unloaded?

@jkotas
Copy link
Member

jkotas commented Apr 19, 2024

When I get to step 2 the second time around and try to build my code, there will be a sharing violation with the shim assembly that does the ALC stuff.

It sounds like that helper code that creates the ALC is part of the plugin. Can you make the helper code that creates the ALC and loads the unloadable plugin into it to be loaded just once (ie make it part of your plugin hosting environment)? It is the recommended way to build plugin hosting systems with .NET Core/5+.

@jazzdelightsme
Copy link
Author

you don't care if CoreCLR.dll and the rest of the CLR remains loaded into your app, even if the plugin DLL is unloaded?

Correct. IMO, that would be nice, but it doesn't cause a problem for me.

Can you make the helper code that creates the ALC and loads the unloadable plugin into it to be loaded just once (ie make it part of your plugin hosting environment)?

I don't control the hosting environment (it's a native app; I write a native DLL that loads the CLR and then all my managed stuff), so no, I can't change that.

Leaving the root ALC shim loaded seems like it should be possible (haven't tried it), but annoying: either I have to just ignore sharing violations when I rebuild my plugin, or I have to split it out into a separate solution or something like that.

@jkotas, if you were going to spend some cycles improving ".NET plugin-related stuff", I think you should focus on #45285 first. That is the real blocker.

@jazzdelightsme
Copy link
Author

jazzdelightsme commented Apr 24, 2024

I think you should focus on #45285 first. That is the real blocker.

Actually... focus on whichever is easier, because I think I could get past my problems with either this Issue fixed, OR that one.

The problem there is that I also host the PowerShell runtime in my plugin, so tons of stuff ends up in the default ALC (and I can't change that, because I can't go rewrite all of PowerShell to do their loading differently for my plugin).

So you could fix how ALCs work, to make them an actual boundary instead of a chalk line on the ground; but if you fixed THIS Issue, then I wouldn't care so much that a bunch of random stuff got loaded in the default ALC, because I'd still be able to just unload it. (And in fact that seems like it might be the easier thing to do; not knowing anything about .NET internals, of course.)

@jkotas
Copy link
Member

jkotas commented Apr 24, 2024

I am sorry I do not expect #45285 or this issue to be ever fixed. You are basically asking to bring .NET Framework AppDomains back. We have cut AppDomains in .NET Core for number of good reasons. They are not coming back. If you need AppDomain-like isolation for your plugins, we recommend launching them in separate processes.

@jazzdelightsme
Copy link
Author

Thanks for letting me know. That's very disappointing, as I would really like to bring my project into the future with .NET, but I can't see how to do it. It is not only prohibitively expensive for me to rearchitect into a multi-process model; it is also problematic because I depend on libraries from the native app that expect to be in the host process.

(The native app is the Windows debugger ("windbg"). So not only would I have to figure out some layer of cross-process communication on the "top", but also when my code needs to call into e.g. dbghelp.dll, I would have to intercept that on the "bottom", to proxy over to the host process. That is a gigantic surface area, and I'm just one dude.)

@jazzdelightsme
Copy link
Author

We have cut AppDomains in .NET Core for number of good reasons.

Also, @jkotas, the way you say it like that, makes me think you don't think what I'm asking for is reasonable. Is that right? I really can't understand that: why bother having ALCs, if ALCs only work with some parts of .NET but not others? (Doesn't work with Assembly.Load / the pwsh runtime.) Should you get rid of Assembly.Load for the same "good reasons" that you got rid of AppDomains? (Should pwsh be rewritten using ... "something else"? If so... let's do that.)

(Are you saying that you can't shouldn't make Assembly.Load load into the current ALC, and/or you can't shouldn't make the ALC that Assembly.Load loads into unloadable??)

I understand that your statement of not having AppDomains come back has been the PoR for years now. But I still really think you should reconsider having a way for people to unload code. I mean, you did add unloadable ALCs, right? So you understand that people want to be able to do that?

(To no one in particular: Why do I have to be the unlucky person who depends on an API that doesn't work with that? 😭 )

@jkotas
Copy link
Member

jkotas commented Apr 24, 2024

The general plugin scenario is reasonable, but asking for it to be solved in the exact same way as in .NET Framework is not reasonable.

To reiterate, the reasons for cutting AppDomains were:

  • Maintenance cost / complex: Maintaining AppDomains was prohibitively expensive for the .NET team. AppDomains made every runtime feature multiple times more expensive to develop, and they created a hardening tax on all library code.
  • Reliability: AppDomains were endless source of reliability bugs. It is very hard to write code that gracefully recovers from being aborted at any point by non-cooperative appdomain unload. All libraries running in an unloadable appdomain has be hardened like that. A lot of code out there was not hardened properly (including parts of .NET Framework) and it would cause intermittent crashes when unloaded.
  • Performance: AppDomains introduced significant performance overhead that was often higher than equivalent process overhead. It was not unusual to see migrations from single process with multi-appdomain to multiple processes to improve performance (and reliability).

Assembly.Load or ALCs do not have any of these problems.

I still really think you should reconsider having a way for people to unload code

We do have a way, it is just not the same way as in .NET Framework. Many of our users are happy with unloadable ALCs. For example, https://unity.com/ is rebuilding their editor on top ALCs.

@jazzdelightsme
Copy link
Author

Thanks, @jkotas.

As I understand it, if a Unity plugin were to host the PowerShell runtime, or any other code that uses Assembly.Load, then the plugin would not be unloadable, because some random set of assemblies would accidentally end up in the default ALC. Is that correct? I suppose they just don't care about that problem? Or perhaps they just don't have any such assemblies--it's all new, all built on .NET, specifically for Unity, and no legacy dependencies? If they are happy, then I'm happy for them, too.

It is very hard to write code that gracefully recovers from being aborted at any point by non-cooperative appdomain unload.

I agree (aside: I would go even further and call it fundamentally impossible to do completely, at least on Windows, because on Windows, the process is the reliability boundary). But I don't need or want that feature. I really don't want AppDomains brought back. I just want to be able to unload my plugin, which happened to have called Assembly.Load. (And remember, I can't just rewrite my plugin to not use Assembly.Load, because it's actually the PowerShell runtime that calls Assembly.Load.) And I don't understand why the feeling that asking for this is tantamount to asking for AppDomains to be brought back, or asking for the CLR itself to be unloadable, or any other impossible-or-never-going-to-happen thing.

People wanted to do plugins, so y'all came up with a way to do plugins (ALCs), great! But the way you did it seems to be that not only the plugin but also all of its downstream dependencies has to have been purpose-written to stay in the correct ALC... and I just don't have the power to rewrite the world like that.

Is that actually what you think is the right course of action? That I should propose to the PowerShell team that they should update their code to become ALC-aware? And any other similar dependencies? (It seems like it would benefit a lot more people with a lot less effort and a lot less bugs to make ALCs work for everyone without requiring them all to change code.)

@jkotas
Copy link
Member

jkotas commented Apr 24, 2024

any other code that uses Assembly.Load, then the plugin would not be unloadable,

Assembly.Load(string) is not a problematic API for unloadability. This API gets the ALC of its caller and forwards the request to it. It is then up to that ALC to decide whether to load the dependency as unloadable or not. (For example, the BCL assemblies should be loaded as unloadable into the default ALC.)

There are other APIs that are problematic like Assembly.LoadFrom that have to be avoided in unloadable plugins. I would not call it purposely written. The amount of the code that needs to be aware of ALCs and unloadability is typically very small.

@jkotas
Copy link
Member

jkotas commented Apr 25, 2024

That I should propose to the PowerShell team that they should update their code to become ALC-aware?

Yes. It is up to every package maintainer to decide whether it makes sense to invest into making their package ALC-aware and unloadable.

It is very similar to supporting multiple OSes. Some packages work great on all OSes. Some packages are Windows-specific and the maintainers are not interested investing into support for non-Windows OSes.

@agocke
Copy link
Member

agocke commented Apr 25, 2024

Ok, it sounds like this issue and the linked one are both asking for a feature of “non-cooperative” assembly unload. I agree with @jkotas — we will likely never build such a feature. I’m going to close both of these issues to signal the appropriate guidance.

In .NET core, all plugins and their dependencies must be safe for unloading if they want to support unloading.

@agocke agocke closed this as not planned Won't fix, can't repro, duplicate, stale Apr 25, 2024
@jazzdelightsme
Copy link
Author

Assembly.Load(string) is not a problematic API for unloadability.

@jkotas: yes, yes; I assumed you would understand what I meant. What I meant was "the Assembly load methods that for whatever reason have been defined as loading stuff into the default ALC--you know, the ones that are causing all the problems here."

It is very similar to supporting multiple OSes.

Yes, but it sure would be nice if a .NET change made a whole bunch more stuff "just work" on other OSes.

it sounds like this issue and the linked one are both asking for a feature of “non-cooperative” assembly unload.

@agocke: I don't understand where the disconnect is here: I absolutely do not want non-cooperative unload. (In fact, I went so far as to call it impossible to do completely; AppDomains were doomed in that respect from day 1!)

It is frustrating that it feels like you keep misunderstanding what I am asking for. I do not want to unload the CLR. I do not want AppDomains brought back. I do not want non-cooperative unload.

If you came and explained "look, here is why Assembly.LoadFile has to load into the Default ALC; there is no alternate design that would work", that would be helpful. I've read the design docs (this one and this one), and I'm not seeing why that would be. (In fact the design of the "contextual reflection" stuff seems directly aimed at my problem, but what I didn't see explained was why e.g. Assembly.LoadFile is not updated to use this facility, to just automatically use the ALC of the calling assembly if there isn't a different one set.)

Or if you said there are no resources or not enough customers that want what I'm asking for, or something like that, that would be understandable, too.

But I don't want this closed out because you think I'm asking for something different. So for the record, again: I do not want non-cooperative unload.

All that said, I appreciate your transparency in your intention to not do anything about this.

@jkotas
Copy link
Member

jkotas commented Apr 25, 2024

Assembly.LoadFrom and Assembly.LoadFile APIs exist for compatibility with .NET Framework, to make most of the .NET Framework libraries that use these APIs work. @agocke has a pending doc update to make it clear https://github.com/dotnet/dotnet-api-docs/pull/9764/files#diff-f317cbaa519e4a60571d2ab44a4fb991bbfd90941c69552f5eb3ddb6c2739454R4636 .

There are many ways that this compat layer is imperfect. We are not interested in churning the behavior of these API in .NET Core. Any change in behavior of these APIs is a breaking change.

here is why Assembly.LoadFile has to load into the Default ALC

Nit: Assembly.LoadFile does not load into the Default ALC. It loads into its own isolated non-collectible ALC that is the best approximation of the .NET Framework behavior of this API.

@agocke
Copy link
Member

agocke commented Apr 25, 2024

It is frustrating that it feels like you keep misunderstanding what I am asking for. I do not want to unload the CLR. I do not want AppDomains brought back. I do not want non-cooperative unload.

You are misunderstanding what I mean by non-cooperative unload. I mean that you want to be able to unload assemblies which may load other assemblies that have not been modified to be "Unload-compatible." Modifying assemblies to only use unload-safe APIs and architectures is what I mean by cooperation.

The problem that you're hitting is precisely that one of the dependencies of your library (powershell) is non-cooperative in assembly unload. We will not support that.

@jazzdelightsme
Copy link
Author

I see, thanks for the explanation. What I did not (still do not) understand is why certain APIs were deemed unload-UNsafe in the first place. PowerShell is not doing anything "crazy"; just loading an assembly. But, according to Jan, that design is basically frozen, so nothing to do about it now.

@agocke
Copy link
Member

agocke commented Apr 25, 2024

The fundamental limitation on unloadability is that there can be no live GC references into the assembly that you want to unload. There are many APIs which could cause this, for instance a static cache of types could have a type from a collectible assembly added to it, and unless that cache is cleared, the target assembly could not unload.

Assembly.LoadFile is a good example of the problem here. If you load an assembly using LoadFile it loads into a non-collectible ALC. If you pass a reference to any type into that ALC, and it's then stored, then you will keep an object reference alive in the unloadable ALC, and that will keep your original ALC loaded.

This is why targets need to be "ALC-aware." There are some operations which can create static roots, and those assemblies need to be careful not to use them.

@jazzdelightsme
Copy link
Author

What you describe (type references crossing ALC boundaries) makes total sense. What does NOT make sense (to me) is why the design of ALCs (which you do not plan to change, so this is an academic discussion at this point, but hopefully will serve as a guidepost for some future technology decisions) made a bunch of existing "self-contained" code (PowerShell does not deliberately call into any other AppDomain or ALC or other such concept) suddenly become crossing these lines, by definition, by defining LoadFile to cross an ALC boundary. I'm sure there was a reason; I just don't like it, because it "moved the lines" for existing code (and, for me, in a way that I can't effectively deal with it).

@jkotas
Copy link
Member

jkotas commented Apr 28, 2024

by defining LoadFile to cross an ALC boundary

Assembly.LoadFile is not part of the core ALC design. The only reason why this API exists is compatibility with .NET Framework.

The implementation of Assembly.LoadFile is quite simple:

public static Assembly LoadFile(string path)
{
ArgumentNullException.ThrowIfNull(path);
if (PathInternal.IsPartiallyQualified(path))
{
throw new ArgumentException(SR.Format(SR.Argument_AbsolutePathRequired, path), nameof(path));
}
string normalizedPath = Path.GetFullPath(path);
Assembly? result;
lock (s_loadfile)
{
if (s_loadfile.TryGetValue(normalizedPath, out result))
return result;
// we cannot check for file presence on BROWSER. The files could be embedded and not physically present.
#if !TARGET_BROWSER && !TARGET_WASI
if (!File.Exists(normalizedPath))
throw new FileNotFoundException(SR.Format(SR.FileNotFound_LoadFile, normalizedPath), normalizedPath);
#endif // !TARGET_BROWSER && !TARGET_WASI
AssemblyLoadContext alc = new IndividualAssemblyLoadContext($"Assembly.LoadFile({normalizedPath})");
result = alc.LoadFromAssemblyPath(normalizedPath);
s_loadfile.Add(normalizedPath, result);
}
return result;
}
. How would you change the implementation of Assembly.LoadFile to address your issue while keeping the behavior compatible with .NET Framework as much as possible?

@jazzdelightsme
Copy link
Author

Pseudocode:

AssemblyLoadContext alc = null;
if (AssemblyLoadContext.CurrentContextualReflectionContext != null)
{
    alc = AssemblyLoadContext.CurrentContextualReflectionContext;
}
else
{
    alc = new IndividualAssemblyLoadContext($"Assembly.LoadFile({normalizedPath})"); 
}

(and of course similar changes would be needed in the other places in that file that new up one-off ALCs)

(and also, before new-ing up that one-off ALC, I wonder about checking AssemblyLoadContext.GetLoadContext for the calling assembly?)

I don't think it's a question of compatibility with .NET Framework--that is easy, because this change is 100% exactly as compatible with .NET Framework as the unchanged code is, because AssemblyLoadContext.CurrentContextualReflectionContext did not exist there, so nobody could be setting it.

For more modern code (post-.NET Framework) (including, importantly, where the host is modern, but the calling code is circa .NET Framework) it only changes the behavior in cases where the program seems to have clearly indicated that it wants the "current ALC" to be a particular ALC.

@jkotas
Copy link
Member

jkotas commented Apr 30, 2024

I don't think it's a question of compatibility with .NET Framework

The question is whether this behavior would make more .NET Framework libraries compatible.

You have mentioned Powershell. Where is the Assembly.LoadFile call in Powershell that this would "fix"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

6 participants