Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What will Domain be replaced with? #10843

Open
isaacs opened this issue Jan 17, 2017 · 34 comments
Open

What will Domain be replaced with? #10843

isaacs opened this issue Jan 17, 2017 · 34 comments
Labels
domain Issues and PRs related to the domain subsystem. question Issues that look for answers.

Comments

@isaacs
Copy link
Contributor

isaacs commented Jan 17, 2017

I'm adding support for parallel tests to node-tap. tapjs/tapjs#306

One sticky point is that, like most test frameworks, node-tap needs to be able to cleverly handle thrown errors and tie them to the test environment where the error was thrown from (especially since many existing assertion libs turn things like expect(foo).to.match(/regexp/) into a thrown error).

When tests were not run in parallel, there was no problem. Just walk down the tree of test.currentChild or whatever, and a single process.on('uncaughtException') is sufficient.

I'm using Domains to manage thrown errors in child tests, and it works great, but the bright red deprecation notice in the docs is scary.

What will it be replaced with? Will it be feature-equivalent? If so, is there any reason why the existing API has to be broken, and not re-implemented in terms of the new thing? Is there an expected time frame for the Domain api to be broken?

Thanks.

@arciisine
Copy link

While not official, something like async-listener or zonejs should generally be able to fit the bill, with some caveats.

@addaleax addaleax added domain Issues and PRs related to the domain subsystem. question Issues that look for answers. labels Jan 17, 2017
@jasnell
Copy link
Member

jasnell commented Jan 17, 2017

At this point it's not clear at all what will replace it. There is this zones proposal being discussed at the TC-39 level but it's not something that is far enough along for us to be able to make any reasonable decisions on. At the same time, if there is going to be something happening at the TC-39 level, it doesn't make much sense for us to come up with a likely incompatible replacement. So at this point, the deprecation/replacement of domain is in a holding pattern and will be for the foreseeable future.

Any discussion on a potential replacement needs to happen in the node-eps repository.

@addaleax
Copy link
Member

My understanding was that async_hooks, when it’s ready, would be suitable for having domain re-implemented on top of it, with a similar or identical API?

/cc @nodejs/diagnostics

@jasnell
Copy link
Member

jasnell commented Jan 17, 2017

That's a possibility but still runs up against the possibility of being inconsistent/incompatible with whatever language level construct is settled on by TC-39. It won't do us much good to put the effort into a domains v2 if we end up having to deprecate it in favor of a language level thing, etc

@sam-github
Copy link
Contributor

Is there an expected time frame for the Domain api to be broken?

@isaacs I was under the impression that it is broken already. Besides https://github.com/nodejs/node/blob/master/doc/topics/domain-postmortem.md, promise users report problems with domains, and users of async/await have found them broken, too, IIRC.

If you don't use any of the broken bits, they seem to work fine, but its a bit of a mine field out there.

@Qard
Copy link
Member

Qard commented Jan 17, 2017

It's really been "broken" to some degree pretty much since day one. It just gets more and more broken as new language/runtime features are introduced and the necessary parts to support domains don't stay in sync. It never quite satisfied the needs and really just exposed a huge hole where people wanted features. That's a big part of why I pushed to start the diagnostics working group in the first place--to try to better deliver what was needed. The new 'async_wrap' stuff is definitely a step in the right direction, but I personally feel there's still a long way to go. I hope the zones effort grows into something more solid. TC39 stuff takes time though. 😕

@isaacs
Copy link
Contributor Author

isaacs commented Jan 18, 2017

@Qard @sam-github It's not "broken already", it is what it is. I am using "broken" in a very technical non-normative sense here, meaning "change the API in a way that requires changes to the code using it".

You might not like the Domains API that exists today, and I'd probably agree with your criticisms of it. But it does a specific job very reliably, which I can't do any other way.

My fear is that, after spending 2 years in a "deprecated" state, someone will decide it's time to rip it out without giving me time to upgrade (or worse, without there even being anything to upgrade to). Could the documentation at least provide some clarity as to the requirements that have to be met before Domains will be removed? Or better yet, could we just deprecate the parts that are known to be harmful? (d.enter(), for example, should probably never be used in userland code, and it was a mistake to expose it.)

@jasnell
Copy link
Member

jasnell commented Jan 18, 2017

There are no plans, nor has anyone suggested, nor would I expect anyone on @nodejs/ctc to suggest or even reasonably consider removing domain without a clear replacement in place. It is likely to be left alone as is without any modifications for the foreseeable future.

@Fishrock123
Copy link
Member

I'll try to read though this tomorrow, but the idea was to get async-hooks in in a state that would allow a user module to correctly replicate domains (and other variations of the async-error-handling idea), and then move towards removing it.

The resulting module would probably be more maintainable (perhaps even actually maintained!), and core's APIs for it a lot more platform-like and unoppinionated.

There are no plans, nor has anyone suggested, nor would I expect anyone on @nodejs/ctc to suggest or even reasonably consider removing domain without a clear replacement in place.

I would like to be clear that "the replacement" may not have the same level of API or do the same thing, but it will certainly be pushed towards allowing replication of the domain API as much as/if possible.

@Qard
Copy link
Member

Qard commented Jan 18, 2017

@isaacs I hear you on that, and I don't expect it'll just disappear immediately when something new comes along. It's possible the existing API might even just get ported to async-hooks in the future. It's really just the internal behaviour that's problematic: lack of support for promises and asyn/await, inconsistent timings between node versions, etc. My feeling is that the current deprecation is really more about it being potentially unreliable than there being any specific intention to fully replace it currently.

@joshgav joshgav added the diag-agenda Issues and PRs to discuss during the meetings of the diagnostics working group. label Feb 23, 2017
@joshgav joshgav removed the diag-agenda Issues and PRs to discuss during the meetings of the diagnostics working group. label Mar 24, 2017
@ORESoftware
Copy link
Contributor

ORESoftware commented Mar 29, 2017

I am in the same exact boat as @isaacs; every test runner in the Node.js ecosystem that is trying to parallelize tests will have the same exact problem - the problem being: the only reliable way to trap errors thrown in an arbitrary chunk of code* is domains.

As for native promises being broken with domains, you can try this:

const then = Promise.prototype.then;
Promise.prototype.then = function (fn1, fn2) {

  if (process.domain) {
    fn1 = fn1 && process.domain.bind(fn1);
    fn2 = fn2 && process.domain.bind(fn2);
  }
  return then.call(this, fn1, fn2);
};

Not super well-tested, just throwing that code out there.

Bluebird promises have had support for domains for more than 2 years TMK, so that's been a good thing for domain users for awhile.

*Where that code is user-provided or otherwise out of your control.

@syrnick
Copy link

syrnick commented Jul 14, 2017

Note on promises. Native v8 promises don't respect domains - the callbacks aren't called in the same domain. Polyfills work fine, because they are just library level constructs. Hence, domains mostly work with transpiled async/await and with most libraries that use promises if the global Promise is the polyfill, not the native Promise.

It's obviously a confusing state and using domains has all sorts of caveats and gotchas. Still, immensely useful creature.

@misterdjules
Copy link

@syrnick

Native v8 promises don't respect domains

Does #12489 solve this problem for your use case? (note: it's only available in node 8.x).

@syrnick
Copy link

syrnick commented Jul 14, 2017

nice! I'll test that out on 8.x.

@Trott
Copy link
Member

Trott commented Aug 10, 2017

domain does have that funny "pending deprecation" notice. Now that async-hooks has arrived, are we anywhere close to actually giving domain a doc-only deprecation that doesn't have the "pending deprecation" ambiguity/confusion? Or not yet? I have no interest in pushing for a runtime deprecation or anything like that, but I would like the docs to be clear about the status.

@BridgeAR
Copy link
Member

I think that is a good idea @Trott

@owenallenaz
Copy link

owenallenaz commented May 31, 2018

As I've been utilizing async and await, it seems apparent to me that with a proper execution of Promise chains, the domain module no longer seems necessary. Steps across the event loop can be maintained via the Promise chain, and thus errors can properly be try/catch no longer how deep and handled accordingly. Given correct implementation by Module implementors, which will take some time, the domain module no longer seems necessary. So far from my experimentation, I'm not aware of an edge case that I have used domains for which Promise chains do not handle and async and await provide the needed syntactic sugar to make it reasonable to utilize.

So then with that in mind, is the replacement, simply start using Promises.

@misterdjules
Copy link

@owenallenaz Just to make sure I understand what you suggest: do you mean that errors thrown from an async task not originally scheduled by a promise should be "wrapped" by a promise, such as for instance using the following to handle errors thrown from setTimeout callbacks?

const somePromise = new Promise((resolve, reject) => {
  setTimeout(function () {
    try {
      throw new Error('boom!');
    } catch (e) {
      reject(e)
    }
  })
})

async function mainAsync() {
  try {
    await somePromise;
  } catch (err) {
    console.error(err);
  }
}

mainAsync();

@owenallenaz
Copy link

owenallenaz commented Aug 22, 2018

With callback semantics we're currently forced to deal with errors in two ways. The first is via the callback and the second is thrown errors caught by domain. I spun up a simple site in a gist to showcase the problem.

https://gist.github.com/owenallenaz/545bacc327ff1969c56d69e1afb4cfa7

Visit / to see a list of the urls.

  • /callback_valid/ - Valid response
  • /callback_cb_error/ - Error cb'd due to DNS failure.
  • /callback_thrown_error/ - Error thrown by JSON.parse() failure, causes a process crash.
  • /callback_timeout/ - Error thrown instead a timeout, process crash.
  • /promise_valid/ - Valid promise result.
  • /promise_cb_error/ - Error cb'd due to DNS failure.
  • /promise_thrown_error/ - Error thrown by JSON.parse() failure. Does not crash.
  • /promise_timeout/ - Error thrown after a timeout. Does not crash.

In CPS style if the error is cb'd we're fine. If there is a thrown error, nested within the CPS chain there is simply no way to handle that error other than a domain. With Promise and async/await, we can have one form of error handling, try/catch, and it is able to handle the error no matter how it came to be, nor how deeply nested. Looking at the various promise urls helps to showcase that capability. This capability requires the whole nested chain to be properly executed. But if module developers ensure that their calls are properly handling promise chains, then they should bubble up until someone applys a then in traditional syntax or a try/catch via await syntax.

@Trott
Copy link
Member

Trott commented Nov 8, 2018

My impression is that the answer to the "when will domains be removed?" question is "quite possibly never." And the answer to the "what will domains be replaced with?" question is "kinda sorta async_hooks but mostly kinda sorta." While perhaps not the most satisfying answers, I wonder if that's basically what we have, and if so, perhaps this can/should be closed?

@misterdjules
Copy link

@Trott I feel the current state of domain does not send a clear message to users though: on one hand, domain is deprecated, and the documentation hints at "a replacement API", but it doesn't seem like any replacement has been considered.

Some users have been very vocal about the confusion that this message generates, and it seems that we could at least provide user-land alternatives that would allow for a reasonable migration path for users while allowing core to fully deprecate domain.

I've submitted a PR that enables user-land implementations of domain at #24279 (this is a work in progress, but I'd like to gather initial feedback on whether core would be open to merge something similar), and worked on a proof of concept of a user-land domain implementation that relies on those changes at https://github.com/misterdjules/domaine.

Any thoughts on that?

@devsnek
Copy link
Member

devsnek commented Nov 13, 2018

lots of replacements have been considered, just nothing has been very compelling (imo because use case itself is uncompelling). I think the best direction would be to suggest that users migrate away from the domain model, and properly handle errors.

@aidinrs
Copy link

aidinrs commented Nov 14, 2018

@devsnek how do you provide context for logs of your applications/servers?

@devsnek
Copy link
Member

devsnek commented Nov 14, 2018

@AiDirex you could use async_hooks for that. (check out an example here). the thing that has been the problem isn't the scoped context, it's been the error handling.

@aidinrs
Copy link

aidinrs commented Nov 14, 2018

@devsnek Thank you, but I know that modules like cls-hooked exist that use async-hooks to solve the issue of context. The problem is that some modules aren't compatible with async-hooks yet and their maintainers won't fix the compatibility issues because async-hooks module is still in an experimental state.

@devsnek
Copy link
Member

devsnek commented Nov 14, 2018

@AiDirex out of curiosity, can you point me to a module that is incompatible with async_hooks? every async primitive in node (via js or c++) should be tracking async_hook context.

@aidinrs
Copy link

aidinrs commented Nov 14, 2018

@devsnek one is bluebird which lots of other modules depend on it.

@isaacs
Copy link
Contributor Author

isaacs commented Jan 2, 2020

Update on this: I've been using async-hook-domain for the last several versions of tap, and it mostly works.

However, in order to handle errors, it has to do some unsavory things, like patch process._fatalException. And due to #26794, it fails to get the correct continuation flow when multiple promises are in use.

Domains were deprecated in node v4, after all. As of v13, still no answer as to what to use instead.

@devsnek
Copy link
Member

devsnek commented Jan 2, 2020

#26794 is unfortunate, but when it is fixed it should be possible to track promise rejection that originates from the context of your choosing. As to a replacement for domains, I don't think there will be one magic module to replace everything domain did. In particular, domain tied async context and error handling together, and it became increasingly contrived and complex trying to follow how js worked. Async context tracking is fairly inherent to node core, and we have async_hooks for that. Error handling though, is a very nuanced problem. People don't even agree on what constitutes exceptional behaviour (for example, the argument on unhandled rejection tracking), so leaving that to userland seems appropriate to me. Your async-hook-domain module feels to me like a good sign that we're on the right path. There are definitely kinks to be worked out, but you're tracking errors across async contexts, on your own terms.

@jasnell
Copy link
Member

jasnell commented Jan 11, 2021

Going to transition this thread into a discussion at this point. Scratch that, looks like the option to convert an issue into a discussion has been temporarily disabled.

@ORESoftware
Copy link
Contributor

ORESoftware commented Oct 31, 2023

there are things that @isaacs and I disagree on, but I agree 100% with his comments on this thread. Domains are a political/religious issue because most people don't seem to really understand how they work. I dont see why maintaining them in core would be any harder than maintaining the equivalent functionality with async-hooks - however there is no equivalent functionality with async-hooks? Some might be ok with having no such equivalent functionality - but many of us are not.

@isaacs
Copy link
Contributor Author

isaacs commented Oct 31, 2023

@ORESoftware i did end up scratching my own itch with async-hook-domain https://www.npmjs.com/package/async-hook-domain

If you need the error handling behavior of domains, then it's worth taking a look at.

@ORESoftware
Copy link
Contributor

ORESoftware commented Nov 1, 2023

@isaacs thanks - I learned a lot about domains by implementing npm package "domain-haven" ->
before version 14, a lot of times, I would be touching these a lot:

process.on('unhandledRejection', (e, p) => {
   // reference p.domain if it exists
});

process.on('uncaughtException', (e) => {
  // reference process.domain if it exists
});

questions about your library belong there but someone will probably use this thread someday - and I could probably poke through your codebase to find out - but does your library need to reference the above global handlers? It certainly would be nice if things were implemented in core (etc etc) so that whatever is "acting like domain" is consistent enough to not need to deal with global things like those handlers.

@isaacs
Copy link
Contributor Author

isaacs commented Nov 1, 2023

@ORESoftware It uses the newer process.setUncaughtExceptionCaptureCallback method to handle errors and the process.on('uncaughtExceptionMonitor') event to know when an error has occurred and look up the appropriate handler from the stack of Domains for the given async context. There is necessarily some "global" hooking that needs to be done, because errors that are not handled go unhandled to the global context, that's what "unhandled" means. But at least, the more recent interfaces are cleaner and less prone to interference from code that is doing other things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain Issues and PRs related to the domain subsystem. question Issues that look for answers.
Projects
None yet
Development

No branches or pull requests