Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disposable is leaky by default #159

Open
rixtox opened this issue Jun 4, 2023 · 84 comments
Open

Disposable is leaky by default #159

rixtox opened this issue Jun 4, 2023 · 84 comments

Comments

@rixtox
Copy link

rixtox commented Jun 4, 2023

I'm a bit uneasy with the fact that it's allowed to create a Disposable without registering its dispose method to some kind of disposal scope, either with using on the current scope, or add to some DisposableStack.

In fact, the disposal registration is opt-in, not opt-out. This gives a Disposable object leaky behavior by default. It's like still doing malloc() and free(), but added opt-in support for scope based RAII. This is especially concerning given most of the resource management in JavaScript to date has been implicit by garbage collection. A Disposable object can leak even after it getting garbage collected.

function createHandle(): Disposable & Handle {
    const { write, close } = nativeHandle();
    return { write, [Symbol.dispose](): { close() } };
}

function main() {
    const handle = createHandle();
    handle.write(data);
    // ops you forget to dispose handle
}

main();
// now it's leaking quietly
// any because handle is garbage collected you can't even find it in heap dump

It's worse than leaking - it's leaking quietly, and it can disappear from heap dump because the dispose method can be garbage collected. Diagnosing a leak like this would be a nightmare.

I know, before the proposal people have already been doing explicit resource management in JavaScript. But that's because it's the only way to do it. I would expect a language level improvement on resource management to do better than just providing a syntactic sugar without addressing the unsafe nature of explicit resource management.

I'm not saying I have a solution to the problem. But I'd like to at least start a conversation on possibilities.

My proposal have improved for several iterations thanks to in-depth discussion. To help people to quickly follow up with the most recent proposal of mine, please check this mega post.

Maybe it's better to not exposing the dispose method on any returned value. A dispose method must be registered to some DisposeScope. While decoupling DisposeScope from block scope, making it similar to Autorelease Pool Blocks in Objective-C. So we don't need explicit transfer of ownership when a function is using some Disposable resource but wants to delegate its disposal to a parent DisposeScope. Also a DisposeScope can be stored and re-opened to support use cases like class.

So it would look something like this:

interface DisposeScope {
    defer<T>((value: T) => Promise<void>|void): void;
}

interface DisposableFunction<T, Args extends any[]> {
    (scope: DisposeScope, ...Args): T;
}

interface DisposableClass<T, Args extends any[]> {
    new (scope: DisposeScope, ...Args): T;
}

function Allocate(scope: DisposeScope, length: number): Array {
    const arr = new Array(length);
    scope.defer(() => arr.splice(0, arr.length));
    return arr;
}

function Bar() {
    const bar = using Allocate(10); // using always bind to the nearest dispose scope above the call stack
}

async dispose { // create a DisposeScope at this stack level
    const foo = using Allocate(5);
    {
        Bar();
    }
} // await bar dispose(); awiat foo dispose();

const baz = using Allocate(5); // error: no enclosing DisposeScope


// For class

class Resource {
    constructor(scope: DisposeScope) {
        this.#scope = scope;
    }
    async run(): Promise<void> {
        async dispose(this.#scope) { // reopen a DisposeScope
            this.#arr = using Allocate(5);
        } // disposal is being deferred to the owner of this.#scope
    }
}

async dispose { // create a DisposeScope at this stack level
    const resource = using new Resource();
} // await resource.#arr dispose();

Enclosing some code inside a DisposeScope means you want the scope to collect all the resource dispose methods, and call them when the DisposeScope is closed, and to make sure all dispose methods complete before executing the statements after the DisposeScope block.

You can even take it further to not pass DisposeScope as an argument. Instead making it an environment slot on the execution context. A defer keyword can be added to register a disposal method to the enclosing DisposeScope. However storing and re-opening a scope would require some changes to be supported.

So it can even look something like this:

function Allocate(length: number): Array {
    const arr = new Array(length);
    defer arr.splice(0, arr.length); // the expression is deferred for evaluation in the enclosing DisposeScope
    async defer { // a block expression can be used for defer
        await cleanup(); // await can be used after async defer
    }; // async defer can be used in a sync function as long as the enclosing DisposeScope is async dispose
    return arr;
}

function Bar() {
    const bar = Allocate(10); // function call always inherits the nearest dispose scope above the call stack
}

async dispose { // create a DisposeScope at this stack level
    const foo = Allocate(5);
    {
        Bar();
    }
} // await bar dispose(); awiat foo dispose();

const baz = Allocate(5); // error: no enclosing DisposeScope


// For class

function EnclosingDisposeScope(): unique symbol {
    // [native function]
    // returns a special symbol uniquely identifies the current enclosing DisposeScope
}

class Task {
    constructor() {
        this.#scope = EnclosingDisposeScope();
    }
    run(): void {
        dispose(this.#scope) { // reopen a DisposeScope identified by the symbol
            this.#arr = Allocate(5);
        } // disposal is being deferred to the scope identified by this.#scope
    }
}

async dispose { // create a DisposeScope at this stack level
    const task = new Task();
    dispose { // a new DisposeScope at top of the stack
        task.run();
    }
} // await task.#arr dispose();

In this way there's no need to define a Disposable type, and no need to use a DisposableStack. Even the using keyword can be omitted. A function created disposable resource no longer has to return a Disposable type. Every function just return whatever value they produce, and defer statement can be inserted at any location to dispose resource at the enclosing DisposeScope.

In this way, a disposal cannot be scheduled with defer if there's no enclosing DisposeScope. And the only way to register disposal operation is to use defer, and the dispose method is never exposed to the outside.

This would provide a safer default behavior when using Disposable resource. If you somehow "forgot" to create an new DisposeScope, resource created directly or indirectly from your function won't leak silently - it either fails to defer to an enclosing DisposeScope, resulting an error being thrown, or it's guarenteed to be disposed when the enclosing DisposeScope closed. If a resource is living longer than you expected, it's very likely being deferred to some long running DisposeScope, and you can easily find both the DisposeScope and the dispose method it referecnes in a heap dump. Diagnosing resource retaintion problems would be easier this way.

Extending the lifetime of a DisposeScope

Resource lifetime and execution context (functions, try-catch blocks etc) lifetime are loosely related. Resource references can be cloned, assigned, transferred, returned, down and up the call stack. You simply cannot confine a resouce in a fixed block of code enclosed by a pair of curly braces. Execution context on the other hand, is more ordered - it's always in a Last-In-First-Out stack. There are special behaviors like async/await, macrotasks, generator functions but if you generalize execution context enough, these can also follow the same rule.

A curly brace block syntax for a resource DisposeScope made it easy to implicitly pass the scope down the execution stack, but it doesn't capture the dynamic nature of resource passing between different execution contexts, and in many cases, outside the curly brace block the DisposeScope can enclose. When that happen, we will have use-after-free problems:

function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
dispose {
    const handle = createHandle();
    doSomething().then(() => {
        handle.write(data); // error: handle already disposed
    }); // this returned synchronously while the callback is scheduled on an async context
}

Note that DisposableStack also has this problem:

function createHandle(): Disposable & Handle {
    const { write, close } = nativeHandle();
    return { write, [Symbol.dispose](): { close() } };
}
function main() {
    using handle = createHandle();
    doSomething().then(() => {
        handle.write(data); // error: handle already disposed
    }); // this returned synchronously while the callback is scheduled on an async context
}

However, if we decouple the DisposeScope from execution block scope, we can define a different semantic for it. Just look at the following code and it would be very intuitive if it can just work:

function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
dispose { // spoiler: this should be an async dispose
    const scope = EnclosingDisposeScope();
    const handle = createHandle();
    doSomething().then(() => {
        dispose(scope) { // re-open a captured scope on an async context
            handle.write(data);
        }
    });
}

An immediate observation tell us, we can make it work if we add the following rule for a DisposeScope disposal: Only dispose a DisposeScope when all the references to that scope were gone (unreachable / garbage collected).

We can manually analyze the bahavior by denoting reference counts of the scope.

dispose { // open the scope, refcount++: 1
    const scope = EnclosingDisposeScope(); // refcount++: 2
    const handle = createHandle();
    doSomething().then(/* ... */); // captures scope, refcount++: 3
    // const scope gone, refcount--: 2
} // close the scope, refcount--: 1

// on the async execution context:
() => {
    dispose(scope) { // re-open the scope, refcount++: 2
        handle.write(data);
    } // close the scope, refcount--: 1
} // captured scope gone, refcount--: 0, now we can dispose scope

We can see that after reaching the end of the dispose block that created the DisposeScope, there is still reference to the scope. So the dispose doesn't happen. It's until the scheduled async context finished, and the reference count of the scope dropped to 0, we can finally dispose the scope.

This however, still has a problem. The DisposeScope we created here is an synchronous scope. But the refcount of the scope only reach to 0 on a different async execution context. This means we should use an async dispose scope instead, even the deferred function is synchronous:

function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
async dispose {
    const scope = EnclosingDisposeScope();
    const handle = createHandle();
    doSomething().then(() => {
        async dispose(scope) { // re-open the captured scope on an async context
            handle.write(data);
        }
    });
}

We can write self managed resources:

function createHandle(): Handle {
    const scope = EnclosingDisposeScope();
    const { write, close } = nativeHandle();
    defer close();
    return { write, scope }; // extending the scope's lifetime to the lifetime of the returned object
}
async dispose {
    const handle = createHandle();
    doSomething().then(() => {
        // no need to capture the scope explicitly, handle will hold a reference to the scope already
        handle.write(data);
    });
}

// For class

class Task {
    constructor() {
        this.#scope = EnclosingDisposeScope(); // extending the scope's lifetime to the lifetime of the returned object
    }
    run(): void {
        dispose(this.#scope) {
            this.#arr = Allocate(5);
        }
    }
}
async dispose {
    const task = new Task();
    task.run();
}

We can even keep improving this by integrating async context tracking mechanism like Node.js AsyncLocalStorage / AsyncResource or the AsyncContext proposal. So the DisposeScope itself is an AsyncResource.

There are however some performance implications if we binds the lifetime of an async DisposeScope to its references. It means the dispose can only happen after a GC cycle. But personally I think the benifits are worth the tradeoff. We can achieve almost implicit usage of managed resource this way.

@rixtox
Copy link
Author

rixtox commented Jun 7, 2023

I made a proof-of-concept preliminary implimentation based on Node.js AsyncLocalStorage / AsyncResource and FinalizationRegistry.

https://github.com/lyfejs/DisposeContext-PoC/blob/main/src/DisposeContext.test.ts

For the test code below:

function log(...args: any[]): void {
    const time = Math.floor(performance.now());
    console.log(time, ...args);
}

async function sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
}

function Allocate(length: number): Array<any> {
    const arr = new Array(length).fill(length);
    DisposeContext.defer(() => log('deallocate end'));
    DisposeContext.deferAsync(async () => sleep(500));
    DisposeContext.defer(() => arr.splice(0, length));
    DisposeContext.defer(() => log('deallocate start'));
    return arr;
}

async function main() {
    log('main start');
    await DisposeContext.runAsync(async () => {
        log('AsyncDisposeContext scope start');
        const arr = Allocate(5);
        log('arr =', arr); // [5, 5, 5, 5, 5];
    
        setTimeout(() => {
            log('AsyncDisposeContext setTimeout start');
            log('arr =', arr); // [5, 5, 5, 5, 5];
            log('AsyncDisposeContext setTimeout end');
        }, 1000);

        log('AsyncDisposeContext scope end');
    });
    log('main end');
    exit(0);
}

if (typeof global.gc === 'function') {
    // force using a more frequent gc cycle (--expose-gc flag required)
    setInterval(() => {
        log('gc');
        gc!();
    }, 2000);
} else {
    // this sleep is required to keep the macrotask queue non-empty
    // otherwise when we wait for gc our microtask queue would be empty, and
    // the process would be terminated, before gc can kick-in
    sleep(60 * 1000);
}

main();

We can get the following output:

704 main start
705 AsyncDisposeContext scope start
706 arr = [ 5, 5, 5, 5, 5 ]
707 AsyncDisposeContext scope end
1709 AsyncDisposeContext setTimeout start
1710 arr = [ 5, 5, 5, 5, 5 ]
1711 AsyncDisposeContext setTimeout end
2717 gc
2730 deallocate start
3243 deallocate end
3245 main end

Note that the resource registered to the async DisposeContext doesn't start deallocation until all the async context captured the DisposeContext are finished. The main function only resumes after resource deallocation finish.

Also note that the resource returned by Allocate doesn't hold a reference to the enclosing DisposeContext but still the async setTimeout callback is able to extend the lifetime of the resource. This is achieved using AsyncLocalStorage. If you need to pass resource to other async context, you can make the resource subclassing AsyncContext and it would automatically capture the async context it was created on. For example:

class Foo extends AsyncResource {
    constructor(public value: string) {
        super('Foo');
        DisposeContext.defer(() => log('foo deallocate end'));
        DisposeContext.deferAsync(async () => sleep(500));
        DisposeContext.defer(() => log('foo deallocate start'));
    }
}

async function main() {
    log('main start');
    await DisposeContext.runAsync(async () => {
        log('AsyncDisposeContext scope start');
        const arr = Allocate(5);
        log('arr =', arr); // [5, 5, 5, 5, 5];
        let foo = new Foo('hello');
    
        setTimeout(() => {
            log('AsyncDisposeContext setTimeout 500 start');
            log('arr =', arr); // [5, 5, 5, 5, 5];
            log('AsyncDisposeContext setTimeout 500 end');
        }, 500);

        DisposeContext.runAsync(async () => {
            setTimeout(() => foo.runInAsyncScope(() => {
                log('AsyncDisposeContext setTimeout 1000 start');
                log('foo =', foo.value);
                log('AsyncDisposeContext setTimeout 1000 end');
            }), 1000);
        });

        log('AsyncDisposeContext scope end');
    });
    log('main end');
    exit(0);
}

This would output:

641 main start
642 AsyncDisposeContext scope start
643 arr = [ 5, 5, 5, 5, 5 ]
644 AsyncDisposeContext scope end
1148 AsyncDisposeContext setTimeout 500 start
1148 arr = [ 5, 5, 5, 5, 5 ]
1148 AsyncDisposeContext setTimeout 500 end
1658 AsyncDisposeContext setTimeout 1000 start
1659 foo = hello
1659 AsyncDisposeContext setTimeout 1000 end
2650 gc
4659 gc
4672 foo deallocate start
5187 foo deallocate end
5187 arr deallocate start
5697 arr deallocate end
5699 main end

Notice that even we enclosed second setTimeoue inside a new DisposeContext, we can reopen the async context of where foo was created by foo.runInAsyncScope(). This would extend the lifetime of the outmost DisposeContext to cover both the setTimeout callbacks.

@mhofman
Copy link
Member

mhofman commented Jun 7, 2023

I'm not sure I follow what the suggestion is.
The dispose mechanism is about explicit resource management, where the resource is explicitly owned and deterministically disposed of when exiting a scope.

You can already perform some implicit cleanup through FinalizationRegistry, however that is neither guaranteed to execute, nor deterministic as anything that is relying on garbage collection.

In my mind, those 2 things are completely different type of problems, with different use cases. I don't believe we should rely on garbage collection mechanisms to controls the disposal of held resources.

@mhofman
Copy link
Member

mhofman commented Jun 7, 2023

IMO forgetting to declare a disposable with using is similar to not awaiting an async function. There are type based linting rules that can help catch those mistakes, but I don't see how a dynamic language like JS could enforce anything (unless you somehow could shoehorn linear typing into the language, which you can't).

@rixtox
Copy link
Author

rixtox commented Jun 7, 2023

I'm not sure I follow what the suggestion is. The dispose mechanism is about explicit resource management, where the resource is explicitly owned and deterministically disposed of when exiting a scope.

I think we should not focus on whether the resource management is implicit or explicit, but rather think in terms of ownership. I believe ownership should be declared, not assigned. When you have implicit resource management, a resource ownership is declared when you initialize that variable. Assigning a variable to another is declaring the extension of the resource ownership lifetime to the other variable.

This is not true for the current Disposable approach. You can declare a variable but not owning its resource.

You can already perform some implicit cleanup through FinalizationRegistry, however that is neither guaranteed to execute, nor deterministic as anything that is relying on garbage collection.

In my mind, those 2 things are completely different type of problems, with different use cases. I don't believe we should rely on garbage collection mechanisms to controls the disposal of held resources.

That's fine. We don't have to leverage garbage collection to trigger disposal. I was just outlining what are possible for making resource ownership to be declarative rather than imperative.

IMO forgetting to declare a disposable with using is similar to not awaiting an async function

Not awaiting an async function won't leak - it's still scheduled to be run on task queue, you are just not joining with the async completion to your current execution context. And there are legit use cases for not awaiting for an async function.

But not using a Disposable has serious consequence - it leaks quietly even after it being garbage collected. I cannot think of a single legit reason of not binding a Disposable resource to some scope.

And by the way there are more problems in this design - can you dispose a resource twice? If I'm authoring a Disposable object, should I keep track of if the dispose method has been called already and produce an error or ignore subsequence release if someone is doing excess release? Can you bind a Disposable resource to multiple scopes/stacks? How can you unbind a resource? Of course these are all anti-patterns and misuses. But the current design is prone to these errors.

One thing I also proposed above is to not expose the dispose method as an interface. This would encourage misuses and anti-patterns to abuse it to do things it was not intended for. Like in #156 and #158. Instead bind the disposal directly to some scope would address both my concerns above. And you don't need a linter to enforce that. This can all be done with a proper design.

@mhofman
Copy link
Member

mhofman commented Jun 7, 2023

When you have implicit resource management, a resource ownership is declared when you initialize that variable. Assigning a variable to another is declaring the extension of the resource ownership lifetime to the other variable.

I think this is where I disagree. Assignment is not the same as gaining ownership.

As I mentioned, JS cannot have a notion of ownership through assignment, which is basically linear types.

The using mechanism is the way to declare ownership. And indeed nothing prevents you from having multiple code paths independently claiming ownership of an object, or none at all, as that is basically impossible to add to the JS language at this point.

@rixtox
Copy link
Author

rixtox commented Jun 7, 2023

I think this is where I disagree. Assignment is not the same as gaining ownership.

As I mentioned, JS cannot have a notion of ownership through assignment, which is basically linear types.

In a garbage collection based language like JavaScript, you are right there's no explicit notion of ownership. But objects and resource still have their lifetime. An object's lifetime starts at when it's been declared/initialized, and ends when it's no longer reachable from the root. So in a very generalized way, we can say in JavaScript everything is owned by the root variable context. You extend the lifetime of that ownership when you assign the object to another variable.

Of course JavaScript would never become linear typed like Rust. And I don't expect that to happen. But I do expect a resource management design to be safer under general usage, and can prevent common mistakes in normal use cases.

The using mechanism is the way to declare ownership.

The using mechanism does not "declare" an ownership. It assigns an ownership. IMO this mechanism is flawed mainly because of 2 reasons:

  1. An easy mistake of not using a Disposable would cause dangerous leak - this is just unacceptable. There's no justification of it being allowed. I made it the title of this issue for a reason. It's the biggest problem of the current proposal. It's so dangerous and illegal to not using a Disposable, much more damaging than not awaiting an async function. It's so serious that I believe the language has to prevent it from happening at all. We cannot allow it to happen and just tell people to use a linter to catch the mistake. And the solution to this issue is actually not that hard - the language should enforce the binding of a resource to a resource scope enclosing the resource creation, and throw an error if there's no such scope.

  2. using is binding resource to the block scope. Block scopes are part of execution context. I mentioned it couple times but let me empathize: Execution context and resource context should not be considered as the same thing and should be decoupled in program design. Execution context are generally stack based. Even async callbacks are scheduled with push and pop operations on some queue. You can correlate microtasks with some stack based context. This is what AsyncContext has been trying to make explicit of. However resource context is completely different. Resource can be passed around different levels of execution context and being shared, making them not bound to a single execution stack frame. Binding a resource only to a block scope, which is part of the execution context, is not that useful. You may want to return the resource. You may want to capture the resource in a closure. You may want to assign it to an outer scope or on some global data structures. Yes there's DisposableStack but the stack itself is a Disposable and itself has the same problem. The solution to this issue is also the same - just bind a resource to a resource scope, not an execution context scope. If you read my proposed possible solutions in the main issue article, you would understand how the decoupling of resource context and execution context actually work. This is already a thing in language like Objective-C where it has @autorelease {} blocks being a decoupled resource context than the execution context.

Although execution context and resource context should be decoupled, they need to work together to make sure correct program behavior:

  1. The execution context that created the resource context must join with (await on) that resource context to be collected (deallocated/disposed).
  2. A resource context may enclose multiple execution context and capture resources created during the execution.
  3. An execution context referencing a resource from outside of that execution context, should extend the lifetime of that resource's binding resource context to the end of the current execution context. Or use after free can happen.
  4. A resource is owned by its enclosing resource context. This ownership was established when the resource was declared, and cannot be assigned or changed. This achieves declarative ownership.
  5. You can safely pass the resource to different execution contexts - calling a function, making a closure as callback etc, as long as that execution context is enclosed by the resource's binding resource context.

@bakkot
Copy link

bakkot commented Jun 7, 2023

The approach in this proposal is similar to the approach in other languages like Python and Java, where it works fine. Yes, it's possible for a resource to get leaked, but that is inevitable because even in the presence of a perfect resource management system there would be no requirement to use it with any of the many resources which exist today. It is extremely unlikely that the committee is going to choose a different approach, especially at this late stage.

Separately, in my opinion the approach you suggest is significantly worse than the current approach in terms of ability to reason about code. Resources should get cleaned up at a predictable point, not based on what other stuff happens to be on the callstack. And it's important that users be able to create and explicitly clean up long-lived resources, which might outlive the callstack; with the approach you've outlined the only way I can see to do that is a reachability analysis of the "resource context", which is not viable, or by letting users manually manage disposal of "resource context"s, which is equivalent to the current approach.

@rixtox
Copy link
Author

rixtox commented Jun 7, 2023

It is extremely unlikely that the committee is going to choose a different approach, especially at this late stage.

That's very unfortunate. I also notice Ben Lesh, author of RxJS, also only discovered this proposal very recently, and he also expressed his concerns over different reasons. It would be a shame if an important language feature like this cannot have a chance to address community feedbacks before it ships.

Resources should get cleaned up at a predictable point, not based on what other stuff happens to be on the callstack.

That is not true. This is the first thing I mentioned how resource context should work:

The execution context that created the resource context must join with (await on) that resource context to be collected (deallocated/disposed).

The deallocation happens when the binding resource context closes.

And it's important that users be able to create and explicitly clean up long-lived resources

You can do this by declaring the resource on a specific resource context that you have finer control of. This is the same how Objective-C is doing. Just read the Use Local Autorelease Pool Blocks to Reduce Peak Memory Footprint section.

the only way I can see to do that is a reachability analysis of the "resource context", which is not viable, or by letting users manually manage disposal of "resource context"s, which is equivalent to the current approach.

It can be done in very simple and intuitive way. If you want to extend the lifetime of a resource context to some async function for example, just wrap that async function. So a resource context would open before starting the async function, and close after the async function completes. This is already a pattern used in e.g. AsyncContext.run() where you can wrap function calls within an async context.

@bakkot
Copy link

bakkot commented Jun 7, 2023

The deallocation happens when the binding resource context closes.

I appreciate that this is a property your design has. It is a property that I specifically think we should not have, because what the "binding resource context" is, and when it closes, depends on the full state of the callstack at the time the resource is allocated, which is very hard to reason about.

It can be done in very simple and intuitive way. If you want to extend the lifetime of a resource context to some async function for example, just wrap that async function.

What does it mean to "extend the lifetime of a resource context to some async function"? Does that mean the context is kept alive until all references to that function have been garbage collected? Like I said, approaches depending on GC are not viable. Or do you mean that you're immediately invoking the function and the resource context is kept alive only for a single execution of the function? That doesn't cover most use cases, where the function may not be executed immediately.

Here's a common pattern: I create a resource, and later - for example, in an event listener - I use the resource, and then immediately dispose of it. What are you imagining that looks like with your design? Please write it out in sample code.

For reference, here's one way to do that with the design in the proposal:

let resource = getResource();
addEventListener('click', () => {
  using res = resource;
  res.use();
  // resource is disposed of here
}, { once: true });

@rixtox
Copy link
Author

rixtox commented Jun 8, 2023

Here's a common pattern: I create a resource, and later - for example, in an event listener - I use the resource, and then immediately dispose of it. What are you imagining that looks like with your design? Please write it out in sample code.

For reference, here's one way to do that with the design in the proposal:

let resource = getResource();
addEventListener('click', () => {
  using res = resource;
  res.use();
  // resource is disposed of here
}, { once: true });

Without any syntax changes to ES6, I can do this:

await context.runAsync(async (context) => {
    const resource = getResource(context);
    context.runAsync(() => new Promise((resolve) => {
        addEventListener('click', () => {
            res.use();
            resolve();
        }, {once: true});
    }));
});
// resource is disposed of here

Where getResource also need to be aware of the context:

function getResource(context) {
    context.defer(() => teardown());
    // ...
}

Your code would fail in the following case:

let resource = getResource();
addEventListener('click', () => {
  using res = resource;
  res.use();
}, { once: true });

addEventListener('keydown', () => {
  using res = resource;
  res.use();
}, { once: true });

But my approach would still work:

await context.runAsync(async (context) => {
    const resource = getResource(context);
    context.runAsync(() => new Promise((resolve) => {
        addEventListener('click', () => {
            res.use();
            resolve();
        }, {once: true});
    }));
    context.runAsync(() => new Promise((resolve) => {
        addEventListener('keydown', () => {
            res.use();
            resolve();
        }, {once: true});
    }));
});

@bakkot
Copy link

bakkot commented Jun 8, 2023

In that example, if you fail to call resolve - for example, if res.use() throws - then your resource will leak forever. This is exactly the same problem you're worried about with the current proposal. (In fact it's worse, since now you're also leaking arbitrarily many other resources which happened to be declared in the same context.)

So the approach you're suggesting doesn't actually improve anything over the current proposal: it still requires discipline on the part of developers to ensure they're using the appropriate patterns to handle resources (or resource scopes). And it's also significantly more complex. So it seems strictly worse than the current proposal.

@mhofman
Copy link
Member

mhofman commented Jun 8, 2023

Also this approach is bound to the context object not being stored somewhere, which would allow whoever has a hold of the context object to perform another runAsync, extending the context's life. More broadly, AsyncContext is not yet a language feature, and there are ongoing discussions about what it means for a context to terminate (spoiler alert, I believe it is intrinsically tied to garbage collection).

@rixtox
Copy link
Author

rixtox commented Jun 8, 2023

No it’s not worse. Yes if your promise never resolves, sure your program may hang. But the uncaught error here will be visible to developers and actionable. Unlike the silent leak of not using a resource.

@bakkot
Copy link

bakkot commented Jun 8, 2023

An exception in an event listener is not the only way to fail to call resolve. In that specific example you might get an actionable error somewhere, but if you happened to have a try/catch around listener, or just had a branch where you omitted resolve, you would not.

@rixtox
Copy link
Author

rixtox commented Jun 8, 2023

If you try to run on a disposed context of course an error will be thrown. That’s the best we can provide so far. It’s the same as returning a resource that you bind to the current scope with using. Without object level or context level reference tracking this use case can never be supported. But this is not a feature I really need for now.

@rixtox
Copy link
Author

rixtox commented Jun 8, 2023

An exception in an event listener is not the only way to fail to call resolve. In that specific example you might get an actionable error somewhere, but if you happened to have a try/catch around listener, or just had a branch where you omitted resolve, you would not.

Of course to support any language feature, some disciplines need to be followed. In this case you need to make sure your asynchronous function completes properly. I think that’s a very reasonable ask.

You don’t need to await on the asynchronous function that you wrapped to run on the context though. The context can await it for you.

@rixtox
Copy link
Author

rixtox commented Jun 8, 2023

Even in the worst case where you made a mistake and your Promise doesn’t complete and leaking the resource, it visible in the heap dump, with all retain relations available to the developer.

While with using, your resource can disappear from heap dump when leaked, leaving you no trace to diagnose.

Which one is worst?

@bakkot
Copy link

bakkot commented Jun 8, 2023

In this case you need to make sure your asynchronous function completes properly. I think that’s a very reasonable ask.

Since the asynchronous function involved is necessary resolved only by an explicit call to resolve, that seems like very much the same kind of thing as "resources should be consumed using using", except that the latter is much easier for static analysis to provide errors for. If you're ok with requiring that level of discipline from developers, then the current proposal is fine.

While with using, your resource can disappear from heap dump when leaked, leaving you no trace to diagnose.

I do not think enough developers currently or will ever make use of heap dumps for it to make sense to design a feature on the basis of how it will affect a heap dump.

@rixtox
Copy link
Author

rixtox commented Jun 8, 2023

One main reason for having any resource management mechanism is to address leaks. And a heap dump is the ad hoc thing to try when diagnosing leaks. It would be outrageous if a resource management proposal doesn’t take leak diagnosis into consideration.

Just to ask the reverse, if your big code base had a ghost leak not showing in heap dump, what else can you do?

You don’t want to later tell people heap dump is not something common developers use so you didn’t bother to support it when designing the Disposable.

@rbuckton
Copy link
Collaborator

rbuckton commented Jun 8, 2023

I'm a bit uneasy with the fact that it's allowed to create a Disposable without registering its dispose method to some kind of disposal scope, either with using on the current scope, or add to some DisposableStack.

In fact, the disposal registration is opt-in, not opt-out. This gives a Disposable object leaky behavior by default. It's like still doing malloc() and free(), but added opt-in support for scope based RAII. This is especially concerning given most of the resource management in JavaScript to date has been implicit by garbage collection. A Disposable object can leak even after it getting garbage collected.

function createHandle(): Disposable & Handle {
    const { write, close } = nativeHandle();
    return { write, [Symbol.dispose](): { close() } };
}

function main() {
    const handle = createHandle();
    handle.write(data);
    // ops you forget to dispose handle
}

main();
// now it's leaking quietly
// any because handle is garbage collected you can't even find it in heap dump

It's worse than leaking - it's leaking quietly, and it can disappear from heap dump because the dispose method can be garbage collected. Diagnosing a leak like this would be a nightmare.

In a language like C#, this is often handled through the use of a finalizer to guarantee cleanup on GC:

public class ManagedHandle : Disposable
{
    private IntPtr handle;

    public ManagedHandle()
    {
        handle = GetNativeHandle();
    }

    ~ManagedHandle()
    {
        // We're doing cleanup implicitly, so just release the handle.
        ReleaseNativeHandle(this.handle);
    }

    public void Dispose()
    {
        // We're doing cleanup explicitly, so suppress the finalizer for this object
        // and release the native handle
        GC.SuppressFinalize(this);
        ReleaseNativeHandle(this.handle);
        this.handle = IntPtr.Zero;
    }
}

The same can be accomplished in JS with a FinalizationRegistry:

const managedHandleRegistry = new FinalizationRegistry(close => {
    // We're doing cleanup implicitly, so just release the handle.
    close();
});
function createHandle() {
    const { write, close } = nativeHandle();
    const unregisterToken = {};
    const managedHandle = {
        write,
        [Symbol.dispose]() {
            // We're doing cleanup explicitly, so suppress the finalizer for this object
            // and release the native handle
            managedHandleRegistry.unregister(unregisterToken);
            close();
        }
    };
    managedHandleRegistry.register(managedHandle, close, unregisterToken);
    return managedHandle;
}

function main() {
    const handle = createHandle();
    handle.write(data);
    // oops, forgot to dispose handle
}

main();
// no leak because the cleanup callback will trigger if we didn't explicitly dispose.

So it would look something like this:

[...]

In this way there's no need to define a Disposable type, and no need to use a DisposableStack. Even the using keyword can be omitted. A function created disposable resource no longer has to return a Disposable type. Every function just return whatever value they produce, and defer statement can be inserted at any location to dispose resource at the enclosing DisposeScope.

In this way, a disposal cannot be scheduled with defer if there's no enclosing DisposeScope. And the only way to register disposal operation is to use defer, and the dispose method is never exposed to the outside.

While I understand its the point of your proposal, I feel that the design is far too rigid. This makes it signifcantly harder to manually manage lifetime of an object if it has to be tied to some outer dispose scope if an object's lifetime needs to extend past any reasonable block scope. The only solution to that in your proposal is to reopen a scope and somehow leave it dangling, possibly with a generator or async function that becomes suspended, and that is far too esoteric of a solution.

This would provide a safer default behavior when using Disposable resource. If you somehow "forgot" to create an new DisposeScope, resource created directly or indirectly from your function won't leak silently - it either fails to defer to an enclosing DisposeScope, resulting an error being thrown, or it's guarenteed to be disposed when the enclosing DisposeScope closed. If a resource is living longer than you expected, it's very likely being deferred to some long running DisposeScope, and you can easily find both the DisposeScope and the dispose method it referecnes in a heap dump. Diagnosing resource retaintion problems would be easier this way.

This can already be accomplished with FinalizationRegistry, as I've shown above. While using a FinalizationRegistry is an opt-in behavior, its also a lot of overhead that most resources won't need. Instead, I think its better to opt-in to the additional overhead where warranted, rather than force it on all users. In addition, FinalizationRegistry was designed in such a way as to avoid holding onto finalizable object reference, which would itself either prevent finalization or require more complexity to track object ressurection. I'll get into more about why that's important in my response to "Extending the lifetime of a DisposeScope", below.

Extending the lifetime of a DisposeScope

Resource lifetime and execution context (functions, try-catch blocks etc) lifetime are loosely related. Resource references can be cloned, assigned, transferred, returned, down and up the call stack. You simply cannot confine a resouce in a fixed block of code enclosed by a pair of curly braces. Execution context on the other hand, is more ordered - it's always in a Last-In-First-Out stack. There are special behaviors like async/await, macrotasks, generator functions but if you generalize execution context enough, these can also follow the same rule.

A curly brace block syntax for a resource DisposeScope made it easy to implicitly pass the scope down the execution stack, but it doesn't capture the dynamic nature of resource passing between different execution contexts, and in many cases, outside the curly brace block the DisposeScope can enclose. When that happen, we will have use-after-free problems:

function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
dispose {
    const handle = createHandle();
    doSomething().then(() => {
        handle.write(data); // error: handle already disposed
    }); // this returned synchronously while the callback is scheduled on an async context
}

Note that DisposableStack also has this problem:

function createHandle(): Disposable & Handle {
    const { write, close } = nativeHandle();
    return { write, [Symbol.dispose](): { close() } };
}
function main() {
    using handle = createHandle();
    doSomething().then(() => {
        handle.write(data); // error: handle already disposed
    }); // this returned synchronously while the callback is scheduled on an async context
}

I would argue that in this case, your explicit lifetime scope may not match the intended lifetime, and should be rewritten:

async function main() {
    using handle = createHandle();
    await doSomething();
    handle.write(data);
}

If your intent was that handle shouldn't survive the async operation, then you would expect to get an error in the callback. If you need it to survive long enough for the callback to execute, then you would need to either change the lifetime scope (by using await), or switch to a slightly more imperative style:

function main() {
    using stack = new DisposableStack();
    const handle = stack.use(createHandle());
    const doSomethingPromise = doSomething();
    const saved = stack.move();
    doSomethingPromise.then(() => {
        // restore outer stack
        using stack = saved;
        handle.write(data);
    }, (e) => {
        // restore outer stack
        using stack = saved;
        throw e;
    });
}

However, if we decouple the DisposeScope from execution block scope, we can define a different semantic for it. Just look at the following code and it would be very intuitive if it can just work:

function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
dispose { // spoiler: this should be an async dispose
    const scope = EnclosingDisposeScope();
    const handle = createHandle();
    doSomething().then(() => {
        dispose(scope) { // re-open a captured scope on an async context
            handle.write(data);
        }
    });
}

This is my major concern for your proposal, and the reason I brought up FinalizationRegistry earlier. As soon as you introduce scope capturing of an implicit surrounding scope, you have opened yourself up to memory and GC issues. If any callee can capture the implicit outer scope and save it, all of those resources lifetimes are no longer obvious to the caller. Since that scope can be restored at any time it can never be freed if any of the closures that capture it are still alive. And this capturing mechanism would need to keep alive all of the outer implicit scopes since there is no telling whether an inner resource has an ordered disposal dependency on an outer resource.

The goal of DisposableStack was to give you the kind capability you are suggesting here, but make it far more explicit how that lifetime is managed and extended. Implicit scoping makes it far less obvious what the impact will be.

An immediate observation tell us, we can make it work if we add the following rule for a DisposeScope disposal: Only dispose a DisposeScope when all the references to that scope were gone (unreachable / garbage collected).

We can manually analyze the bahavior by denoting reference counts of the scope.

dispose { // open the scope, refcount++: 1
    const scope = EnclosingDisposeScope(); // refcount++: 2
    const handle = createHandle();
    doSomething().then(/* ... */); // captures scope, refcount++: 3
    // const scope gone, refcount--: 2
} // close the scope, refcount--: 1

// on the async execution context:
() => {
    dispose(scope) { // re-open the scope, refcount++: 2
        handle.write(data);
    } // close the scope, refcount--: 1
} // captured scope gone, refcount--: 0, now we can dispose scope

We can see that after reaching the end of the dispose block that created the DisposeScope, there is still reference to the scope. So the dispose doesn't happen. It's until the scheduled async context finished, and the reference count of the scope dropped to 0, we can finally dispose the scope.

Ref counting isn't nearly as reliable here as you make it out to be. You can't count references statically because scope, or the arrow function capturing it, can be held in memory. Instead, you must wait until scope is garbage collected to decrement the reference. At that point, you are better off using FinalizationRegistry.

This however, still has a problem. The DisposeScope we created here is an synchronous scope. But the refcount of the scope only reach to 0 on a different async execution context. This means we should use an async dispose scope instead, even the deferred function is synchronous: [...]

There's no point in ref counting if the scope still can't survive past the dispose {} block it captures.

We can even keep improving this by integrating async context tracking mechanism like Node.js AsyncLocalStorage / AsyncResource or the AsyncContext proposal. So the DisposeScope itself is an AsyncResource.

Async context tracking is already a Stage 2 proposal. While I am generally supportive of that proposal, I'm strongly against language level implicit passdown of disposable scopes.

There are however some performance implications if we binds the lifetime of an async DisposeScope to its references. It means the dispose can only happen after a GC cycle. But personally I think the benifits are worth the tradeoff. We can achieve almost implicit usage of managed resource this way.

If dispose can only happen after GC, then your proposal does not give you the ability to enforce lifetime with timely cleanup. One of the main motivations of this proposal was to have an explicitly bounded lifetime for a resource that allow you to avoid race conditions with file IO and thread synchronization. Implicit scoping and capture would make those scenarios impossible, since you could never have an explicitly bounded lifetime.

@rixtox
Copy link
Author

rixtox commented Jun 9, 2023

@rbuckton First of all, I'd like to thank you for taking time to understand my concerns and provide a through response. I really appreciate your professional attitude. It would be better if stakeholders like RxJS, Angular, and other popular frameworks have been asked for feedback in early stages of the proposal. Important changes like this should get more attention from the developer community.

Apologize that I expressed several concerns mixing together. Making it harder to address them one by one. So let me separate them in the order of importance. And allow me to reorder your response into each category and address them with clear context and intent.

1. Leak diagnose

There should always be a retain path from the VM root to a living resource, or the teardown functions of that living resource. The current design made it easy to lose that info when a leak happens. This is not asking for preventing leaks from happening. This is an ask to provide the useful and actionable information developers need when there's a leak happening. Leaks are generally quiet by nature so the retain path is the most essential info for diagnosing leaks. A resource management mechanism should preserve the retain path info as much as possible. It's the bottom line for not being able to fully prevent leaks.

This can already be accomplished with FinalizationRegistry, as I've shown above. While using a FinalizationRegistry is an opt-in behavior, it's also a lot of overhead that most resources won't need. Instead, I think its better to opt-in to the additional overhead where warranted, rather than force it on all users.

I know I was the one first mentioned about FinalizationRegistry, but under a different context. Of course if a Disposable incorporated FinalizationRegistry to guarantee its finalization correctly like you showed above, it won't leak even not using it. But just like you said, this GC finalization behavior preventing leaks from happening should be opt-in. I completely agree, but I don't think it's something even experienced developers would want to use. However, as I emphasized, preventing leaks was never my intention.

Your response still doesn't address the undiagnosable nature of leaking by not using. We should not rely on linter to catch it. Linters can fail in may cases. I believe this can be done with a better language feature design.

2. Sharing resource to multiple detached async calls

I didn't state this issue in a clear sentence before. But one example I wrote above can demonstrate this issue. Let me adapt that example to show it more clear:

async function main() {
    using handle = createHandle();
    const data = await handle.readAsync();
    doSomething(data).then(async (newData) => {
        await handle.writeAsync(newData); // how to extend the handle's lifetime to here?
    });
    doSomethingElse(data).then(async (newData) => {
        await handle.writeAsync(newData); // how to extend the handle's lifetime to here?
    });
    return data;
}

I would argue that in this case, you're explicit lifetime scope may not match the intended lifetime... If you need it to survive long enough for the callback to execute, then you would need to either change the lifetime scope (by using await), or switch to a slightly more imperative style.

Calling async functions in detached mode (without using await) has legit use cases. The example above shows one of those cases. You want to read from the resource and return the result immediately, while scheduling some write operations to the resource without explicitly waiting for them to complete.

It is possible to do it imperatively with using, moving and restoring DisposableStacks around. But very soon you will find yourself decoupling resource context from execution context, reinventing resource context capturing in some way.

async function main() {
    using stack = new DisposableStack();
    const handle = stack.use(createHandle());
    const data = await handle.readAsync();
    const saved = stack.move();
    (async () => { // this is a scope created solely for extending resource lifetime
        using stack = saved;
        await Promise.allSettled([
            doSomething(data).then(async (newData) => {
                await handle.writeAsync(newData);
            }),
            doSomethingElse(data).then(async (newData) => {
                await handle.writeAsync(newData);
            }),
        ]);
    })();
    return data;
}

I don't think you can achieve this without making a dedicated resource context scope. Comparing to the other approach that separated resource context from execution context from the beginning, which is more intuitive to use:

async function main(context) {
    const handle = createHandle(context);
    const data = await handle.readAsync();
    context.runAsync(() => doSomething(data).then(async (newData) => {
        await handle.writeAsync(newData);
    }));
    context.runAsync(() => doSomethingElse(data).then(async (newData) => {
        await handle.writeAsync(newData);
    }));
    return data;
}

You can easily and intuitively extend resource context lifetime by running your async functions on it, in a declarative style.

3. Passing resource out of block scope without binding to some outer scope

While I understand it's the point of your proposal, I feel that the design is far too rigid. This makes it signifcantly harder to manually manage lifetime of an object if it has to be tied to some outer dispose scope if an object's lifetime needs to extend past any reasonable block scope.

I think you got a dilemma here:

  1. You don't want to force resources to be tied to a dispose scope at creation, so they can be passed around freely. Essentially making them dangling resources easy to be leaked.
  2. Leak diagnosis is still the biggest problem unsolved for not binding resources at creation.

You cannot have both for a language like JavaScript. So the question comes down to which is the right choice.

My reasoning to this problem is, at the end of the day, you still need to ensure your program is safe and sound. You can either make the language easy to write unsafe code, but hard to diagnose when developer made a mistake; or you can make the language encouraging developers to follow good practices, and provide useful diagnose support when they do make mistakes, while making it hard to write unsafe code.

To show you what I meant by this, and you probably already realized, it's possible to convert the other approach into explicit imperitive style:

// THIS IS UNSAFE
function unsafeRetain(context) {
    let release;
    context.runAsync(new Promise(resolve => release = resolve));
    return release;
}

Now you can pass the release method freely - but unsafe. It's now functioning in the same way as Disposable - without the using syntax etc. It can leak too if you forget to call release(). But it doesn't disappear from heap dump. This is the main difference from bare Disposable - it still has the context awaiting for the dispose to happen so it preserves the retain path for diagnosis.

that is far too esoteric of a solution

It's harder to turn it into this form but for a good reason - it's unsafe to do so. But you can gain that level of freedom if you know what you were doing.

4. Resource context capturing

I avoided using the dispose {} syntax so far because I wanted to seperate my context and intents. I want to dedicate this section for this topic.

The approach I proposed defined a contract to pass the resource context down the call stack. This has been done by explictly passing the context as function argument so far, but if we make it a language feature it may get a dedicated syntax.

As soon as you introduce scope capturing of an implicit surrounding scope, you have opened yourself up to memory and GC issues. If any callee can capture the implicit outer scope and save it, all of those resources lifetimes are no longer obvious to the caller.

As I mentioned above, although it's doable, it should be deemed unsafe and discoraged, unless you know what you are doing. This is at worst leaking with retain path info available. If you have nested DisposableStack binded together and you forgot to release the root stack, you leak everything on it too, without retain path info available.

And this capturing mechanism would need to keep alive all of the outer implicit scopes since there is no telling whether an inner resource has an ordered disposal dependency on an outer resource.

Yes. This is the same way if you constructed a nested DisposableStack, or have nested function calls each binded resources with using. This is the natural way nested resource dependencies should behave when the are released. You can break this pattern using unsafe methods of course, but we should discourage that.

4.1. Async resource context capturing and GC

I think this is the part most of the rest of your comments were about, and I agree with most of the things you said. Becuase myself also don't think we should leverage any GC features in a resource management mechanism. The intention of me bringing it up is to explore if it's possible to capture resource context across async executon contexts. I only said it's possible if GC is used, but I don't think it's a good idea to involve GC for any real implementation. I would be interested if there's another way to do it.

@rbuckton
Copy link
Collaborator

rbuckton commented Jun 9, 2023

1. Leak diagnose

There should always be a retain path from the VM root to a living resource, or the teardown functions of that living resource. The current design made it easy to lose that info when a leak happens. This is not asking for preventing leaks from happening. This is an ask to provide the useful and actionable information developers need when there's a leak happening. Leaks are generally quiet by nature so the retain path is the most essential info for diagnosing leaks. A resource management mechanism should preserve the retain path info as much as possible. It's the bottom line for not being able to fully prevent leaks.

This can already be accomplished with FinalizationRegistry, as I've shown above. While using a FinalizationRegistry is an opt-in behavior, it's also a lot of overhead that most resources won't need. Instead, I think its better to opt-in to the additional overhead where warranted, rather than force it on all users.

I know I was the one first mentioned about FinalizationRegistry, but under a different context. Of course if a Disposable incorporated FinalizationRegistry to guarantee its finalization correctly like you showed above, it won't leak even not using it. But just like you said, this GC finalization behavior preventing leaks from happening should be opt-in. I completely agree, but I don't think it's something even experienced developers would want to use. However, as I emphasized, preventing leaks was never my intention.

I believe that runtimes will choose to adopt Disposable wrappers for most native handles, such as the NodeJS fs.promises.FileHandle, and they could easily leverage FinalizationRegistry for those wrappers. Runtimes may even choose to expose some kind of native handle wrapper as well:

interface SafeHandleData<T> {
    unsafeHandle: T;
    cleanup: (unsafeHandle: T) => void;
}

class SafeHandle<T> {
    static #dispose = ({ unsafeHandle, cleanup }: SafeHandleData<T>) => cleanup(unsafeHandle);
    static #registry = new FinalizationRegistry(SafeHandle.#dispose);
    #unregisterToken = {};
    #data: SafeHandleData<T> | undefined;

    constructor(unsafeHandle: T, cleanup: (unsafeHandle: T) => void) {
        this.#data = { unsafeHandle, cleanup };
        SafeHandle.#registry.register(this, this.#data, this.#unregisterToken);
    }

    get value() {
        if (!this.#data) throw new ReferenceError("Object is disposed");
        return this.#data.unsafeHandle;
    }

    dispose() {
        if (this.#data) {
            SafeHandle.#registry.unregister(this.#unregisterToken);
            const data = this.#data;
            this.#data = undefined;
            SafeHandle.#dispose(data);
        }
    }

    [Symbol.dispose]() {
        return this.dispose();
    }
}

It may even be worth a future add-on proposal to add something like SafeHandle natively.

If most native handles have Disposable wrappers, users won't need to reach for custom solutions, and thus those handles will either close on GC or appear in heap dumps. In other languages with this capability, the majority of user-written disposables are either some form of composition of existing disposables, or a disposable wrapper for some non-handle-based operation (such as tracing, transactions, etc.) that may not need such finalization behavior.

If all of the native resources you consume are already wrapped in some kind of safe handle wrapper, they will be readily visible in a heap snapshot.

Your response still doesn't address the undiagnosable nature of leaking by not using. We should not rely on linter to catch it. Linters can fail in may cases. I believe this can be done with a better language feature design.

Yes, leaks are possible with using. Yes, linters and type systems may not be able to be 100% reliable at catching this. However, I think this design is far more flexible than what you've suggested so far. Its far easier to write imperative code against Disposable when you explicitly don't want to leverage using. Its also far easier to integrate into the wealth of existing APIs without having to wholly restructure your code, which would otherwise be a severe adoption risk.

For example, I've been considering a follow-on proposal to add Symbol.dispose to ArrayBuffer to forcibly detach the array buffer so that its memory can be reclaimed. This proposal already adds [Symbol.dispose]() to built-in iterators that performs iter.return(), for cases where you need to execute an iterator imperatively rather than with for..of.

2. Sharing resource to multiple detached async calls

I didn't state this issue in a clear sentence before. But one example I wrote above can demonstrate this issue. Let me adapt that example to show it more clear:

async function main() {
    using handle = createHandle();
    const data = await handle.readAsync();
    doSomething(data).then(async (newData) => {
        await handle.writeAsync(newData); // how to extend the handle's lifetime to here?
    });
    doSomethingElse(data).then(async (newData) => {
        await handle.writeAsync(newData); // how to extend the handle's lifetime to here?
    });
    return data;
}

I would argue that in this case, you're explicit lifetime scope may not match the intended lifetime... If you need it to survive long enough for the callback to execute, then you would need to either change the lifetime scope (by using await), or switch to a slightly more imperative style.

Calling async functions in detached mode (without using await) has legit use cases. The example above shows one of those cases. You want to read from the resource and return the result immediately, while scheduling some write operations to the resource without explicitly waiting for them to complete.

It is possible to do it imperatively with using, moving and restoring DisposableStacks around. But very soon you will find yourself decoupling resource context from execution context, reinventing resource context capturing in some way.

async function main() {
    using stack = new DisposableStack();
    const handle = stack.use(createHandle());
    const data = await handle.readAsync();
    const saved = stack.move();
    (async () => { // this is a scope created solely for extending resource lifetime
        using stack = saved;
        await Promise.allSettled([
            doSomething(data).then(async (newData) => {
                await handle.writeAsync(newData);
            }),
            doSomethingElse(data).then(async (newData) => {
                await handle.writeAsync(newData);
            }),
        ]);
    })();
    return data;
}

I don't think you can achieve this without making a dedicated resource context scope. Comparing to the other approach that separated resource context from execution context from the beginning, which is more intuitive to use:

async function main(context) {
    const handle = createHandle(context);
    const data = await handle.readAsync();
    context.runAsync(() => doSomething(data).then(async (newData) => {
        await handle.writeAsync(newData);
    }));
    context.runAsync(() => doSomethingElse(data).then(async (newData) => {
        await handle.writeAsync(newData);
    }));
    return data;
}

You can easily and intuitively extend resource context lifetime by running your async functions on it, in a declarative style.

Intended resource lifetime can always vary based on need. using uses lexical block scoping to give you a clear boundary for a resource, while DisposableStack gives you both the ability to compose resources, as well as the ability to extend lifetime. If you want to have a resource lifetime that is associated with the execution context, you will already be able to do so programmatically with AsyncContext. But again, I am strongly opposed to natively supporting implicit pass down of disposable scopes as that approach is a GC hazard and can make lifetimes impossible to predict. That approach causes far more problems than it solves. If you wanted implicit pass down in your own code, you could do so with DisposableStack and it won't have any effect on using declarations in functions elsewhere in the call stack, so the authors of those functions can make reliable determinations about the lifetime of their resources.

3. Passing resource out of block scope without binding to some outer scope

While I understand it's the point of your proposal, I feel that the design is far too rigid. This makes it signifcantly harder to manually manage lifetime of an object if it has to be tied to some outer dispose scope if an object's lifetime needs to extend past any reasonable block scope.

I think you got a dilemma here:

  1. You don't want to force resources to be tied to a dispose scope at creation, so they can be passed around freely. Essentially making them dangling resources easy to be leaked.
  2. Leak diagnosis is still the biggest problem unsolved for not binding resources at creation.

You cannot have both for a language like JavaScript. So the question comes down to which is the right choice.

My reasoning to this problem is, at the end of the day, you still need to ensure your program is safe and sound. You can either make the language easy to write unsafe code, but hard to diagnose when developer made a mistake; or you can make the language encouraging developers to follow good practices, and provide useful diagnose support when they do make mistakes, while making it hard to write unsafe code.

JavaScript is often a language of convenience and has far more novice and hobbyist users than any other programming language. One of the goals of using is to take a large number of footguns related to resource management and distill it into a easily consumable syntax, but without sacrificing the flexibility and control that experienced developers need. There are often caveats and limitations that come along with that, and we must weigh those limitations against growing language complexity. I feel that using does a fair job at walking this line.

I also think that leak diagnosis will be less of a problem if runtimes provide easy-to-use safe handle wrappers like I mentioned above.

To show you what I meant by this, and you probably already realized, it's possible to convert the other approach into explicit imperitive style:

// THIS IS UNSAFE
function unsafeRetain(context) {
    let release;
    context.runAsync(new Promise(resolve => release = resolve));
    return release;
}

Now you can pass the release method freely - but unsafe. It's now functioning in the same way as Disposable - without the using syntax etc. It can leak too if you forget to call release(). But it doesn't disappear from heap dump. This is the main difference from bare Disposable - it still has the context awaiting for the dispose to happen so it preserves the retain path for diagnosis.

This does not function the same way as Disposable as it forces a dependency on asynchrony, and thus poisons synchronous code paths with await if they want to be able to properly handle exceptions thrown during disposal. It also doesn't make clear how those exceptions can be handled even in the async case, and potentially runs afoul of an unhandled promise rejection because the result of runAsync will go unwitnessed.

that is far too esoteric of a solution

It's harder to turn it into this form but for a good reason - it's unsafe to do so. But you can gain that level of freedom if you know what you were doing.

Therein lies the problem. An overly complex language feature will hamper adoption, and the benefits you've espoused with this design would do more to make the feature harder to reason over and harder to teach. using is simple. It may have a flaw with leaks, but I believe that flaw can be addressed with improvements to host APIs without sacrificing that simplicity. Those host APIs would need to be updated with your proposal as well.

4. Resource context capturing

I avoided using the dispose {} syntax so far because I wanted to seperate my context and intents. I want to dedicate this section for this topic.

The approach I proposed defined a contract to pass the resource context down the call stack. This has been done by explictly passing the context as function argument so far, but if we make it a language feature it may get a dedicated syntax.

As soon as you introduce scope capturing of an implicit surrounding scope, you have opened yourself up to memory and GC issues. If any callee can capture the implicit outer scope and save it, all of those resources lifetimes are no longer obvious to the caller.

As I mentioned above, although it's doable, it should be deemed unsafe and discoraged, unless you know what you are doing. This is at worst leaking with retain path info available. If you have nested DisposableStack binded together and you forgot to release the root stack, you leak everything on it too, without retain path info available.

And this capturing mechanism would need to keep alive all of the outer implicit scopes since there is no telling whether an inner resource has an ordered disposal dependency on an outer resource.

Yes. This is the same way if you constructed a nested DisposableStack, or have nested function calls each binded resources with using. This is the natural way nested resource dependencies should behave when the are released. You can break this pattern using unsafe methods of course, but we should discourage that.

No, this is not quite the same. A DisposableStack can be used to explicitly pass down lifetime, but otherwise has no effect on other using declarations. You opt-in to its use. What you've suggested so far is that syntax could be used to capture the surrounding scope. If you were able to capture the dispose scope within a nested function call, then this implies implicit pass-down of the scope and the scope's lifetime becomes unknowable. If you can only capture the dispose scope within the same lexical scope in which it was declared, then it is no different than just using const scope = new DisposableStack(), thus I don't see the benefit of syntax for that case.

Summary

I'd like to summarize my position as to what we've discussed so far:

Native implicit pass-down is not viable

Implicit pass-down of a dispose scope to permit capturing makes lifetime unknowable. If you cannot reason over the lifetime of a resource, then you cannot reliably use the feature. If you cannot reliably use the feature, then it is harder to teach and to learn and will hamper adoption. As a result, this is not a viable option for native support within the language.

If you "know what you're doing", you can easily implement implicit pass-down in your own code in such a way as to not affect surrounding code. You can do this by defining your own DisposableStack and passing it as an argument, or storing it in a field or variable or AsyncContext, and thus you know at the declaration site that the lifetime is potentially unknowable. This provides flexibility without sacrificing simplicity.

Hosts and built-ins can and should guard against leaky native handles

Yes, using can be leaky if you forget to use it. This can be guarded against for native resources by also leveraging FinalizationRegistry, and should be recommended in documentation, much as it is for C#. Host implementations like NodeJS and browsers should endeavour to guard resources in this way. This ensures that native resources are properly released when the handle itself is GC'd, and will also help with debugging and diagnostics when using heap snapshots.

Linters and type checkers can help as well

This can also be guarded against by linters and type checkers, even if only in part. For example, TypeScript might choose to implement a uses keyword to indicate to the caller that a Disposable will be tracked for cleanup, and provide an error if you fail to "use" it, return it, or store it in an outer variable or field:

class DisposableStack {
    ...
    use<T extends Disposable | null | undefined>(uses value: T): T & used;
    //                                           ^^^^                ^^^^
    //                                           |                   |
    // indicates the value passed to the parameter                   indicates the returned value should be considered
    //                should be considered as used                   as used
    ...
}

function foo() {
    const stack = new DisposableStack();
    //    ^^^^^ - ok: stack is returned, so its up to the caller to manage lifetime.

    const res1 = new SomeDisposable();
    //    ^^^^ - error: res1 was not used

    const res2 = new SomeDisposable();
    //    ^^^^ - ok: res1 is used below

    stack.use(res2);
    return stack;
}

function bar() {
    const res3 = new SomeDisposable() as used;
    //    ^^^^ - ok: explicitly declared as `used`, so we assume the user knows what they're doing
}

It should be easy to handle exceptions

It's important that a developer can reason over exceptions thrown during disposal. Both using and DisposableStack do this for you in a way that that works conveniently with try..catch and doesn't poison synchronous code with await just to retain lifetime.

Resource management should be easy to use

If we design this feature such that only the most experienced developers can use it reliably, then it won't be adopted by the majority. The inconvenience of APIs like FinalizationRegistry and Proxy are a necessity of their utility, and often only affect the producer of the object. The resulting code is often "write few, read many"—You only need to write a leak-safe Disposable around a native resource in a few places, yet the APIs that produce them will often be used orders-of-magnitude-more frequently, and by a combination of both experienced and inexperienced developers alike. It behooves us to design this feature with that in mind.

@rixtox
Copy link
Author

rixtox commented Jun 12, 2023

1. Leak diagnose

I believe that runtimes will choose to adopt Disposable wrappers for most native handles, such as the NodeJS fs.promises.FileHandle, and they could easily leverage FinalizationRegistry for those wrappers. Runtimes may even choose to expose some kind of native handle wrapper as well:

interface SafeHandleData<T> {
    unsafeHandle: T;
    cleanup: (unsafeHandle: T) => void;
}

class SafeHandle<T> {
    static #dispose = ({ unsafeHandle, cleanup }: SafeHandleData<T>) => cleanup(unsafeHandle);
    static #registry = new FinalizationRegistry(SafeHandle.#dispose);
    #unregisterToken = {};
    #data: SafeHandleData<T> | undefined;

    constructor(unsafeHandle: T, cleanup: (unsafeHandle: T) => void) {
        this.#data = { unsafeHandle, cleanup };
        SafeHandle.#registry.register(this, this.#data, this.#unregisterToken);
    }

    get value() {
        if (!this.#data) throw new ReferenceError("Object is disposed");
        return this.#data.unsafeHandle;
    }

    dispose() {
        if (this.#data) {
            SafeHandle.#registry.unregister(this.#unregisterToken);
            const data = this.#data;
            this.#data = undefined;
            SafeHandle.#dispose(data);
        }
    }

    [Symbol.dispose]() {
        return this.dispose();
    }
}

It may even be worth a future add-on proposal to add something like SafeHandle natively.

If most native handles have Disposable wrappers, users won't need to reach for custom solutions, and thus those handles will either close on GC or appear in heap dumps. In other languages with this capability, the majority of user-written disposables are either some form of composition of existing disposables, or a disposable wrapper for some non-handle-based operation (such as tracing, transactions, etc.) that may not need such finalization behavior.

If all of the native resources you consume are already wrapped in some kind of safe handle wrapper, they will be readily visible in a heap snapshot.

I still think it's not an easy task to author safe Disposable resource even for engine developers. But that's easier for me, because I'm not the one authoring native resources. 😂

Your response still doesn't address the undiagnosable nature of leaking by not using. We should not rely on linter to catch it. Linters can fail in may cases. I believe this can be done with a better language feature design.

Yes, leaks are possible with using. Yes, linters and type systems may not be able to be 100% reliable at catching this. However, I think this design is far more flexible than what you've suggested so far. Its far easier to write imperative code against Disposable when you explicitly don't want to leverage using. Its also far easier to integrate into the wealth of existing APIs without having to wholly restructure your code, which would otherwise be a severe adoption risk.

What you said here is not in alignment with what you previously said on another thread:

  1. We want the API to guide the user towards the safest and most reliable avenue for resource cleanup by making those use cases the easiest to address.
  2. We want to allow the user to do less safe things for specific cases, but there should be more resistance on that path to ensure the average use case stays on (1).

If you were being truthful to those doctrines, you would oppose to allowing Disposable to have leaky default behavior.

JavaScript is often a language of convenience and has far more novice and hobbyist users than any other programming language. One of the goals of using is to take a large number of footguns related to resource management and distill it into a easily consumable syntax, but without sacrificing the flexibility and control that experienced developers need. There are often caveats and limitations that come along with that, and we must weigh those limitations against growing language complexity. I feel that using does a fair job at walking this line.

Even in the name of usability I don't think you should relax number (1) that much. Memory/resource safety is hard - you either do it the hard way when you write it, or you learn it the hard way when you fix it. Your doctrine above was leaning toward the former, while your arguments to my concerns were emphasizing the later.

And more importantly, I believe what I proposed was not "sacrificing the flexibility and control that experienced developers need". I will outline a more complete contract in the next section to illustrate my point.

2. Sharing resource to multiple detached async calls

Intended resource lifetime can always vary based on need. using uses lexical block scoping to give you a clear boundary for a resource, while DisposableStack gives you both the ability to compose resources, as well as the ability to extend lifetime. If you want to have a resource lifetime that is associated with the execution context, you will already be able to do so programmatically with AsyncContext. But again, I am strongly opposed to natively supporting implicit pass down of disposable scopes as that approach is a GC hazard and can make lifetimes impossible to predict. That approach causes far more problems than it solves. If you wanted implicit pass down in your own code, you could do so with DisposableStack and it won't have any effect on using declarations in functions elsewhere in the call stack, so the authors of those functions can make reliable determinations about the lifetime of their resources.

You keep bringing up implicit pass down of disposable scopes even the examples I showed in this section have nothing to do with implicit scope capturing. I have a dedicated section below for that discussion because I don't want to pollute the context.

So let me rename the title of this section to clarify my intention, and address your related comments under the same context.

2. Resource context passing without implicit scope capturing

What I was trying to get to in this section was that resource context and execution context should be separated. using binds resource to block scope would quickly become not useful enough in many cases. Moving with DisposableStack can help, but it's adding unnecessary complexity making it harder to learn and write safe code. I completely agree with Ben on move being a overly complicated design. You can make it easier if you separated resource context and execution context from the begining.

Let me show you what I came up with as a simple programing contract to adopt resource context. I only did the async version but there could be a synchronous variant.

Let me emphasize that this is all without implicit scope capturing. What I want to get to in this section is that a programing contract can be established WITHOUT implicit scope capturing to achieve a safer resource management behavior than Disposable, without sacrifising usability, simple to learn, and library authors being more confident to adopt with a simple and safer contract having their back.

I call it the Lifecycle contract. The full implementation and type definitions can be found in this gist. Below is a detailed description of the contract. The text is quite long so I collapsed it to make it easier to navigate to my other comments. Expand it to have a read:

Description of the Lifecycle resource context contract

Resource context and Lifecycle

Resources are objects that require cleanup. The lifetime of a resource starts from it being created, and ends at it being cleaned up.

const { write, close } = open('file'); // created
write('data');
close(); // cleaned up
// cannot use write() again

A resource context owns resources by holding on to their teardown functions.

A resource context can own many resources and perform cleanup all together.

A resource has one and only one owning resource context and cannot be changed.

A resource context has a Lifecycle of three stages:

  • Alive stage - when the resource context was accepting resource ownership.
  • Dying stage - when the resource context was starting to perform cleanup.
  • Dead stage - when the resource context was finished with all cleanup.

A function can safely use a resource if the owning resource context was being kept alive throughout the function call.

const { write, close } = open('file'); // created
life.defer(() => close()); // declare ownership to context, deferred teardown
write('data'); // we can use it here because context is still alive
(async () => {
    await sleep(1000);
    write('data'); // but how about here? how to ensure context is still alive?
})();

To run a function safely, we can extend the Lifecycles of all the depending resources' owning contexts to cover the lifetime of that function.

To do this, we use life.extend(function).

const { write, close } = open('file'); // created
life.defer(() => close()); // declare ownership
life.extend(async () => { // extend resource context Lifecycle
    await sleep(1000);
    write('data'); // safe to use
});

A function created a resource context should wait for the Lifecycle of that context to finish. Which is after all its resources were cleaned up.

To create a new resource context, use Lyfe.spawn(function).

import { Lyfe } from 'lyfe';
await Lyfe.spawn(async (life) => {
    const { write, close } = open('file');
    life.defer(() => close());
    write('data');
});

Errors thrown from the function passed to spawn() and from the deferred callbacks during cleanup can be caught and handled by the await on spawn().

Declare resource context dependencies by spawning Lifecycle

Some resources need to be kept alive for longer, while some resources are better to be cleaned up more frequently. Resource context only cleanup all resources it owns in once. Multiple resource contexts are needed to have different lifetimes.

Resources have dependencies - a short living resource may depend on some long living resource.

Therefore, resource contexts should also have dependencies - a short living context should keep the depending long living context alive.

We achieve this by spawning short living context from a long living context. To do this, we use life.spawn().

const { accept, close } = serve();
longLife.defer(() => close());
while (true) {
    await longLife.spawn(async (shortLife) => {
        const { write, close } = accept();
        shortLife.defer(() => close());
        shortLife.extend(async () => {
            await sleep(1000);
            write('data');
        });
    });
}

In practice, all Lifecycles are descendants from a root Lifecycle called Lyfe.

import { Lyfe } from 'lyfe';
Lyfe.spawn(async (life) => {
    // life is spawned from the root Lifecycle `Lyfe`
});

Cancellation with AbortController

Note: Cancellation is not in-scope of our discussion, so I should just keep these collapsed.

Because resources are often cleaned up when the task was cancelled, it's useful
to provide resource management and cancellation in one simple-to-use API.

[AbortController][AbortController] is used for cancelling fetch
requests
. It's commonly available to many JavaScript
environments and easy to import polyfill if not. Its simple API also made it a
popular cancellation mechanism.

To set an [AbortSignal][AbortSignal] when you spawn a Lifecycle, use
life.withAbort(abort). Then, the signal will be available on the child
Lifecycle at life.signal.

const abort = new AbortController();
Lyfe.withAbort(abort).spawn(async (life) => {
    return fetch(url, { signal: life.signal });
});

To register custom callback when an abort signal is received, use
life.onAbort(callback).

const { send, cancel, close } = makeRequest();
life.defer(() => close());
life.onAbort(() => cancel());
await send();

By default, a child Lifecycle inherits the AbortSignal from the parent
Lifecycle. A parent Lifecycle's abort signal will travel to all its descendant
Lifecycles.

The Lifecycle contract

We define a simple contract for function and class authors to properly annotate their functions and classes, so they can be confident that when their code was used by other developers, the contract would enforce a set of safe behaviors to ensure resources to be properly cleaned up.

In general, as a function or class author, you need to ask one simple question: is my function/class delegating resource cleanup tasks to the caller/creator?

If the answer is no, you don't need to do anything to your function/class signature, just keep it like a normal function/class.

However, if the answer is yes, you would need to adopt the Lifecycle contract to indicate to callers that some cleanup is delegated to them.

On the other hand, if your function used resources that employed the Lifecycle contract, but you clean them up before your function returns, then your function doesn't delegate the cleanup to the caller, so you don't need to adopt the Lifecycle contract for this function.

Rule of thumb: A function should try to contain the cleanup of its resources inside it. Only declare a Lifecycle parameter on the function signature if the function returns a resource that requires cleanup delegated to the caller.

Value function

If a function returns a value, not a resource that need to be cleaned up, write it as a simple function.

function getValue(): Value;
async function findValue(): Promise<Value>;

A function like this may use Resources inside it, but it performs cleanup before return, so the returned value doesn't require cleanup.

async function findValue(): Promise<Value> {
    return Lyfe.spawn(async (life) => {
        const client = new Client();
        life.defer(() => client.close());
        return client.findValue();
    });
}

You should always try to write a function in this way, and only adopt the following patterns when necessary.

Cancellable value function

Note: Cancellation is not in-scope of our discussion, so I should just keep these collapsed.

If a function returns a value that doesn't require cleanup, but accepts a cancellation signal, pass the signal in the parameters, instead of accepting a Lifecycle parameter:

async function fetchValue(url: string, signal?: AbortSignal): Promise<Value> {
    const response = await fetch(url, { signal });
    return response.json<Value>();
}

async function findValue(signal?: AbortSignal): Promise<Value> {
    return Lyfe.withAbort(signal).spawn(async (life) => {
        const client = new Client(life); // client cancellable with the signal
        return client.findValue();
    });
}

LifeFunction that returns resource

If a function returns resource that requires cleanup delegated to the caller, it should accept the outer Lifecycle as its first parameter. A function satisfies this form is called a LifeFunction.

async function findResource(outerLife: Lifecycle): Promise<Resource> {
    return outerLife.spawn(async innerLife => {
        const client = new Client(innerLife); // client is cleaned up inside findResource
        const location = await client.locateResource();
        const resource = new Resource(location); // resource is returned to caller
        outerLife.defer(() => resource.close()); // delegating cleanup to the outer Lifecycle
        return resource;
    });
}

Adopting the LifeFunction signature is requesting the caller to follow the contract to properly clean up the returned resource. Therefore, it requires the caller to pass in a Lifecycle to this function in someway. And if the caller returns the resource to even an outer caller, the required adoption of passing Lifecycle would bubble up the call stack until a function consumes the resource and performs cleanup before it returns.

Interoperability with [Disposable][Disposable]

To use Disposable or AsyncDisposable resource inside a Lifecycle context, just use life.use(disposable).

If one module adopts the Lifecycle contract, and another module consumes AsyncDisposable, you can wrap a Lifecycle into an AsyncDisposable using life.unsafeSpawnAsyncDisposable().

// You have a resource the follows the Lifecycle contract
class Resource {
    constructor(life: Lifecycle) {
        const { read, write } = open();
        life.defer(() => close());
        this.read = read;
    }
    public read!: () => string;
}

// But some module consumes the resource expecting AsyncDisposable contract
async function consumeAsyncDisposable(resource: AsyncDisposable & Resource) {
    using usedResource = resource;
    const data = resource.read();
}

// You can convert Lifecycle into AsyncDisposable like this
async function consumeAsyncDisposableWithLifecycle() {
    const resource = Lyfe.unsafeSpawnAsyncDisposable(life => new Resource(life));
    await consumeAsyncDisposable(resource);
}

(End of the Lifecycle contract)


The Lifecycle contract and the Disposable contract are similar in one way: they both alter a function's signature if there are cleanup delegated to the caller. But they achieve this in different styles:

  • Disposable contract delegates cleanup to caller by changing the return type to a Disposable type
  • Lifecycle contract delegates cleanup to caller by requiring it to pass in a resource context as parameter

The key difference between these contracts is not the position of the change to a function's signature, but rather the level of enforcement on the contract.

The Disposable contract only defines how a value should be produced, but it doesn't enforce how it should be consumed, thus having its "flexibility" while sacrifised safety and lack in confidence as a function author.

The Lifecycle contract on the other hand has "requiring" in its wording, so it's enforcing how its value should be consumed, but not by changing the return type.

// THIS IS UNSAFE
function unsafeRetain(context) {
    let release;
    context.runAsync(new Promise(resolve => release = resolve));
    return release;
}

Now you can pass the release method freely - but unsafe. It's now functioning in the same way as Disposable - without the using syntax etc. It can leak too if you forget to call release(). But it doesn't disappear from heap dump. This is the main difference from bare Disposable - it still has the context awaiting for the dispose to happen so it preserves the retain path for diagnosis.

This does not function the same way as Disposable as it forces a dependency on asynchrony, and thus poisons synchronous code paths with await if they want to be able to properly handle exceptions thrown during disposal.

I should have said AsyncDisposable instead. The Lifecycle contract I gave above has a utility method to turn a Resource using Lifecycle contract into an AsyncDisposable.

// You have a resource the follows the Lifecycle contract
class Resource {
    constructor(life: Lifecycle) {
        const { read, write } = open();
        life.defer(() => close());
        this.read = read;
    }
    public read!: () => string;
}

// But some module consumes the resource expecting AsyncDisposable contract
async function consumeAsyncDisposable(resource: AsyncDisposable & Resource) {
    using usedResource = resource;
    const data = resource.read();
}

// You can convert Lifecycle into AsyncDisposable like this
async function consumeAsyncDisposableWithLifecycle() {
    const resource = Lyfe.unsafeSpawnAsyncDisposable(life => new Resource(life));
    await consumeAsyncDisposable(resource);
}

The execution context calling the asyncDispose function should await for teardown completion. Currently I only implemented the async version but it's possible to have a specialized LifecycleSync type that only accepts synchronous defers, so it would be possible to be turned into a synchronous Disposable.

It also doesn't make clear how those exceptions can be handled even in the async case, and potentially runs afoul of an unhandled promise rejection because the result of runAsync will go unwitnessed.

It's important that a developer can reason over exceptions thrown during disposal. Both using and DisposableStack do this for you in a way that that works conveniently with try..catch and doesn't poison synchronous code with await just to retain lifetime.

My implementation can propagate teardown errors to the caller of asyncDispose. It should now behave like an AsyncDisposable and still be bound to a resource context.

And for Lifecycle contract this is also simple - whoever started a new Lifecycle context should await for resources to be cleaned up. Errors thrown from teardown functions can be caught too. Currently my implementation only report the first error seen, but it's easy to employ something like SuppressedError to collect all errors.

It's harder to turn it into this form but for a good reason - it's unsafe to do so. But you can gain that level of freedom if you know what you were doing.

Therein lies the problem. An overly complex language feature will hamper adoption, and the benefits you've espoused with this design would do more to make the feature harder to reason over and harder to teach. using is simple. It may have a flaw with leaks, but I believe that flaw can be addressed with improvements to host APIs without sacrificing that simplicity. Those host APIs would need to be updated with your proposal as well.

Let me emphasize that the Lifecycle contract I showed so far has no usage of implicit scope capturing. Lifecycle context was being passed explicitly so far. What I want to show here is how a better programing contract can be designed even without any change to the language syntax.

In terms of "complexity", I already showed above both the Disposable and Lifecycle contracts share the same amount of change toward a function adopting these contracts. And they can all be summarized into one or two easy-to-follow rules when authoring a function adopting these contracts. It's just the Lifecycle contract made it a requirement to change how the function is called, while Disposable is being optimistic in whether the caller would use the returned value correctly.

In addition, as discussed above, the move mechanism and dual-existence of block scoped and stack-like resource context are not "simple" at all. It's "simple" to just use Disposable, but much more complicated to use it correctly.

It would be great if you can show me some examples where the Lifecycle contract is more complicated than the Disposable contract.

4. Implicit resource context capturing

I have to say that I don't have a solid solution in implicit resource context capturing, but nor that I think this is that much a need for me. Even if we only have the Lifecycle contract as a 3rd party library would be enough for me. Having something similar in JavaScript standard library would be better even without implicit resource context capturing. It already solved all my biggest concerns with Disposable.

So my intent for doing implicit resource context capturing is solely for improving usability, and adding syntax level enforcement on the contract. I'm totally fine if it turned out to be not feasible and not having it at all. But the Disposable contract is just not meeting my expectation and should have something better, especially if it's meant for the JavaScript standard library and even syntax changes.

I understand it's already a stage 3 proposal, engines may have started implementation. I also wish I was told about it in an earlier stage. But I think this feature would become as important as async/await and deserves every chance to get improved before shipping.

What I was hoping to gain with a syntax level change is to enforce awaiting a new Lifecycle at where it's being created. This is a common pattern in using Lifecycle contract:

async function writeData(file, data) {
    await Lyfe.spawn(life => {
//  ^^^^^ - a new Lifecycle should always be awaited
        const { write } = open(life, file);
        write(data);
    });
}

Not awaiting here would make errors uncaught, violate resource lifetime dependencies, and generally a programming mistake. I haven't encountered a use case where it's okay to not await a new Lifecycle. So I wish there can be a syntax that: spawn a new Lifecycle, call a LifeFunction on the new Lifecycle context, and await for the new Lifecycle to finish.

As to whether should we make the Lifecycle context variable completely hidden and only accessible with special syntax is something I don't really have a preference over. This could be something like implicit resource context capturing but if there's a better way to do this I'm open to possibilities.

Native implicit pass-down is not viable

Implicit pass-down of a dispose scope to permit capturing makes lifetime unknowable.

I'm afraid I would need some examples to fully understand what you mean by having an unknowable lifetime. Are you suggesting that any function you called is able to indefinitely extend the lifetime of your resource context? I would argue if that's how long it takes for that function to use your resource, then you should be holding the resource context to that extent.

Also as mentioned in the Lifecycle contract discription above, if you have long living and short living resources, you can bind them to different Lifecycle contexts to have more fine-grained control over their lifetimes.

Or are you refering to one of the possibility I mentioned before that a function can capture and save a reference of a resource context to keep it alive indefinitely? I would agree that it's not a good idea at all. As I said I don't have a solid solution in this area yet.

If you "know what you're doing", you can easily implement implicit pass-down in your own code in such a way as to not affect surrounding code. You can do this by defining your own DisposableStack and passing it as an argument, or storing it in a field or variable or AsyncContext, and thus you know at the declaration site that the lifetime is potentially unknowable. This provides flexibility without sacrificing simplicity.

It's doable, and with AsyncContext it's probably doable even without using FinalizationRegistry. But I will avoid using it in production if it relies on GC to do leak free.

Hosts and built-ins can and should guard against leaky native handles

Yes, using can be leaky if you forget to use it. This can be guarded against for native resources by also leveraging FinalizationRegistry, and should be recommended in documentation, much as it is for C#. Host implementations like NodeJS and browsers should endeavour to guard resources in this way. This ensures that native resources are properly released when the handle itself is GC'd, and will also help with debugging and diagnostics when using heap snapshots.

Linters and type checkers can help as well

This can also be guarded against by linters and type checkers, even if only in part. For example, TypeScript might choose to implement a uses keyword to indicate to the caller that a Disposable will be tracked for cleanup, and provide an error if you fail to "use" it, return it, or store it in an outer variable or field:

class DisposableStack {
    ...
    use<T extends Disposable | null | undefined>(uses value: T): T & used;
    //                                           ^^^^                ^^^^
    //                                           |                   |
    // indicates the value passed to the parameter                   indicates the returned value should be considered
    //                should be considered as used                   as used
    ...
}

function foo() {
    const stack = new DisposableStack();
    //    ^^^^^ - ok: stack is returned, so its up to the caller to manage lifetime.

    const res1 = new SomeDisposable();
    //    ^^^^ - error: res1 was not used

    const res2 = new SomeDisposable();
    //    ^^^^ - ok: res1 is used below

    stack.use(res2);
    return stack;
}

function bar() {
    const res3 = new SomeDisposable() as used;
    //    ^^^^ - ok: explicitly declared as `used`, so we assume the user knows what they're doing
}

You don't need any of these for the Lifecycle contract. It's already being enforced in function signature.

It should be easy to handle exceptions

It's important that a developer can reason over exceptions thrown during disposal. Both using and DisposableStack do this for you in a way that that works conveniently with try..catch and doesn't poison synchronous code with await just to retain lifetime.

You can also use try..catch together with Lifecycle contract. await is only required at the call site that created an async Lifecycle - where it accepts async defer callbacks, to collect the completion of the Lifecycle. As I mentioned, it's possible to write a synchronous version that only accepts synchronous defer callbacks, so you don't need to await. This is the same as a function that using an AsyncDisposable would be "polluted" to be async.

Resource management should be easy to use

If we design this feature such that only the most experienced developers can use it reliably, then it won't be adopted by the majority.

As I argued aboved, while it has a seemingly simple syntax, it's difficult to use Disposable contract "reliably".

The inconvenience of APIs like FinalizationRegistry and Proxy are a necessity of their utility, and often only affect the producer of the object. The resulting code is often "write few, read many"—You only need to write a leak-safe Disposable around a native resource in a few places, yet the APIs that produce them will often be used orders-of-magnitude-more frequently, and by a combination of both experienced and inexperienced developers alike. It behooves us to design this feature with that in mind.

I understand where you are getting with this. But it's still something native resource authors have to opt-in for. Developers will start wrapping ANYTHING inside a Disposable starting on day 1 - node:fs, node:net, you name it. And I'm sure not many of those Disposables can be "leak-safe". We will end up with a landscape of a few "safe" and many "unsafe" Disposables all over the place.

@rbuckton
Copy link
Collaborator

1. Leak diagnose

I still think it's not an easy task to author safe Disposable resource even for engine developers. But that's easier for me, because I'm not the one authoring native resources. 😂

If the perscriptive guidance for a Disposable over a native handle is to also use FinalizationRegistry, I think implementers or engine hosts (such as browsers) wouldn't have difficulty. They could either leverage FinalizationRegistry directly via spec-native intrinsics, or we could look into adding a host API to make that easier.

Yes, leaks are possible with using. Yes, linters and type systems may not be able to be 100% reliable at catching this. However, I think this design is far more flexible than what you've suggested so far. Its far easier to write imperative code against Disposable when you explicitly don't want to leverage using. Its also far easier to integrate into the wealth of existing APIs without having to wholly restructure your code, which would otherwise be a severe adoption risk.

What you said here is not in alignment with what you previously said on another thread:

  1. We want the API to guide the user towards the safest and most reliable avenue for resource cleanup by making those use cases the easiest to address.
  2. We want to allow the user to do less safe things for specific cases, but there should be more resistance on that path to ensure the average use case stays on (1).

I also said in this thread that the decisions we make must be balanced with ease of use and teachability. This was also about the API, and I've already discussed how leaking native handles and heap snapshot discoverability are already solvable via an API, where hosts offer convenient and safe handle wrappers for native APIs that do this work for you.

If you were being truthful to those doctrines, you would oppose to allowing Disposable to have leaky default behavior.

These are not doctrines, they are guiding principles, and I'm willing to bend them where necessary to make something that is useable. This proposal is easy to stitch onto existing NodeJS and DOM APIs without requiring a significant rewrite, or requring that users use wholly different methods to use, which your suggestions would likely entail.

2. Resource context passing without implicit scope capturing

What I was trying to get to in this section was that resource context and execution context should be separated. using binds resource to block scope would quickly become not useful enough in many cases. Moving with DisposableStack can help, but it's adding unnecessary complexity making it harder to learn and write safe code. I completely agree with Ben on move being a overly complicated design. You can make it easier if you separated resource context and execution context from the begining.

There is far too much in this section to discuss point by point, so I will summarize my position.

I think we have very different positions on how a resource's lifetime should be scoped. Lifecycle, as you have suggested, is very similar to DisposableStack. You can, if you so chose, pass down a DisposableStack as a parameter to register resources in the same way. Where they differ is in who controls lifetime. With a Lifecycle, the lifetime of the object is completely out of control of the resource consumer (in this case, the author of the Lifecycle instance). While it seems as if the Lifecycle creator owns the lifetime, they in fact give that control over to any function they pass it to as a parameter, as any of those functions could chose to extend() the lifetime beyond that which the Lifecycle creator expects. This design can lead to race conditions, which can be far harder to diagnose than leaks.

On the other hand, DisposableScope ensures the lifetime is bounded in a predictable way, as its lifetime is effectively owned by the creator. That may seem like a weak position, given that any callee could effectively call .move() to change the lifetime as well, except that a DisposableScope is an example of an affine type in a substructural type system. An affine type is one that can be used at most once, and that behavior is represented by .move():

const a = new DisposableStack();
a.use(getResource());

const b = a.move(); // 'b' is now the owner of the resource
const c = b.move(); // 'c' is now the owner of the resource

a.move(); // throws as resources have already been moved out of 'a'.

As a result of this behavior, the lifetime of a DisposableScope can be reliably known. If you are the creator, you can only .move() resources out of it if no one else has. If .move() doesn't throw, the creator knows they still own the lifetime. If it does throw, then the creator no longer knows the lifetime, and an exception is thrown. By the way, Rust's move semantics are another example of affine types, and are the inspiration behind the method name move() as well.

In your response you indicated that you believe "resource context and execution context should be separated", but I disagree. A resource's lifetime should be explicit, bounded, and knowable so as to avoid race conditions and properly order resource dependencies and dependent resource cleanup. If that lifetime does not have a known, explicit boundary, then you cannot make any reliable assumptions about resource state. This introduces data races and deadlocks, which cannot be solved as readily as leaks can. using declarations and DisposableStack both provide explicit lifetime boundaries and ownership guarantees:

  • using declarations provide an explicit lifetime that is both bounded by, and owned by, execution (i.e., lexical block scope).
  • DisposableStack provides a boundary that is bounded by new and [Symbol.dispose](), and maintains ownership via .move()'s "at most once" semantics.

If you want your resource lifetime to be divorced from execution context, we already have FinalizationRegistry. With a Lifecycle, extending a lifetime requires a method call, in an appropriate async context, and may still be even tied to the Lifecycle instance until it is GC'd. With a FinalizationRegistry you can "extend" lifetime by merely holding the resource, which is a far more obvious implication.

As I see it, you have essentially suggested three things:

  1. A way to ensure native handles aren't leaked (a) by mandating that an enclosing resource scope is provided at the time the resource is allocated (b).
  2. A way to make obvious an open native handle in a heap snapshot.
  3. A way to extend the lifetime of a resource beyond its initial lifetime.

I believe that (1.a) and (2) can be handled by hosts implementing appropriate safe native handle wrappers that leverage [Symbol.dispose]() and FinalizationRegistry. Most users will not need this level of complexity.

I believe that (1.b) is an unnecessary guard rail if safe native handles are provided by hosts. It also overcomplicates resource composition scenarios, and potentially requires huge API changes for the entire ecosystem to adopt.

I believe that (3) can be accomplished with DisposableStack, and in a way that guarantees the correct ownership of resource lifetime. I believe the approach offered by Lifecycle to be a very dangerous footgun as it makes lifetime extension non-obvious to the lifetime creator.

I believe that using and DisposableStack do a far better job at avoiding race conditions than Lifecycle does.

Finally, I don't see how using Lifecycle/Lyfe could work in non-async code. The only way to extend lifetime I can see would be to use an async function and resolve() it later (as you suggested), which makes it impossible to catch disposal exceptions in synchronous code. This complexity makes resource composition very difficult to do correctly.

@rixtox
Copy link
Author

rixtox commented Jun 13, 2023

I think we have very different positions on how a resource's lifetime should be scoped. Lifecycle, as you have suggested, is very similar to DisposableStack. You can, if you so chose, pass down a DisposableStack as a parameter to register resources in the same way. Where they differ is in who controls lifetime. With a Lifecycle, the lifetime of the object is completely out of control of the resource consumer (in this case, the author of the Lifecycle instance). While it seems as if the Lifecycle creator owns the lifetime, they in fact give that control over to any function they pass it to as a parameter, as any of those functions could chose to extend() the lifetime beyond that which the Lifecycle creator expects. This design can lead to race conditions, which can be far harder to diagnose than leaks.

I see what you mean, and it's a valid point. I only considered a function declaring a Lifecycle as input parameter for delegating cleanup to the caller, but didn't consider enough what it means by passing a Lifecycle to a function that is consuming a resource bound to that Lifecycle on a detached async context. You are right that it makes it harder to reason about the implication to resource lifetime when being used in that particular case. I don't have a solution to that question yet. So I will withdraw from pushing a change toward passing resource context.

If you want your resource lifetime to be divorced from execution context, we already have FinalizationRegistry. With a Lifecycle, extending a lifetime requires a method call, in an appropriate async context, and may still be even tied to the Lifecycle instance until it is GC'd. With a FinalizationRegistry you can "extend" lifetime by merely holding the resource, which is a far more obvious implication.

I just learned that Node.js FileHandle is already being made to be auto closed on GC but not based on FinalizationRegistry. So they already have the mechanism to do some of those things you suggested.

It would probably be better if there's a standard utility to help non-engine developers to author "garbage-collectable" Disposables. It can reduce the complexity in writing custom FinalizationRegistry integrations if developers want to wrap legacy handles in Disposable and opt-in for GC cleanup. Maybe something like FinalizationRegistry.disposeOnGC(disposable): void? But the question would be how can it know if a resource has been disposed when doing GC? There's no disposable.disposed: boolean field in the Disposable interface.

Probably something like FinalizationRegistry.makeDisposable<T extends object>(object: T, dispose: () => void): Disposable & T and FinalizationRegistry.makeAsyncDisposable<T extends object>(object: T, dispose: () => Promise<void>): AsyncDisposable & T can work better? But it would then raise question on how to use it for classes.

@rbuckton
Copy link
Collaborator

rbuckton commented Jun 13, 2023

It would probably be better if there's a standard utility to help non-engine developers to author "garbage-collectable" Disposables. It can reduce the complexity in writing custom FinalizationRegistry integrations if developers want to wrap legacy handles in Disposable and opt-in for GC cleanup. Maybe something like FinalizationRegistry.disposeOnGC(disposable): void? But the question would be how can it know if a resource has been disposed when doing GC? There's no disposable.disposed: boolean field in the Disposable interface.

This is most likely not an option. When FinalizationRegistry was designed, it was an intended goal to avoid the complexity of resurrection and reachability semantics of GC in other languages, like that of C#. As such, a FinalizationRegistry cannot itself hold a reference to the garbage collected object. This is why the register() method takes three arguments: The target object whose collection will be monitored, a "held value" that can be used to perform actual cleanup when passed to the callback supplied to the registry's constructor, and an optional "unregister token" that can be used to remove the registry entry.

A disposeOnGC method would need to both monitor disposable and use it as the this when performing cleanup. Since neither FinalizationRegistry nor the JavaScript GC has the concept of object resurrection, this would mean that the disposable would be held indefinitely and could never be GC'd.

Probably something like FinalizationRegistry.makeDisposable<T extends object>(object: T, dispose: () => void): Disposable & T and FinalizationRegistry.makeAsyncDisposable<T extends object>(object: T, dispose: () => Promise<void>): AsyncDisposable & T can work better? But it would then raise question on how to use it for classes.

This is closer, but still has issues that prevent it from being a viable approach. It would again need to differentiate between the object whose GC is monitored (i.e., object) and some value that can be used to identify the resource for cleanup (i.e., the "held value"). Accepting a dispose callback that takes no arguments is likely to result in the user introducing a closure that captures object, which would, in turn, result in a cycle that prevents GC. It also seems to work by mutating an existing value, which isn't a common practice in the standard library. Finally, in most cases where you would need this functionality, you are often working with a native handle or file descriptor, which is usually represented by a number, not an object.

This is where the SafeHandle class I suggested earlier would make more sense. However, that suggestion was predicated on it being a host-defined capability rather than something that is part of the standard library. The main reason I would not propose SafeHandle for ECMA262 is that the core language itself has no notion of a "handle", nor does it give any particular meaning to a number. Instead, this meaning is given by a host like NodeJS and its file descriptors, or the DOM and its timers.

In any case, I'm not sure its wholly necessary to design an API for such use cases as native handle wrappers. These can more readily be handled by hosts themselves, who often have no need for a user-facing API as you are proposing. While I don't see it as being necessary for the MVP for this feature, it's definitely something we could pursue as a follow-on proposal given adequate feedback.

@rixtox
Copy link
Author

rixtox commented Jun 13, 2023

It would probably be better if there's a standard utility to help non-engine developers to author "garbage-collectable" Disposables. It can reduce the complexity in writing custom FinalizationRegistry integrations if developers want to wrap legacy handles in Disposable and opt-in for GC cleanup. Maybe something like FinalizationRegistry.disposeOnGC(disposable): void? But the question would be how can it know if a resource has been disposed when doing GC? There's no disposable.disposed: boolean field in the Disposable interface.

This is most likely not an option. When FinalizationRegistry was designed, it was an intended goal to avoid the complexity of resurrection and reachability semantics of GC in other languages, like that of C#. As such, a FinalizationRegistry cannot itself hold a reference to the garbage collected object. This is why the register() method takes three arguments: The target object whose collection will be monitored, a "held value" that can be used to perform actual cleanup when passed to the callback supplied to the registry's constructor, and an optional "unregister token" that can be used to remove the registry entry.

I'm sure many people found that register() semantics confusing, but I understand why it was designed that way. Which is why I would like to see a simple interface just for GC-guarding Disposable instead of letting people to go through the complexity of learning now garbage collector works.

A disposeOnGC method would need to both monitor disposable and use it as the this when performing cleanup. Since neither FinalizationRegistry nor the JavaScript GC has the concept of object resurrection, this would mean that the disposable would be held indefinitely and could never be GC'd.

Probably something like FinalizationRegistry.makeDisposable<T extends object>(object: T, dispose: () => void): Disposable & T and FinalizationRegistry.makeAsyncDisposable<T extends object>(object: T, dispose: () => Promise<void>): AsyncDisposable & T can work better? But it would then raise question on how to use it for classes.

This is closer, but still has issues that prevent it from being a viable approach. It would again need to differentiate between the object whose GC is monitored (i.e., object) and some value that can be used to identify the resource for cleanup (i.e., the "held value"). Accepting a dispose callback that takes no arguments is likely to result in the user introducing a closure that captures object, which would, in turn, result in a cycle that prevents GC. It also seems to work by mutating an existing value, which isn't a common practice in the standard library. Finally, in most cases where you would need this functionality, you are often working with a native handle or file descriptor, which is usually represented by a number, not an object.

Instead of mutating the object passed to makeDisposable, would it work better if it returns a Proxy object? It would also allow capturing the object in the dispose closure.

This is where the SafeHandle class I suggested earlier would make more sense. However, that suggestion was predicated on it being a host-defined capability rather than something that is part of the standard library. The main reason I would not propose SafeHandle for ECMA262 is that the core language itself has no notion of a "handle", nor does it give any particular meaning to a number. Instead, this meaning is given by a host like NodeJS and its file descriptors, or the DOM and its timers.

In any case, I'm not sure its wholly necessary to design an API for such use cases as native handle wrappers. These can more readily be handled by hosts themselves, who often have no need for a user-facing API as you are proposing. While I don't see it as being necessary for the MVP for this feature, it's definitely something we could pursue as a follow-on proposal given adequate feedback.

If we don't have a solution to this problem in this proposal, I would suggest making it clear in the API guideline that wrapping native handles inside Disposables can potentially leak and should be advised against.

@rbuckton
Copy link
Collaborator

rbuckton commented Jun 13, 2023

I'm sure many people found that register() semantics confusing, but I understand why it was designed that way. Which is why I would like to see a simple interface just for GC-guarding Disposable instead of letting people to go through the complexity of learning now garbage collector works.

There is no "simple interface for just GC-guarding" a disposable without completely rearchitecting GC in JavaScript. You cannot wait for GC on an object while still using that object to perform its own cleanup, as that means the object will always have some kind of reference after it has been GC'd, so it can never be GC'd. FinalizationRegistry was the simplest solution to the problem without needing to completely rearchitect GC on every engine.

Instead of mutating the object passed to makeDisposable, would it work better if it returns a Proxy object? It would also allow capturing the object in the dispose closure.

You cannot create a proxy for a primitive value, and a proxy still holds its target strongly, so a Proxy does not solve any of the concerns I presented.

SafeHandle is pretty much the simplest mechanism. You could potentially subclass it, but you would still need to pass a "held value" and a cleanup callback to the super constructor. Overriding [Symbol.dispose] would have no effect on finalization, however.

// node:fs
function closeSync(fd) { ... }

class FileHandle extends SafeHandle {
  constructor(fd) {
    super(fd, closeSync); // finalizer does not strongly hold a `FileHandle`, receives `fd` as input
  }
}

If we don't have a solution to this problem in this proposal, I would suggest making it clear in the API guideline that wrapping native handles inside Disposables can potentially leak and should be advised against.

That is my intent, yes. One actionable item might be to add a non-normative NOTE to the specification that includes this information as an advisory for hosts.

@bathos
Copy link

bathos commented Jun 13, 2023

Aside: This thread has been instructive — the responses have helped me understand the API and the motivations behind its design decisions better than I’d previously been able to work out from the readme/explainer.

@mhofman
Copy link
Member

mhofman commented Jul 7, 2023

The Symbol.create method doesn't have to return a new object, it can return this.

So the RHS expression can be a Creator & Writable & Disposable

@rixtox
Copy link
Author

rixtox commented Jul 7, 2023

@senocular
How does create integrate with existing APIs? And apologies if I'm not correctly understanding the new proposal entirely (long thread) but right now we generally have something to the effect of

const handle = new Writable

and with the current proposal this would become

using handle = new Writable & Disposable

where handle is the same value as before, now just also implementing Disposable. With create it seems instead we'd need/want an intermediary of

using handle = new Creator -> new Writable & Disposable

The Symbol.create contract is optional. Meaning you can adopt it if you don't have compatibility restrictions, like supporting existing API usages. For those use cases, you should just do new Writable & Disposable. The new proposal is to provide an opt-in option for those who authoring new functions and APIs as a safer calling convention that's less likely to be leaked by forgetting using.

@mhofman
The Symbol.create method doesn't have to return a new object, it can return this.

So the RHS expression can be a Creator & Writable & Disposable

I would strongly oppose to returning Creator & Writable & Disposable. It completely defeats the purpose of Creator - forcing the value to be consumed by using. The Creator object should not be usable in any way except being consumed by using.

More concretely, in the current proposal @@dispose is added to Iterator such that I could create a generator that could auto-return() when declaring with using. So given something like...

let locked = false;
function* countdown() {
  if (locked) throw new Error("Simultaneous launches prohibited");
  try {
    locked = true;
    yield* [3, 2, 1, "blastoff!"];
  } finally {
    locked = false;
  }
}

The use of using could help ensure the generator is properly cleaned up between launches.

{
  using launch = countdown();
  launch.next(); // 3
  // bad weather, launch cancelled...
} // return() called unlocking countdown
{
  using launch2 = countdown();
  launch2.next(); // 3 - OK
  // ...
}

Your above code still works under the new proposal since it's compatible with Ron's current proposal. We should not make generator to inherently adopt the Symbol.create contract, because it has compatibility restrictions.

How would create be applied here? Would there always be a need to wrap the generator function in another call that produced a Creator instead of the generator object (DisposableHelper)?

Again, Symbol.create should be an opt-in convention to force the value to be consumed by using. As explored extensively in this long thread, all solutions that can achieve this level of safety requires either breaking the compatibility of existing function signature (the defer contract or Symbol.create contract), or requires implicit scope capturing (AsyncContext), or requires active involvement of the garbage collector (FinalizationRegistry ).

It's just the Symbol.create contract is the solution most likely to be adopted at this stage. But you can definitely build other contracts like the defer contract on top of Symbol.create contract like @mhofman has pointed out.

@mhofman
Copy link
Member

mhofman commented Jul 7, 2023

I would strongly oppose to returning Creator & Writable & Disposable. It completely defeats the purpose of Creator - forcing the value to be consumed by using. The Creator object should not be usable in any way except being consumed by using.

I see the create symbol as a way for the API to choose to enforce using usage if it wants, but it doesn't have to.

This is up to the creator to decide if they're ok with their disposable resources being used with plain const/let instead of using explicit resource management. I expect most existing APIs to use this pattern which allows but does not require the new using syntax from their users (no breaking change), before potentially switching to requiring using (breaking change).

@rixtox
Copy link
Author

rixtox commented Jul 7, 2023

@mhofman
I see the create symbol as a way for the API to choose to enforce using usage if it wants, but it doesn't have to.

Yes, to enforce using, you should make the Creator object not usable, otherwise you are not enforcing it to be consumed by using. Because people can just let/const it and use it normally.

I expect most existing APIs to use this pattern which allows but does not require the new using syntax from their users (no breaking change)

For those cases, they should return Resource & Disposable, not Resource & Creator & Disposable.

@rbuckton
Copy link
Collaborator

rbuckton commented Jul 8, 2023

I've explicitly avoided completely emulating Python's semantics for __enter__/__exit__ because they are designed to enable support for context managers, which are a major step above disposables in complexity. Python's context managers allow you to intercept exceptions during dispose and potentially swallow them, which significantly complicates static analysis for tools as it breaks expectations for control flow analysis. There is some additional discussion on this in #49, if we want to dig further into exploring full Python-like semantics there. While I'm not strongly in favor of adding full context manager-like support to ECMAScript, I think it's feasible to extend using et al to support that in a follow-on proposal.

One issue I have with adapting Python's approach is that shoehorning it into existing APIs in the DOM or NodeJS would mean those existing APIs would have to add something like [Symbol.enter]() { return this } to their return values, which doesn't solve the concern about forgetting to use using over const. Existing APIs can't migrate to a separated enter/exit without introducing major web compatibility issues, so the only way to support separated enter/exit would be to add additional versions of existing APIs tailored for using. However, since those existing APIs will often need to remain for compatibility reasons, there becomes less incentive for developers to migrate.

I have concerns about a declaration form like using x = y not binding x with the value of y. Having it implicitly act like const x = y[Symbol.create]() to get the actual resource would be confusing at best as existing declaration forms just don't do that.

I might be willing to consider an approach where something like Symbol.create would be an opt-in mechanism for a resource producer to make it more likely that using is used, but I don't believe it should be a requirement to use using itself, i.e.:

// error, resource management protocol not implemented
using x = { }; 

// ok. object itself is the resource stored in 'x', and who's @@dispose is tracked.
using x = { [Symbol.dispose]() { } }; 

// ok. @@create is called to get the resource to store in 'x', who's @@dispose is then tracked
using x = { [Symbol.create]() { return { [Symbol.dispose]() { } }; } };

// ok...ish. but 'x' isn't the actual resource, you'd have to call @@create manually
const x = { [Symbol.create]() { return { [Symbol.dispose]() { } }; } };

If most existing API's would have to implement [Symbol.create]() { return this; } if it were mandatory, then making the presence of @@create optional is certainly viable.

Assuming we could get committee buy-in on using x = y initializing x to something other than y in the presence of @@create, then we could advance using as-is and support @@create as a follow-on to this proposal while also further considering the ramifications of a full context manager as is being discussed in #49.

@rbuckton
Copy link
Collaborator

rbuckton commented Jul 8, 2023

That said, I don't think { [Symbol.create]() { return disposable; } } actually matches Python's __enter__/__exit__ very well. Python doesn't call __exit__ on the return value of __enter__, but on the same object that has __enter__.

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Jul 11, 2023

What do you think about something like this to enforce using of using?

class File {
  #fd;
  #entered = false;

  constructor(path) {
    this.#fd = open(path);
  }

  read() {
    if (!this.#entered) throw new Error("You must setup `using` before using the File");
    return read(this.#fd);
  }

  close() {
    close(this.#fd);
  }

  get [Symbol.dispose]() {
    this.#entered = true;
    return this.close;
  }
}
try {
  let file1 = new File("a");
  file1.read(); // throws
} catch {}

{
  using file2 = new File("b");
  file2.read();
} // closes the file

@mhofman
Copy link
Member

mhofman commented Jul 11, 2023

Make [Symbol.dispose] a getter, why did I not think of that! I requires explicit guards on all methods though, so not the most ergonomic for the resource's internal implementation, but definitely possible with the proposal as written today.

@rixtox
Copy link
Author

rixtox commented Jul 11, 2023

@nicolo-ribaudo
Your example breaks the legacy usage:

const file = new File("a");
try {
  file.read(); // should be allowed but you would throw
} finally {
   file.close();
}

It makes the API not backward compatible.

Even without the requirement of backward compatibility, making [Symbol.dispose] a getter may work for a single resource, but the enforcement cannot be inherited by a composed resource, e.g. composed by a DisposableStack. It's more of a hack than a convention or contract. And accessing the [Symbol.dispose] property is too weak an indication to tell if the dispose method will actually be called or not.

@nicolo-ribaudo
Copy link
Member

nicolo-ribaudo commented Jul 12, 2023

@rixtox How can you at the same time force someone to use new File with using so that it's guaranteed that it will be disposed, and not break compatibility with existing code that does not use using?

In the example above you were using a createHandle function, which is a new API compared to the "previous one" that you might not want to break backward compatibility to. In that case, please read my example above as class NewFile extends File instead of modifying an existing File class.

@rbuckton
Copy link
Collaborator

@nicolo-ribaudo Your example breaks the legacy usage:

const file = new File("a");
try {
  file.read(); // should be allowed but you would throw
} finally {
   file.close();
}

It makes the API not backward compatible.

Even without the requirement of backward compatibility, making [Symbol.dispose] a getter may work for a single resource, but the enforcement cannot be inherited by a composed resource, e.g. composed by a DisposableStack. It's more of a hack than a convention or contract. And accessing the [Symbol.dispose] property is too weak an indication to tell if the dispose method will actually be called or not.

I don't think we would require that [Symbol.dispose]() be a getter, only that you could implement it thus if necessary. A legacy API or lightweight disposable could still just use a method. As far as using a getter-based resource imperatively outside of using and DisposableStack, one could do this:

const file = new File(...), dispose = file[Symbol.dispose];
try {
  file.read();
}
finally {
  dispose.call(file);
}

@rbuckton
Copy link
Collaborator

Make [Symbol.dispose] a getter, why did I not think of that! I requires explicit guards on all methods though, so not the most ergonomic for the resource's internal implementation, but definitely possible with the proposal as written today.

A well-written disposable resource should guard against its methods being used in a manner inconsistent with its state in any case. In the File case, above, attempting to read() on a closed file descriptor will throw even without the !this.#entered check, assuming the descriptor itself isn't reused.

@rbuckton
Copy link
Collaborator

[...] making [Symbol.dispose] a getter may work for a single resource, but the enforcement cannot be inherited by a composed resource, e.g. composed by a DisposableStack.

Once the resource has left your control, the consumer will always have the flexibility to use it as they see fit, and adding it to a DisposableStack is a perfectly valid use case, as it was designed to be. Though I'll be clear, I would be opposed to enforcing such a mechanism on DisposableStack itself as it would overcomplicate one if its main use cases, which is using it in a class per the example in the explainer.

And accessing the [Symbol.dispose] property is too weak an indication to tell if the dispose method will actually be called or not.

Maybe, but not much weaker than a [Symbol.enter]() would be. The only caveat being that using if (obj[Symbol.dispose]) to test for a disposable could trigger a false positive, while if (Symbol.dispose in obj) would not. Aside from that, there's really no other reason why a user might access obj[Symbol.dispose] prior to interacting with the object in any other way.

@rixtox
Copy link
Author

rixtox commented Jul 12, 2023

@nicolo-ribaudo
How can you at the same time force someone to use new File with using so that it's guaranteed that it will be disposed, and not break compatibility with existing code that does not use using?

You can't. My point is if it would already make the returned object not backward compatible, why not just implement or adopt a proper calling convention like [Symbol.create] or defer to enforce it. They are more reliable and explicit than a hack of using property getter.

@rbuckton
Though I'll be clear, I would be opposed to enforcing such a mechanism on DisposableStack itself as it would overcomplicate one if its main use cases, which is using it in a class per the example in the explainer.

Can you explain how it would overcomplicate DisposableStack's use case in class? I recall you mentioned DisposableStack should be an affine type and should always be consumed by using. And it's indeed the case in all examples in the explainer. It makes sense to me if there's a way to enforce using with Disposable (not necessarily the accessor approach), DisposableStack should adopt that mechanism.

And accessing the [Symbol.dispose] property is too weak an indication to tell if the dispose method will actually be called or not.

Maybe, but not much weaker than a [Symbol.enter]() would be. The only caveat being that using if (obj[Symbol.dispose]) to test for a disposable could trigger a false positive, while if (Symbol.dispose in obj) would not. Aside from that, there's really no other reason why a user might access obj[Symbol.dispose] prior to interacting with the object in any other way.

There's hardly any operations that can have a side effect of actually calling [Symbol.enter]() other than being used by using or equivalent methods. It's indeed a calling convention. But there can be operations having side effects of accessing [Symbol.dispose] that are not necessarily actually scheduling disposal for the resource. Checking / asserting on obj[Symbol.dispose] is one of those operations as you pointed out. It would basically disarm the enforcement mechanism. Comparing to a check / asset on obj[Symbol.enter] would not cause the enforcement to be disarmed.

@rbuckton
Copy link
Collaborator

rbuckton commented Jul 12, 2023

Can you explain how it would overcomplicate DisposableStack's use case in class? I recall you mentioned DisposableStack should be an affine type and should always be consumed by using. And it's indeed the case in all examples in the explainer. It makes sense to me if there's a way to enforce using with Disposable (not necessarily the accessor approach), DisposableStack should adopt that mechanism.

The example I linked explicitly does not use using for one case of DisposableStack as it contains the resources to be disposed when the class itself is disposed. The result of stack.move() is itself another DisposableStack. Requiring it go through using would easily break that example.

@zaygraveyard
Copy link

If stack.move() pre-disarms the new DisposableStack before returning it (by calling its [Symbol.enter]() or the accessing its [Symbol.dispose]), it would not require going through using.

Am I missing something?

@senocular
Copy link

The example I linked [link] explicitly does not use using for one case of DisposableStack as it contains the resources to be disposed when the class itself is disposed.

@rbuckton is this in reference to

this.#disposables = stack.move();

or

const disposables = this.#disposables;

?

I understand why stack.move() is not using using but I'm not sure I understand why const disposables is not.

@rixtox
Copy link
Author

rixtox commented Jul 12, 2023

@rbuckton
The example I linked explicitly does not use using for one case of DisposableStack as it contains the resources to be disposed when the class itself is disposed. The result of stack.move() is itself another DisposableStack. Requiring it go through using would easily break that example.

I have the same question as @senocular. Your example can be written as:

class AsyncPluginHost {
  // ...
  static async create() {
    // ...
    await using stack = new AsyncDisposableStack();
    host.#disposables = stack.move();
    // ...
  }
  // ...
  async [Symbol.asyncDispose]() {
    if (!this.#disposed) {
      this.#disposed = true;
      await using disposables = this.#disposables;
      this.#socket = undefined;
      this.#channel = undefined;
      this.#disposables = undefined;
    }
  }
}

It doesn't overcomplicate anything. And it works even if AsyncDisposableStack is enforced to be consumed by using, e.g. function AsyncDisposableStack(): Creator<AsyncDisposableStack> and also AsyncDisposableStack.move(): Creator<AsyncDisposableStack>.

@rbuckton
Copy link
Collaborator

@rbuckton
The example I linked explicitly does not use using for one case of DisposableStack as it contains the resources to be disposed when the class itself is disposed. The result of stack.move() is itself another DisposableStack. Requiring it go through using would easily break that example.

I have the same question as @senocular. Your example can be written as:

[...]

It doesn't overcomplicate anything. And it works even if AsyncDisposableStack is enforced to be consumed by using, e.g. function AsyncDisposableStack(): Creator<AsyncDisposableStack> and also AsyncDisposableStack.move(): Creator<AsyncDisposableStack>.

What is unfortunate here is that the user needs to introduce an otherwise unused binding disposables, though this was a known implication of dropping the void-binding syntax from this proposal.

This does make it harder for users to leverage DisposableStack in an imperative fashion, especially when the resources in question aren't backed by handles that might leak.

@rbuckton
Copy link
Collaborator

I'd also like to point out that Python's ExitStack uses the default implementation of __enter__:

https://github.com/python/cpython/blob/a276ce4505abc72dc3cf3a3a906ad65ef1306203/Lib/contextlib.py#L543-L544

def __enter__(self):
    return self

and as such does not require the use of with.

@rixtox
Copy link
Author

rixtox commented Jul 13, 2023

@rbuckton
The example I linked explicitly does not use using for one case of DisposableStack as it contains the resources to be disposed when the class itself is disposed. The result of stack.move() is itself another DisposableStack. Requiring it go through using would easily break that example.

I have the same question as @senocular. Your example can be written as:
[...]
It doesn't overcomplicate anything. And it works even if AsyncDisposableStack is enforced to be consumed by using, e.g. function AsyncDisposableStack(): Creator<AsyncDisposableStack> and also AsyncDisposableStack.move(): Creator<AsyncDisposableStack>.

What is unfortunate here is that the user needs to introduce an otherwise unused binding disposables, though this was a known implication of dropping the void-binding syntax from this proposal.

That's pretty common for RAII. e.g. in C++ we have std::lock_guard<std::mutex> guard(some_mutex);

This does make it harder for users to leverage DisposableStack in an imperative fashion, especially when the resources in question aren't backed by handles that might leak.

Using DisposableStack in imperative fashion defeats the propose of move() and is more error prone. It should be recommended against. Are there any legitimate use cases of such that cannot be done in the affine fashion?

@rbuckton
Copy link
Collaborator

A quick note: I've filed #195 to discuss the possible adoption of an optional [Symbol.enter]() method, in lieu of full Python-like Context Manager support mentioned in #49.

@rbuckton
Copy link
Collaborator

Using DisposableStack in imperative fashion defeats the propose of move() and is more error prone. It should be recommended against. Are there any legitimate use cases of such that cannot be done in the affine fashion?

A recommendation is fine, but I don't think mandating this in the API is a necessity. A user may need to employ more complicated cleanup in a finally in addition to disposing of a DisposableStack, and mandating using in that case could make such imperative cleanup far more complex than it needs to be. In addition, mandating a [Symbol.enter]() on DisposabelStack would make it far harder to subclass. I have a use case for subclassing related to VS Code that I think is important and would not like it to become arbitrarily more complex.

I do think the use of using purely to handle cleanup in the compositional case is a bit more awkward to read in a code review vs. the imperative approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests