Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error handling for DispatchedTask being in invalid state: #3991

Open
werner77 opened this issue Dec 18, 2023 · 6 comments
Open

Improve error handling for DispatchedTask being in invalid state: #3991

werner77 opened this issue Dec 18, 2023 · 6 comments
Labels
docs KDoc and API reference enhancement native

Comments

@werner77
Copy link

Describe the bug

The primary crash in our Kotlin Multiplatform project is related to the following force cast in DispatchedTask.kt:

final override fun run() {
        assert { resumeMode != MODE_UNINITIALIZED } // should have been set before dispatching
        val taskContext = this.taskContext
        var fatalException: Throwable? = null
        try {
            val delegate = delegate as DispatchedContinuation<T> // CRASH here: delegate is CompletedContinuation
            val continuation = delegate.continuation
            withContinuationContext(continuation, delegate.countOrElement) {

In crashlytics the exception is shown as follows:

Fatal Exception: kotlinx.coroutines.CoroutinesInternalError
Fatal exception in coroutines machinery for DispatchedContinuation[MainDispatcher, Continuation @ 0]. Please read KDoc to 'handleFatalException' method and report this incident to maintainers Caused by: kotlin.ClassCastException: class kotlin.coroutines.native.internal.CompletedContinuation cannot be cast to class kotlinx.coroutines.internal.DispatchedContinuation

Now I understand this crash may be caused by invalid usage of the API, however a class cast exception should never occur. Also it is very hard to debug this way. As of yet, I have no clue if we cause this error ourselves or one of the libraries (e.g. KTOR) causes it.

I propose a solution where at the point the invalid call is made an IllegalStateException or similar exception is thrown, describing the invalid usage of the API, so the code can be fixed at that point.

Provide a Reproducer

I unfortunately don't know how to exactly reproduce this problem. I don't know which call causes this error.

@werner77 werner77 added the bug label Dec 18, 2023
@dkhalanskyjb dkhalanskyjb added the docs KDoc and API reference label Dec 18, 2023
@werner77
Copy link
Author

werner77 commented Dec 22, 2023

I didn't mention it, but this happens only on iOS, Kotlin Native. The complete stack trace is here:

Fatal Exception: kotlinx.coroutines.CoroutinesInternalError
0  MeasurementKit                 0x217cb0 kfun:kotlinx.coroutines.DispatchedTask#handleFatalException(kotlin.Throwable?;kotlin.Throwable?){} + 55 (DispatchedTask.kt:55)
1  MeasurementKit                 0x217a30 kfun:kotlinx.coroutines.DispatchedTask#run(){} + 119 (DispatchedTask.kt:119)
2  MeasurementKit                 0x2259c4 kfun:kotlinx.coroutines.DarwinMainDispatcher.$dispatch$lambda$0$FUNCTION_REFERENCE$1.$<bridge-UNN>invoke(){}#internal + 43 (Dispatchers.kt:43)
3  MeasurementKit                 0x3cfce8 kfun:kotlin.Function0#invoke(){}1:0-trampoline + 1 ([K][Suspend]Functions:1)
4  MeasurementKit                 0x960bec ___6f72672e6a6574627261696e732e6b6f746c696e783a6b6f746c696e782d636f726f7574696e65732d636f72652f6f70742f6275696c644167656e742f776f726b2f343465633665383530643563363366302f6b6f746c696e782d636f726f7574696e65732d636f72652f6e617469766544617277696e2f7372632f44697370617463686572732e6b74_knbridge7_block_invoke
5  libdispatch.dylib              0x26a8 _dispatch_call_block_and_release
6  libdispatch.dylib              0x4300 _dispatch_client_callout
7  libdispatch.dylib              0x12998 _dispatch_main_queue_drain
8  libdispatch.dylib              0x125b0 _dispatch_main_queue_callback_4CF
9  CoreFoundation                 0x3700c __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
10 CoreFoundation                 0x33d18 __CFRunLoopRun
11 CoreFoundation                 0x33468 CFRunLoopRunSpecific
12 GraphicsServices               0x34f8 GSEventRunModal
13 UIKitCore                      0x22d004 -[UIApplication _run]
14 UIKitCore                      0x22c640 UIApplicationMain
15 Application                    0x63f8 main + 14 (main.swift:14)

@werner77
Copy link
Author

To add to the above:

I see this code in kotlin stdlib, ContinuationImpl.kt:

internal abstract class ContinuationImpl(
    completion: Continuation<Any?>?,
    private val _context: CoroutineContext?
) : BaseContinuationImpl(completion) {
    constructor(completion: Continuation<Any?>?) : this(completion, completion?.context)

    public override val context: CoroutineContext
        get() = _context!! 

    private var intercepted: Continuation<Any?>? = null

    public fun intercepted(): Continuation<Any?> =
        intercepted
            ?: (context[ContinuationInterceptor]?.interceptContinuation(this) ?: this)
                .also { intercepted = it }

    protected override fun releaseIntercepted() {
        val intercepted = intercepted
        if (intercepted != null && intercepted !== this) {
            context[ContinuationInterceptor]!!.releaseInterceptedContinuation(intercepted)
        }
        this.intercepted = CompletedContinuation // just in case
    }
}

If releaseIntercepted() would be called more than once for some reason a ClassCastException would occur, because there is a force cast to DispatchedContinuation in the implementation of releaseInterceptedContinuation (CoroutineDispatcher.kt).

@qwwdfsad
Copy link
Member

qwwdfsad commented Jan 23, 2024

Now I understand this crash may be caused by invalid usage of the API

We typically don't allow CoroutinesInternalError for invalid API usages, it's indeed an Illegal*Exception in that case.
This particular exception (along with CCE in releaseIntercepted) indicates that there is a non-trivial and obscure bug hiding somewhere -- in our implementation or in the compiler (taking it's both platforms, it's rather our implementation).

So, while there might be some workarounds, the only reasonable action here is for us to fix the problem. If you have any pointers regarding its reproducibility, please don't hesitate to share

@werner77
Copy link
Author

We made a fork where we ensure Continuation is always of type DispatchedContinuation on initialization/creation of the class (no forced runtime casts), which seems to work. This way we were able to find the root cause which seems to be a suspending function part of some interface which was implemented on the iOS side. We made this function an ordinary function with a completion handler and the crash went away.

See the attached diff for the changes we made in our fork.
diff.txt

@qwwdfsad
Copy link
Member

That's a new pattern of usages for us, thanks!
Maybe you could share a reproducer with a iOS code that triggered this issue?

@werner77
Copy link
Author

  • create an interface in kotlin multiplatform code with a suspending function
  • implement the interface on the ios side, call the completion with an error, or call completion multiple times? Not sure which causes it
  • run a test which calls this suspending function from a kotlin multiplatform test case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs KDoc and API reference enhancement native
Projects
None yet
Development

No branches or pull requests

3 participants