Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullPointerException when setting StateFlow value #3820

Closed
hakonschia opened this issue Jul 21, 2023 · 35 comments
Closed

NullPointerException when setting StateFlow value #3820

hakonschia opened this issue Jul 21, 2023 · 35 comments
Assignees

Comments

@hakonschia
Copy link

Describe the bug

After upgrading from version 1.6.4 to 1.7.1 (we have since bumped to 1.7.2) we started seeing NullPointerException crashes when updating a StateFlow value. I am not able to reproduce this locally, we are only seeing this through Firebase reports.

This is happening on Android with org.jetbrains.kotlinx:kotlinx-coroutines-android. It's happening on many different devices and Android versions, so it isn't specific to one manufacturer/version.

Fatal Exception: java.lang.NullPointerException: Attempt to invoke virtual method 'boolean java.lang.Class.isInterface()' on a null object reference
       at java.lang.Class.isAssignableFrom(Class.java:589)
       at java.lang.Class.isInstance(Class.java:542)
       at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.accessCheck(AtomicReferenceFieldUpdater.java:389)
       at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.get(AtomicReferenceFieldUpdater.java:447)
       at kotlinx.coroutines.flow.StateFlowSlot.makePending(StateFlowSlot.java:59)
       at kotlinx.coroutines.flow.StateFlowImpl.updateState(StateFlow.kt:349)
       at kotlinx.coroutines.flow.StateFlowImpl.setValue(StateFlow.kt:316)

Provide a Reproducer

We have a lot of flows, and only one specific one is crashing. The only unique thing about the flow that is causing crashes is that it holds an Enum, but I don't know if that's causing it.

The StateFlow is nullable, but the value set on it after initialization is never null. The code below isn't our production code, but is how our code looks like

enum class SomeEnum {
    Apple,
    Orange
}

class Class {
    val flow: StateFlow<SomeEnum?> get() = _mutableFlow
    private val _mutableFlow: MutableStateFlow<SomeEnum?> = MutableStateFlow(null)
    
    fun updateValue(value: SomeEnum) {
        _mutableFlow.value = value
    }
}
@hakonschia hakonschia added the bug label Jul 21, 2023
@vlad-kasatkin
Copy link

vlad-kasatkin commented Jul 27, 2023

Seeing the same crash occasionally after an upgrade to 1.7.1.

In our case the flow receives a kotlin object implementing a sealed class. The flow is updated from a dedicated dispatcher created out of Executors.newSingleThreadExecutor().

   private sealed class Status {
    data class WithSomeData(val someData: Map<String, String>) : Status()
    object NoData : Status()
    object Initial: Status()
  }

  private val status = MutableStateFlow<Status>(Initial)

  fun updateStateFlow(coroutineScope: Scope) { 
    coroutineScope.launch { 
        someOtherFlow 
         .onEach { status.emit(NoData) } // <= Crashes on this line
         .flowOn(Executors.newSingleThreadExecutor().asCoroutineDispatcher())
         .collect()
    }
  }
java.lang.NullPointerException: Attempt to invoke virtual method 'boolean java.lang.Class.isInterface()' on a null object reference
        at java.lang.Class.isAssignableFrom(Class.java:589)
        at java.lang.Class.isInstance(Class.java:542)
        at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.accessCheck(AtomicReferenceFieldUpdater.java:421)
        at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.get(AtomicReferenceFieldUpdater.java:479)
        at kotlinx.coroutines.flow.StateFlowSlot.makePending(Unknown:2)
        at kotlinx.coroutines.flow.StateFlowImpl.updateState(StateFlow.kt:349)
        at kotlinx.coroutines.flow.StateFlowImpl.setValue(StateFlow.kt:316)
        at kotlinx.coroutines.flow.StateFlowImpl.emit(StateFlow.kt:373)

The same stateflow is updated with WithSomeData in another code-path, but from the main thread, and that code-path has not been seen in the stacktraces.

@vlad-kasatkin
Copy link

The last change to StateFlow.kt was in this PR: #3686. Which would have been a 1.7.0 release. We also went from 1.6.4, so the issue was likely in 1.70 originally, unless that PR is not related.

@mshdabiola
Copy link

Same exceptions when I upgrade to 1.7.2

Fatal Exception: java.lang.NullPointerException:
at kotlinx.coroutines.flow.FlowKt__ChannelsKt.access$emitAllImpl$FlowKt__ChannelsKt(FlowKt__Channels.kt:12)
at kotlinx.coroutines.flow.FlowKt__ChannelsKt$emitAllImpl$1.invokeSuspend(Channels.kt:12)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:1)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:1)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.java:92)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:92)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:92)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:92)

@vlad-kasatkin
Copy link

@mshdabiola This does not look the same at all, did you mean to paste a different trace?

@vlad-kasatkin
Copy link

vlad-kasatkin commented Jul 29, 2023

We have received another instance of this crash updating ANOTHER stateflow, this time it's much simpler, but still being run on the background thread, IO dispatcher pool in this case.

class MyServiceClass(private val originStateFlow: StateFlow<Boolean>) {
    val ioContext = Dispatchers.IO

    val stateFlowToUpdate = MutableStateFlow(false)
    val coroutineScope = CoroutineScope(ioContext)

    fun updateFlow()
       coroutineScope.launch {
          originStateFlow
            .onEach(stateFlowToUpdate::value::set)
            .collect()
        }
    }
}

This instance of the crash is much more infrequent than the one I originally reported (that one has picking up in occurrences lately), but we have detailed analytics on how often this new crash code-path triggers. Out of 22_789_644 invocations it has triggered once, but this suggests to me that the issue is more widespread and can affect even simpler cases.

Full stacktrace with more details about how the stateflow is updated.

Crash occured on the first and only update to StateFlow, there were no other updates.

java.lang.NullPointerException: Attempt to invoke virtual method 'boolean java.lang.Class.isInterface()' on a null object reference
        at java.lang.Class.isAssignableFrom(Class.java:589)
        at java.lang.Class.isInstance(Class.java:542)
        at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.accessCheck(AtomicReferenceFieldUpdater.java:421)
        at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.get(AtomicReferenceFieldUpdater.java:479)
        at kotlinx.coroutines.flow.StateFlowSlot.makePending(Unknown:2)
        at kotlinx.coroutines.flow.StateFlowImpl.updateState(StateFlow.kt:349)
        at kotlinx.coroutines.flow.StateFlowImpl.setValue(StateFlow.kt:316)
        at com.example.package.MyServiceClass$updateFlow$2$2.set(MyServiceClass:55)
        at com.example.package.MyServiceClass$updateFlow$2.invokeSuspend$set(MyServiceClass:55)
        at com.example.package.MyServiceClass$updateFlow$2.access$invokeSuspend$set(Unknown)
        at com.example.package.MyServiceClass$updateFlow$2$1.invoke(MyServiceClass:55)
        at com.example.package.MyServiceClass$updateFlow$2$1.invoke(MyServiceClass:55)
        at com.example.package.MyServiceClass$updateFlow$2$1.invoke(MyServiceClass:55)
        at com.example.package.MyServiceClass$updateFlow$2$1.invoke(MyServiceClass:55)
        at kotlinx.coroutines.flow.FlowKt__TransformKt$onEach$$inlined$unsafeTransform$1$2.emit(Emitters.kt:223)
        at com.squareup.coroutines.util.MapStateFlowKt$mapStateFlow$1$collect$2.emit(MapStateFlow.kt:46)
        at kotlinx.coroutines.flow.StateFlowImpl.collect(StateFlow.kt:396)
        at com.squareup.coroutines.util.MapStateFlowKt$mapStateFlow$1.collect(MapStateFlow.kt:42)
        at kotlinx.coroutines.flow.FlowKt__TransformKt$onEach$$inlined$unsafeTransform$1.collect(SafeCollector.common.kt:113)
        at kotlinx.coroutines.flow.FlowKt__CollectKt.collect(Collect.kt:30)
        at kotlinx.coroutines.flow.FlowKt.collect(FlowKt:1)
        at com.example.package.MyServiceClass$updateFlow$2.invokeSuspend(MyServiceClass:56)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
        at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:115)
        at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:100)
        at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:584)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:793)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:697)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:684)

I have been trying to reproduce with the simplified version (this new crash), but it's still very hard to hit the exact race here. Does this ring any bell at all @qwwdfsad?

@vlad-kasatkin
Copy link

1.7.0 is also the version that included a bump of atomicfu from 0.17.3 to 0.20.2 that is a more likely culprit here

@hakonschia
Copy link
Author

This has happened on another StateFlow now that holds a nullable data class

@vlad-kasatkin
Copy link

@hakonschia All our crashes are occurring when the MutableStateFlow is updated from the background thread close to the process instantiation. Are you seeing a similar pattern?

@hakonschia
Copy link
Author

The original crash is happening on the main thread, and shouldn't be very close to process instantiation. The second is called from the IO dispatcher, but I don't think this should be close to process instantation either. The second is also a lot rarer, only happening 3 times compared to 80 for the original

@vlad-kasatkin
Copy link

vlad-kasatkin commented Aug 14, 2023

We have now surpassed 100 crashes. A few additional details:

  • It's always Android 13 and Android 12 that are crashing.
  • It's not manufacturer specific - we see a wide range of devices and manufacturers.
  • Seems to be rare enough for the same user to not crash more than once ever, however this crash is one of our top crashes now.

@hakonschia
Copy link
Author

It's always Android 13 and Android 12 that are crashing.

We are also seeing this on Android 11, but not on lower versions

Seems to be rare enough for the same user to not crash more than once ever

We have some seen some repeats, but for the most part it has only happened once per user (100 crashes for 98 users)

Our crashes are mostly on Android TV and lower end phones/tablets

@vlad-kasatkin
Copy link

Got a few reports from the the 3rd party library as well:

java.lang.NullPointerException: Attempt to invoke virtual method ‘boolean java.lang.Class.isInterface()’ on a null object reference
        at java.lang.Class.isAssignableFrom(Class.java:589)
        at java.lang.Class.isInstance(Class.java:542)
        at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.accessCheck(AtomicReferenceFieldUpdater.java:389)
        at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.get(AtomicReferenceFieldUpdater.java:447)
        at kotlinx.coroutines.flow.StateFlowSlot.makePending(StateFlowSlot:59)
        at kotlinx.coroutines.flow.StateFlowImpl.updateState(StateFlowImpl:349)
        at kotlinx.coroutines.flow.StateFlowImpl.setValue(StateFlowImpl:316)
        at kotlinx.coroutines.flow.StateFlowImpl.emit(StateFlowImpl:373)
        at app.cash.sqldelight.coroutines.FlowQuery$mapToList$$inlined$map$1$2.emit(FlowQuery:223)
        at app.cash.sqldelight.coroutines.FlowQuery$mapToList$$inlined$map$1$2.emit$bridge(FlowQuery:181)
        at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke(SafeCollectorKt:15)
        at kotlinx.coroutines.flow.internal.SafeCollectorKt$emitFun$1.invoke(SafeCollectorKt:15)
        at kotlinx.coroutines.flow.internal.SafeCollector.emit(SafeCollector:87)
        at kotlinx.coroutines.flow.internal.SafeCollector.emit(SafeCollector:66)
        at app.cash.sqldelight.coroutines.FlowQuery$asFlow$1.invokeSuspend(FlowQuery:48)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(BaseContinuationImpl:33)
        at kotlinx.coroutines.DispatchedTask.run(DispatchedTask:108)
        at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher:115)
        at com.google.mlkit.common.sdkinternal.zza.run$bridge(zza:139)
        at kotlinx.coroutines.scheduling.TaskImpl.run(TaskImpl:103)
        at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler:584)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler:793)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler:697)
        at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler:684)

@qwwdfsad qwwdfsad self-assigned this Sep 11, 2023
@qwwdfsad
Copy link
Member

It seems like a long-standing bug in Android 12/13, as the quick search of Attempt to invoke virtual method 'boolean java.lang.Class.isInterface()' on a null object reference shows that a whole plethora of API is the subject to this bug.

It seems to be a bug on the edge of the DEX compiler, optimizer and specific runtime, which is almost impossible to reason about without a reproducer and access to the actual ART sources.

The only change in atomicfu I see that might have affected the reproducibility of the bug is modifiers on the corresponding field updater.

In 1.6.4 it was static final, but starting from 1.7.0 it's more precise: it's synthetic private static final; probably private is what triggers more aggressive optimizations, but I have no means to verify this hypothesis.

@qwwdfsad
Copy link
Member

qwwdfsad commented Sep 11, 2023

I'm also a bit confused by the fact it only reproduces on the makePending function only, seems to have a very specific bytecode/dex pattern.

The biggest issue here is ensuring the problem is gone -- poking with the implementation of makePending and waiting for kotlinx.coroutines regular release might take way too much time. Reverting changes in atomicfu, unfortunately, is also not really an option -- it's now shipped with the Kotlin compiler, meaning it has a very restrictive cadence.

If anybody knows the way to reproduce it on their CI, device farm, and/or local environments, I can arrange dev-builds with changes that presumably help.

@hakonschia
Copy link
Author

I'm not really sure what has changed, but this has stopped happening so it is no longer an issue for us.

@JosueB
Copy link

JosueB commented Nov 6, 2023

I am still seeing this issue on version 1.6.0
Same Android OS as mentioned above (12, 13, and a few on 14)

@amalaev
Copy link

amalaev commented Nov 10, 2023

We also started seeing this crash all of a sudden. We have coroutine version of '1.7.2'. Our class is almost the same as the author produced in the beginning, but we have a slightly different stacktrace.

We have it updated multiple times from coroutines with different scopes.

Fatal Exception: java.lang.NullPointerException: Attempt to invoke virtual method 'boolean java.lang.Class.isInterface()' on a null object reference
       at java.lang.Class.isAssignableFrom(Class.java:824)
       at java.lang.Class.isInstance(Class.java:774)
       at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.accessCheck(AtomicReferenceFieldUpdater.java:421)
       at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.get(AtomicReferenceFieldUpdater.java:479)
       at kotlinx.coroutines.flow.StateFlowSlot.g(StateFlow.kt:3)
       at kotlinx.coroutines.flow.StateFlowImpl.updateState(StateFlow.kt:349)
       at kotlinx.coroutines.flow.StateFlowImpl.setValue(StateFlow.kt:316)
       at kotlinx.coroutines.flow.StateFlowImpl.tryEmit(StateFlow.kt:368)
       at OurStateManager.updateOurState(OurStateManager.kt:15)

@dkhalanskyjb
Copy link
Collaborator

See above: #3820 (comment) It's an Android-specific problem that's not in our library but in Android's toolchain, but if someone can give us a reliable reproducer, we can look into adding a workaround in the library itself.

@yangwuan55
Copy link

Any updates? Is there a temporary solution to avoid collapse?

@Monabr
Copy link

Monabr commented Nov 20, 2023

@qwwdfsad Is there any workaround to fix this? Some catch blocks or what should I do to prevent crashes?

@dkhalanskyjb
Copy link
Collaborator

@yangwuan55 , @Monabr , the answer is literally on the same screen, you just need to scroll up a bit to see it: no, there are no updates, as it's an Android problem, not the problem with our library; we don't know what's causing this. If you can provide a reliable reproducer (a project that consistently crashes in the emulator or at least on some specific device), please do, and we'll try to introduce a workaround.

Specifically: #3820 (comment)

@2Ra66it
Copy link

2Ra66it commented Nov 29, 2023

The problem continues to reproduce.
Android 12/13.

Caused by java.lang.NullPointerException: Attempt to invoke virtual method 'boolean java.lang.Class.isInterface()' on a null object reference
       at java.lang.Class.isAssignableFrom(Class.java:824)
       at java.lang.Class.isInstance(Class.java:774)
       at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.accessCheck(AtomicReferenceFieldUpdater.java:421)
       at java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.get(AtomicReferenceFieldUpdater.java:479)
       at kotlinx.coroutines.flow.StateFlowSlot.makePending(StateFlow.kt:2)
       at kotlinx.coroutines.flow.StateFlowImpl.updateState(StateFlow.kt:349)
       at kotlinx.coroutines.flow.StateFlowImpl.setValue(StateFlow.kt:316)

It is necessary to change synthetic private static final back to static final so that “aggressive optimizations” stop working. There was no such problem in version 1.6.4.

@dkhalanskyjb
Copy link
Collaborator

Unless someone can definitively test our changes, we don't know what is necessary to do. Do you have a way to reproduce this issue consistently and can help us check if a fix does help? Or do you maybe know something we don't and can say for sure that it's the synthetic private part that triggers the issue? If not, this is just guesswork.

@yangwuan55
Copy link

yangwuan55 commented Nov 30, 2023

image
As a temporary solution, we might be able to circumvent this problem by forcing changes to the atomicfu version. But I don't know if it will cause other problems.

like this:


configurations.all {
        resolutionStrategy {
            force 'org.jetbrains.kotlinx:atomicfu:0.17.3'
        }
}

@LouisYubo
Copy link

image As a temporary solution, we might be able to circumvent this problem by forcing changes to the atomicfu version. But I don't know if it will cause other problems.

like this:


configurations.all {
        resolutionStrategy {
            force 'org.jetbrains.kotlinx:atomicfu:0.17.3'
        }
}

We tried this solution but sadly the issue still occurs 😞

@yangwuan55
Copy link

image As a temporary solution, we might be able to circumvent this problem by forcing changes to the atomicfu version. But I don't know if it will cause other problems.
like this:


configurations.all {
        resolutionStrategy {
            force 'org.jetbrains.kotlinx:atomicfu:0.17.3'
        }
}

We tried this solution but sadly the issue still occurs 😞

My scene seems to be related to cold startup, there is no good way at present, I have caught the exception, can only do so that the app does not crash.

@rantianhua
Copy link

rantianhua commented Feb 4, 2024

Same issue on version 1.7.3.

@rossbacher
Copy link

We just started getting this problem after updating our coroutines dependency from 1.6.4 to 1.8.0-RC2. We currently only have this happening in our beta release but so far it looks like it is limited to Android 13 and 14.
I did a bit of digging and wanted to share it here, I will also file a ticket with Google as this seems to be an issue with the VM or with GC maybe? 🤷

The crash originates in StateFlowSlot.makePending() which before the atomicfu update looked like this (this is decompiled Android byte code to Java:

public final void makePending() {
    Symbol symbol;
    Symbol symbol2;
    Symbol symbol3;
    Symbol symbol4;
    while (true) {
        Object obj = this._state;
        if (obj == null) {
            return;
        }
        symbol = StateFlowKt.PENDING;
        if (obj == symbol) {
            return;
        }
        symbol2 = StateFlowKt.NONE;
        boolean z16 = false;
        if (obj == symbol2) {
            AtomicReferenceFieldUpdater atomicReferenceFieldUpdater = _state$FU;
            symbol3 = StateFlowKt.PENDING;
            while (true) {
                if (!atomicReferenceFieldUpdater.compareAndSet(this, obj, symbol3)) {
                    if (atomicReferenceFieldUpdater.get(this) != obj) {
                        break;
                    }
                } else {
                    z16 = true;
                    break;
                }
            }
            if (z16) {
                return;
            }
        } else {
            AtomicReferenceFieldUpdater atomicReferenceFieldUpdater2 = _state$FU;
            symbol4 = StateFlowKt.NONE;
            while (true) {
                if (!atomicReferenceFieldUpdater2.compareAndSet(this, obj, symbol4)) {
                    if (atomicReferenceFieldUpdater2.get(this) != obj) {
                        break;
                    }
                } else {
                    z16 = true;
                    break;
                }
            }
            if (z16) {
                int i9 = k.f270577;
                ((CancellableContinuationImpl) obj).resumeWith(c0.f270561);
                return;
            }
        }
    }
}

after the update to atomicfu in 1.7.0 it looks like this:

public final void makePending() {
    Symbol symbol;
    Symbol symbol2;
    Symbol symbol3;
    Symbol symbol4;
    AtomicReferenceFieldUpdater atomicReferenceFieldUpdater = _state$volatile$FU;
    while (true) {
        Object obj = atomicReferenceFieldUpdater.get(this);
        if (obj == null) {
            return;
        }
        symbol = StateFlowKt.PENDING;
        if (obj == symbol) {
            return;
        }
        symbol2 = StateFlowKt.NONE;
        boolean z13 = false;
        if (obj == symbol2) {
            AtomicReferenceFieldUpdater atomicReferenceFieldUpdater2 = _state$volatile$FU;
            symbol3 = StateFlowKt.PENDING;
            while (true) {
                if (!atomicReferenceFieldUpdater2.compareAndSet(this, obj, symbol3)) {
                    if (atomicReferenceFieldUpdater2.get(this) != obj) {
                        break;
                    }
                } else {
                    z13 = true;
                    break;
                }
            }
            if (z13) {
                return;
            }
        } else {
            AtomicReferenceFieldUpdater atomicReferenceFieldUpdater3 = _state$volatile$FU;
            symbol4 = StateFlowKt.NONE;
            while (true) {
                if (!atomicReferenceFieldUpdater3.compareAndSet(this, obj, symbol4)) {
                    if (atomicReferenceFieldUpdater3.get(this) != obj) {
                        break;
                    }
                } else {
                    z13 = true;
                    break;
                }
            }
            if (z13) {
                int i16 = k.f159498;
                ((CancellableContinuationImpl) obj).resumeWith(c0.f159482);
                return;
            }
        }
    }
}

with the crash originating in this line: Object obj = atomicReferenceFieldUpdater.get(this);

Note: Even though StateFlow.kt had no meaningful code change between 1.6.4 and 1.7.0 because of the atomicfu update the byte code changed. This is also the reason why the forced dependency downgrade of atomicfu will not "fix" this issue, as the impacted code is already in the coroutines -core .jar file. So the downgrade during app build does not really do anything anymore.

This is where it gets funky: This code calls get on AtomicReferenceFieldUpdaterImpl with this (StateFlowSlot) instance as parameter.

@SuppressWarnings("unchecked")
public final V get(T obj) {
  accessCheck(obj);
  return (V)U.getObjectVolatile(obj, offset);
}

with

private final void accessCheck(T obj) {
  if (!cclass.isInstance(obj))
    throwAccessCheckException(obj);
}

cclass is StateFlowSlot.class because it is the first parameter in

static final /* synthetic */ AtomicReferenceFieldUpdater _state$FU = AtomicReferenceFieldUpdater.newUpdater(StateFlowSlot.class, Object.class, "_state");

and the constructor of AtomicReferenceFieldUpdaterImpl assigns it to the tclass, which is the fist parameter:

this.cclass = (Modifier.isProtected(modifiers) &&
  tclass.isAssignableFrom(caller) &&
  !isSamePackage(tclass, caller))
  ? caller : tclass;

so this accessCheck is checking if our instance of type StateFlowSlot is an instance of StateFlowSlot.class

on Android 13 the isInstance call looks like this:

public boolean isInstance(Object obj) {
  if (obj == null) {
    return false;
  }
  return isAssignableFrom(obj.getClass());
}

so this will get the class from the object (which itself is not null) that we called this with (which would be the existing instance of StateFlowSlow.

into

public boolean isAssignableFrom(Class<?> cls) {
  if (this == cls) {
    return true;  // Can always assign to things of the same type.
  } else if (this == Object.class) {
    return !cls.isPrimitive();  // Can assign any reference to java.lang.Object.
  } else if (isArray()) {
    return cls.isArray() && componentType.isAssignableFrom(cls.componentType);
  } else if (isInterface()) {
    // Search iftable which has a flattened and uniqued list of interfaces.
    Object[] iftable = cls.ifTable;
    if (iftable != null) {
      for (int i = 0; i < iftable.length; i += 2) {
        if (iftable[i] == this) {
          return true;
        }
     }
    }
  return false;
  } else {
    if (!cls.isInterface()) {
      for (cls = cls.superClass; cls != null; cls = cls.superClass) {
        if (cls == this) {
         return true;
       }
      }
    }
  return false;
  }
}

And this is where the crash is in line if (!cls.isInterface()) {.

Which means that the obj.getClass() call in isInstance returned a null class even though the instance is not null.

The Object implementation on Android seems to use some kind of backing field here:

public final Class<?> getClass() {
  return shadow$_klass_;
}

which I guess can be null, maybe a GC issue? Anyway, I will also file this with Google

Android standard lib code examples from: https://android.googlesource.com/platform/libcore/+/refs/heads/android13-d1-release/ojluni/src/main/java/java/lang

@rossbacher
Copy link

Issue in Google issue tracker: https://issuetracker.google.com/issues/325123736

@rossbacher
Copy link

rossbacher commented Feb 27, 2024

Update: This is happening on Android 12, 13 and 14 for us and we support 9+

@sgjesse
Copy link

sgjesse commented Feb 28, 2024

Just an FYI on the loop around compareAndSet in the decompiled code, which is not in the source. It is from a compareAndSet D8/R8 workaround using this code to workaround an issue on Android 12 (arm32 only).

qwwdfsad added a commit that referenced this issue Feb 29, 2024
Replace the specific place where ARFU gets misexecuted by specific Android toolchain

Fixes #3820
@qwwdfsad
Copy link
Member

qwwdfsad commented Mar 8, 2024

We've merged a potential temporary workaround that is going to be included in the next release.

Please note that the workaround is temporary, and we expect to rollback it once Google has identified and fixed the issue (which we assume it will taking its severity and priority: https://issuetracker.google.com/issues/325123736)

@yangwuan55
Copy link

We've merged a potential temporary workaround that is going to be included in the next release.

Please note that the workaround is temporary, and we expect to rollback it once Google has identified and fixed the issue (which we assume it will taking its severity and priority: https://issuetracker.google.com/issues/325123736)

So glad hear that,thank you so much!

@Mustafaubaid4
Copy link

Mustafaubaid4 commented Mar 9, 2024 via email

@rossbacher
Copy link

We have a fork of 1.8.0 with the workaround from #4054 in prod now and have enough sessions that we can confirm that this issue is not happening anymore with the workaround! 🙏 for the fix(workaround!

knisht pushed a commit to JetBrains/intellij-deps-kotlinx.coroutines that referenced this issue Apr 15, 2024
Replace the specific place where ARFU gets misexecuted by a specific Android toolchain

Fixes Kotlin#3820
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests