Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PGSQL] The reactor hanging issue with r2dbc-postgresql #5160

Open
vttranlina opened this issue Apr 12, 2024 · 7 comments
Open

[PGSQL] The reactor hanging issue with r2dbc-postgresql #5160

vttranlina opened this issue Apr 12, 2024 · 7 comments

Comments

@vttranlina
Copy link
Member

vttranlina commented Apr 12, 2024

Related to the hanging issue with r2dbc-postgresql + JOOQ:

Example methods:

1: PostgresExecutor#executeCount
2: PostgresExecutor#executeDeleteAndReturnList
3: AttachmentPartRetriever

Updates:

  1. The hanging issue is not related to Jooq. Refer to the issue [Postgresql] Investigate why PopulateEmailQueryViewTask gets stuck while running
    I successfully resolved the issue by re-running Quan's lab with the addition of the JVM argument -Dreactor.bufferSize.small=100000. Both Jooq and Native SQL passed the test after this adjustment.

Note: While testing with Native SQL, an error was encountered: Cannot exchange messages because the request queue limit is exceeded, which has occurred sporadically in the past. However, no error log was generated when testing with Jooq, => We missed configure logger error for Jooq

  1. For hanging issue when DELETE ... RETURNING(deleteByMailboxId)
    The hanging issue persists when using native queries (without Jooq). I reported bug on r2dbc-postgresql repository here: Hanging when execute statement DELETE and RETURNING pgjdbc/r2dbc-postgresql#650

Reproduction Code:

public Flux<UUID> deleteByMailboxIdAndReturning(PostgresMailboxId mailboxId) {
        AtomicInteger counter = new AtomicInteger(0);
        AtomicInteger doOnNextCounter = new AtomicInteger(0);

        return postgresExecutor.connection()
            .flatMapMany(con -> Flux.from(con.createStatement("DELETE FROM message_mailbox WHERE mailbox_id = $1 RETURNING message_id")
                    .bind(0, mailboxId.asUuid())
                    .execute())
                .flatMap(result -> result.map((row, rowMetadata) -> {
                    String msgIdWasDeleted = row.get(0, String.class);
                    System.out.println("msg was deleted(" + counter.incrementAndGet() + "): " + msgIdWasDeleted);
                    return msgIdWasDeleted;
                }), 5,15))
            .doOnNext(e -> {
                System.out.println("____doOnNext("+doOnNextCounter.incrementAndGet()+"): " + e);
            })
            .map(UUID::fromString);
    }
  • the counter is not euqal doOnNextCounter.
  • counter >= prefetchSize > doOnNextCounter

Resolution:
Resolved by replacing .collectList().flatMapIterable(list -> list) with .window(1).flatMap(flux -> flux). Both methods have been tested successfully.
WDYT about new way?

@vttranlina vttranlina changed the title [PGSQL [PGSQL] The reactor hanging issue with r2dbc-postgresql Apr 12, 2024
@vttranlina
Copy link
Member Author

Thread dump when running DeleteMessageListenerContract#deleteMessageListenerShouldSucceedWhenDeleteMailboxHasALotOfMessages
with Reproduction Code above
threads_report.txt

"deleteByMailboxIdAndReturning-1" prio=0 tid=0x0 nid=0x0 waiting on condition
     java.lang.Thread.State: WAITING
 on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@44e29989
	at java.base@21.0.2/jdk.internal.misc.Unsafe.park(Native Method)
	at java.base@21.0.2/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
	at java.base@21.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
	at java.base@21.0.2/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
	at java.base@21.0.2/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
	at java.base@21.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1707)
	at java.base@21.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
	at java.base@21.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
	at java.base@21.0.2/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
	at java.base@21.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.base@21.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base@21.0.2/java.lang.Thread.runWith(Thread.java:1596)
	at java.base@21.0.2/java.lang.Thread.run(Thread.java:1583)

@vttranlina
Copy link
Member Author

Now I don't have any new ideas for this issue.
Wait for the response from r2dbc-postgresql maintainer on the bug reported

@chibenwa
Copy link
Member

Yes the threads are doing strictly nothing!

@chibenwa
Copy link
Member

Could you rerun your example but print thread name at the start of your debug messages?

@vttranlina
Copy link
Member Author

Could you rerun your example but print thread name at the start of your debug messages?

reactor-tcp-nio-1- msg was deleted(1): 018ed018-1402-7830-ac4f-626e05961402
reactor-tcp-nio-1____doOnNext(1): 018ed018-1402-7830-ac4f-626e05961402
reactor-tcp-nio-1- msg was deleted(3): 018ed018-14f3-7363-8111-6c8d0cb4f204
reactor-tcp-nio-1____doOnNext(2): 018ed018-14f3-7363-8111-6c8d0cb4f204
reactor-tcp-nio-1- msg was deleted(5): 018ed018-1517-7fe3-8c0d-f42ef83f6b51
reactor-tcp-nio-1____doOnNext(3): 018ed018-1517-7fe3-8c0d-f42ef83f6b51
reactor-tcp-nio-1- msg was deleted(7): 018ed018-1536-77a6-a157-80da658c0591
reactor-tcp-nio-1____doOnNext(4): 018ed018-1536-77a6-a157-80da658c0591
reactor-tcp-nio-1- msg was deleted(9): 018ed018-1551-7085-ba9f-d3ff0dbaa154
reactor-tcp-nio-1- msg was deleted(11): 018ed018-156f-7c5b-81e2-984b207faa39
reactor-tcp-nio-1- msg was deleted(13): 018ed018-158c-7248-8277-2ecde78841ee
reactor-tcp-nio-1- msg was deleted(15): 018ed018-15a5-7af8-af7c-fe0b661afc45
reactor-tcp-nio-1- msg was deleted(17): 018ed018-15be-759f-8789-868947c8b038
reactor-tcp-nio-1- msg was deleted(19): 018ed018-15da-71fd-aaad-eae1344ccc46
reactor-tcp-nio-1- msg was deleted(21): 018ed018-15f5-7f6a-b9ac-9d823fa26653
reactor-tcp-nio-1- msg was deleted(23): 018ed018-1613-76ed-aa05-6a535d1a6336
reactor-tcp-nio-1- msg was deleted(25): 018ed018-162a-763c-9342-61992c9ccdde
reactor-tcp-nio-1- msg was deleted(27): 018ed018-1640-7766-b433-aacd5f10e3d9
reactor-tcp-nio-1- msg was deleted(29): 018ed018-1658-7e51-9f28-bfa0a0c78183

@vttranlina
Copy link
Member Author

Full debug log
test-run1.log
Search keyword: "PostgresMailboxMessageDAO"

@vttranlina
Copy link
Member Author

I created a new simple maven project (independent of james)
for easy reproduction, debugging
https://github.com/vttranlina/r2dbc-postgresql-test.git
Feel free to debug it if interested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants