Memory leak in Aurora Postgres since 3.9.1 #1435

al-kudryavtsev · 2024-04-29T18:55:12Z

Version

3.9.1, 4.5.7

Context

Our app has been on vertx-pg-client 3.9.0 for a while until recently when we migrated to version 4.5.7.

Shortly after the upgrade we noticed our AWS Aurora PG cluster is running out of freeable memory and rebooting. It is directly correlated to the load on the service - during peak times DB can run out of memory (60GB) in 45 minutes. We employed the maxLifetime parameter for the pool, which helps but is not ideal as it causes latency spikes. We prewarm the pool on deployment so all the connections expire at about the same time.

On 3.9.0, there was no issue with DB memory consumption. We rolled back the upgrade to confirm this.
On 3.9.1, confirmed the issue is present.
This thread has a report of the same issue on 3.9.13 as well as 4.x branch.

Do you have a reproducer?

I don't have a reproducer; this might be AWS/aurora-specific. I can test potential solutions if that can help.

Extra

Pretty much all our queries are preparedQuery + execute or executeBatch. We also use standalone prepare with a cursor for paginated reads. Pool configuration: https://gist.github.com/al-kudryavtsev/3e6eeb3cfd200afc66df5a12932bba25

The text was updated successfully, but these errors were encountered:

tsegismont · 2024-05-21T09:58:25Z

Have you tried using DISCARD ALL as suggested in the group discussion?

It would help to have some details about what is consuming memory, can you get that from your DBA?

In the 3.9.1 release notes, there is a link to #577
If you use prepare, can you make sure the statement is closed properly?

al-kudryavtsev · 2024-05-21T18:01:58Z

Thanks for looking into this!

Haven't tried DISCARD ALL; it might have a performance impact as it will flush cached query plans when connection is returned to the pool. Might try this if other options won't work.

If you use prepare, can you make sure the statement is closed properly?

The statement wasn't closed indeed. We were doing prepare(sql).coAwait().cursor(args) and then reading the cursor until it is exhausted, but never closed the result of prepare(sql). I changed this to:

flow<Row> {
    val preparedStatement = prepare(sql).coAwait()
    try {
        val cursor = preparedStatement.cursor(args)
        do cursor.read(pageSize).coAwait().forEach { emit(it) }
        while (cursor.hasMore())
    } finally {
        preparedStatement.close().coAwait()
    }
}

It looks promising so far, will see if we can keep connections open indefinitely. Based on the doc, I'm assuming that cursor doesn't have to be closed since we read it until the end.

tsegismont · 2024-05-23T08:31:57Z

Good news, keep us posted please.

al-kudryavtsev · 2024-05-23T18:08:22Z

Fully removed max lifetime today - DB memory looks good. Thanks for your help!

al-kudryavtsev added the bug label Apr 29, 2024

tsegismont added this to the 4.5.8 milestone May 21, 2024

tsegismont self-assigned this May 21, 2024

al-kudryavtsev closed this as completed May 23, 2024

tsegismont closed this as not planned Won't fix, can't repro, duplicate, stale May 23, 2024

tsegismont added question and removed bug labels May 23, 2024

tsegismont removed this from the 4.5.8 milestone May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in Aurora Postgres since 3.9.1 #1435

Memory leak in Aurora Postgres since 3.9.1 #1435

al-kudryavtsev commented Apr 29, 2024 •

edited

tsegismont commented May 21, 2024

al-kudryavtsev commented May 21, 2024

tsegismont commented May 23, 2024

al-kudryavtsev commented May 23, 2024

Memory leak in Aurora Postgres since 3.9.1 #1435

Memory leak in Aurora Postgres since 3.9.1 #1435

Comments

al-kudryavtsev commented Apr 29, 2024 • edited

Version

Context

Do you have a reproducer?

Extra

tsegismont commented May 21, 2024

al-kudryavtsev commented May 21, 2024

tsegismont commented May 23, 2024

al-kudryavtsev commented May 23, 2024

al-kudryavtsev commented Apr 29, 2024 •

edited