Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to error on queries inside a transaction after a lost connection #1454

Closed
vitaly-t opened this issue Sep 14, 2017 · 16 comments
Closed
Assignees
Labels

Comments

@vitaly-t
Copy link
Contributor

vitaly-t commented Sep 14, 2017

When we execute a lengthy operation inside a transaction, and the connection is suddenly lost during that time, then any query we execute on the client after that become stuck, i.e. they report nether success no error:

'use strict';

const Pool = require('pg-pool');

const cn = {
    database: 'newone',
    port: 5433,
    user: 'postgres'
};

const p = new Pool(cn);

p.on('error', error => {
    // do nothing
});

p.connect((err, client, release) => {
    if (err) {
        console.log('Failed to connect:', err);
        return;
    }
    const test = async function () {
        try {
            await client.query('BEGIN');
            await client.query('SELECT pg_sleep(10)'); // connection breaks during this one
            await client.query('COMMIT');
            console.log('success!');
        } catch (e) {
            console.log('about to rollback...'); // this is reported
            await client.query('ROLLBACK'); // this one is stuck, never reporting an error, if the error was due to lost connection
            console.log('rollback finished'); // this is never reported, if we are here because of lost connection
        } finally {
            release();
            p.end();
        }
    };

    test();
});

I've tried both pg-pool directly and via node-postgres, versions 6.x and 7.x - all the same.

Expected Behavior

Executing any query against client at that point should immediately report an error that informs us of the lost connectivity.

@vitaly-t vitaly-t changed the title Fails to report error on queries inside a transaction after a lost connection Fails to error on queries inside a transaction after a lost connection Sep 14, 2017
@charmander charmander added the bug label Sep 14, 2017
@charmander
Copy link
Collaborator

Related (and closed for some reason?): #632, #718

@vitaly-t
Copy link
Contributor Author

vitaly-t commented Sep 14, 2017

One of those old bugs was kind-of more specific, even though the issue was a broad one, and the other one had too much code, not easy to see what the issue is.

My example is much easier to read and debug, and it is 100% reproducible, and quite generic, i.e. it is not that the ROLLBACK fails at that point, it is any query.

Hopefully, it will help us find and fix the issue this time 🤞

@vitaly-t
Copy link
Contributor Author

@charmander seems like #1322 is yet another dupe. This is one devious bug! 😈

@charmander

This comment has been minimized.

@brianc
Copy link
Owner

brianc commented Sep 15, 2017

Nice! Thanks for reporting this - I'll try to look at this this weekend.

@brianc brianc self-assigned this Sep 15, 2017
@vitaly-t
Copy link
Contributor Author

@brianc any luck? Do you need help with it? 😉

@vitaly-t
Copy link
Contributor Author

Two week-ends out, still nothing?

@brianc
Copy link
Owner

brianc commented Sep 25, 2017

Sorry for the delay: I broke a bone & haven't been @ the computer much, and when I am at the computer I have to work so I can pay my bills. One thing that would help is adding to your example above some code to manually kill the connection. Right now to simulate a connection drop I have to kill postgres, which doesn't work if I want to run the test in CI or automatically. Would it be possible to create a fully self-contained script that demonstrates the error without requiring manual intervention to kill the connection? That would be super helpful.

@vitaly-t
Copy link
Contributor Author

vitaly-t commented Sep 25, 2017

One thing that would help is adding to your example above some code to manually kill the connection

To temporarily kill all connections to your test database, execute the following SQL in pgAdmin:

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname='my-database-name';

You must provide the correct database name in the query, and make sure to execute it from a connection to a different database, or else you will be killing the current pgAdmin connection, which often causes it to crash the UI.


Get well soon! 🤧

@brianc
Copy link
Owner

brianc commented Sep 25, 2017

gotcha! Thanks! It's a slow road but I'm gettin' better 😄

@vitaly-t
Copy link
Contributor Author

vitaly-t commented Oct 29, 2017

@brianc Are you still on sick? It's been almost 2 month since the last commit.

(pressed close instead of comment by mistake, as per freaking usual)

@vitaly-t
Copy link
Contributor Author

vitaly-t commented Feb 19, 2018

UPDATE

This issue appears to be even more severe, reported against solutions running on a thin WiFi connection, i.e. the issue keeps happening even with small transactions that execute longer due to the slow network.

@charmander @brianc Guys, any chance to see a progress for this some day? 😄 This is a P1 bug 😉

@vitaly-t
Copy link
Contributor Author

vitaly-t commented Feb 19, 2018

While there is no fix for this, for now I am using the following work-around:

try {
      await client.query('BEGIN');
      await client.query('SELECT pg_sleep(10)'); // connection breaks during this one
      await client.query('COMMIT');
     } catch (e) {
        if(!isConnectivityError(e)) {
          await client.query('ROLLBACK');
       }
    }

And here's code from pg-promise of how I check for a connectivity error:

////////////////////////////////////////////
// Identifies a general connectivity error.
function isConnectivityError(err) {
    const code = err && typeof err.code === 'string' && err.code;
    const cls = code && code.substr(0, 2); // Error Class
    return code === 'ECONNRESET' || cls === '08' || cls === '57';
    // Code 'ECONNRESET' - Connectivity issue handled by the driver.
    // Class 08 - Connection Exception.
    // Class 57 - Operator Intervention.
    //
    // ERROR CODES: https://www.postgresql.org/docs/9.6/static/errcodes-appendix.html
}

@charmander

This comment has been minimized.

@charmander
Copy link
Collaborator

charmander commented Jul 26, 2018

Nevermind, I was testing on the wrong branch – confirming that #1503 does fix this as well.

@vitaly-t
Copy link
Contributor Author

Confirmed. This now works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants