Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard to handle idle-in-transaction errors #2852

Open
mgabeler-lee-6rs opened this issue Nov 3, 2022 · 8 comments
Open

Hard to handle idle-in-transaction errors #2852

mgabeler-lee-6rs opened this issue Nov 3, 2022 · 8 comments

Comments

@mgabeler-lee-6rs
Copy link

I've setup my application's pool with an on('error') hook to log errors that happen outside queries.

However, this hook is not always active between queries. If I acquire a connection from the pool to run a transaction, and that transaction falls afoul of the idle_in_transaction_session_timeout, the error thrown from there doesn't happen during a query, and so if I don't take extra steps to add an on('error') listener for each such connection, my app dies.

But I also have to make sure to manually remove those, since releasing the client back to the pool doesn't do so!

Is there a recommended way to do this "nicely"?

I see some possible ways to make this more ergonomic, which might be applied in combination:

  1. Apply that idle error listener from the pool even when the client is acquired
    • This probably would need to be combined with some logic to avoid calling it if a "per-acquire" error listener is attached
  2. Automatically clear client error listeners on release to the pool
    • e.g. call client.removeAllListeners('error') just before re-adding the idle listener here: https://github.com/brianc/node-postgres/blob/master/packages/pg-pool/index.js#L329
    • This still leaves a race window between when the idle listener is removed acquiring a client connection and when the calling code can add its custom listener.
    • This race may be avoidable when using the callback style, but it's pretty hard to avoid when using Promises.
@brianc
Copy link
Owner

brianc commented Nov 3, 2022

great question - would you be able to include a code snippet that demonstrates the problem? I'll take a look at it.

@mgabeler-lee-6rs
Copy link
Author

I've made a small demo here: https://github.com/mgabeler-lee-6rs/node-pg-issue-2852

Naturally imagine that there are many worker functions using the pool in various places.

I've poked through things and found a way to make this kinda work, but it's decidedly not-pretty: use the pool acquire event to remove and re-add a "durable" extra error handler to the client. The downside is that this stays attached to the client, in addition to the "idle" handler, when the connection is released, so this isn't production-ready imo, in addition to being a bit ugly.

@ackava
Copy link

ackava commented Jan 10, 2024

This is the error log.

error: terminating connection due to idle-in-transaction timeout
    at Parser.parseErrorMessage (/app/node_modules/pg-protocol/src/parser.ts:369:69)
    at Parser.handlePacket (/app/node_modules/pg-protocol/src/parser.ts:188:21)
    at Parser.parse (/app/node_modules/pg-protocol/src/parser.ts:103:30)
    at TLSSocket.<anonymous> (/app/node_modules/pg-protocol/src/index.ts:7:48)
    at TLSSocket.emit (node:events:514:28)
    at addChunk (node:internal/streams/readable:545:12)
    at readableAddChunkPushByteMode (node:internal/streams/readable:495:3)
    at Readable.push (node:internal/streams/readable:375:5)
    at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23) {
  length: 122,
  severity: 'FATAL',
  code: '25P03',
  detail: undefined,
  hint: undefined,
  position: undefined,
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: undefined,
  table: undefined,
  column: undefined,
  dataType: undefined,
  constraint: undefined,
  file: 'postgres.c',
  line: '3372',
  routine: 'ProcessInterrupts'
}

Ideally this should not crash the application even if on("error",...) is not set, in this case, the next query or next step of reading should throw a timeout error.

@mgabeler-lee-6rs
Copy link
Author

Having just had to fight through some more manifestations of this and some related error patterns, I've come to what may be a controversial opinion: The use of the this.emit('error', ...) pattern is just unfriendly to library consumers.

My reason for this is that it creates a library whose behavior includes, "if I notice an unexpected network socket closure in the background, I will kill the entire application unless you do something special on each and every network socket to tell me you don't want this to happen." Phrased like this, I think one can see why this might be a shocking and unpleasant behavior for a networking library to exhibit. Every other networking library, and particularly database connector, I've worked with will simply hold these errors and surface them on the next attempt to use that connection.

Having the option to be proactively notified of background errors is good, so I don't think that should be removed. I simply think that the behavior should be changed such that, if no "user" error listener is installed, node-postgres should retain that error and use it to respond to the next interaction with the client connection object. So e.g. if my PG server goes down unexpectedly, and I don't have an error event listener connected, the "connection terminated by administrator command" or "connection terminated unexpectedly" or whatever error should become the result of the next query() call or the like. And if this happens while a connection is in the pool and not acquired by the library user code, then it should just be silently dropped without fuss in the case where no user error listener is installed1.

Footnotes

  1. Maybe this part is already the case, I'll admit that tracing through how errors are handled in this library has been confusing for me, and I've ended up adding error listeners in a lot of different places trying to stop my team's apps crashing any time pgbouncer or postgres restarts or some transient networking hiccup happens.

@charmander
Copy link
Collaborator

@mgabeler-lee-6rs Probably not controversial. I think attaching a no-op error listener to the pool by default has already been proposed, and is the most likely change? Queries erroring with the same connection error is already how it works (#1503).

@brianc
Copy link
Owner

brianc commented Apr 6, 2024

yah this is a really good idea - would probably need to be an "opt-in" feature via some config option until the next major version because even though it's a good change it's also technically a breaking change if someone is relying on the currently unfortunate behavior of "something goes wrong in the background? well unless you have an error listener kiss your process goodbye" behavior. But yeah definitely I'll add this to my ever growing list of "things to work on when I'm back from vacation" aka end of april. ❤️

@mgabeler-lee-6rs
Copy link
Author

Awesome, thank you!

@tswaters
Copy link

We've encountered this and worked around it with the following helper to manage client connections consistently -

async function withClient(pool, worker) {
	let client
	let clientReleased = false
	let error

	const onError = err => {
		if (clientReleased) return
		clientReleased = true
		client.release(err)
	}

	try {
		client = await pool.connect()
		client.once('error', onError)
		await client.query('BEGIN')
		const result = await worker(client)
		await client.query('COMMIT')
		return result
	} catch (err) {
		if (!clientReleased) await client.query('ROLLBACK')
		error = err
		throw error
	} finally {
		if (!clientReleased) {
			clientReleased = true
			client.removeListener('error', onError)
			client.release(error)
		}
	}
}

This is pretty similar to how the pool implements query, here:

https://github.com/brianc/node-postgres/blob/pg-pool%403.6.2/packages/pg-pool/index.js#L381-L436

i.e., attach an error listener to client and release if it hits an error outside the normal connect, query end end calls.

Usage:

await withClient(pool, async client => {
  await client.query('...')
  await client.query('...')
  await client.query('...')
})

This includes doing a BEGIN/COMMIT/ROLLBACK which may not be desired - but if someone is hitting this issue, something like this might be helpful for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants