Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client side socket timeout (not statement_timeout) #3124

Open
omerd-cyera opened this issue Jan 7, 2024 · 9 comments
Open

Client side socket timeout (not statement_timeout) #3124

omerd-cyera opened this issue Jan 7, 2024 · 9 comments

Comments

@omerd-cyera
Copy link

Hi, I'm wondering if there is an option to set a client side query timeout for a pool.
I have a network issue, where the socket stays open, but is no longer connected to postgres (im using some sort of proxy).
This can be replicated by a simple nc -nvlp 5432.

I am aware of statement_timeout, but this is a server side timeout which will not work in my case.
Ideally, I would like an option like socket_read_timeout, that will throw an error.
Thanks!

@Safricloud
Copy link

My Azure App Service is doing the same thing.

@omerd-cyera
Copy link
Author

@Safricloud Encountered this on azure too, when using a virtual network gateway. I ended up upgrading it, and it stopped dropping the connections

@c032
Copy link

c032 commented Feb 12, 2024

@omerd-cyera

This can be replicated by a simple nc -nvlp 5432.

I can't run that command (nc: cannot use -p and -l), but assuming nc -nvl 5432 is the same as what you want, that one can be solved by the connectionTimeoutMillis option.

Reproducing a connection that hangs specifically during a call to the query method looks a bit more involved. But glancing at the code, it seems that setting query_timeout should cancel the query client-side (test).

Another useful setting to keep in mind during these situations (though it wouldn't have helped here specifically) is setting keepAlive to true.

@omerd-cyera
Copy link
Author

one can be solved by the connectionTimeoutMillis option.

You are correct, I had this issue both while creating a new connection, And when waiting for a response on a query.
I managed to replicate the second one with a simple python proxy, and relays the traffic. The script would also wait for a key to be pressed, and stop relaying packets between the sockets. This replicated my issue locally.
Here is the script, if you want to try and replicate it too

import socket
import sys
import threading
import time
import select

PROXY_PORT = 54321
POSTGRES_PORT = 5432
POSTGRES_HOST = 'localhost'
HALT_PROXY_KEY = 'q'
HALT_PROXY_TRANSFER_KEY = 'w'

def main():
  proxy_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  proxy_socket.bind(('', PROXY_PORT))

  proxy_socket.listen(10)
  print(f"Press {HALT_PROXY_KEY} to halt proxy, {HALT_PROXY_TRANSFER_KEY} to halt transfer")

  try:
    while True:
      client_socket, addr = proxy_socket.accept()
      print('Got a connection from %s' % str(addr))
      thread = threading.Thread(target=handle_request, args=(client_socket,))
      thread.start()
  except:
    proxy_socket.close()
    
    


def handle_request(client_socket):
  postgres_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  postgres_socket.connect((POSTGRES_HOST, POSTGRES_PORT))

  postgres_socket.settimeout(1)

  sockets = [client_socket, postgres_socket]

  stop_transfer = False
  while True:
    if sys.stdin in select.select([sys.stdin], [], [], 0)[0]:
      key = sys.stdin.read(1)
      if key == HALT_PROXY_KEY:
        break
      elif key == HALT_PROXY_TRANSFER_KEY:
        stop_transfer = True

    try:
      read_sockets, write_sockets, error_sockets = select.select(sockets, [], [])
      for sock in read_sockets:
        if sock == client_socket:
          data = client_socket.recv(1024)
          if not data:
            break
          print('> Packet from client')
          if not stop_transfer:
            postgres_socket.sendall(data)
        elif sock == postgres_socket:
          data = postgres_socket.recv(1024)
          if not data:
            break
          print('< Packet from postgres')
          if not stop_transfer:
            client_socket.sendall(data)
    except socket.timeout:
      print('timeout')
      pass
    except socket.error:
      print('error')
      break

  client_socket.close()
  postgres_socket.close()

if __name__ == '__main__':
  main()

glancing at the code, it seems that setting query_timeout should cancel the query client-side (test).

Can't remember exactly now why I decided it didn't work. I might try to get back to replicating this issue later this week with the proxy.

@c032
Copy link

c032 commented Feb 17, 2024

@omerd-cyera

Yep, I started your script, connected from pg to it, started running queries in a loop, pressed w (and then Enter) in your script, and got an exception from the query method after the expected amount of time (I used 3000 ms in my test script):

client.js
import pg from 'pg'

const { Client } = pg;

function sleep(ms) {
    return new Promise((resolve) => {
        setTimeout(() => {
            resolve();
        }, ms);
    });
}

async function main() {
    const client = new Client({
        connectionString: "postgresql://localhost:54321",

        // This ///////////////////////////////////////////////////////////////
        query_timeout: 3000,
        ///////////////////////////////////////////////////////////////////////
    })

    await client.connect();

    for (let i = 0; i < 60; i++) {
        const result = await client.query('select current_timestamp ct;');

        console.log(result.rows[0].ct);

        await sleep(1000);
    }

    await client.end();
}

main();
/run/user/1000/tmp/node_modules/pg/lib/client.js:526
          Error.captureStackTrace(err);
                ^

Error: Query read timeout
    at /run/user/1000/tmp/node_modules/pg/lib/client.js:526:17
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async main (file:///run/user/1000/tmp/client.js:22:21)

Node.js v20.10.0

Hope this helps.

@omerd-cyera
Copy link
Author

Not sure where my tests went wrong earlier. Ill check it out again.
Thanks for the help!

@Safricloud
Copy link

I was able to solve my problem by setting this in pool connection options:

idleTimeoutMillis: 60000, // 60 seconds

This threshold is the same as what Azure is using on their proxy to timeout connections, so now my app just reconnects when needed and never assumes that a connection is alive when its not.

@omerd-cyera
Copy link
Author

@Safricloud If I am not mistaken, this will only fix the problem for idle connections, not for connections that are waiting for a response from the db.

@Safricloud
Copy link

Safricloud commented Feb 22, 2024

For active queries you can set a timeout with this pool/connection option:
query_timeout: 30000, // 30 seconds
It will throw an error if the query did not get a response within that time frame, but bare in mind that this may impact long queries that will eventually return a result. That's why I preferred to find a way to use a working connection that I could verify.

I was using this getWorkingConnection function to verify my results:

const testConnection = async (connection: PoolClient): Promise<boolean> => {
    const MAX_WAIT = 3 * 1000; // 3 seconds
    // Create a timeout promise that rejects after MAX_WAIT milliseconds
    const timeout = new Promise((_, reject) => {
        setTimeout(() => {
            reject(new Error('Connection test timed out'));
        }, MAX_WAIT);
    });
    try {
        // Create a race between the timeout and the query execution
        await Promise.race([
            connection.query('SELECT 1'), // Execute a simple query to test the connection
            timeout
        ]);
        return true; // The connection is working
    } catch (err) {
        console.error('Connection failed', err);
        return false;
    }
};
const getWorkingConnection = async (retryNumber = 0): Promise<PoolClient> => {
    const MAX_RETRY = pool.totalCount || 10;
    if (retryNumber >= MAX_RETRY) {
        throw new Error('Exceeded maximum attempts to get a working postgreSQL connection');
    }
    const conn = await pool.connect(); // Get a connection from the pool
    const isConnected = await testConnection(conn); // Test it
    if (isConnected) {
        console.log('A working connection has been acquired from the pool.');
        return conn; // Return the working connection
    } else {
        conn.release(); // How can we destroy connections that are not working?
        // Try again
        return getWorkingConnection(++retryNumber);
    }
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants