Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for passing in existing SSL connection to psycopg2.connect() #1421

Open
jackwotherspoon opened this issue Jan 28, 2022 · 7 comments
Open

Comments

@jackwotherspoon
Copy link

The Cloud SQL Python Connector would like to support database connections to Cloud SQL using psycopg2. In order to do so we require the ability to pass in an existing connection or the ability to configure the connection level SSL.

For reference, we currently support pg8000 through the use of their ssl_context argument which allows us to pass in our pre-configured ssl.SSLContext object. pg8000 ssl_context

Let me know if this is potentially feasible in a future release? Happy to provide more information or assistance if needed. Thanks!

@dvarrazzo
Copy link
Member

Hello Jack,

psycopg (2 and 3) use the libpq, the PostgreSQL client library, to connect to the database. I don't believe that the libpq offers such functionality.

Maybe it's something that can be added, but it should be in the libpq; once it's available we would be happy to support it in psycopg.

Something unbelievably brutal we could attempt on our side could be to sneak an ssl structure in the connection after the creation of the structure but before attempting connection to bypass this branch but it requires access to the PGconn structure internals, which are outside the public interface.

ISTM that it's possible to add a new function to the libpq, which may be called between PQconnetStart and PQconnectPoll to initialise the ssl context from the outside (e.g. bool PQsetSSLContext(PGconn *conn, SSL_CTX *ctx)).

I would be open to work on the feature and to propose it upstream to the PostgreSQL developers, if the work has some financial support.

@enocom
Copy link

enocom commented Oct 13, 2023

For some extra context, Java, Go, and Node all support a way to provide a custom socket creation function.

  • Java's JDBC Postgres driver has a SocketFactoryFactory where you provide a class name and the Postgres driver will use that class to create sockets
  • Go's pgx has a DialFunc that can be configured to create a socket.
  • Node's node-pg has a stream option that can be configured to create a socket as well

There are probably other examples, but these are the ones I'm most familiar with. There are many reasons to support this functionality:

  1. to support connecting over SSH tunnels
  2. to customize how DNS failures are handled
  3. to establish a secure connection beyond the usual Postgres mechanics.
  4. etc

In all cases, this custom socket creation replaces the Start TLS protocol in Postgres. In effect, if a caller provides such a function, the Postgres driver does no additional TLS handling and treats the created socket as if it were a typical TCP socket.

Ideally, we could expose something comparable in psycopg.

To make this work, libpq has to change first, as noted. Since the existing connection-related functions will likely never change, we're thinking of adding a callback function where C clients could register a function to call in place of creating a socket in the usual way.

What I'm uncertain about is how we'd expose such an interface in Python. The flow would be something like: Python client configures socket creation function, psycopg passes a reference to that function to libpq, libpq calls that function. That's hand wavy, but does it seem feasible?

@enocom
Copy link

enocom commented Oct 13, 2023

Side note. The title of the issue is one option: "Support for passing in existing SSL connection". I'm proposing a slightly different option: "Support for passing in a socket creation function." Both are probably worth considering, although passing in the socket creation function is more in line with other postgres drivers.

@dvarrazzo
Copy link
Member

Looking a bit into this problem, and looking into possible solution, a couple seem the most promising:

  • adding a callback to create a socket, to be called by the libpq
  • adding a fileno connection parameter to the connection string, to let it use an existing socket.

The tradeoffs (unweighted), at first glance, for me are:

Socket creation callback:

  • +1 is already used by other adapters (mysql, I understand)
  • +1 it is a config setting to set only once and will be called by every connection attempt (but it still need to be changed in order to connect to different databases, so it cannot really be a global option)
  • -1 it requires a new libpq C api for the connection functions
  • -1 it requires a new Python API to pass the callback
  • -1 it requires a careful handover of a callback from a Python wrapper to C

"fileno" connection parameter:

  • -1 I don't know if it's used by other implementations
  • -1 the fileno requires to be set to a different value for each connection
  • +1 a new Python API to call a the callback at connection time and pass the fileno to the connection string will be useful, but can do without, in a pinch
  • +1 it doesn't require libpq API change
  • +1 passing an int param to a string is simpler and more portable than passing a callback, it can be used by any client language

By the look of it, the route of passing a FD number to the connection string seems easier, but feedback about points I may have overlooked is welcome.

@enocom
Copy link

enocom commented Nov 30, 2023

Thanks @dvarrazzo.

FWIW the "fileno" approach would require a change in the supported keyword values. Perhaps a "hostfd".

Separately, we could add support for creating a Unix domain socket and then passing it into psycopg like any other database connection, but we'll explore that option separately.

cc @nmisch who has explored possible solutions.

@dvarrazzo
Copy link
Member

Hi @enocom

FWIW the "fileno" approach would require a change in the supported keyword values. Perhaps a "hostfd".

If you are writing this because I said that the fileno approach requires no libpq API change, I mean that it requires no new C function exposed to the users, but of course it requires behavioural changes. As opposite, adding new functions to connect using an existing socket, or to connect using an existing SSH context, requires exposing new functions to end users.

@enocom
Copy link

enocom commented Dec 1, 2023

Ah yes, I misunderstood. We're in agreement in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants