Postgres proxy on cloud #5354

FelixMalfait · 2024-05-11T08:01:35Z

Context

We have a multi tenant-cloud but use 1 schema per workspace.
We could leverage this to our advantage to let people access their data externally (e.g. via Metabase or Tableau)
We need to find a way to do that in a secure way (we cannot expose Postgres publicly/for all tenants)

Proposal

My initial proposal was to setup a new Postgres DB that we would use as a proxy with foreign data wrappers pointing to our read-replica. That way we would only expose the data for people that opt-in + we could add additional security layers (connection limit enforced by the read replica, read-only at prod db-level + read-replica level, additional logging tailored specifically for the proxy etc).

Then talking to @Freebios we realized that building a wrapper in Node wasn't actually complex. It feels a bit hacky as we're almost at the packet level but it seems to work reasonably well

Proof of concept

Here's a sample code that works (IP whitelisting part wasn't tested)

const net = require('net');

const HOST = 'localhost'; // PostgreSQL server host
const PG_PORT = 5432; // PostgreSQL server port
const PROXY_PORT = 5433; // Port for proxy server to listen on

// Create a server that listens for client connections
const server = net.createServer((clientSocket) => {
    console.log('Client connected.');

    const checkPassed = checkSecurity(clientSocket);
    if (!checkPassed) {
        console.log('Security check failed');
        socket.end('Security check failed\n');
        return;
    }

    // Connect to the PostgreSQL server
    const serverSocket = net.connect({ host: HOST, port: PG_PORT }, () => {
        console.log('Proxy connected to PostgreSQL server.');
    });

    // Relay data from client to PostgreSQL server
    clientSocket.on('data', (data) => {
        console.log('SQL Query:', data.toString()); // Log the SQL query
        serverSocket.write(data);
    });

    // Relay data from PostgreSQL server back to client
    serverSocket.on('data', (data) => {
        console.log('PostgreSQL Response:', data.toString()); // Log the PostgreSQL response
        clientSocket.write(data);
    });

    // Handle client disconnect
    clientSocket.on('end', () => {
        console.log('Client disconnected.');
        serverSocket.end();
    });

    // Handle server disconnect
    serverSocket.on('end', () => {
        console.log('PostgreSQL server connection closed.');
        clientSocket.end();
    });

    // Error handling for the client
    clientSocket.on('error', (err) => {
        console.log('Client Error:', err.message);
        clientSocket.destroy();
        serverSocket.end();
    });

    // Error handling for the server connection
    serverSocket.on('error', (err) => {
        console.log('Server Error:', err.message);
        serverSocket.destroy();
        clientSocket.end();
    });
});

server.listen(PROXY_PORT, () => {
    console.log(`Proxy server listening on port ${PROXY_PORT}`);
});


function checkSecurity(socket) {
    // Implement your security checks here
    // For example, check the IP address

    return true;

    const allowedIPs = ['192.168.1.100', '192.168.1.101', 'localhost'];
    return allowedIPs.includes(socket.remoteAddress);
}

(note this is not Typescript)

Implementation

Create a page for people to opt-in to Postgres exposure (will be within integrations, probably under "Metabase" or something like that intially), if they've opted in then they can click to reveal the postgres user/password/host.
On the backend side we should add a column to keep track whether Postgres exposure was enabled and if a password was set (we've done db-side encryption already for remoteServer options). I'd say we should do two columns for enablement (postgresEnabledAt) and password, that way if someone toggle on-off then password remains the same.
Create the proxy adapting the proof of concept above.
Improve the proxy to add additional security/logging (e.g. ActivityLogs + report malicious attempts to Sentry?), maybe only allow SELECT command?
Feature to whitelist IP

From a structure perspective I think the easiest would be to run this as a separate package twenty-postgres-proxy ; add a README, tests, etc. It's possible to tell NestJS to listen on 2 ports but I think that will add some un-necessary slowness and complexity. Since this could be a risky area, keeping the code small, extremely well tested and isolated seems like a good option.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postgres proxy on cloud #5354

Postgres proxy on cloud #5354

FelixMalfait commented May 11, 2024

Postgres proxy on cloud #5354

Postgres proxy on cloud #5354

Comments

FelixMalfait commented May 11, 2024

Context

Proposal

Proof of concept

Implementation