Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pg-query-stream] Stream gets stuck after first few batches #3175

Open
anirudh1713 opened this issue Mar 22, 2024 · 1 comment
Open

[pg-query-stream] Stream gets stuck after first few batches #3175

anirudh1713 opened this issue Mar 22, 2024 · 1 comment

Comments

@anirudh1713
Copy link

anirudh1713 commented Mar 22, 2024

Hi, Thank you for this amazing project! 馃檹

We're facing an issue with pg-query-stream, however, I'm not sure if its the package or something else.

The issue

We're selecting all rows from a table (~25k rows) and aggregating results from other table (~350k rows) that we join. We use jsonb for aggregating (which is slow but does the job for now). The issue is that the stream just gets stuck randomly (usually after 1 or 2 batch), no more events are emitted from neither stream nor connection. And after a few seconds, the connection is terminated with error - Connection terminated unexpectedly (At this time we get the error event on connection).

While the query is/was stuck for a few seconds, data from pg_stat_activity -

wait_event wait_event_type state
ClientRead Client active

The query looks something like this -

SELECT x.id AS "xId",
jsonb_agg(jsonb_build_object('id', y.id, 'f1', y.f1, 'f2', y.f2, 'f3', y.f3, 'f99', z.f99)) as "items"
FROM x
INNER JOIN xy xy ON xy."xId" = x."id"
INNER JOIN y o ON y.id = xy."yId"
INNER JOIN z s ON z.id = y."zId"
group by x.id;
const pool = new Pg.Pool({
  ...
});

const connection = await pool.connect();

const query = new QueryStream(
  SELECT x.id AS "xId",
  jsonb_agg(jsonb_build_object('id', y.id, 'f1', y.f1, 'f2', y.f2, 'f3', y.f3, 'f99', z.f99)) as "items"
  FROM x
  INNER JOIN xy xy ON xy."xId" = x."id"
  INNER JOIN y o ON y.id = xy."yId"
  INNER JOIN z s ON z.id = y."zId"
  group by x.id;
);

const stream = connection.query(query);

await new Promise((resolve, reject) => {
  stream.on('data', (data) => {
    console.log('DATA ->  ', data.xId);
  });

  stream.on('end', () => {
    console.log('END');

    resolve(true);
  });

  stream.on('error', (err) => {
    console.log('ERROR', err);

    reject(err);
  });
});

This could be because of the jsonb aggregate? IIRC it used to work fine I think. Also, it works fine if I don't select other data (i.e. remove aggregate function).

SELECT x.id AS "xId"
FROM x
INNER JOIN xy xy ON xy."xId" = x."id"
INNER JOIN y o ON y.id = xy."yId"
INNER JOIN z s ON z.id = y."zId"
group by x.id;

This works too -

SELECT x.id AS "xId",
count(y.id)
FROM x
INNER JOIN xy xy ON xy."xId" = x."id"
INNER JOIN y o ON y.id = xy."yId"
INNER JOIN z s ON z.id = y."zId"
group by x.id;

However, it doesn't work either. In this case, it fetches more data if I compare to using jsonb aggregate function before failing.

SELECT x.id AS "xId",
array_agg(y.id) as "itemIds"
FROM x
INNER JOIN xy xy ON xy."xId" = x."id"
INNER JOIN y o ON y.id = xy."yId"
INNER JOIN z s ON z.id = y."zId"
group by x.id;

Thank you for going through this, and thanks for you time. I'm aware that this info in itself might not be enough, I'll try and get a setup running with this issue and share here over the weekend.

@brianc
Copy link
Owner

brianc commented Mar 22, 2024

Hey thanks for the detailed report! I am curious about what version of node and version of pg & postgres you're using.

Do you think you could create a self-contained reproduction? 馃檹 For things like this issue it might include a setup step where you bootstrap a table with lots of information in it if that's what's needed. Its hard for me to investigate in detail without a way to reproduce the issue on my side.

The only thing that comes to mind right now is there's some kind of configured statement timeout or max socket lifetime in the environment you're running within...but hard to say for certain.

I'll try and get a setup running with this issue and share here over the weekend.

cool - standing by for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants