Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the number of generated SQL statements for Updates/Inserts #5043

Closed
benawad opened this issue Apr 14, 2020 · 20 comments · Fixed by prisma/prisma-engines#4084
Closed
Assignees
Labels
kind/improvement An improvement to existing feature and code. team/client Issue for team Client. topic: performance/queries topic: performance
Milestone

Comments

@benawad
Copy link

benawad commented Apr 14, 2020

I'm not sure if there is a good reason for how it currently works, but it looks like you could reduce the number of SQL statements generated for update.

Problem

The following code:

  const updatedPost = await prisma.post.update({
    where: {
      id: 1,
    },
    data: {
      published: true,
    },
  });

Generates this SQL:

prisma:query BEGIN
prisma:query SELECT `dev`.`Post`.`id` FROM `dev`.`Post` WHERE `dev`.`Post`.`id` = ?
prisma:query UPDATE `dev`.`Post` SET `published` = ? WHERE `dev`.`Post`.`id` IN (?)
prisma:query SELECT `dev`.`Post`.`id`, `dev`.`Post`.`title`, `dev`.`Post`.`content`, `dev`.`Post`.`published`, `dev`.`Post`.`authorId` FROM `dev`.`Post` WHERE `dev`.`Post`.`id` = ? LIMIT ? OFFSET ?
prisma:query COMMIT

Solution

In PostgreSQL, I would normally write this as:

update `dev`.`Post` set published = ? where id = ? returning *

Additional context

You could use returning on inserts too.

@Sytten
Copy link
Contributor

Sytten commented Apr 14, 2020

This is a recurring demand from the community, but as the devs said many times the first goal of the beta is to get things working correctly then optimize. There are many open issues for optimization. This one seems a duplicate of #2157.

@pantharshit00
Copy link
Contributor

pantharshit00 commented Apr 16, 2020

Closing as duplicate of #2157

Keep opening issues though so that we can confirm in the future that these are properly addressed.

Actually we discussed this a bit and calls are different in the issues so we will keep this open

@SelfDevTV

This comment has been minimized.

@pantharshit00

This comment has been minimized.

@pantharshit00 pantharshit00 transferred this issue from prisma/prisma-client-js Jan 13, 2021
@pantharshit00 pantharshit00 added kind/improvement An improvement to existing feature and code. topic: performance labels Jan 13, 2021
@timsuchanek timsuchanek added process/candidate team/client Issue for team Client. labels Jan 14, 2021
@pimeys
Copy link
Contributor

pimeys commented Jan 14, 2021

Actually, PostgreSQL and SQL Server support RETURNING or OUTPUT in inserts, updates and deletes.

Also, this would help reducing errors coming from an overly saturated pool in #4280

@Sytten
Copy link
Contributor

Sytten commented Jan 14, 2021

Lets note that the returning doesn't do anything if the update doesn't update a row. So it's debatable is the first select should be left in place or not. It also won't work if other tables are requested in the select/include. I am not even sure it will actually improve things significantly considering the data is likely still in memory when it was just updated. Sometimes the query optimizer is intelligent enough to catch the select and rewrite that as a returning.

@benawad
Copy link
Author

benawad commented Jan 16, 2021

maybe have it as an option to where you can set returning to true

@pimeys
Copy link
Contributor

pimeys commented Mar 1, 2021

If we'd be able to cut a simple insert into one query utilizing returning or output, we could then skip the transaction from that query. This would mean that instead of four queries:

  • BEGIN to start a transaction
  • INSERT INTO to insert your data
  • SELECT FROM to query the data and
  • COMMIT to end the transaction

we would just do INSERT INTO ... RETURNING. If you have any roundtrip between the application server and the database, this would cut the response times tremendously, reducing the need for more connections in certain scenarios.

@jan4984
Copy link

jan4984 commented May 31, 2021

any update? I think performance is important for database middlewares like Prisma.

@alienself
Copy link

Any updates on this issue? This looks like a major problem that should be fixed asap.

@chrbala
Copy link

chrbala commented Mar 28, 2022

I would like to see this fixed as well. I am fairly certain that this is adding a huge amount of latency to my queries. I was hoping to provide some more detailed numbers from the Observability epic #9601, but that development seems to have stalled.

From a high level, I've tried a number of setups to get a sense of the latency:

  • Local testing with the app server and MySQL instance on my mac. This was fast. Latency is not noticeable as I add joins.
  • Prod testing with PlanetScale + Prisma running in AWS lambda in the same us-east-1 region via Vercel serverless functions. This was slow when I added joins.
  • Prod testing with PlanetScale + Prisma data proxy in the same us-east-1 region. This was slow as I added joins.
  • Local testing against prod database. I am on the west coast, which meant that my queries were even slower than the prod testing.

When I say "slow", above, these queries took multiple seconds to run when they worked almost instantly when running entirely on my mac.

My understanding of the problem is that joins will often waterfall, so the deeper the joins, the more round trips the data has to take to complete all the queries. This is basically broken from what I can tell, unless the latency is extremely low between app and database servers. Even having the app and database server in the same region was not enough. I do not exactly understand the networking aspect, but my rough understanding/assumption is that if the servers are not configured to do otherwise (e.g. by using the same VPC), they will round trip to the internet for each trip. This will very quickly get too slow to be usable.

@janpio
Copy link
Member

janpio commented Mar 29, 2022

When I say "slow", above, these queries took multiple seconds to run when they worked almost instantly when running entirely on my mac.

My understanding of the problem is that joins will often waterfall, so the deeper the joins, the more round trips the data has to take to complete all the queries. This is basically broken from what I can tell, unless the latency is extremely low between app and database servers. Even having the app and database server in the same region was not enough. I do not exactly understand the networking aspect, but my rough understanding/assumption is that if the servers are not configured to do otherwise (e.g. by using the same VPC), they will round trip to the internet for each trip. This will very quickly get too slow to be usable.

This sounds like a concrete problem we would love to have an issue about with a reproduction project we can try out ourselves. As you are on PlanetScale, there could be some cross effect of the referentialIntegrity = prisma mode for example - so we really want to be able to look into what is going on. Please open that standalone issue and provide the information we need to be able to reproduce and understand. Thanks.

@chrbala
Copy link

chrbala commented Mar 30, 2022

Please open that standalone issue and provide the information we need to be able to reproduce and understand. Thanks.

Done! See: #12582

@aradalvand
Copy link

aradalvand commented May 3, 2022

No updates on this one yet? It's now been more than 2 years... :(

@rohanrajpal
Copy link

Would love to have this. Whats the best way to do an update without selects right now?

@ChristianIvicevic
Copy link

@rohanrajpal If I am not mistaken updateMany does update without returning, even though it's debatable whether it's a good solution spreading this in your codebase since the semantics of that function are different and might cause other readers of your code to question why you are not using update in the first place.

@rohanrajpal
Copy link

@rohanrajpal If I am not mistaken updateMany does update without returning, even though it's debatable whether it's a good solution spreading this in your codebase since the semantics of that function are different and might cause other readers of your code to question why you are not using update in the first place.

Yeah right, didn't know updateMany does update without returning, tho yes wouldn't prefer to give up readability for it.

@oscarcoding
Copy link

Please be aware that this does not work in MS SQL Server if your table has triggers.

See https://stackoverflow.com/questions/13198476/cannot-use-update-with-output-clause-when-a-trigger-is-on-the-table for details.

So if this is added, there must also be an option to turn it off via some configuration setting.

I ran into this problem on a previous project using Knex/Bookshelf.js as the ORM and this option was not configurable, so we had to fork the whole project and make the change ourselves.

@adrianbienias
Copy link

I found that upsert uses only one query instead of three when making an update, and also it uses RETURNING.

@Weakky
Copy link
Member

Weakky commented Aug 1, 2023

Hey folks, this should be now available for Postgres & CockroachDB on the lastest Prisma 5.1. Have a look at the release notes for more information! Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/improvement An improvement to existing feature and code. team/client Issue for team Client. topic: performance/queries topic: performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.