Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SQLite foreign key constraints when altering a table #4189

Merged
merged 4 commits into from Dec 13, 2021

Conversation

nickrum
Copy link
Contributor

@nickrum nickrum commented Dec 31, 2020

Some notes/questions:

  • Is it safe to execute raw queries directly on the client?
  • What was the reasoning behind doing disableProcessing() / enableProcessing() in getTableSql()? It seems like this does not work when using the client directly instead of through a transaction.
  • Unless I am missing something, with this change reinsertMapped() should be treated like a critical section, so another invocation of the function can't interfere with the procedure of querying, disabling and reenabling foreign key constraint enforcement.
  • When using indices, the manual foreign key check will likely fail as they are not recreated for the new table. (According to the SQLite Docs for ALTER TABLE this also applies to triggers and views, but I'm not sure how relevant that is for Knex)
  • With this change would it be possible to enable foreign key constraint enforcement for SQLite by default? (Maybe with an option to disable it)

Fixes #4155

@kibertoad
Copy link
Collaborator

Is it safe to execute raw queries directly on the client?

It's preferable to avoid doing that (although we have cases of this, but we intend to refactor that out). The reason for that - knex aims to be a querybuilder, and certain projects use it that way, simply generating SQL statements that they execute on their own. Ideally that should be preserved.

What was the reasoning behind doing disableProcessing?

See #3033

When using indices, the manual foreign key check will likely fail as they are not recreated for the new table

Why not? That sounds confusing for the user.

With this change would it be possible to enable foreign key constraint enforcement for SQLite by default?

Why should it be default? Isn't it something that user should opt-in on their own? I'm open to exposing this config as an option param, though.

@kibertoad
Copy link
Collaborator

@nickrum A heads-up - I'm currently reworking foreign key creation to output proper SQL, you should be able to reuse same solution after it is done.

@kibertoad
Copy link
Collaborator

But in order to unblock you for now, since changes might take some time, feel free to execute changes directly on client for now if that makes things easier for you.

@kibertoad
Copy link
Collaborator

@nickrum After #4194 is merged, you can use the mechanism used there as a reference.

@nickrum
Copy link
Contributor Author

nickrum commented Jan 3, 2021

@kibertoad Thank you for the info and the heads-up. In that case I will wait until that PR is merged as I don't think it would be possible to output proper SQL for the foreign key check only without also refactoring the whole DDL generation.

I did find a way to essentially do what this PR currently does with generated SQL statements, but there is one catch: As, to my knowledge at least, it is not possible to enable/disable the foreign key check based on a previous query, the SQL statements have to be generated based on a query of "PRAGMA foreign_keys at "compile-time" or based on a config option.

See #3033

I see, thanks. Is there a way to disable processing without a transaction? The function doesn't seem to exist on the client directly.

Why not? That sounds confusing for the user.

Because I didn't find the time to implement this yet. It is a bit more complicated as it requires modifying the CREATE INDEX query used to create the index if a column has been renamed or dropped.

Why should it be default? Isn't it something that user should opt-in on their own? I'm open to exposing this config as an option param, though.

Mainly because of #453 (comment). I am not sure if users expect the check to be turned on by default because of the way other database vendors do it or the opposite because it is the default in SQLite. In any case, a config option sounds like a good idea.

@kibertoad
Copy link
Collaborator

@nickrum PR was just merged!

the SQL statements have to be generated based on a query of "PRAGMA foreign_keys at "compile-time

That's not a problem. Check how that PR is implemented, there is an async function involved that generates query SQL based on results of previous queries, same approach should work for you.

Is there a way to disable processing without a transaction?

Where do you need it? You'll have an instance of transaction inside ddl.

@nickrum nickrum force-pushed the sqlite-foreign-key-check-fix branch 2 times, most recently from c48ac42 to e5e4216 Compare January 3, 2021 21:19
@kibertoad
Copy link
Collaborator

@nickrum Awesome! Do you need help writing a test?

@nickrum
Copy link
Contributor Author

nickrum commented Jan 3, 2021

Where do you need it? You'll have an instance of transaction inside ddl.

Inside getTableSql(). I had to remove the transaction because enabling/disabling the foreign keys check inside a transaction is not allowed.

Do you need help writing a test?

I would appreciate any help regarding tests. I still have to port the foreign keys check to the new generateReinsertCommands() though. Currently it only works with the old method.

Also, one of the tests is failing because renameColumn() doesn't rename references inside other tables. Fixed by #4200

@kibertoad
Copy link
Collaborator

@nickrum New style of integration tests are actually quite easy to run. Since you are only interested in SQLite, what you can do is mark your test suite with describe.only and then run npm run test:sqlite. Best place for new tests would be here: test/integration2/schema/foreign-keys.spec.js
You can adjust existing ones slightly to replicate the case that you need.

@kibertoad
Copy link
Collaborator

. I still have to port the foreign keys check to the new generateReinsertCommands() though. Currently it only works with the old method.

In order for the new method to work, make sure to update lib/dialects/sqlite3/schema/sqlite-tablecompiler.js to use statementsProducer instead of output, then array of string commands will be executed automatically. See

statementsProducer(pragma, connection) {
as an example.

@nickrum nickrum force-pushed the sqlite-foreign-key-check-fix branch from e5e4216 to d8b7f31 Compare January 3, 2021 22:04
@nickrum
Copy link
Contributor Author

nickrum commented Jan 3, 2021

@kibertoad Thanks, I think I will be able to figure this out.

@kibertoad
Copy link
Collaborator

@nickrum If you'll get stuck with writing tests, please let me know, would be happy to help.

@kibertoad
Copy link
Collaborator

Friendly ping :). Happy to pick up any pieces of remaining work if you need help with anything.

@nickrum
Copy link
Contributor Author

nickrum commented Jan 14, 2021

@kibertoad sorry for the delay :) The beginning of a year is always a little busy. I also had a few issues implementing the foreign keys check for generated SQL statements, but I think I got this now. If you could pick up writing some tests, that would help a lot.

@nickrum nickrum force-pushed the sqlite-foreign-key-check-fix branch from d8b7f31 to 1b5b2a5 Compare January 27, 2021 23:14
@nickrum
Copy link
Contributor Author

nickrum commented Jan 28, 2021

@kibertoad I finally found the time to implement my idea regarding foreign keys check with generated SQL statements.

Unfortunately it is not exactly intuitive. I had to create temporary tables and triggers to dynamically rollback the changes if a foreign key error occurs.

There are two independent checks because the manual foreign keys check can report violations in two different ways. Either the check returns a list of violations for every row that violates a foreign key constraint, or the check command simply fails if a whole column referenced by some other column can not be found.

The change to queryArray() is needed to execute all queries before throwing an error to make sure the temporary tables and triggers are removed and the foreign keys check is reenabled (this is essentially what the finally does in reinsertMapped()).

@kibertoad
Copy link
Collaborator

@nickrum Does the new logic only work when executed on client directly? Is it not possible to generate SQL for all of the operations that users could execute themselves, so that they could use knex purely as a query builder?

@nickrum
Copy link
Contributor Author

nickrum commented Jan 28, 2021

It should work without the client as long as every generated SQL is executed regardless of any errors. This is definitely not ideal but, to my knowledge at least, there is no easy way around this as the foreign key constraint enforcement definitely has to be reenabled after the transaction.

I can only think of two alternatives. One would be to basically execute the whole procedure in reinsertMapped() at build time of the query, rollback the changes and based on the result of the foreign keys check throw an error at build time or generate the SQL. The other would be to manually validate the constraints at build time of the query. Both approaches have the disadvantage that they depend on the state of the database at build time which would make the usage of the generated queries less flexible.

@nickrum
Copy link
Contributor Author

nickrum commented Feb 3, 2021

@kibertoad Let me know which approach you prefer or if you can think of another way to go about this.

@kibertoad
Copy link
Collaborator

@nickrum I would say that with the query builder approach it is not possible to have a completely safe solution, so I would argue that if user goes for query generation and executes it manually, they should be responsible for wrapping it into transaction and rolling back properly should any part of it (such as executing the foreign check) fails. When we are executing it directly (using runner.js -> queryArray()), we should be the ones ensuring that there is an atomic transaction for the whole array. Not sure why we might need any additional safety mechanisms beyond that.

@nickrum
Copy link
Contributor Author

nickrum commented Feb 3, 2021

@kibertoad Good point. What about the problem that PRAGMA foreign_keys does not work inside a transaction? A user who goes for query generation has to somehow know that those must not be executed inside a transaction. The same thing also applies to us when we are executing the queries directly in queryArray().

@kibertoad
Copy link
Collaborator

@nickrum Yeah, that is definitely a problem. I don't think that a good solution for that exists; considering that we don't know when and how user will execute generated SQL, we can't really provide him with an easy way to rollback their changes, so I would say that we should simply throw an error and let the user handle situation in this case.

@kibertoad
Copy link
Collaborator

@nickrum Do you think that there is anything left, other than fixing tests?

@nickrum
Copy link
Contributor Author

nickrum commented Feb 4, 2021

@kibertoad Sounds good to me. Regarding executing the queries directly inside queryArray(), I am not sure how to start a transaction inside that function and how to exclude the PRAGMA statements from the transaction in a clean way. Any ideas?

After finding a solution for that, fixing the tests and writing new tests to verify the added functionality, this PR should be good to go.

@kibertoad
Copy link
Collaborator

kibertoad commented Feb 27, 2021

@nickrum Any chance you could create a preliminary PR changing signature of generateDdlCommands to return an array, so that we could implement the rest of this functionality incrementally without introducing breaking changes within 0.95.0?

@nickrum
Copy link
Contributor Author

nickrum commented Feb 28, 2021

@kibertoad Sure! I feel like it is not clear which queries should be executed inside a transaction and which queries must not if generateDdlCommands returns an array of arrays, though. What do you think of returning an object like:

{
  pre: [],
  sql: [],
  post: []
}

This way, if foreign key constraint enforcement is disabled, the user could simply ignore the pre and post properties and use .generateDdlCommands().sql without having to worry about at which position in the array of arrays the actual queries to alter the schema are.

@kibertoad
Copy link
Collaborator

@nickrum Sounds like a great idea!

@OlivierCavadenti
Copy link
Collaborator

What is the status of this PR ? @nickrum

@nickrum nickrum marked this pull request as ready for review December 13, 2021 14:29
@nickrum
Copy link
Contributor Author

nickrum commented Dec 13, 2021

@kibertoad @OlivierCavadenti Sorry this took so long! I made several attempts throughout the year at finishing this PR but there were a lot of show-stoppers preventing me from doing so.

Initially I thought I could use the SQLite's PRAGMA defer_foreign_keys to defer the foreign keys check until the alter table commands have been executed. This wasn't possible due to several issues. That's why I had to add an additional check property to the object returned by generateDdlCommands to execute the check explicitly.

Copy link
Collaborator

@kibertoad kibertoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nickrum
Copy link
Contributor Author

nickrum commented Dec 13, 2021

In just pushed a small commit that wraps the statementsProducer transaction and the post query in a try-finally-block to align with the way the alter() function is implemented.

@OlivierCavadenti
Copy link
Collaborator

OlivierCavadenti commented Dec 13, 2021

Sorry this took so long! I made several attempts throughout the year at finishing this PR but there were a lot of show-stoppers preventing me from doing so.

No problem, if I ask, it's to know if we can finish the work or not, I understand that everyone is busy !
Anyway, thanks for this hard and amazing work, I can merge.

@OlivierCavadenti OlivierCavadenti merged commit 157e27d into knex:master Dec 13, 2021
@nickrum nickrum deleted the sqlite-foreign-key-check-fix branch December 14, 2021 13:18
nickrum added a commit to nickrum/knex that referenced this pull request Dec 15, 2021
…ex#4336)"

Reverts knex#4336. Now that knex#4189 has been merged, we can savely revert this PR.
nickrum added a commit to nickrum/knex that referenced this pull request Dec 15, 2021
…ex#4336)"

Reverts knex#4336.
Now that knex#4189 has been merged, we can safely revert this PR.
@rijkvanzanten
Copy link
Collaborator

@kibertoad hate to be that guy, but do you think we could sneak in a 0.95.15 before the big 1.0 release? 😬 Between this and #4923, a couple mission critical bugfixes for SQLite use have made it in to the project, but aren't available for use yet. Totally understand if v1 is the exclusive focus rn tho! ✌🏻

@kibertoad
Copy link
Collaborator

@rijkvanzanten Sure. can you give a laundry list of prs you want in it?

@rijkvanzanten
Copy link
Collaborator

@kibertoad This one is by far the most important (without, doing any table alterations drops foreign key columns completely, resulting in critical data loss). Other than that, #4923 would be nice to have, as it's a bug without a workaround. Anything else would be cherry on the cake 🙂

@kibertoad
Copy link
Collaborator

Released in 1.0.1.

@rijkvanzanten Do you still need 0.95 backport, or 1.x is sufficient?

@AlphaMoonbaseBerlin
Copy link

@kibertoad
Just FIY: This issue brings in some issues with Directus, which uses knex as its ORM.
Because of the switch to sqlite/vscode, they did not update knex to version 1. Some Discussion here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Foreign keys prevent table altering migrations on SQLite when foreign keys pragma is on
5 participants