Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted Databases #1274

Closed
jakemumu opened this issue May 2, 2024 · 10 comments
Closed

Corrupted Databases #1274

jakemumu opened this issue May 2, 2024 · 10 comments
Labels

Comments

@jakemumu
Copy link

jakemumu commented May 2, 2024

Hey!

I'm dealing with and debugging an issue with some corrupted databases while working with this lib in a multi threaded app -- I've been using the open_forever() approach, and using locks on my application layer anytime any thread interacts with the database object.

Unfortunately I haven't been able to reproduce the issue locally, but something is wrong with this approach and I'm seeing a few instances of corrupted databases which fail integrity checks. It appears we're fighting multi threading race conditions on two different levels, the sqlite_orm library itself, as well as sqlite itself, but the way they're handling them is opposite.

AFAIK a connection should be a single process, single threaded, opened & then quickly closed transaction -- they shouldn't be used across threads -- at the same time, sqlite_orm appears to struggle to manage or maintain connections between threads due to a lack of locking in the library itself.

I'm curious if the desire to not use locking in this library makes sense? -- I can't see a point where database code needs to be so performant that locks would make a difference, performance concerns here would come down to the queries and data itself, not the connections or accessing of said database.

Has anyone else seen randomly corrupted databases in the wild while using this lib? Happy to provide more details about the corrupted dbs themselves, but I'm guessing it has to do with the use of connections across threads.

Thanks!

@trueqbit
Copy link
Collaborator

trueqbit commented May 4, 2024

Hi @jakemumu

It's not correct that a connection should be only single-threaded. It all depends on the requirements, and there are multiple of them as you can imagine. Your view point is perfectly valid, but for me repeatedly opening and closing a connection is detrimental to fast execution and resource management.

This being said, sqlite_orm perfectly works in a multi-threaded application when using the open_forever() approach. Then there's SQLite3's different levels of support for multi-threading. If you are using the multi-thread aware version of SQLite3 then there's concurrent access control by default.

There have already been several questions about multithreading, some of which were solved in sqlite_orm, and some of which revealed that SQLite was simply being used in the wrong way.

This combination is absolutely thread-safe with regard to the database connection:

  • Using the open_forever() approach
  • Using a concurrency-aware build of SQLite3
  • Leaving the default concurrency settings of SQLite3

Can you describe in more detail how you use sqlite_orm?
Maybe you share a prepared SQL statement between threads?

@fnc12 fnc12 added the question label May 5, 2024
@jakemumu
Copy link
Author

jakemumu commented May 8, 2024

Thanks @trueqbit -- I'm spending more time investigating this today. So all of those are how we're using the library, I'm using open_forever in a multithreaded app, any usage of that connection is locked on the application level between threads, and I believe that part is working fine. -- today I'm going to investigate the sync_schema, but yeah I'm struggling to find why the corruption is happening.

The context of the library usage is inside of an audio plugin -- in this context, when an application opens -- it's possible for some 20 different "applications" to launch which all make their own sqlite_orm instances & open_forever connections to the database. These can all potentially boot at the same time meaning that they could all potentially try to sync schema etc at once as well.

I still haven't been able to reproduce the corruption but I've seen it happen both during instantiation and during runtime. The database itself is privately wrapped inside a class, and then that class has another execution wrapper which ensures locks are always present on the individual application level -- so today I'm going to investigate if there are potentially any concerns about multiple different applications connecting to the same DB at once.

Let me know if any of those would create concerns I should think about, thanks so much for your time!

@jakemumu
Copy link
Author

jakemumu commented May 8, 2024

I guess the question / concern is how does sqlite_orm perform when multiple multi-threaded applications are connected to the database -- in my mind it shouldn't make a difference but it appears to somehow?

@trueqbit
Copy link
Collaborator

trueqbit commented May 9, 2024

Well, and honestly, your question doesn't have so much to do with sqlite_orm, but more to do with the design of your library/application and the use of SQLite3. It's hard to get to the bottom of your problem without seeing some concrete design/code.

Anyway, sqlite_orm itself doesn't do anything special once you have a persistent database connection open. It just has a model of the database schema and provides an interface to the database, but otherwise it doesn't hold any data or a pool of anything that needs to be locked.

  • In any case, I would never synchronize a database schema from several processes. When you think about it, that seems like a very bad idea. Regardless of the database system and programming framework, I always update a schema in a separate step before running the application, e.g. as part of the installation.
  • Again the question: Do you share prepared statements between multiple threads? Actually, even such sharing should be fine if you do not update a prepared statement.
  • Why should you lock access to sqlite_orm's "storage" object? It's not necessary and just adds a thick layer of indirection and raises a lot of questions.

Below you will find some considerations and frequently asked questions on the subject of multi-threading and simultaneous database access:

@jakemumu
Copy link
Author

jakemumu commented May 9, 2024

Hmm, yes this all makes sense -- I'm not trying to blame sqlite_orm, but I have used other sqlite wrappers in the past and never saw these issues so I'm just trying to get to the bottom of it. The docs say to call sync_schema post storage construction so that's been my approach, but I'm realizing this might not be safe is multiple processes are calling it at the same time.

Regarding the locks around the connection, I was under the impression doing multiple writes into a single connection could cause errors -- atleast this is also something I saw locally which Is why I implemented locking around the database calls.

I'm going to do some more investigation and write some isolated test cases today to try to discern where it is the error is occurring and will report back! Perhaps it's nothing to do with the API.

In either case thank you for your responses and consideration!

@fnc12
Copy link
Owner

fnc12 commented May 10, 2024

@jakemumu if you have sync_schema calls in several processes try to put sync_schema inside a transaction. It will make SQLite know about other process making a set of queries

@fnc12
Copy link
Owner

fnc12 commented May 11, 2024

is there a way to specify those things when opening a connection?

How do you expect this to be handled in theory?

@jakemumu
Copy link
Author

jakemumu commented May 11, 2024

@fnc12 -- sorry I deleted the comment didn't realize you had responded -- I've got a very strenuous test of the library / sqlite which performs a series of parallel storage creations & insert operations -- on occasions I see database is locked issues when setting the journal & busy timeout settings -- for example when creating 10 storage objects in parallel. -- to be honest this is probably the hardest I've looked at DB objects before since I usually use Postgres which "just works" -- I'm going to do a bit more research on it today and report back.

but -- switching to WAL journal mode solved a lot of the concurrency issues -- I did see multiple storage objects trying to sync_schema at once which may have been my problem but both changing the journal mode and using exlcusive transactions solved it, just want to be very sure of my results before I post back -- it does appear to all be normal sqlite issues

@fnc12
Copy link
Owner

fnc12 commented May 12, 2024

@jakemumu that is really good!

@jakemumu
Copy link
Author

So I think what ended up happening is that we weren't doing retries properly, I didn't realize that even with a busy timeout set, you could still get busy errors from sqlite -- that among many other lifecycles and strange gotchas from sqlite I wasn't aware of.

Sorry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants