-
-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corrupted Databases #1274
Comments
Hi @jakemumu It's not correct that a connection should be only single-threaded. It all depends on the requirements, and there are multiple of them as you can imagine. Your view point is perfectly valid, but for me repeatedly opening and closing a connection is detrimental to fast execution and resource management. This being said, sqlite_orm perfectly works in a multi-threaded application when using the There have already been several questions about multithreading, some of which were solved in sqlite_orm, and some of which revealed that SQLite was simply being used in the wrong way. This combination is absolutely thread-safe with regard to the database connection:
Can you describe in more detail how you use sqlite_orm? |
Thanks @trueqbit -- I'm spending more time investigating this today. So all of those are how we're using the library, I'm using open_forever in a multithreaded app, any usage of that connection is locked on the application level between threads, and I believe that part is working fine. -- today I'm going to investigate the sync_schema, but yeah I'm struggling to find why the corruption is happening. The context of the library usage is inside of an audio plugin -- in this context, when an application opens -- it's possible for some 20 different "applications" to launch which all make their own sqlite_orm instances & open_forever connections to the database. These can all potentially boot at the same time meaning that they could all potentially try to sync schema etc at once as well. I still haven't been able to reproduce the corruption but I've seen it happen both during instantiation and during runtime. The database itself is privately wrapped inside a class, and then that class has another execution wrapper which ensures locks are always present on the individual application level -- so today I'm going to investigate if there are potentially any concerns about multiple different applications connecting to the same DB at once. Let me know if any of those would create concerns I should think about, thanks so much for your time! |
I guess the question / concern is how does sqlite_orm perform when multiple multi-threaded applications are connected to the database -- in my mind it shouldn't make a difference but it appears to somehow? |
Well, and honestly, your question doesn't have so much to do with sqlite_orm, but more to do with the design of your library/application and the use of SQLite3. It's hard to get to the bottom of your problem without seeing some concrete design/code. Anyway, sqlite_orm itself doesn't do anything special once you have a persistent database connection open. It just has a model of the database schema and provides an interface to the database, but otherwise it doesn't hold any data or a pool of anything that needs to be locked.
Below you will find some considerations and frequently asked questions on the subject of multi-threading and simultaneous database access:
|
Hmm, yes this all makes sense -- I'm not trying to blame sqlite_orm, but I have used other sqlite wrappers in the past and never saw these issues so I'm just trying to get to the bottom of it. The docs say to call sync_schema post storage construction so that's been my approach, but I'm realizing this might not be safe is multiple processes are calling it at the same time. Regarding the locks around the connection, I was under the impression doing multiple writes into a single connection could cause errors -- atleast this is also something I saw locally which Is why I implemented locking around the database calls. I'm going to do some more investigation and write some isolated test cases today to try to discern where it is the error is occurring and will report back! Perhaps it's nothing to do with the API. In either case thank you for your responses and consideration! |
@jakemumu if you have |
How do you expect this to be handled in theory? |
@fnc12 -- sorry I deleted the comment didn't realize you had responded -- I've got a very strenuous test of the library / sqlite which performs a series of parallel storage creations & insert operations -- on occasions I see database is locked issues when setting the journal & busy timeout settings -- for example when creating 10 storage objects in parallel. -- to be honest this is probably the hardest I've looked at DB objects before since I usually use Postgres which "just works" -- I'm going to do a bit more research on it today and report back. but -- switching to WAL journal mode solved a lot of the concurrency issues -- I did see multiple storage objects trying to sync_schema at once which may have been my problem but both changing the journal mode and using exlcusive transactions solved it, just want to be very sure of my results before I post back -- it does appear to all be normal sqlite issues |
@jakemumu that is really good! |
So I think what ended up happening is that we weren't doing retries properly, I didn't realize that even with a busy timeout set, you could still get busy errors from sqlite -- that among many other lifecycles and strange gotchas from sqlite I wasn't aware of. Sorry! |
Hey!
I'm dealing with and debugging an issue with some corrupted databases while working with this lib in a multi threaded app -- I've been using the open_forever() approach, and using locks on my application layer anytime any thread interacts with the database object.
Unfortunately I haven't been able to reproduce the issue locally, but something is wrong with this approach and I'm seeing a few instances of corrupted databases which fail integrity checks. It appears we're fighting multi threading race conditions on two different levels, the sqlite_orm library itself, as well as sqlite itself, but the way they're handling them is opposite.
AFAIK a connection should be a single process, single threaded, opened & then quickly closed transaction -- they shouldn't be used across threads -- at the same time, sqlite_orm appears to struggle to manage or maintain connections between threads due to a lack of locking in the library itself.
I'm curious if the desire to not use locking in this library makes sense? -- I can't see a point where database code needs to be so performant that locks would make a difference, performance concerns here would come down to the queries and data itself, not the connections or accessing of said database.
Has anyone else seen randomly corrupted databases in the wild while using this lib? Happy to provide more details about the corrupted dbs themselves, but I'm guessing it has to do with the use of connections across threads.
Thanks!
The text was updated successfully, but these errors were encountered: