Txn Commit can result in transaction guarantee violation #6

aarondwi · 2021-06-22T17:54:54Z

Hi

Looking at the txn source, it seems that Commit is not holding the table lock for the entire apply duration, but instead on per delete/update/insert basis. This means that another concurrent transaction might read some deleted value (cause it is done first), and not see the update. Even worse, first transaction might delete a record, while second one might try to update it, resulting in lost update. Haven't thought deeply about the per column lock problems, but probably it gonna has same issues.

For disclosure, I have not tried to reproduce this, but full lock is needed to ensure only one transaction is updating, till finish. This is what sqlite does. For concurrent writer db (e.g. postgres/mysql), they have another protocol (one example is xmin/xmax check in postgres, range/gap lock till commit in mysql) to handle this.

The simplest solution is to just wrap the entire Commit call with a lock, at the cost of reduced concurrency.

What do you think?

The text was updated successfully, but these errors were encountered:

kelindar · 2021-06-22T18:08:22Z

Thanks for the review, really appreciated. Yeah, I'm planning to do this, first by merging delete+update+insert together and acquiring the locks then releasing them at the end. Do you have any references on the protocols mentioned? I was thinking about sharding locks as well, currently we have RWLocks per column but it may be worth to sharding them per row range (say a lock per column and every 100K rows). Not quite sure yet what's the best approach.

aarondwi · 2021-06-22T18:30:22Z

Some on the net:

postgres use snapshot isolation semantic (pending transaction, committed, etc) and checking it against the xmin/xmax value of each row, a good blog is here
for serializability, postgres need yet another protocol on top of snapshot, Serializable Snapshot Isolation
SQLite semantic is snapshot isolation, but because only 1 writer allowed, it has same effect as serializable isolation. Explanation here
The rests are from academic paper, such as silo, mooc. Also there are some technique for columnar databases, such as this one

About the sharding, seems like you can use Clickhouse's MergeTree, although clickhouse doesnt guarantee any isolation or anything. But this seems like good tradeoff, optimizing for read.

aarondwi · 2021-06-22T18:35:04Z

Seems this behavior also affect per column lock behavior. While reading the value , it only locks the column, but on txn code, especially for delete and insert, doesnt have that.

kelindar · 2021-06-22T19:38:17Z

Great thanks, I'll read through all of this over the weekend. But indeed, I'll write some unit tests that test different transaction isolation levels.

However I wonder if we drop column-level locks and instead create a sharded lock. I'm leaning towards a simple []sync.RWMutex slice for its simplicity, but with some padding to prevent false sharing. This way, we could acquire all of the locks necessary during the transaction and completely lock out those pages of rows when committing the transaction.

I think we'll easily be able to guarantee read committed type of isolation but you would still have fuzzy and phantom reads, since if we lock rows page by page, a transaction could commit a bunch and another transaction could get different values during the read before and after. It's probably not a deal breaker though, as if you're doing read-then-write during a Range() for example, we could optimistically lock pages and instead of collection.Query() we could have collection.View() which initiates a transaction does not allow update, and collection.Update() which does. The View could only acquire read locks per page while Update would acquire read-write locks instead.

aarondwi · 2021-06-22T19:51:14Z

Separating read-only and read-write transaction definitely gonna makes the API clearer about the guarantee, I totally support this.

As for per page (row shard?), how would it handle insert? As insert might create new page. But this definitely gonna help with update/delete case.

aarondwi · 2021-06-22T19:56:37Z

For isolation level, typical read-focused databases benefit from snapshot isolation level? As long running and repeated queries gonna have same values read. These gonna need like creation/deletion (a la postgres xmin/xmax) values? But probably gonna negate lots of this repo style (?) of using fast bitmap checks, as those created/deletion undoubtedly should be integer.

kelindar · 2021-06-22T20:43:29Z

As for per page (row shard?), how would it handle insert? As insert might create new page. But this definitely gonna help with update/delete case.

It could be of fixed size, something like a pre-allocated slice and then use range reduction and modulo to figure out the shard index. For example if you have a slice of 100 locks, and we want a new lock every 1k items, we can do index >> 10 % 100 in order to retrieve the lock id and jump and lock the appropriate lock. For example:

index=500 shard=0
index=750 shard=0
index=1000 shard=0
index=1250 shard=1
index=1500 shard=1
...
index=102000 shard=99
index=102250 shard=99
index=102500 shard=0
index=102750 shard=0

For isolation level, typical read-focused databases benefit from snapshot isolation level? As long running and repeated queries gonna have same values read. These gonna need like creation/deletion (a la postgres xmin/xmax) values? But probably gonna negate lots of this repo style (?) of using fast bitmap checks, as those created/deletion undoubtedly should be integer.

Yeah, but I think first things first. Implementing properly read commited and then moving on snapsot and potentially serializable. Technically we could make users chose their isolation levels and simply give a global RWLock that is acquired during the transaction View() and Update(), at the cost of concurrency of course. I think MongoDB used to have database-wide lock for a long time. In reality, users won't need to have a single collection and could still manually shard collections.

aarondwi · 2021-06-23T04:16:15Z

It could be of fixed size, something like a pre-allocated slice and then use range reduction and modulo to figure out the shard index

Ah okay, with predefined max length, locks will be enough.

Yeah, but I think first things first. Implementing properly read commited and then moving on snapsot and potentially serializable.

I take it as possiblity that this repo may support multiple isolation level at once? Some databases choose to have only one isolation level, cause not all people understand isolation level enough

In reality, users won't need to have a single collection and could still manually shard collections.

Definitely true

kelindar · 2021-06-23T22:30:34Z

Some work in progress #7

xiegeo · 2022-12-13T07:19:50Z

@kelindar Can this work be summarized in the readme to describe the current implementation? From the readme, the suggestion to use merge instead of reading and writing in a single transaction seems to imply that transactions are not atomic. But otherwise, the behaviour and performance of concurrent writes leave me guessing.

kelindar · 2022-12-13T16:15:08Z

I'll add this to the readme. TL;DR is that transactions are atomic, but the isolation level currently implemented is essentially read committed, which means that you have dirty/phantom reads in the system that you have to deal with. In order to update a particular row without reading it, and hence without being exposed to dirty/phantom reads, merge was introduced.

kelindar mentioned this issue Jul 14, 2021

Migrated to sharded latch & commit buffer #12

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Txn Commit can result in transaction guarantee violation #6

Txn Commit can result in transaction guarantee violation #6

aarondwi commented Jun 22, 2021 •

edited

kelindar commented Jun 22, 2021

aarondwi commented Jun 22, 2021

aarondwi commented Jun 22, 2021

kelindar commented Jun 22, 2021

aarondwi commented Jun 22, 2021

aarondwi commented Jun 22, 2021 •

edited

kelindar commented Jun 22, 2021

aarondwi commented Jun 23, 2021

kelindar commented Jun 23, 2021

xiegeo commented Dec 13, 2022

kelindar commented Dec 13, 2022

Txn Commit can result in transaction guarantee violation #6

Txn Commit can result in transaction guarantee violation #6

Comments

aarondwi commented Jun 22, 2021 • edited

kelindar commented Jun 22, 2021

aarondwi commented Jun 22, 2021

aarondwi commented Jun 22, 2021

kelindar commented Jun 22, 2021

aarondwi commented Jun 22, 2021

aarondwi commented Jun 22, 2021 • edited

kelindar commented Jun 22, 2021

aarondwi commented Jun 23, 2021

kelindar commented Jun 23, 2021

xiegeo commented Dec 13, 2022

kelindar commented Dec 13, 2022

aarondwi commented Jun 22, 2021 •

edited

aarondwi commented Jun 22, 2021 •

edited