-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cli,events: speed up backfill with temporary index #11953
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions that are probably not feasible or worth the hassle given that this is a shed command.
Unfortunately I'm web3 native so SQL is all magic to me but I trust that what you're doing is fine.
@@ -32,6 +32,8 @@ var pragmas = []string{ | |||
"PRAGMA read_uncommitted = ON", | |||
} | |||
|
|||
// Any changes to this schema should be matched for the `lotus-shed indexes backfill-events` command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we share this schema with lotus-shed somehow or at least add a test that will break if this doesn't happen? I don't trust that this comment will do a very good job enforcing consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mainly the queries I'm concerned about; see #11723 where this becomes a problem and is why I thought to add this comment in there - the schema will change, then the queries will break in lotus-shed. We could put all the queries in index.go, but then we'd have to make them public to make them useful, and do we want to be stuck with additional public APIs just for this? We could, it just seems noisy for a command that basically nobody uses (I was running this on calibnet and it was painfully slow, yet nobody's complained since it was introduced as far as I know).
@@ -92,8 +103,12 @@ var backfillEventsCmd = &cli.Command{ | |||
return err | |||
} | |||
|
|||
log.Infof( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this should be log.Warnf
Long shot -- does the API expose the state of event collection? It would be nice if we could just error here if the node is currently collecting events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, and we'd end up with a race anyway, unless it was something like "give me a lock". It would be nice to do this all via RPC so we don't have this problem but I'm not sure an entirely new API just for this makes any sense.
252b6bf
to
27402a8
Compare
const ( | ||
// same as in chain/events/index.go | ||
eventExists = `SELECT MAX(id) FROM event WHERE height=? AND tipset_key=? AND tipset_key_cid=? AND emitter_addr=? AND event_index=? AND message_cid=? AND message_index=?` | ||
insertEvent = `INSERT OR IGNORE INTO event(height, tipset_key, tipset_key_cid, emitter_addr, event_index, message_cid, message_index, reverted) VALUES(?, ?, ?, ?, ?, ?, ?, ?)` | ||
insertEntry = `INSERT OR IGNORE INTO event_entry(event_id, indexed, flags, key, codec, value) VALUES(?, ?, ?, ?, ?, ?)` | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this PR to pull these in, they're identical to chain/events/index.go, so it does suggest that maybe we should consider making them public over there. It would help keep things in sync, but would likely just result in this command failing the next time someone tries it and the parameters don't match. But we mostly have that problem already anyway.
27402a8
to
6e571ab
Compare
From experience running this:
IMMEDIATE
transactions so we know as soon as we start a transaction whether we can get an exclusive lock for a write or not so we can error more easily.