Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flat storage tool #11013

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft

Conversation

posvyatokum
Copy link
Member

This PR is for demonstration purposes only, and should not be merged as is.

I plan to use flat-storage init tool to recreate a flat storage from a historical height. But this tool uses final height right now.
Finding and changing every usage of BlockMisc in flat storage initialisation seemed like too much. So before init I "overwrite" BlockMisc column.
This is done through MixedDB. This struct is also used to write all the changes in the separate db.
MixedDB has read-only database, and read-write database. All writes go to the read-write db, but reading is done sequentially. Read priority is decided during db creation.
So, to "overwrite" base db, I need to combine it with read-only db that contains amendments, and choose read priority as ReadDBFirst.
To write something to another db, I need to use it as a read-write db, and choose read priority as WriteDBFirst (as i assume that we should give priority to a fresher data).

I have some questions still:

  • For which height should we create FlatStorage? And should we call init tool for height + 1 or height?
  • Should we add parallel init of all shards? Or there is no profit over calling this tool sequentially?
  • Should I replace the db in the same way in the verify command and run it after the FlatStorage creation? Will it help us to find any bugs?

Copy link

codecov bot commented Apr 9, 2024

Codecov Report

Attention: Patch coverage is 0% with 206 lines in your changes are missing coverage. Please review.

Project coverage is 71.21%. Comparing base (246f7cc) to head (b3cfa5e).
Report is 2 commits behind head on master.

Files Patch % Lines
tools/flat-storage/src/commands.rs 0.00% 104 Missing ⚠️
core/store/src/db/mixeddb.rs 0.00% 102 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11013      +/-   ##
==========================================
- Coverage   71.33%   71.21%   -0.12%     
==========================================
  Files         760      761       +1     
  Lines      152301   152530     +229     
  Branches   152301   152530     +229     
==========================================
- Hits       108637   108631       -6     
- Misses      39197    39426     +229     
- Partials     4467     4473       +6     
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.43% <0.00%> (-0.01%) ⬇️
integration-tests 36.88% <0.00%> (-0.11%) ⬇️
linux 69.68% <0.00%> (-0.12%) ⬇️
linux-nightly 70.68% <0.00%> (-0.12%) ⬇️
macos 54.21% <0.00%> (-0.08%) ⬇️
pytests 1.66% <0.00%> (-0.01%) ⬇️
sanity-checks 1.44% <0.00%> (-0.01%) ⬇️
unittests 66.87% <0.00%> (-0.12%) ⬇️
upgradability 0.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


// This DB can only write to write_db
// When reading, it first reads from wirte_db, then amended_db, then base_db
let mixed_db = MixedDB::new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirming

  • we are creating a mixed db where we are writing to the custom recovery database we are trying to create with the flat storage?
  • we are reading from the split storage db but with one tiny change where we are manually setting the final height to our custom value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True and True

@shreyan-gupta
Copy link
Contributor

shreyan-gupta commented Apr 11, 2024

Confirming the flat head is inclusive of the data of the block height for which we have flat storage created.

// Here we iterate over all the child shards and initialize flat storage for them by calling set_flat_storage_state
// Note that this function is called on the current_block which is the first block the next epoch.
// We set the flat_head as the prev_block as after resharding, the state written to flat storage corresponds to the
// state as of prev_block, and that's the convention that we follow.

@shreyan-gupta
Copy link
Contributor

Taking a look at the implementation of FlatStorageShardCreator::update_status, sounds like we build the flat storage as of the tip of the blockchain got from chain_store.final_head().last_block_hash.

Considering resharding we would like the final_head to be the last block of the epoch right before resharding begins.

.set_ser(DBCol::FlatStorageStatus, &shard_uid.to_bytes(), &FlatStorageStatus::Empty)
.expect("Unable to set FS status");
store_update.commit().expect("Unable to change FS status");

let mut creator =
FlatStorageShardCreator::new(shard_uid, tip.height - 1, epoch_manager, rw_hot_runtime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some weird logic checking if specific deltas exist for the start_height parameter upto final_head. I would probably try to set this to tip.height instead of tip.height - 1 to begin with, although @Longarithm or @pugachAG might have a better idea why this has a -1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for more safety... It is easier to move head forward later than backwards (see MoveFlatHeadMode::Back). tip.height also works.

@posvyatokum
Copy link
Member Author

This tool was used to build FS for testnet and mainnet with final height set to the last block before the switch to 5 shards.
For testnet the height is 156901187
For mainnet the height is 114580307
The tool builds FS only for one shard.
Testnet build succeeded for all shards. Results can be found at ubuntu@testnet-db-recovery:~/.near/recovery-data/156901187/data.
Mainnet build for shard 2 consistently fails with

thread '<unnamed>' panicked at chain/chain/src/flat_storage_creator.rs:110:68:
called `Result::unwrap()` on an `Err` value: MissingTrieValue(TrieStorage, 6v9kMHtTMsfoSEJNUvJhVT5tCxTjjDBEJXMAU5uFUiLv)
Neard command

This is the neard command that fails with this binary

/home/ubuntu/neard_for_fs --unsafe-fast-startup --home /home/ubuntu/.near flat-storage init --height 114580307 --shard-id 2 --target-db-path /home/ubuntu/.near/recovery-data/114580307/data-2 16

This is a problem for us, as the current approach requires us to build FS pre-resharding for faster performance.
But more than that, even if we use the State column directly, I am afraid that we are going to experience the same error, as it is not a FS problem, but a Trie consistency problem.

We also need a way to verify FS for optional height (current verify tool works only for final height). We maybe can do the same kind of modification as my draft PR, but I am not sure that this is a valid solution.
We can always just check the final product (node after FS creation and re-resharding) by querying historical requests. But earlier checks are preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants