Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add database analyse-gas-usage command #10240

Merged
merged 24 commits into from Nov 30, 2023

Conversation

jancionear
Copy link
Contributor

@jancionear jancionear commented Nov 23, 2023

Add a command which allows to analyse gas usage on some part of the blockchain.
The command goes over the specified blocks and gathers information about how much
gas was used by each block, shard and account.
Then it displays an analysis of all the data.

The analysis contains:

  • Total gas used
  • Gas used on each shard
  • Top 10 accounts by gas usage
  • The optimal account by which a shard could be split into two halves with similar gas usage

Help page:

$ ./neard database analyse-gas-usage --help
Analyse gas usage in a chosen sequnce of blocks

Usage: neard database analyse-gas-usage [OPTIONS]

Options:
      --last-blocks <LAST_BLOCKS>
          Analyse the last N blocks in the blockchain
      --from-block-height <FROM_BLOCK_HEIGHT>
          Analyse blocks from the given block height, inclusive
      --to-block-height <TO_BLOCK_HEIGHT>
          Analyse blocks up to the given block height, inclusive
  -h, --help
          Print help

Example usage:

# Analyse gas usage in the last 1000 blocks
cargo run --bin neard -- database analyse-gas-usage --last-blocks 1000

# Analyse gas usage in blocks with heights between 200 and 300 (inclusive)
./neard database analyse-gas-usage --from-block-height 200 --to-block-height 300

Add a command which will allow to analyse gas usage on the blockchain.
The command can be executed by running:
```bash
./neard database analyse-gas-usage
```
The command will look at gas used in all of the blocks,
shards and accounts, and print out an analysis report.

This commit just adds the command, the implementation will
be added in the following commits.
Add a flag that can be used to specify how many latest blocks
should be analysed using the analyse-gas-usage command.

Now running:
```bash
./neard database analyse-gas-usage --last-blocks 1000
```

Will go over the last 1000 blocks in the blockchain
and analyse them.
For each block we want to gather information about how much gas was
used in every shard and account.

Gas is used when transactions and receipts are executed, and every
such execution produces an ExecutionOutcome, which is saved in the
database.
We can read all ExecutionOutcomes originating from a given shard
from the database, and get all of the needed information from there.
Every ExecutionOutcome contains the AccountId of the executor
and the amount of gas that was burned during the execution.
It's everything that is needed for analysing gas usage.

Usage from all blocks is merged into a single instace of GasUsageStats
and the interesting pieces of information are displayed for the user.
Add a flag that can be used to specify a range of blocks
to be analysed.

Now running:
```bash
./neard database analyse-gas-usage --from-block-height 120 --to-block-height 130
```

Will analyse 11 blocks with heights in range of [120, 130]

The logic of converting command line arguments to block iterators
has been extracted to the block_iterators module. In the future
it'll be possible to reuse it for another command that wants to
analyse some subset of blocks.
For each shard, calculate the optimal account that the shard could
be split at, so that gas usage on both sides of the split is similar.
An exact split isn't always possible, because it's possible for one account
to consume 40% of all gas, but the function tries its best to make it fair.

Information about optimal splits is displayed for the user.
Find the accounts that consume the most gas and display
them in the analysis. This information is interesting,
because it might turn out that one account consumes 50%
of all gas.
@jancionear jancionear requested a review from a team as a code owner November 23, 2023 16:50
@jancionear
Copy link
Contributor Author

jancionear commented Nov 23, 2023

Notes:

  • I'm not entirely sure that the technique used in get_gas_usage_in_block is the correct way to measure used gas. Gas is a tricky subject, I would appreciate if someone with more knowledge about the subject would chime in.
  • There's a lot of unwraps in the code. Some people dislike that, but in my opinion unwrap() is a good choice for fatal errors in user facing applications. An unwrap() prints the whole error and the exact line where the error has occured, which is great for finding out what went wrong. With Result I often ended up with messages like: Failed: Error: NotFound, which isn't very useful.
  • The tool isn't very fast - analysing a whole epoch (43200 blocks) takes about 8 minutes. This is most likely caused by the fact that it reads one piece of information from the database at a time. If needed it could probably be sped up by running multiple get_gas_usage_in_block in parallel.
  • Everything fits in the memory. The tool only keeps one u128 per account. Analysing a whole epoch of blocks required only ~300MB of RAM.

@jancionear
Copy link
Contributor Author

jancionear commented Nov 23, 2023

Example output:

$ ./neard database analyse-gas-usage --last-blocks 43200
2023-11-29T18:39:21.516429Z  INFO neard: version="trunk" build="1.1.0-4614-g645a5a8ea-modified" latest_protocol=64
2023-11-29T18:39:21.516636Z  INFO config: Validating Config, extracted from config.json...
2023-11-29T18:39:21.518495Z  WARN genesis: Skipped genesis validation
2023-11-29T18:39:21.518525Z  INFO config: Validating Genesis config and records. This could take a few minutes...
2023-11-29T18:39:21.518836Z  INFO config: All validations have passed!
2023-11-29T18:39:21.518878Z  INFO db_opener: Opening NodeStorage path="/home/ubuntu/.near/data" cold_path="none"
2023-11-29T18:39:21.518898Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-29T18:39:21.537906Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-29T18:39:21.537928Z  INFO db_opener: The database exists. path=/home/ubuntu/.near/data
2023-11-29T18:39:21.537940Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-29T18:39:21.697480Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-29T18:39:21.697511Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-29T18:39:21.714940Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-29T18:39:21.714959Z  INFO db: Opened a new RocksDB instance. num_instances=1

Analysed 43200 blocks between:
Block: height = 106240517, hash = HaqJ1P71gAgHDdMjju72CkZ43uuAS4W3qRVU1LAta4Cq
Block: height = 106196896, hash = AWVbrPJUgy645ZB6eLh3fgZohQ5dQK66RKsh1TrhWBQB

Total gas used: 11203674.25 TGas

Shard: s0.v1
  Gas usage: 2417726.52 TGas (21.6% of total)
  Number of accounts: 197450
  Biggest account: asset-manager.orderly-network.near
  Biggest account gas: 226419.58 TGas (9.4% of shard)
  Optimal split:
    boundary_account: 6f3976e0f6e108b8d621ae935705dcfc62871afb806b753d79c0ee2929529b2f
    gas(boundary_account): 6.60 TGas
    Gas distribution (left, boundary_acc, right): (50.0%, 0.0%, 50.0%)
    Left (account < boundary_account):
      Gas: 1208862.19 TGas (50.0% of shard)
      Accounts: 122329
      Top 3 accounts:
        #1: 105dedf45c79957f708a85b04aed2de23a79782ffc2840a735fbbb58dddde2cf
            Used gas: 17243.59 TGas (0.7% of shard)
        #2: 51c0b243a75ace637d883afaae4e912d8d5842bfae11e8adfc9afc269fb0b70f
            Used gas: 13429.25 TGas (0.6% of shard)
        #3: 5c33c6218d47e00ef229f60da78d0897e1ee9665312550b8afd5f9c7bc6957d2
            Used gas: 9646.04 TGas (0.4% of shard)
    Right (account >= boundary_account):
      Gas: 1208864.33 TGas (50.0% of shard)
      Accounts: 75121
      Top 3 accounts:
        #1: asset-manager.orderly-network.near
            Used gas: 226419.58 TGas (9.4% of shard)
        #2: app.nearcrowd.near
            Used gas: 207878.41 TGas (8.6% of shard)
        #3: a-z0-9.near
            Used gas: 30547.30 TGas (1.3% of shard)

Shard: s1.v1
  Gas usage: 217558.66 TGas (1.9% of total)
  Number of accounts: 1
  Biggest account: aurora
  Biggest account gas: 217558.66 TGas (100.0% of shard)
  No optimal split for this shard

Shard: s2.v1
  Gas usage: 3492199.44 TGas (31.2% of total)
  Number of accounts: 118485
  Biggest account: earn.kaiching
  Biggest account gas: 2130683.13 TGas (61.0% of shard)
  Optimal split:
    boundary_account: earn.kaiching
    gas(boundary_account): 2130683.13 TGas
    Gas distribution (left, boundary_acc, right): (22.0%, 61.0%, 17.0%)
    Left (account < boundary_account):
      Gas: 767584.96 TGas (22.0% of shard)
      Accounts: 68065
      Top 3 accounts:
        #1: citric.near
            Used gas: 32482.59 TGas (0.9% of shard)
        #2: distributions.grow.sweat
            Used gas: 22890.11 TGas (0.7% of shard)
        #3: client-eth2.bridge.near
            Used gas: 16520.35 TGas (0.5% of shard)
    Right (account >= boundary_account):
      Gas: 2724614.48 TGas (78.0% of shard)
      Accounts: 50420
      Top 3 accounts:
        #1: earn.kaiching
            Used gas: 2130683.13 TGas (61.0% of shard)
        #2: jars.sweat
            Used gas: 128436.78 TGas (3.7% of shard)
        #3: hotwallet.kaiching
            Used gas: 51499.23 TGas (1.5% of shard)

Shard: s3.v1
  Gas usage: 5076189.62 TGas (45.3% of total)
  Number of accounts: 81093
  Biggest account: token.sweat
  Biggest account gas: 2299587.29 TGas (45.3% of shard)
  Optimal split:
    boundary_account: token.sweat
    gas(boundary_account): 2299587.29 TGas
    Gas distribution (left, boundary_acc, right): (27.5%, 45.3%, 27.2%)
    Left (account < boundary_account):
      Gas: 1395152.95 TGas (27.5% of shard)
      Accounts: 49350
      Top 3 accounts:
        #1: sweat_welcome.near
            Used gas: 470514.16 TGas (9.3% of shard)
        #2: spin.sweat
            Used gas: 174138.78 TGas (3.4% of shard)
        #3: tge-lockup.sweat
            Used gas: 92929.73 TGas (1.8% of shard)
    Right (account >= boundary_account):
      Gas: 3681036.67 TGas (72.5% of shard)
      Accounts: 31743
      Top 3 accounts:
        #1: token.sweat
            Used gas: 2299587.29 TGas (45.3% of shard)
        #2: wallet.kaiching
            Used gas: 612465.09 TGas (12.1% of shard)
        #3: v2.ref-finance.near
            Used gas: 244591.22 TGas (4.8% of shard)

10 biggest accounts by gas usage:
#1: token.sweat
    Used gas: 2299587.29 TGas (20.5% of total)
#2: earn.kaiching
    Used gas: 2130683.13 TGas (19.0% of total)
#3: wallet.kaiching
    Used gas: 612465.09 TGas (5.5% of total)
#4: sweat_welcome.near
    Used gas: 470514.16 TGas (4.2% of total)
#5: v2.ref-finance.near
    Used gas: 244591.22 TGas (2.2% of total)
#6: asset-manager.orderly-network.near
    Used gas: 226419.58 TGas (2.0% of total)
#7: aurora
    Used gas: 217558.66 TGas (1.9% of total)
#8: app.nearcrowd.near
    Used gas: 207878.41 TGas (1.9% of total)
#9: spin.sweat
    Used gas: 174138.78 TGas (1.6% of total)
#10: jars.sweat
    Used gas: 128436.78 TGas (1.1% of total)
2023-11-29T18:45:55.533594Z  INFO db: Closed a RocksDB instance. num_instances=0
Old (v2) output
$ ./neard database analyse-gas-usage --last-blocks 43200
2023-11-24T18:45:42.494418Z  INFO neard: version="trunk" build="1.1.0-4609-g81b328a0f" latest_protocol=64
2023-11-24T18:45:42.495634Z  INFO config: Validating Config, extracted from config.json...
2023-11-24T18:45:42.520064Z  WARN genesis: Skipped genesis validation
2023-11-24T18:45:42.520153Z  INFO config: Validating Genesis config and records. This could take a few minutes...                                                                                                                                                             
2023-11-24T18:45:42.520770Z  INFO config: All validations have passed!
2023-11-24T18:45:42.521106Z  INFO db_opener: Opening NodeStorage path="/home/ubuntu/.near/data" cold_path="none"
2023-11-24T18:45:42.521190Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-24T18:45:42.549932Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-24T18:45:42.549996Z  INFO db_opener: The database exists. path=/home/ubuntu/.near/data
2023-11-24T18:45:42.550039Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-24T18:45:42.720600Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-24T18:45:42.720670Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-24T18:45:42.738937Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-24T18:45:42.738976Z  INFO db: Opened a new RocksDB instance. num_instances=1

Analysed 43200 blocks between:
Block: height = 106240517, hash = HaqJ1P71gAgHDdMjju72CkZ43uuAS4W3qRVU1LAta4Cq
Block: height = 106196896, hash = AWVbrPJUgy645ZB6eLh3fgZohQ5dQK66RKsh1TrhWBQB

Total gas used: 11203674.25 TGas

Shard: s0.v1
  Gas usage: 2417726.52 TGas (21.6% of total)
  Number of accounts: 197450
  Biggest account: asset-manager.orderly-network.near
  Biggest account gas: 226419.58 TGas (9.4% of shard total)
  Optimal split:
    split_account: 6f38z8mm6wsg.users.kaiching
    gas(split_account): 3.10 TGas
    Left (account <= split_account):
      Gas: 1208862.19 TGas (50.0% of shard total)
      Accounts: 122329
      Top 3 accounts:
        #1: 105dedf45c79957f708a85b04aed2de23a79782ffc2840a735fbbb58dddde2cf
            Used gas: 17243.59 TGas (1.4% of shard split half)
        #2: 51c0b243a75ace637d883afaae4e912d8d5842bfae11e8adfc9afc269fb0b70f
            Used gas: 13429.25 TGas (1.1% of shard split half)
        #3: 5c33c6218d47e00ef229f60da78d0897e1ee9665312550b8afd5f9c7bc6957d2
            Used gas: 9646.04 TGas (0.8% of shard split half)
    Right (account > split_account):
      Gas: 1208864.33 TGas (50.0% of shard total)
      Accounts: 75121
      Top 3 accounts:
        #1: asset-manager.orderly-network.near
            Used gas: 226419.58 TGas (18.7% of shard split half)
        #2: app.nearcrowd.near
            Used gas: 207878.41 TGas (17.2% of shard split half)
        #3: a-z0-9.near
            Used gas: 30547.30 TGas (2.5% of shard split half)

Shard: s1.v1
  Gas usage: 217558.66 TGas (1.9% of total)
  Number of accounts: 1
  Biggest account: aurora
  Biggest account gas: 217558.66 TGas (100.0% of shard total)
  No optimal split for this shard

Shard: s2.v1
  Gas usage: 3492199.44 TGas (31.2% of total)
  Number of accounts: 118485
  Biggest account: earn.kaiching
  Biggest account gas: 2130683.13 TGas (61.0% of shard total)
  Optimal split:
    split_account: earlysunrise4919227330.u.arkana.near
    gas(split_account): 9.43 TGas
    Left (account <= split_account):
      Gas: 767584.96 TGas (22.0% of shard total)
      Accounts: 68065
      Top 3 accounts:
        #1: citric.near
            Used gas: 32482.59 TGas (4.2% of shard split half)
        #2: distributions.grow.sweat
            Used gas: 22890.11 TGas (3.0% of shard split half)
        #3: client-eth2.bridge.near
            Used gas: 16520.35 TGas (2.2% of shard split half)
    Right (account > split_account):
      Gas: 2724614.48 TGas (78.0% of shard total)
      Accounts: 50420
      Top 3 accounts:
        #1: earn.kaiching
            Used gas: 2130683.13 TGas (78.2% of shard split half)
        #2: jars.sweat
            Used gas: 128436.78 TGas (4.7% of shard split half)
        #3: hotwallet.kaiching
            Used gas: 51499.23 TGas (1.9% of shard split half)

Shard: s3.v1
  Gas usage: 5076189.62 TGas (45.3% of total)
  Number of accounts: 81093
  Biggest account: token.sweat
  Biggest account gas: 2299587.29 TGas (45.3% of shard total)
  Optimal split:
    split_account: token.stlb.near
    gas(split_account): 13.25 TGas
    Left (account <= split_account):
      Gas: 1395152.95 TGas (27.5% of shard total)
      Accounts: 49350
      Top 3 accounts:
        #1: sweat_welcome.near
            Used gas: 470514.16 TGas (33.7% of shard split half)
        #2: spin.sweat
            Used gas: 174138.78 TGas (12.5% of shard split half)
        #3: tge-lockup.sweat
            Used gas: 92929.73 TGas (6.7% of shard split half)
    Right (account > split_account):
      Gas: 3681036.67 TGas (72.5% of shard total)
      Accounts: 31743
      Top 3 accounts:
        #1: token.sweat
            Used gas: 2299587.29 TGas (62.5% of shard split half)
        #2: wallet.kaiching
            Used gas: 612465.09 TGas (16.6% of shard split half)
        #3: v2.ref-finance.near
            Used gas: 244591.22 TGas (6.6% of shard split half)

10 biggest accounts by gas usage:
#1: token.sweat
    Used gas: 2299587.29 TGas (20.5% of total)
#2: earn.kaiching
    Used gas: 2130683.13 TGas (19.0% of total)
#3: wallet.kaiching
    Used gas: 612465.09 TGas (5.5% of total)
#4: sweat_welcome.near
    Used gas: 470514.16 TGas (4.2% of total)
#5: v2.ref-finance.near
    Used gas: 244591.22 TGas (2.2% of total)
#6: asset-manager.orderly-network.near
    Used gas: 226419.58 TGas (2.0% of total)
#7: aurora
    Used gas: 217558.66 TGas (1.9% of total)
#8: app.nearcrowd.near
    Used gas: 207878.41 TGas (1.9% of total)
#9: spin.sweat
    Used gas: 174138.78 TGas (1.6% of total)
#10: jars.sweat
    Used gas: 128436.78 TGas (1.1% of total)
2023-11-24T18:58:32.810985Z  INFO db: Closed a RocksDB instance. num_instances=0
Old (buggy) output
$ time ./target/release/neard database analyse-gas-usage --last-blocks 43200
2023-11-23T16:39:01.093282Z  INFO neard: version="trunk" build="1.1.0-4597-gb7d470924" latest_protocol=64
2023-11-23T16:39:01.093504Z  INFO config: Validating Config, extracted from config.json...
2023-11-23T16:39:01.095173Z  WARN genesis: Skipped genesis validation
2023-11-23T16:39:01.095199Z  INFO config: Validating Genesis config and records. This could take a few minutes...
2023-11-23T16:39:01.095523Z  INFO config: All validations have passed!
2023-11-23T16:39:01.095555Z  INFO db_opener: Opening NodeStorage path="/home/ubuntu/.near/data" cold_path="none"
2023-11-23T16:39:01.095589Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-23T16:39:01.114325Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-23T16:39:01.114350Z  INFO db_opener: The database exists. path=/home/ubuntu/.near/data
2023-11-23T16:39:01.114361Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-23T16:39:01.273929Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-23T16:39:01.273965Z  INFO db: Opened a new RocksDB instance. num_instances=1
2023-11-23T16:39:01.291481Z  INFO db: Closed a RocksDB instance. num_instances=0
2023-11-23T16:39:01.291506Z  INFO db: Opened a new RocksDB instance. num_instances=1
Analysed 43200 blocks between:
Block: height = 106240517, hash = HaqJ1P71gAgHDdMjju72CkZ43uuAS4W3qRVU1LAta4Cq
Block: height = 106196896, hash = AWVbrPJUgy645ZB6eLh3fgZohQ5dQK66RKsh1TrhWBQB

Total gas used: 22407270193906837544

Shard: s0.v1
  Gas usage: 4835423212212028560 (21.6% of total)
  Number of accounts: 197450
  Optimal split:
    split_account: auqurbjawabo.users.kaiching
    gas(account < split_account): 2417713906472325694 (50.0% of shard)
    gas(account >= split_account): 2417709305739702866 (50.0% of shard)

Shard: s1.v1
  Gas usage: 435103165008707214 (1.9% of total)
  Number of accounts: 1
  No optimal split for this shard

Shard: s2.v1
  Gas usage: 6984397849429433430 (31.2% of total)
  Number of accounts: 118485
  Optimal split:
    split_account: kkute9tmsit8.users.kaiching
    gas(account < split_account): 3492199439799673004 (50.0% of shard)
    gas(account >= split_account): 3492198409629760426 (50.0% of shard)

Shard: s3.v1
  Gas usage: 10152345967256668340 (45.3% of total)
  Number of accounts: 81093
  Optimal split:
    split_account: zzzdclzbf6li.users.kaiching
    gas(account < split_account): 5076173908397008194 (50.0% of shard)
    gas(account >= split_account): 5076172058859660146 (50.0% of shard)

10 biggest accounts by gas usage:
#1: token.sweat
    Used gas: 2299587291568145490 (10.3% of total)
#2: earn.kaiching
    Used gas: 2130683133139892087 (9.5% of total)
#3: wallet.kaiching
    Used gas: 612465086458210767 (2.7% of total)
#4: sweat_welcome.near
    Used gas: 470514156151885936 (2.1% of total)
#5: v2.ref-finance.near
    Used gas: 244591219782634243 (1.1% of total)
#6: asset-manager.orderly-network.near
    Used gas: 226419584256414264 (1.0% of total)
#7: aurora
    Used gas: 217558664457482245 (1.0% of total)
#8: app.nearcrowd.near
    Used gas: 207878414604109841 (0.9% of total)
#9: spin.sweat
    Used gas: 174138779272472700 (0.8% of total)
#10: jars.sweat
    Used gas: 128436775874128340 (0.6% of total)
2023-11-23T16:46:26.809965Z  INFO db: Closed a RocksDB instance. num_instances=0

real    7m25.741s
user    5m22.621s
sys     1m4.785s

Copy link

codecov bot commented Nov 23, 2023

Codecov Report

Attention: 348 lines in your changes are missing coverage. Please review.

Comparison is base (2b24bc8) 71.87% compared to head (95e43b9) 71.77%.
Report is 29 commits behind head on master.

Files Patch % Lines
tools/database/src/analyse_gas_usage.rs 40.28% 250 Missing and 2 partials ⚠️
tools/database/src/block_iterators/height_range.rs 0.00% 45 Missing ⚠️
tools/database/src/block_iterators/mod.rs 0.00% 29 Missing ⚠️
tools/database/src/block_iterators/last_blocks.rs 0.00% 21 Missing ⚠️
tools/database/src/commands.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10240      +/-   ##
==========================================
- Coverage   71.87%   71.77%   -0.10%     
==========================================
  Files         707      711       +4     
  Lines      141791   142942    +1151     
  Branches   141791   142942    +1151     
==========================================
+ Hits       101907   102596     +689     
- Misses      35171    35613     +442     
- Partials     4713     4733      +20     
Flag Coverage Δ
backward-compatibility 0.08% <0.00%> (-0.01%) ⬇️
db-migration 0.08% <0.00%> (-0.01%) ⬇️
genesis-check 1.26% <0.00%> (+0.02%) ⬆️
integration-tests 36.11% <0.00%> (-0.06%) ⬇️
linux 71.64% <32.81%> (-0.10%) ⬇️
linux-nightly 71.36% <32.81%> (-0.29%) ⬇️
macos 55.03% <32.81%> (-0.72%) ⬇️
pytests 1.48% <0.00%> (+0.02%) ⬆️
sanity-checks 1.28% <0.00%> (+0.01%) ⬆️
unittests 68.13% <32.81%> (-0.09%) ⬇️
upgradability 0.13% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jancionear
Copy link
Contributor Author

Attention: 315 lines in your changes are missing coverage. Please review.

It'd be nice to test this tool, but for that I would have to mock ChainStore and EpochManager with some example blockchain data. I will look around to see if there is an easy way to do it.

@wacban
Copy link
Contributor

wacban commented Nov 24, 2023

There's a lot of unwraps in the code. Some people dislike that, but in my opinion unwrap() is a good choice for fatal errors in user facing applications. An unwrap() prints the whole error and the exact line where the error has occured, which is great for finding out what went wrong. With Result I often ended up with messages like: Failed: Error: NotFound, which isn't very useful.

That's fine for tooling in a subcommand. In the usual neard flow we avoid unwraps however and propagate errors up. Personally I also like to log the error at the place where it happens to make it clear what's the place where it occured.

@wacban
Copy link
Contributor

wacban commented Nov 24, 2023

The tool isn't very fast - analysing a whole epoch (43200 blocks) takes about 8 minutes. This is most likely caused by the fact that it reads one piece of information from the database at a time. If needed it could probably be sped up by running multiple get_gas_usage_in_block in parallel.

That's fine, we can run it offline.

@wacban
Copy link
Contributor

wacban commented Nov 24, 2023

example output

Can you display gas in some nicer format e.g. tera gas?

Funny how all the split accounts are belong to kaiching.

Shard: s3.v1
split_account: zzzdclzbf6li.users.kaiching

That's a bit surprising, it's really close to the end of the range of accounts - tripple 'z' really gets there. It would be good to do a follow up PR to check how well will the current candidate split the shard. Let me review this one first though.

@wacban
Copy link
Contributor

wacban commented Nov 24, 2023

Codecov Report

Don't worry about testing it, we don't usually write tests for the tools unless we intend them to be used in production. This is just a one off analysis. We can add tests if and when we upgrade it to production code.

@wacban
Copy link
Contributor

wacban commented Nov 24, 2023

10 biggest accounts by gas usage:

Can you make that state per shard? 1% global usage may be way more significant when narrowed down to a shard. Perhaps just print the top 5 per shard.

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks, looks good, but can you double check the results please? The 'zzz...' split account in shard 3 is sus.

type BigGas = u128;

#[derive(Parser)]
pub(crate) struct AnalyseGasUsageCommand {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mini nit: I think there is a neat way to make the command an enum or maybe just have an enum field inside of it. It's fine as is but if you have some spare time you can check it out and see if it would make sense here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked around and I couldn't find a clean way to do this using enums.
There is an open issue in clap for creating mutually exclusive sets of arguments, but it isn't ready yet: clap-rs/clap#2621

It'd be possible to make two subcommands: last-blocks and block-range, but that doesn't feel right to me. I want a command analyse-gas along with arguments that specify which blocks it should analyse.

I don't see a better way to do this, I think we can leave it as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found an example in our codebase and it is indeed using a subcommand. If you're curious you can have a look at the StateRootSelector. I don't remember how does it end up working in the cli. Not a big deal, feel free to ignore.

Comment on lines 46 to 48
let mut near_config =
nearcore::config::load_config(home, near_chain_configs::GenesisValidationMode::Full)
.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would import those, not reason to suffer the long names

tools/database/src/block_iterators/mod.rs Show resolved Hide resolved
tools/database/src/analyse_gas_usage.rs Show resolved Hide resolved
tools/database/src/analyse_gas_usage.rs Outdated Show resolved Hide resolved
// The outcome of each transaction and receipt executed in this chunk is saved in the database as an ExecutionOutcome.
// Go through all ExecutionOutcomes from this chunk and record the gas usage.
let outcome_ids: Vec<CryptoHash> =
chain_store.get_outcomes_by_block_hash_and_shard_id(block.hash(), shard_id).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems alright but I'll double check that as I'm also not sure

tools/database/src/analyse_gas_usage.rs Show resolved Hide resolved
tools/database/src/analyse_gas_usage.rs Outdated Show resolved Hide resolved
tools/database/src/analyse_gas_usage.rs Outdated Show resolved Hide resolved
tools/database/src/block_iterators/height_range.rs Outdated Show resolved Hide resolved
@wacban
Copy link
Contributor

wacban commented Nov 24, 2023

@Ekleog & @nagisa how would you feel about excluding the tools directory from code coverage or setting the target to 0 (for this directory only)? It's looks awfully red for no good reason here ;)

@nagisa
Copy link
Collaborator

nagisa commented Nov 24, 2023

Setting the target to 0% seems reasonable enough to me, but I'm not sure I have codecov permissions to make any changes here. Can you please file an issue for this?

@wacban
Copy link
Contributor

wacban commented Nov 24, 2023

Setting the target to 0% seems reasonable enough to me, but I'm not sure I have codecov permissions to make any changes here. Can you please file an issue for this?

#10245

@jancionear
Copy link
Contributor Author

but can you double check the results please? The 'zzz...' split account in shard 3 is sus.

Nice catch! Yeah I screwed something up. I added code that manually counts gas on both sides of the split, and it doesn't look right at all:

Shard: s3.v1
  Gas usage: 79848.02 TGas (46.4% of total)
  Number of accounts: 180
  Optimal split:
    split_account: zomland.near
    gas(account < split_account): 39940.65 TGas (50.0% of shard)
    gas(account >= split_account): 39907.38 TGas (50.0% of shard)
    gas_left2: 39859.69 TGas
    gas_right2: 80.96 TGas
    accounts(account < split_account): 179
    accounts(account >= split_account): 1

Debugging....

GasUsageInShard::merge() merges results from two instances of GasUsageInShard into one.
The merge is performed by adding information about each account from the other shard.
Then used_gas_total is updated as well. However this is the wrong thing to do.
Calling `add_used_gas` already increases `used_gas_total`, so doing it again
doubles the total gas, which is incorrect.

Fix the bug by removing the line that doubles the gas.
The field GasUsageInShard::used_gas_total keeps the total amount of
gas used in a shard. The problem is that keeping it in sync with the
rest of the struct requires special care and is prone to bugs.
A mistake in this logic has already caused one bug in calculating
shard splits. It's safer to get rid of the field and just calculate
the total amount of used gas when needed. It doesn't take that long
to calculate this value from scratch.
The previous implementation of calculate_split() worked okayish,
but it had some faults.

The first fault was that it split the shard into two halves where the left
one consistend of accounts smaller than the boundary account. This doesn't match
the logic that NEAR uses for boundary accounts. In NEAR the left half includes
the boundary account, i.e it's <= boundary_acc, not < boundary_acc.
The new function uses the same division as NEAR: (left <= split_account), (right > split_account).

The second fault was that the old function looked for a split where gas_left >= gas_right.
It isn't easy to prove that this is the optimal one - why is the left half supposed to be bigger?
Let's implement it in a way that is easier to understand. The new calculate_split()
looks for a split that minimizes the difference between the two halves. This makes sense,
we want the halves to be as similar as possible.

The commit also adds unit tests to make sure that calculate_split() works correctly.
@jancionear
Copy link
Contributor Author

jancionear commented Nov 24, 2023

Debugging....

The bug was in GasUsageInShard::merge(). By mistake I increased the used_gas_total twice - once when adding accounts from the other instance, and then once again at the end. Because of this the total amount of gas was 2x bigger than it should be, and the logic of shard splitting went crazy.

Fixed in 2919011. Eventually I decided to remove used_gas_total altogether. Its presence makes it easier to create bugs, so let's just get rid of it. We can just calculate this information from scratch when needed, it doesn't take that long.

@jancionear
Copy link
Contributor Author

jancionear commented Nov 24, 2023

v2:

  • Fixed the issues mentioned in the review, except for two:
    • One modification to use the entry() api didn't look good to me, I decided not to change it
    • The command arguments are not an enum, I will read up about it I couldn't find a way to specify them as an enum
  • Fixed the bug that caused calculate_split to go crazy.
  • Added more information to the analysis report.
  • Reimplemented calculate_split() - it now looks for a split that has the smallest difference between the two halves, and the left half is now inclusive (see commit message for details: f01c80e)
  • Added unit tests for calculate_split()

I updated the example output to show how the changes look like: #10240 (comment)

@wacban
Copy link
Contributor

wacban commented Nov 27, 2023

I won't be able to review today, sorry.
@shreyan-gupta Can you have a look in my place?

@wacban
Copy link
Contributor

wacban commented Nov 29, 2023

Reimplemented calculate_split() - it now looks for a split that has the smallest difference between the two halves, and the left half is now inclusive (see commit message for details: f01c80e)

&

Left (account <= split_account): 

I don't think this is how it works but I'll double check. Looking at the current shard layout here shard 1 should contains the single account "aurora" which means that the boundary is the first account of the shard, not the last. For a new boundary aka the split account we want accounts on the left < split account <= accounts on the right.

@wacban
Copy link
Contributor

wacban commented Nov 29, 2023

% of shard split half

Can you make that % of the original shard - without split? The goal is to split the shard as evenly as possible so we want to measure everything relative to the original shard.

@wacban
Copy link
Contributor

wacban commented Nov 29, 2023

example output to show how the changes look like

Thanks, it looks much better now!

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and big thanks!
Just address the final comments before merging please.

@jancionear
Copy link
Contributor Author

For a new boundary aka the split account we want accounts on the left < split account <= accounts on the right.

Damn, you're right.
I looked at account_id_to_shard_id and I mistakenly read boundary_account.cmp(account_id) == Greater as account_id > boundary_account, but that was wrong, it means account_id < boundary_account:

ShardLayout::V1(ShardLayoutV1 { boundary_accounts, .. }) => {
            // Note: As we scale up the number of shards we can consider
            // changing this method to do a binary search rather than linear
            // scan. For the time being, with only 4 shards, this is perfectly fine.
            let mut shard_id: ShardId = 0;
            for boundary_account in boundary_accounts {
                if boundary_account.cmp(account_id) == Greater {
                    break;
                }
                shard_id += 1;
            }
            shard_id
        }

It would be much more readable if it was written using the standard < operator :c

let mut shard_id: ShardId = 0;
for boundary_account in boundary_accounts {
    if account_id < boundary_account {
        break;
    }
    shard_id += 1;
}

Thanks! I will fix it.

When splitting a shard into two halves the shard is divided
into two on the split_account (boundary_account).
The split should look like this:
left < boundary_account
right >= boundary_account

Previously I misunderstood the semantics and implemented the split
as if it was left <= boundary and right > boundary. Fix it.
Previously gas usage of some accounts was displayed as percentage
of shard split half, but this was hard to reason about.
Let's change it to a percentage of total gas usage on a shard.
@jancionear
Copy link
Contributor Author

v3:

  • Split the shards in a way where the left half is less than split account instead of less or equal.
  • Display account gas usage as percent of shard, not shard split
  • Show how the gas is distributed in a split (left, split_account, right)

Updated the example output to show how the latest version looks like.

@jancionear
Copy link
Contributor Author

jancionear commented Nov 29, 2023

v4:

  • fix the ranges used for left_accounts and right_accounts
  • Rename split_account to boundary_account. boundary is the term used in all of NEAR code, so it's better than split.

jancionear added a commit to jancionear/nearcore that referenced this pull request Nov 29, 2023
The following code is currently used for comparing `boundary_account` and `account_id`:
```rust
if boundary_account.cmp(account_id) == Greater {
    ...
}
```

IMO it's a bit confusing - it isn't immediately obvious whether this means `boundary_account` > `account_id`
or `account_id` > `boundary_account`. I misread this line and because of that I made a mistake in near#10240.

Let's change it to something that is easier to read:
```rust
if account_id < boundary_account {
    ...
}
```
jancionear added a commit to jancionear/nearcore that referenced this pull request Nov 29, 2023
The following code is currently used for comparing `boundary_account` and `account_id`:
```rust
if boundary_account.cmp(account_id) == Greater {
    ...
}
```

IMO it's a bit confusing - it isn't immediately obvious whether this means `boundary_account` > `account_id`
or `account_id` > `boundary_account`. I misread this line and because of that I made a mistake in near#10240.

Let's change it to something that is easier to read:
```rust
if account_id < boundary_account {
    ...
}
```
I guess VSCode froze and didn't run format on save :C
github-merge-queue bot pushed a commit that referenced this pull request Nov 30, 2023
…#10268)

The following code is currently used for comparing `boundary_account`
and `account_id`:
```rust
if boundary_account.cmp(account_id) == Greater {
    ...
}
```

IMO it's a bit confusing - it isn't immediately obvious whether this
means `boundary_account` > `account_id` or `account_id` >
`boundary_account`. I misread this line and because of that I made a
mistake in #10240.

Let's change it to something that is easier to read:
```rust
if account_id < boundary_account {
    ...
}
```
Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! The numbers are really interesting!

@jancionear jancionear added this pull request to the merge queue Nov 30, 2023
Merged via the queue into near:master with commit 935cc98 Nov 30, 2023
17 of 19 checks passed
@jancionear jancionear deleted the gas-stats-v3 branch November 30, 2023 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants