Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3651 prefetch entries in soroban ops #4178

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ThomasBrady
Copy link
Contributor

@ThomasBrady ThomasBrady commented Jan 31, 2024

Description

Adds LedgerKeyMeter to track the allowance of read bytes for each transaction when prefetching soroban ops.

Resolves #3651

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

src/bucket/Bucket.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like a good start! Most of my comments are pretty minor nit picky things, although there are a couple of more significant issues that will require a little redesign. I'm not quite sure why the test is failing for range based indexes at the moment, but I did flag a few issues that may be contributing factors.

Additionally, there are one or two DOS concerns I have, especially regarding individual indexes. I think you should focus on correctness for now though, and we can chat about that later offline (I'd rather not explain attack vectors in-depth publicly...)

@@ -212,6 +212,10 @@ void
ExtendFootprintTTLOpFrame::insertLedgerKeysToPrefetch(
UnorderedSet<LedgerKey>& keys) const
{
for (auto const& lk : mParentTx.sorobanResources().footprint.readOnly)
{
keys.emplace(lk);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to prefetch TTL keys as well.

auto const& footprint = mParentTx.sorobanResources().footprint;
for (auto const& lk : footprint.readOnly)
{
keys.emplace(lk);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to prefetch TTL keys.

Bucket::loadKeys(std::set<LedgerKey, LedgerEntryIdCmp>& keys,
std::vector<LedgerEntry>& result)
{
std::unordered_map<LedgerKey, std::unordered_set<Hash>> lkToTx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should almost never use std::unordered_map and std::unordered_set. We want to use our own Hasher, where we can deterministically set the hashing seed. This helps us get determinism across unit test runs. Use UnorderedMap and UnorderedSet instead.

@@ -795,6 +800,11 @@ class LedgerTxn : public AbstractLedgerTxn

double getPrefetchHitRate() const override;
uint32_t prefetch(UnorderedSet<LedgerKey> const& keys) override;

uint32_t
prefetch(UnorderedSet<LedgerKey> const& keys,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rename this instead of overloading, something like prefetchWithLimits.

}
else
{
tx->getResources(/* useByteLimitInClassic */ false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is getResources called here?

return true;
}
// Ensure at least one transaction has read quota for this key.
for (auto txn : lkToTx[key])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto const&

}
return false;
};
auto updateReadQuotaForLK = [&](auto key, auto entry) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Params should both be auto const&

}
// Update read quota for all txns waiting on this key.
auto size = xdr::xdr_size(entry);
for (auto txn : lkToTx[key])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto const&

@@ -207,6 +264,9 @@ Bucket::loadKeys(std::set<LedgerKey, LedgerEntryIdCmp>& keys,
{
if (entryOp->type() != DEADENTRY)
{
updateReadQuotaForLK(*currKeyIt, entryOp.value());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might also have to account for read quota if a DEADENTRY is returned. I'd have to double check the fee model, but I think we charge for the size of the key on reads that return null. Again, I'm not sure though and will need to double check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were can I find the fee model?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I think we charge for the size of the key on reads that return null

We don't, we only charge the read bytes for live entries read. Dead entries/non-existent entry lookups are free.

src/bucket/Bucket.cpp Outdated Show resolved Hide resolved
@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch 5 times, most recently from d9f765d to 883987f Compare March 6, 2024 19:06
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving in the right direction! I think structurally this is looking good and I think your approach is correct, but there's still some significant bugs and gaps in testing that need to be addressed, as well as DOS exploits. I'm also not quite sure what the test cases are doing, but I'm looking forward to chatting IRL so you can walk me through it and improve my understanding!


auto currKeyIt = keys.begin();
auto prevKeyIt = currKeyIt;
auto maxReadQuota = maxReadQuotaForKey(*currKeyIt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Null dereference, keys may be empty such that currKeyIt == keys.end() here.

auto maxReadQuotaForKey = [&](auto const& key) {
uint32_t maxReadQuota = 0;
if (lkToTx.empty() || txReadBytes.empty() ||
/* classic */ lkToTx.count(key) == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using count here, if you do something like auto iter = lkToTx.find(key); iter != lkToTx.end(), you can reuse the iter in your for loop and only do one map lookup instead of 2. You can actually simplify this if block to just the iter != lkToTx.end() check, the empty checks are redundant.

auto const& index = getIndex();
auto indexIter = index.begin();
while (currKeyIt != keys.end() && indexIter != index.end())
{
if (prevKeyIt != currKeyIt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're doing here with the optimization, but I think it would be safer to just call auto maxReadQuota = maxReadQuotaForKey(*currKeyIt); at the start of every for loop instead of keeping track of the previous iterator. This avoids issues like the null dereference and is simpler. Worst case we do an extra lookup in a relatively small, in memory hash map, so I'd value the minor perf hit for simplicity here.

if (maxReadQuota == 0)
{
notLoaded.insert(*currKeyIt);
// TODO (Consider adding a metric tracking failed loads due to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought! But I don't think this metric is necessary here, as it's already covered by the soroban apply time metrics. Prefetch may "fail" in so far as some keys are not loaded, but we never fail a TX at prefetch time, so we always make it to the apply time metrics anyway.

std::vector<LedgerEntry>& result)
Bucket::loadKeysWithLimits(std::set<LedgerKey, LedgerEntryIdCmp>& keys,
std::vector<LedgerEntry>& result,
UnorderedMap<LedgerKey, UnorderedSet<Hash>>& lkToTx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to rename lkToTx to something like meteredLedgerKeyToTx. This indicates that this field only contains metered keys and that classic keys should not be part of this set. With this naming, I'd initially expect all keys to be in this set.

{
auto e =
LedgerTestUtils::generateValidLedgerEntryWithTypes({CONTRACT_DATA});
SCVal cd_value(SCV_BYTES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: lowerCamelCase

SCVal cd_value(SCV_BYTES);
xdr::opaque_vec<xdr::XDR_MAX_LEN> data_bytes;
size_t data_bytes_size = 0;
do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary, just call cd_value.bytes().resize(size)

void
testLoadContractKeySpanningPages()
{
size_t page_size = 16384;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a literal, you should use cfg.EXPERIMENTAL_BUCKETLIST_DB_INDEX_PAGE_SIZE_EXPONENT

testLoadContractKeySpanningPages()
{
size_t page_size = 16384;
for (auto const& bucketHash : getBM().getBucketListReferencedBuckets())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this for loop is doing, all Buckets either have a page size determined by the cfg or a page size of 0 if they are individually index, you can remove it.

}

void
testLoadContractKeysWithInsufficientReadBytes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is pretty hard to follow, I think it would be easier if we chatted and you walked me through it.

@@ -1622,6 +1622,19 @@ TransactionFrame::insertKeysForTxApply(UnorderedSet<LedgerKey>& keys) const
}
}

void
TransactionFrame::insertKeysAndLimitsForTxApply(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this function have to do with the limits (as the name suggests)?

{
return std::numeric_limits<std::uint32_t>::max();
}
for (auto const& txn : lkToTx[key])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Using .find would be more optimal here (no need to call count above), also there's no need to check if lkToTx is empty.

return;
}
// Update read quota for all txns waiting on this key.
auto size = xdr::xdr_size(entry) + kBucketEntrySizeEncodingBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not correct to add anything to the entry size as the read limit only accounts for the entry sizes.

if (!hasTxnWithReadQuota(*currKeyIt))
{
// If no txn has read quota for this key, skip it.
currKeyIt = keys.erase(currKeyIt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test for this? I don't think we need to involve classic, it should be sufficient to access the source account of some tx from the footprint, right?

std::vector<LedgerEntry>& result)
Bucket::loadKeysWithLimits(std::set<LedgerKey, LedgerEntryIdCmp>& keys,
std::vector<LedgerEntry>& result,
UnorderedMap<LedgerKey, UnorderedSet<Hash>>& lkToTx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be const?

insertLedgerKeysWithMappingsToPrefetch(keys, lkToTx);
}
void
InvokeHostFunctionOpFrame::insertLedgerKeysWithMappingsToPrefetch(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should be common for all Soroban ops. Moreover, it really belongs to the transaction layer, as resources are defined at transaction level (not operation level).

{
return;
}
auto size = xdr::xdr_size(key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not correct, only entry sizes are accounted for.

return;
}
// Update read quota for all txns waiting on this key.
auto size = xdr::xdr_size(entry) + kBucketEntrySizeEncodingBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TTL entries are not counted towards the transaction read byte limit (we generally don't want users to care about them). So we should not count them towards the limits here as well.

}
else
{
txReadBytes[tx] -= size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should skip TTL entry size here (see the similar comment in Bucket.cpp )

}

uint32_t
LedgerTxnRoot::Impl::prefetchWithLimits(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand the prefetch flows; is there any way to avoid logic duplication between this and Bucket.cpp?

@anupsdf
Copy link
Contributor

anupsdf commented Mar 21, 2024

While discussing pre-fetching work with Marta and Nico, a suggestion came up to have separation of phases (classic phase and Soroban phase) when pre-fetching entries. This will help us have better metrics on IO and also help in future parallelization work.
(Already discussed with Tom)

@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from efaffc9 to 4e98d50 Compare March 27, 2024 23:14
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good overall! The structure is a lot simpler and easier to follow, nice job! I think there's a few areas where you can further simplify the metering interface and hide some of the internal details a bit more.

Staging wise, I think we can punt on the partial deserialization in this PR and follow up on it later. I think we're pretty close! No significant issues as far as I can tell. Biggest thing missing is an end-to-end test at the LedgerTxn level, but after that I think we'll be good to go!

@@ -167,6 +168,69 @@ LedgerEntryPtr::isDeleted() const
{
return mState == EntryPtrState::DELETED;
}
LedgerKeyMeter::LedgerKeyMeter() : meteredLedgerKeyToTx{}, txReadBytes{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: No need for constructor, just default construct these in the header file.

#include "overlay/StellarXDR.h"
#include "util/UnorderedMap.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused includes:

#include "util/UnorderedMap.h"
#include "util/UnorderedSet.h"

@@ -6,7 +6,10 @@

#include "bucket/FutureBucket.h"
#include "bucket/LedgerCmp.h"
#include "ledger/LedgerTxn.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to include this here, just forward declare LedgerKeyMeter.

@@ -5,6 +5,7 @@
// of this distribution or at http://www.apache.org/licenses/LICENSE-2.0

#include "bucket/Bucket.h"
#include "ledger/LedgerTxn.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to include this here, forward declare LedgerKeyMeter instead.

src/ledger/InMemoryLedgerTxnRoot.cpp Show resolved Hide resolved

for (size_t i = 0; i < entries.size(); ++i)
{
auto entry = entries[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto const& entry = entries.at(i)

// entry's size, the value returned by maxReadQuota should be equal
// entry's size.
REQUIRE(lkMeter.maxReadQuotaForKey(key) == entrySize);
// Adding another txn with less readQuota should not affect the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional edge case here: add new TX with a greater read quota than tx 1

auto tx_2 = i + entries.size();
lkMeter.addTxn(tx_2, 0, ks);
REQUIRE(lkMeter.maxReadQuotaForKey(key) == entrySize);
// Consume size(entry) of the read quota of each transaction which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional edge case: consume more than the read quota for a given key.

LedgerEntryKey(const_cast<const LedgerEntry&>(entries[0]))) ==
std::numeric_limits<uint32_t>::max());

for (size_t i = 0; i < entries.size(); ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to call this in a loop, I think you can just run all these tests once since the inner body is just dealing with a single TX and key at a time.

@@ -2804,6 +2807,68 @@ TEST_CASE("Erase performance benchmark", "[!hide][erasebench]")
#endif
}

TEST_CASE("LedgerKeyMeter tests")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are off to a good start and are much easier to follow! I think we should add one more end-to-end test case in this file, you probably want to test the following sections inside of it:

  1. Prefetch a mix of soroban and classic keys, all keys have enough quota
  2. Prefetch a mix of keys, some keys don't have quota
  3. Prefetch a mix of keys, keys have quota, but some indiviual TXs don't have quota.

Ideally, these tests would be as high level as possible.

@@ -1622,6 +1622,14 @@ TransactionFrame::insertKeysForTxApply(UnorderedSet<LedgerKey>& keys) const
}
op->insertLedgerKeysToPrefetch(keys);
}
if (isSoroban())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Soroban prefetching should only be supported with BucketListDB, so need to change this to if (isSoroban() && isUsingBucketsDB()). We might need to through in a few more additional BucketsDB checks (and a few asserts might not hurt too).

@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from db5dc6f to 09a89d0 Compare April 9, 2024 17:10
@ThomasBrady ThomasBrady marked this pull request as ready for review April 9, 2024 17:18
@ThomasBrady ThomasBrady changed the title WIP 3651 prefetch entries in soroban ops 3651 prefetch entries in soroban ops Apr 9, 2024
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very close! I think the implementation is mostly correct, just one bug and a few nit picks. The tests also look good (I like the higher level tests much better!), but I still think there's a few areas where we need some more coverage.

@@ -66,7 +66,8 @@ class SearchableBucketListSnapshot : public NonMovableOrCopyable
void loopAllBuckets(std::function<bool(BucketSnapshot const&)> f) const;

std::vector<LedgerEntry>
loadKeysInternal(std::set<LedgerKey, LedgerEntryIdCmp> const& inKeys);
loadKeysInternal(std::set<LedgerKey, LedgerEntryIdCmp> const& inKeys,
std::shared_ptr<LedgerKeyMeter> lkMeter = nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: In general, I like to avoid default arguments, they can lead to footguns and misunderstanding down the line.

uint32_t prefetchClassic(UnorderedSet<LedgerKey> const& keys) override;
uint32_t
prefetchSoroban(UnorderedSet<LedgerKey> const& keys,
std::shared_ptr<LedgerKeyMeter> lkMeter = nullptr) override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why default to null? If anything I think prefetchSoroban should assert that lkMeter is not null. Is there ever a condition we would call prefetchSoroban and not want metering?

{
UnorderedSet<LedgerKey> txKeys;
tx->insertKeysForTxApply(txKeys, lkMeter);
lkMeter->addTxn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have tx call addTxn inside TransactionFrame::insertKeysForTxApply? I think this would clean up the interface a bit and you wouldn't need the extra copy UnorderedSet<LedgerKey> txKeys;, since TransactionFrame could populate directly into lkMeter and sorobanKeys at the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that works. I was hesitant to change the interface for TransactionFrame, but that is much cleaner.

// key.
bool canLoad(LedgerKey const& key, size_t entrySizeBytes) const;
// Returns true if a key was not loaded due to insufficient read quota.
bool isNotLoaded(LedgerKey const& key) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe reword this function slightly to make it more clear that the key could not be loaded due to insufficient resources, as isNotLoaded to me implies that the key may be loaded later, could not be loaded because it does not exist, etc. Maybe something like loadFailed.

// Returns the maximum read quota across all transactions with this key.
uint32_t maxReadQuotaForKey(LedgerKey const& key) const;

std::vector<uint32_t> txReadBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We use mMemberName for classes and memberName for small structs. This is a bit of a grey space, but I think this class is complex enough to warrant the mMemberName.

entrySet.emplace(e);
entryMap.emplace(LedgerEntryKey(e), e);
}
if (cfg.EXPERIMENTAL_BUCKETLIST_DB)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this if check since cfg.EXPERIMENTAL_BUCKETLIST_DB == true always for this test.

ltx.createWithoutLoading(e);
keysToPrefetch.emplace(LedgerEntryKey(e));
UnorderedSet<LedgerKey> ks{};
ks.emplace(LedgerEntryKey(e));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment explaining this offset would be very helpful.

LedgerKeySet expectedSuccessKeys2;
LedgerKeySet expectedFailedKeys2;

for (auto const& k : keysToPrefetch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this again for lkMeter2? Cant we just use lkMeter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to be sure there was no residual quota left in the meter. In the non-test usecase we never re-use the meter, it is constructed for each prefetch separately as the quotas should not persist across calls.

}
}
auto loadSize = root.prefetchSoroban(smallSet, lkMeter2);
for (auto const& k : expectedFailedKeys2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is redundant with the if (expectedFailedKeys2.find(k) != expectedFailedKeys2.end()) check below.

}
}

TEST_CASE("LedgerKeyMeter tests")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, it looks like "Soroban Prefetch" tests the following case:
All txs have 1 unique Soroban metered key (as in no TTL entry). Some txs have enough resources, some do not. Check that they are loaded or not loaded properly. No TXs have overlapping keys.

I see you have some additional edge cases covered in the LedgerKeyMeter tests which is good, but I'd like to see this edge cases tested more end to end at the ledgertxn level. I think adding the following cases to "Soroban Prefetch" would be helpful (even if some of these are redundant with "LedgerKeyMeter tests":

  1. Mix of metered soroban and classic keys
  2. Multiple TXs referencing the same key, some of which have enough resource and some do not. I think it would be interesting if there was a TX that read 3 keys and only ran out of budget on the last one. Then maybe another TX that only reads failed key from the previous TX but has enough budget. Just interesting permutations of TXs that are overlapping that have a few keys instead of just one key.
  3. Live cached entries are metered
  4. Dead cached entries (i.e. null entries) are not metered (this is currently not working)
  5. TTL keys are not metered

@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from 192b102 to 32a2b61 Compare April 11, 2024 23:14
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! A few minor nitpics and one implementation bug. I think everything looks like it working! The LkMeter interface in particular has improved a lot and is easy to follow, thanks for all the iterations. I'd like to see a little refactoring and cleanup of the test cases so they're a little easier to follow and have less redundant code, but I think we should be able to merge after that :)


auto entries =
LedgerTestUtils::generateValidUniqueLedgerEntriesWithExclusions(
{TTL}, 11);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exclude CONFIG_SETTING here too.

for (auto const& tx : txs)
{
tx->insertKeysForTxApply(keys);
if (tx->isSoroban() && mApp.getConfig().isUsingBucketListDB())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check isn't quite right, it should be

if (tx->isSoroban()) {
    if (isUsingBucketListDB()) {
    ...

Currently, if we have a soroban TX but the node isn't running BucketListDB, we will still insert any classic keys from the footprint into classicKeys, and these reads will be unmetered. We should not call insertKeysForTxApply on Soroban TXs when BucketListDB is disabled.

class LedgerKeyMeter
{
public:
LedgerKeyMeter() : mTxReadBytes{}, mLedgerKeyToTxs{}, mNotLoadedKeys{} {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This constructor can be removed. Just default initialize these values in the member declaration.

{
public:
LedgerKeyMeter() : mTxReadBytes{}, mLedgerKeyToTxs{}, mNotLoadedKeys{} {};
LedgerKeyMeter(LedgerKeyMeter const& other) = delete;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you trying to accomplish with this deletion? If you want to remove copying and/or moving, you should look use util/NonCopyable.h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want this class to be accidentally copied. I can't see a case where that would be desired (but several cases where we might accidentally copy it and then update the quotas across several meters).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this class inherit from NonMovableOrCopyable, should I also add enable_shared_from_this<LedgerKeyMeter> ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to create a shared_ptr from within any of the LedgerKeyMeter functions, so enable_shared_from_this<LedgerKeyMeter> shouldn't be necessary.

// Stores the read quota for each transaction.
std::vector<uint32_t> mTxReadBytes;
// Stores the keys that each transaction will read. i.e.
// mLedgerKeyToTxs[key] is the set of indeces of mTxReadBytes corresponding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: indices

mTxReadBytes.push_back(resources.readBytes);
auto txId = mTxReadBytes.size() - 1;
auto addKeyToTxnMap = [&](auto const& key) {
if (mLedgerKeyToTxs.find(key) == mLedgerKeyToTxs.end())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can avoid 2 lookups here if you do:

auto iter = mLedgerKeyToTxs.find(key);
if (iter == end) ...

iter->second.insert(txId);

Alternatively, because we explicitly want to create a new entry in mLedgerKeyToTxs, this is the one scenario where the [] operator may be appropriate. Still, [] is footgunny, so I don't mind the iterator approach here either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be reverted.

if (lkMeter)
{
auto const& resources = sorobanResources();
keys.insert(resources.footprint.readOnly.begin(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each Soroban type key in the footprint also has a TTL key implicitly included in the footprint that we also need to prefetch. For each key in the footprint, if it's a Soroban key, we need to add its TTL key to the prefetch set.

@@ -2803,6 +2818,455 @@ TEST_CASE("Erase performance benchmark", "[!hide][erasebench]")
}
#endif
}
TEST_CASE("LedgerTxnRoot prefetch soroban entries edge cases", "[ledgertxn]")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see some great test coverage here! I think you have good cases covered, but it's a little hard to follow (and will be hard to debug if something breaks) given how every test case in condensed and tested into a single run. I think if you break this into pieces where each interesting edge case is it's own SECTION, it will be easier to follow, easier to setup, and we'll know what we need to debug if something fails. Right now if we break something, it will be difficult to determine what edge case we're hitting. I think if we do it like this, we can also condense LedgerTxnRoot prefetch soroban entries edge cases with LegerTxnRoot prefetch soroban entries to prevent copy-pasting lambdas and other setup code.

@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from 0355da5 to 970a7b3 Compare April 17, 2024 21:01
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty close! I think we need to do another pass on one test case. I think we can simplify it more, and we need to tighten some of the checks. For example, we don't actually check that keys that have insufficient quota are not loaded. I think if we simplify this test a bit, we'll be able to tighten some of the checks and make this possible.

}

void
LedgerKeyMeter::addTxn(const stellar::SorobanResources& resources)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: SorobanResources const&

{
public:
// Adds a transaction with a read quota and the keys it will read.
void addTxn(stellar::SorobanResources const& resources);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: SorobanResources const&

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curr XDR changes also need to be reverted.

resources.footprint.readWrite.end());

// TTL keys must be prefetched for soroban entries.
for (auto const& k : resources.footprint.readOnly)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to insert TTL keys for readWrite footprint entries.

{CONFIG_SETTING}, 1);
auto ttlEntry = LedgerTestUtils::generateValidLedgerEntryOfType(TTL);
entries.push_back(ttlEntry);
for (auto e : entries)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be auto &

ensureLedgerKeyMeterCanLoad(
lkMeterCold, entries, expectedSuccessKeys, expectedFailedKeys);
};
auto checkPrefetch = [&](std::vector<LedgerEntry>& allEntries) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should take a set of expectedSuccess keys.

LedgerTxn ltx2(root);
auto numLoadedCold =
root.prefetchSoroban(keysToPrefetch, lkMeterCold);
REQUIRE(numLoadedCold >= expectedSuccessKeys.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we pass in an explicit expectedSuccess instead of generating one in addTxn, we can enforce this more tightly with REQUIRE(numLoadedCold == expectedSuccessKeys.size());

{
auto txle = ltx2.load(k);
REQUIRE(txle);
REQUIRE(entrySet.find(txle.current()) != entrySet.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is redundant, REQUIRE(txle); is sufficient.

REQUIRE(txle);
REQUIRE(entrySet.find(txle.current()) != entrySet.end());
}
// Ensure that the meter is exhausted after loading.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the load loop, we need to check prefetch hit rate to make sure it's 100% to ensure that the keys we loaded in the loop were actually in the cache.

// cache size > number of entries
runTest(1000);
}
SECTION("prefetch with small cache")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we actually need to check this here. It will simplify the test case, and we already have coverage in the classic prefetch tests. The soroban specific prefetch logic just uses the same cache eviction library as the classic prefetch.

auto tx2Entries =
std::vector<LedgerEntry>{entries.at(0), entries.at(1)};
addTxn(0, tx1Entries);
addTxn(-1, tx2Entries, deadKeys);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite right. We want to make sure that deadKeys don't count towards quota, so this should just be a single call to addTxn(0, tx1Entries, deadKeys)

{
auto tx1Entries =
std::vector<LedgerEntry>{entries.at(0), entries.at(1)};
addTxn(-1, tx1Entries);
Copy link
Contributor

@SirTyson SirTyson Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this test, the key we expect to succeed is whichever key is lexographically smaller. Note that you can use < to compare LedgerKey types.

Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A few small cleanups in the test then I think we're done :)


auto preLoadPrefetchHitRate = root.getPrefetchHitRate();
REQUIRE(preLoadPrefetchHitRate == 0);
auto checkPrefetch = [&](std::vector<LedgerEntry> const& allEntries,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove allEntries and expectedFailedKeys, as they are not used.

std::set<LedgerKey> expectedFailedKeys;
expectedSuccessKeys.emplace(LedgerEntryKey(TTLEntry));
// Whichever entry is loaded first will succeed. The other will fail.
if (classicEntry < contractDataEntry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically classic entries are always strictly less than Soroban entry types, so you can just leave a comment that says the smallest entry will get loaded first successfully and point in the classicEntry instead of doing this if check.

if (offset + resources.readBytes < 0)
keysToPrefetch.emplace(k);
// Randomly add the key to either the read or write set.
if (stellar::rand_flip())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While TTL keys should be in keysToPrefetch, they should not be in the footprint.

std::vector<LedgerEntry> ledgerVect{entrySet.begin(), entrySet.end()};
std::vector<LedgerKey> deadKeyVect{deadKeySet.begin(),
deadKeySet.end()};
auto deadKeys =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a set of size 1, just make the dead key a single CONTRACT_DATA key. This will guarantee uniqueness against the other generated keys and will make our footprint valid.

// cache size > number of entries
runTest(1000);
auto tx1Entries =
std::vector<LedgerEntry>{classicEntry, contractDataEntry};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TTL entry should be included in all tests (every where that contractDataEntry is present).

addTxn(-1, tx1Entries);
checkPrefetch(tx1Entries);
// Ensure that the correct entries were loaded.
for (auto const& k : expectedSuccessKeys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed this last iteration, but we don't actually need this loop at all. The hotCache assert with 100% hit rate is sufficient to show that the first prefetch loaded all entries correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't call ltx2.load(k), then the hit rate will be zero as nothing has been loaded. I guess I don't need the assertion tho.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I thought the prefetch itself contributed to the hit rate, but I guess it's just loads after the prefetch, thanks!

Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just one final cleanup. The commit history has gotten a little long, so you should also squash and rebase into one or two commits (we don't want to keep all the review back and forth, but we still want to maintain different commits for discrete logical changes). We also need to investigate why the CI is currently failing.

REQUIRE(fabs(ltx2.getPrefetchHitRate() - 1.0f) < 0.0001f);
};

std::vector<LedgerEntry> allEntries;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allEntries is unused and can be removed (you can just use tx1Entries to populate expectedSuccessKeys).

src/bucket/readme.md Outdated Show resolved Hide resolved
{
// Consume key size bytes from the quotas of transactions
// containing the key (i.e. set their remaining quota to 0).
lkMeter->updateReadQuotasForKey(*currKeyIt, keySize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the key size is only taken into account if we can't load the entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an optimization. At this point we know that the size of the key exceeds the remaining quota for all transactions which contain it, so we do not read the entry. We consume keySize (i.e. set the remaining quota to zero) for all transactions containing this key, so that next time canLoad is called with keys in the same transaction(s) it will return false. (Note its possible some of those keys are included in other within-quota transactions, in which case canLoad would return true)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok so you use the keySize in canLoad because the key is guaranteed to be smaller than the entry, and then you clear the quotas (you're using keySize because you know keySize is enough to clear every quota, but max int would work as well). Maybe clarify this in the comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comment to say:

                // If the transactions containing this key have a remaining
                // quota less than the size of the key, we cannot load the
                // entry, as xdr_size(key) <= xdr_size(entry). Here we consume
                // keySize bytes from the quotas of transactions containing the
                // key so that they will have zero remaining quota and
                // additional entries belonging to only those same transactions
                // will not be loaded even if they would fit in the remaining
                // quota before this update.                

Is that sufficient?

src/bucket/BucketSnapshot.cpp Outdated Show resolved Hide resolved
src/bucket/BucketSnapshot.cpp Show resolved Hide resolved
src/bucket/BucketSnapshot.cpp Outdated Show resolved Hide resolved
src/ledger/LedgerTxn.cpp Outdated Show resolved Hide resolved
src/ledger/LedgerTxn.cpp Outdated Show resolved Hide resolved
src/ledger/LedgerTxn.cpp Outdated Show resolved Hide resolved
@@ -667,7 +667,8 @@ LedgerTxnRoot::Impl::bulkLoadAccounts(UnorderedSet<LedgerKey> const& keys) const
{
BulkLoadAccountsOperation op(mApp.getDatabase(), keys);
return populateLoadedEntries(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this function would benefit from and overload or default parameter, as to avoid the confusing nullptr passed around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any changes here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! done now 😄

src/ledger/test/LedgerTxnTests.cpp Outdated Show resolved Hide resolved
src/ledger/test/LedgerTxnTests.cpp Outdated Show resolved Hide resolved
@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from 581fc0e to b874972 Compare April 19, 2024 21:31
UnorderedSet<LedgerKey> keys;
UnorderedSet<LedgerKey> sorobanKeys;
auto lkMeter = make_unique<LedgerKeyMeter>();
UnorderedSet<LedgerKey> classicKeys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious to see perf results on various catchup segments 👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a doc with some tracy screenshots comparing catchup over 1000 ledgers on master, on this branch, and on master prefetching soroban ops with no metering. https://docs.google.com/document/d/1sWkABl7yimTWx99YW5qUi-8PiINO6_VBPwbXs1w9B9Q/edit

@@ -2,6 +2,7 @@
// under the Apache License, Version 2.0. See the COPYING file at the root
// of this distribution or at http://www.apache.org/licenses/LICENSE-2.0

#include "bucket/BucketManager.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like CI is failing, any idea what's going on there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also squash commits please?

// mLedgerKeyToTxs[key] is a vector containing the indices of
// mTxReadBytes corresponding to the transactions which have
// the key in their footprint.
UnorderedMap<LedgerKey, std::vector<size_t>> mLedgerKeyToTxs{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is quite an error-prone way to store key <> tx mappings, as you have to update two data structures, and a tx is implicitly tied to the vector index (so if the index changes, it breaks the map). could we have a single map of keys mapped to {txHash, byte limit}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a single map, in order to update the byte limit after reading key A I would have to iterate over all other keys, and all transactions associated with each key and update the limits separately. That's less efficient and more error prone imo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are envisioning something like this map<LedgerKey, vector<pair<Hash, size_t>>> ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if perf is an issue with a single data structure, you can always re-arrange it into two like tx hash -> byte limit and key -> vector<tx hash>. The gist of my comment was that relying on indexes to implicitly represent transactions is error-prone and quite confusing; there doesn't seem to be any harm in using explicit IDs for transactions, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had originally used hash, but @SirTyson suggested using indices #4178 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This saves space, but more importantly is more straightforward, as there's some inconsistencies between using getFullHash and getContentsHash. This also let's you convert UnorderedMap<Hash, uint32_t> txReadBytes to std::vector<uint32_t> where the index into the vector corresponds to txID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah :) I was thinking using hashes is lot more straightforward, since you know exactly what you're mapping. Not sure saving space matters here, we're talking a few extra bytes, right? I mean if ya'll feel strongly, lmk, just feels like there's more opportunity for bugs this way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its both a space saving and a (albeit small) optimization, as we have a constant index into the vector<size_t> txReadBytes instead of a worst-case linear lookup in the UnorderedMap<Hash, uint32_t>. I don't feel very strongly either way, but seeing as this is an internal data structure that is only modified indirectly by the public functions of LedgerKeyMeter, I don't think the current approach is that error prone but I'm happy to change it back to Hashes 🤷

@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch 3 times, most recently from a2c1476 to 243f2f0 Compare April 23, 2024 21:54
@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch 2 times, most recently from 3d9e2c1 to fc2433b Compare May 2, 2024 00:04
Copy link
Contributor

@dmkozh dmkozh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but there is one minor comment that hasn't been addressed.

@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from fc2433b to ddfa45a Compare May 2, 2024 21:28
@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch 5 times, most recently from 82eb651 to 0001da5 Compare May 20, 2024 22:22
@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from 0001da5 to 6d66995 Compare June 3, 2024 23:45
@ThomasBrady ThomasBrady force-pushed the 3651-prefetch-entries-in-soroban-ops branch from 6d66995 to 79907fc Compare June 4, 2024 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prefetch entries in Soroban ops
6 participants