Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomination prototype #3856

Closed
wants to merge 54 commits into from
Closed

Nomination prototype #3856

wants to merge 54 commits into from

Conversation

JuI3s
Copy link
Contributor

@JuI3s JuI3s commented Jul 26, 2023

Draft working off Hidenori's branch on the nomination protocol prototype. Replace transaction sets with their hashes.

Description

Resolves #X

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

@JuI3s JuI3s force-pushed the nomination-prototype branch 3 times, most recently from 7b73bb9 to ccf54de Compare August 2, 2023 04:07
@marta-lokhova marta-lokhova self-requested a review August 8, 2023 22:06
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're off to a good start! I'm not super knowledgeable on specific SCP issues so this is more of a high-level overview for now, but I'll come back through and validate the actual logic a little more once the unit tests are passing. It might seem like a bunch of comments, but most of the stuff is more code style and git related issues which is always a bit tricky to get right the first couple times :).

lib/asio Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed in a few of your PR's that you've been adding changes to non-source files and changes to src files that shouldn't be included like white space changes. No big deal, I ran into this issue when submitting my first couple of PR's too because I used git add . to add my changes. I'm not sure exactly what git frontend you're using, but it looks like you might be picking up some unwanted changes if you're doing something similar. This change shouldn't update any of the submodule versions (which is what all the lib/* changes are) and it makes it a little difficult for reviews to help you debug because it leads to build errors on other people's machines and rebase conflicts with master. If you use VSCode, I'd recommend using the source control tab to view changes and add files to a commit individually after reviewing the diffs, or doing something similar with whatever git flavor you like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest way to undo this (or at least what I do) is probably to run git reset --soft HEAD^, then unstage the lib changes. git reset --soft ~commit-hash~ resets the commit history, but leaves all the changes from the commit staged. HEAD^ is just shorthand for the last commit hash. I use this a lot when I need to fix a change I've already pushed when I don't want to send another commit. After you've reset the commit history, you can call git restore --staged file/path on individual files to remove them from the PR. Once you've removed the files you can redo the commit with git commit --amend --no-edit and push the edited commit with git push -f.

src/herder/HerderImpl.cpp Outdated Show resolved Hide resolved
src/herder/HerderImpl.cpp Outdated Show resolved Hide resolved
src/herder/PendingEnvelopes.cpp Outdated Show resolved Hide resolved
// Overwrite this with TX_SET_BACKOFF_DELAY_MS
// For the prototype, I made this a regular variable (not a const)
// but I'm not sure if it's a good idea.
static std::chrono::milliseconds MS_TO_WAIT_FOR_FETCH_REPLY;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For constants like this, we often have the value as a static constexpr, but have a get function. For testing purposes, we have a BUILD_TEST ifdef inside the get function that can return a variable value that you can set for testing. This is helpful because you can change the value in test environments but can ensure it's a proper static constexpr in prod. See SorobanNetworkConfig::getBucketListSizeSnapshotPeriod() and BUCKETLIST_SIZE_SNAPSHOT_PERIOD for an example.

@@ -110,6 +119,12 @@ class LoopbackPeer : public Peer
double getReorderProbability() const;
void setReorderProbability(double d);

void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only for testing, you should wrap this in #ifdef BUILD_TESTS for safety purposes. Same with the bool mSuspendTxFlooding{false}; above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this should be be only used for testing purposes (Hide made this change)? Is there any benefit in the real environment we might want to suspend flooding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoopbackPeer is only used for testing, so no need for BUILD_TESTS here.

src/protocol-next/xdr Outdated Show resolved Hide resolved
@@ -1256,6 +1257,12 @@ OverlayManagerImpl::getMaxAdvertSize() const
return res;
}

UnorderedMap<Hash, std::vector<std::weak_ptr<Peer>>>
OverlayManagerImpl::getPendingTxSetRequests()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: OverlayManagerImpl::getPendingTxSetRequests() const

@@ -1256,6 +1257,12 @@ OverlayManagerImpl::getMaxAdvertSize() const
return res;
}

UnorderedMap<Hash, std::vector<std::weak_ptr<Peer>>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a const&

auto pendingTxSetRequests =
mApp.getOverlayManager().getPendingTxSetRequests();

for (auto& weakPeer : pendingTxSetRequests[frame->getContentsHash()])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% percent sure, but I think this unchecked map access may open up a DOS angle. Probably want to check with @marta-lokhova, but it looks like this assumes the incoming message asks for a pending TxSet hash that we currently have in pendingTxSetRequests. I don't think this assumption is valid given byzantine nodes or dropped messages from a well intentioned node. You should check that the hash exists in the map and not do an unchecked access.

@JuI3s JuI3s force-pushed the nomination-prototype branch 3 times, most recently from 7d7c1d7 to 1c38c02 Compare August 28, 2023 14:47
@JuI3s JuI3s force-pushed the nomination-prototype branch 2 times, most recently from 6b71e20 to 2e375bb Compare September 5, 2023 14:51
@JuI3s JuI3s force-pushed the nomination-prototype branch 5 times, most recently from bb0277b to 2701942 Compare September 18, 2023 04:46

if (slotIndex == 0)
{
CLOG_INFO(Overlay,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just send DONT_HAVE immediately here, since we're not tracking this tx set.

PR fix No.1
@stellar stellar deleted a comment from JuI3s Sep 28, 2023
@JuI3s JuI3s force-pushed the nomination-prototype branch 3 times, most recently from 2745014 to 5e5edd2 Compare September 28, 2023 20:52
// │ ┌────────┴───────┐ │ ▲
// Local node │Local node recvs│ │ │
// receives │ GetTxnSet │◀────┘ │
// new txn │ request. │ Y│
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "Retry after timeout" mean here? It would be the remote node retrying, right?

@marta-lokhova
Copy link
Contributor

Closing this as stale, we'll probably open a new PR picking up this work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants