Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix problems in Class Scheduling example for #76, and update to use run method for #59 #77

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kaiavintr
Copy link

The following two changes were required to allow spawning tasks using Tokio:

  • Call rand::thread_rng() each time the random number generator is needed, instead of storing the rng or passing it as a parameter. The thread-local random number generator cannot be passed between threads, so this was preventing the futures returned by some of the async functions from being Send.
  • Use transact_boxed instead of transact_boxed_local, so that the returned future is Send

After making the above changes, tokio::task::spawn can be used instead of thread::spawn (with no need to call block_on).
The return value of tokio::task::spawn is a different kind of JoinHandle, and the code needs to .await it instead of calling join().

Remove some comments that are not meaningful to the reader of the example code.

Add a note explaining how the database will be modified by running the code (especially since it does not use a directory).

Make the parameters of the simulation (number of students and number of operations to attempt per student) const values for clarity.

Instead of using all 1,620 class names, and initializing the seat counts to 100, use const values for number of classes and initial number of seats, choosing a random selection of class names from the list of possible combinations.

Use subspaces consistently throughout the code.
When getting the key range for querying the attends keys for the student, use the .subspace method to get a Subspace struct for the prefix. Add a comment explaining this because it seems confusing/unintuitive (but I don't think the API provides a better way of doing it).

Pass false for the snapshot parameter to the get method, and add comments explaining why this is important (since this could be a major gotcha for someone new to the API).

Make ditch_trx and signup_trx both return bool values indicating whether the operation was performed. If the user was not in the class, or was already in the class, return false. Obviously it could return an Err here to cancel the transaction, but returning success values other than () adds variety to the example code.

Make switch_classes() handle the cases where the student was not in the old class, or was already in the new class, canceling the transaction in each case. Define two new error codes for the two cases.

In perform_op, check the return values of ditch and signup, and update my_classes only if the operation succeeds. A similar change is not required for the switch case, because that will return an error result.

Log the operations that were successfully performed, to make the simulation more interesting. (The code already has commented-out printlns inside the transactions, but that logs operations even if they failed, or logs them multiple times if the transaction is repeated, which produces very cluttered and confusing output.)

Add a comment explaining why get_available_classes is called when it is, since that is one of the more confusing parts of the example.

Prevent panics when trying to choose values from my_classes or available_classes and the vectors are empty.

Use get_ranges_keyvalues and get_ranges instead of get_range, since get_range is not recommended for use by application code and has a confusing parameter (see #71 / #72).
In get_available_classes and signup_trx, use get_ranges_keyvalues the easy way, using .try_collect() to first get all results as a vector.
In run_sim, provide a commented-out example of iterating through the Stream returned by get_ranges_keyvalues, but use get_ranges here instead.

Copy link
Member

@PierreZ PierreZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @kaiavintr 👏
Clippy seems to be not passing, and I left a comment. Otherwise, LGTM 👍

async fn signup(db: &Database, student: String, class: String) -> Result<()> {
db.transact_boxed_local(
async fn signup(db: &Database, student: String, class: String) -> Result<bool> {
db.transact_boxed(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the transact_boxed and use db.run, to make it more like the java bindings?

async fn ditch(db: &Database, student: String, class: String) -> Result<()> {
db.transact_boxed_local(
async fn ditch(db: &Database, student: String, class: String) -> Result<bool> {
db.transact_boxed(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the transact_boxed and use db.run, to make it more like the java bindings?

@coveralls
Copy link

coveralls commented Nov 21, 2022

Pull Request Test Coverage Report for Build 3534187468

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 17 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.3%) to 74.512%

Files with Coverage Reduction New Missed Lines %
foundationdb/src/tuple/pack.rs 1 92.38%
foundationdb/src/directory/mod.rs 1 94.85%
foundationdb/src/directory/directory_layer.rs 5 93.33%
foundationdb/src/directory/directory_partition.rs 10 81.34%
Totals Coverage Status
Change from base Build 3147490591: 0.3%
Covered Lines: 4315
Relevant Lines: 5791

💛 - Coveralls

@kaiavintr
Copy link
Author

I have now fixed the Clippy issues.

I started trying to use the run method, but immediately ran into difficulties.
I was hoping it would only require minor changes to the transact_boxed calls, but it looks like some other code will need to be modified. Custom errors need to be handled differently, since they are boxed and wrapped in FdbBindingError::CustomError.

The class-scheduling.rs file is the example code for people using 0.7.0 (the documentation links to the examples folder in github), so there still needs to be a version of it that will compile with 0.7.0.
I will make a copy called "class-scheduling-0_7_0.rs" (dots are not allowed in the name) with the changes I have committed so far, and then try changing the original file to use run. Hopefully that name will be obvious enough for people to find it.

@kaiavintr
Copy link
Author

Sorry, I didn't realize that some of the errors are differences with the output of "cargo fmt". I will fix those formatting issues now.

@kaiavintr
Copy link
Author

kaiavintr commented Nov 22, 2022

I noticed there is an infinite recursion in Debug::fmt and Display::fmt for FdbBindingError, e.g. these will overflow the stack:

    println!("{}", FdbBindingError::ReferenceToTransactionKept);
    println!("{:?}", FdbBindingError::ReferenceToTransactionKept);

@kaiavintr kaiavintr changed the title Fix various problems in Class Scheduling example for issue #76 Fix problems in Class Scheduling example for #76, and update to use run method for #59 Nov 22, 2022
@kaiavintr
Copy link
Author

I realized that it is probably still too confusing if someone tries to use the example with 0.7.0 and does not notice the new class-scheduling-0_7_0.rs file.

I will work on adding descriptions of the examples to the README.md file, and also add a comment at the top of class-scheduling.rs (which could be removed after the next release).

@codecov-commenter
Copy link

Codecov Report

Base: 82.28% // Head: 82.55% // Increases project coverage by +0.26% 🎉

Coverage data is based on head (afbd2fc) compared to base (7bdc7ec).
Patch has no changes to coverable lines.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #77      +/-   ##
==========================================
+ Coverage   82.28%   82.55%   +0.26%     
==========================================
  Files          26       26              
  Lines        5042     5039       -3     
==========================================
+ Hits         4149     4160      +11     
+ Misses        893      879      -14     
Impacted Files Coverage Δ
foundationdb/src/directory/directory_layer.rs 93.33% <0.00%> (-1.05%) ⬇️
foundationdb/src/tuple/pack.rs 92.37% <0.00%> (-0.19%) ⬇️
foundationdb-bindingtester/src/main.rs 89.73% <0.00%> (+0.06%) ⬆️
foundationdb/src/directory/mod.rs 94.85% <0.00%> (+1.47%) ⬆️
foundationdb/src/directory/directory_partition.rs 81.34% <0.00%> (+4.47%) ⬆️
foundationdb/src/database.rs 86.59% <0.00%> (+11.34%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@PierreZ
Copy link
Member

PierreZ commented Nov 28, 2022

Sorry for the long delay, nice work 👏

I agree with you, keeping two class-scheduling.rs is a bit too confusing. I think we can keep only one class-scheduling, and having comments about transact_boxed methods.

I have some minors comments, but I think we can merge your work once we have one class-scheduling, if you are okay with this.

EDIT:

I noticed there is an infinite recursion in Debug::fmt and Display::fmt for FdbBindingError, e.g. these will overflow the stack:

    println!("{}", FdbBindingError::ReferenceToTransactionKept);
    println!("{:?}", FdbBindingError::ReferenceToTransactionKept);

This does look like a nasty bug, feel free to open an issue, otherwise I will open-it.

@kaiavintr
Copy link
Author

Unfortunately it looks like the extra file is necessary, because this code still serves as part of the documentation for 0.7.0. (There are links from the 0.7.0 documentation to his folder.) If anyone tries to use the new example code with the 0.7.0 crate and sees compilation errors, they will probably just give up (because changing the code so it works with 0.7.0 is not trivial).

I think it's clearer now, with the updates to README.md and the additional comments in the source, but let me know if you'd like me to change anything, or if there's a better solution to the compatibility problem.

If this needs more discussion, I can add details to #59.

@PierreZ
Copy link
Member

PierreZ commented Nov 29, 2022

Unfortunately it looks like the extra file is necessary, because this code still serves as part of the documentation for 0.7.0. (There are links from the 0.7.0 documentation to his folder.) If anyone tries to use the new example code with the 0.7.0 crate and sees compilation errors, they will probably just give up (because changing the code so it works with 0.7.0 is not trivial).

I think it's clearer now, with the updates to README.md and the additional comments in the source, but let me know if you'd like me to change anything, or if there's a better solution to the compatibility problem.

If this needs more discussion, I can add details to #59.

My intention is to publish a 0.8.0 pretty soon, once current PRs are merged. This will fix the link in the documentation.

@kaiavintr
Copy link
Author

Ok, in that case I will remove class-scheduling-0_7_0.rs from the PR. Hopefully it will not be long before the 0.8.0 release.

@PierreZ
Copy link
Member

PierreZ commented Dec 11, 2022

Something is wrong on the Code coverage check, I will try to bump the version.

@PierreZ
Copy link
Member

PierreZ commented Dec 11, 2022

@kaiavintr, can you rebase your work? I pushed a commit on main that should fix the CI

@kaiavintr
Copy link
Author

Would you prefer a squash commit, or should I leave the 7 individual commits when rebasing?

@PierreZ
Copy link
Member

PierreZ commented Dec 12, 2022

Would you prefer a squash commit, or should I leave the 7 individual commits when rebasing?

If you don't mind, a squash commit is prefered 😄

…db-rs#76

Changes required to allow spawning the tasks using Tokio:
- Call rand::thread_rng() each time the random number generator is
  needed, instead of storing the rng or passing it as a parameter. The
  thread-local random number generator cannot be passed between
  threads, so this was preventing the futures returned by some of the
  async functions from being Send.
- Use transact_boxed instead of transact_boxed_local, so that the
  returned future is Send

After making the above changes, tokio::task::spawn can be used instead
of thread::spawn (with no need to call block_on).
The return value of tokio::task::spawn is a different kind of
JoinHandle, and the code needs to .await it instead of calling join().

Remove some comments that are not meaningful to the reader of the
example code.

Add a note explaining how the database will be modified by running the
code (especially since it does not use a directory).

Make the parameters of the simulation (number of students and number of
operations to attempt per student) const values for clarity.

Instead of using all 1,620 class names, and initializing the seat counts
to 100, use const values for number of classes and initial number of
seats, choosing a random selection of class names from the list of
possible combinations.

Use subspaces consistently throughout the code.
When getting the key range for querying the attends keys for the
student, use the .subspace method to get a Subspace struct for the
prefix. Add a comment explaining this because it seems
confusing/unintuitive (but I don't think the API provides a better way
of doing it).

Pass false for the snapshot parameter to the .get() method, and add
comments explaining why this is important (since this could be a major
gotcha for someone new to the API).

Make ditch_trx and signup_trx both return bool values indicating
whether the operation was performed. If the user was not in the class,
or was already in the class, return false. Obviously it could return an
Err here to cancel the transaction, but returning success values other
than () adds variety to the example code.

Make switch_classes() handle the cases where the student was not in the
old class, or was already in the new class, canceling the transaction
in each case. Define two new error codes for the two cases.

In perform_op(), check the return values of ditch() and signup(), and
update my_classes only if the operation succeeds. A similar change is
not required for the switch case, because that will return an error
result.

Log the operations that were successfully performed, to make the
simulation more interesting. (The code already has commented-out
printlns inside the transactions, but that logs operations even if
they failed, or logs them multiple times if the transaction is
repeated, which produces very cluttered and confusing output.)

Add a comment explaining why get_available_classes() is called when it
is, since that is one of the more confusing parts of the example.

Prevent panics when trying to choose values from my_classes or
available_classes and the vectors are empty.

Use get_ranges_keyvalues() and get_ranges() instead of get_range(),
since get_range() is not recommended for use by application code and
has a confusing parameter (see foundationdb-rs#71 / foundationdb-rs#72).
In get_available_classes() and signup_trx(), use get_ranges_keyvalues()
the easy way, using .try_collect() to first get all results as a
vector.
In run_sim(), provide a commented-out example of iterating through the
Stream returned by get_ranges_keyvalues(), but use get_ranges() here
instead.
…iondb-rs#59, and improve README.md for the examples folder

Change the `Result<T>` type (returned by various functions) to use
`FdbBindingError` instead of the custom `Error` type.
`Error` no longer needs to wrap `FdbError`, so the `Internal` variant
can be removed.

Implement the `std::error::Error` trait for the custom Error type.
This is needed because `FdbBindingError` can only wrap custom errors
if they implement `std::error::Error`.
Requires implementations of `Display` and `Debug`.

Add a `wrap_custom_error` function, because wrapping an error is a
bit complicated, and it's needed in four places.

Change the type of the `txn` parameter to `RetryableTransaction` for
`ditch_trx`, `signup_trx`, and `switch_classes_body`.
Change the string parameters to type `String` to avoid lifetime
errors with `&str` references due to the combination of closures and
`async` functions.

Handle the `FdbBindingError` returned by perform_op.
The error handingcode now differentiates between custom errors (which
require calling `get_available_classes` to update the cache) and
other errors, which are fatal.

Add comment explaining that class-scheduling.rs is not compatible with v0.7.0

Also add descriptions of the examples to README.md
@kaiavintr
Copy link
Author

Ok, it's rebased and squashed to two commits now (I think keeping the .run() changes in a separate commit makes it more clear).

@PierreZ PierreZ added correctness run hundred of iterations on the bindingTester during CI and removed correctness run hundred of iterations on the bindingTester during CI labels Dec 13, 2022
@PierreZ
Copy link
Member

PierreZ commented Dec 14, 2022

Thanks @kaiavintr

Something is wrong, there is no CI associated to this PR 🤔

the PR correcness has been triggered by me, but there should be others
image

@PierreZ
Copy link
Member

PierreZ commented Jan 8, 2023

@kaiavintr, could you trigger the CI by pushing/altering a commit? Something seems wrong

@kaiavintr
Copy link
Author

Do you still need me to do this? I was trying to figure out how to change the pull request without actually changing anything. Does it need to be a meaningful change in a .rs file in order to trigger the CI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants