Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from RAII to OBRM Terms, expand and improve parts (#322) #323

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

9SMTM6
Copy link

@9SMTM6 9SMTM6 commented Oct 16, 2022

This attempts to solve #322.

Since what was to be done finally was still up for discussion I've now mostly followed by opinion, and have not accounted for that of others.

This obviously isnt entirely finished and still needs polishing, but should serve as a basis for discussion.

@9SMTM6
Copy link
Author

9SMTM6 commented Oct 16, 2022

markdownlint doesnt seem to have an autofix for the line-lenght lint.

As long as we're still discussing the changes I won't put in the work to manually fix these lints, housekeeping that result manually during subsequent changes is frankly no use of my time I accept.

@9SMTM6
Copy link
Author

9SMTM6 commented Oct 16, 2022

Also, I should note that I dont have a TERRIBLE lot of experience with RAII in C++, so these sections are more at risk of being slightly wrong or not as complete as one could wish.

Comment on lines +3 to +6
<!-- I'm not sure this is idomatic to Rust, usually one would want to handle that in the types used themself. -->
<!-- doesnt draw comparisons to `defer` in eg golang or ziglang -->
<!-- Is unneccessarily verbose, IF one would want to do that one could simply define a `Defer` type that holds a closure OR a function pointer that gets executed on drop -->
<!-- theres also crates that aim to implenent this via macros, I expect they use something like the above or below. -->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should go into a separate issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just noticed that while I was there, and didn't have time nor was sure how to approach that. But a seperate issue for that is totally fine by me.

patterns/behavioural/OBRM.md Outdated Show resolved Hide resolved
SUMMARY.md Outdated Show resolved Hide resolved
9SMTM6 and others added 3 commits October 16, 2022 21:31
Co-authored-by: simonsan <14062932+simonsan@users.noreply.github.com>
Co-authored-by: simonsan <14062932+simonsan@users.noreply.github.com>
Copy link
Contributor

@neithernut neithernut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments from my side. I actually only have some minor nits about the content. Most of my ramblings are more motivated by @9SMTM6 expressing not being too familiar with C++.

"Ownership Based Resource Management" (OBRM) - also known as ["Resource Acquisition is Initialisation" (RAII)][wikipedia] - is an idom meant to make handling resources easier and less error-prone.

In essence it means that an object serves as proxy for a resource, to create the object you have to aquire the resource, once that object isn't used anymore - determined by it being unreachable - the resource is released.
It is said the object guards access to the resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd highlight the "O" in "OBRM" a bit more: the object "owns" the resource and thus is responsible for releasing it. But that may be personal preference.

patterns/behavioural/RAII.md Outdated Show resolved Hide resolved

The core aim of the borrow checker is to ensure that references to data do not
outlive that data. The RAII guard pattern works because the guard object
contains a reference to the underlying resource and only exposes such
outlive that data. The OBRM guard pattern works because the guard object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the sections above you say "ORBM-Object". I'd stick to that one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was some sort of attempt of not having to rewrite sections below, and also of introducing the guard terminology which is also used in rusts stdlib (eg with the mutex guard).

patterns/behavioural/RAII.md Outdated Show resolved Hide resolved
patterns/behavioural/RAII.md Outdated Show resolved Hide resolved
C++ has complex rules for copying and moving of values, that Rust managed to simplify while keeping most advantages.
In C++ behavior on a "move" (which is semantically meant to signify passing held resources to the moved-to value) is customizable in its move and move-assignment constructors.
But after a variable has been "moved out of", it must still be accessable in C++.
In Rust, a moved-out-of variable can not be used, only reassigned a new value (this is referred to as "destructive move"), and the behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU you don't reassign a value to a variable moved away. You bind something new to a name which now happens to be unused. One consequence is that the new thing doesn't need to have the same type.

Also, I'd split this sentence into two, e.g.

Suggested change
In Rust, a moved-out-of variable can not be used, only reassigned a new value (this is referred to as "destructive move"), and the behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.
In Rust, a moved-out-of variable can not be used (this is referred to as "destructive move"), though the name of that variable can be re-used.
The behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.

Copy link
Author

@9SMTM6 9SMTM6 Oct 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU you don't reassign a value to a variable moved away. You bind something new to a name which now happens to be unused. One consequence is that the new thing doesn't need to have the same type.

I'm pretty sure you can and I have reused a variable that was moved out. In that case it has to have the same type, and it'll use the same memory, which is the purpose of that rule (and probably the reason C++ requires that an object still be usable after being moved out of), rust found a somewhat more elegant way.

If you want to reuse the name only yeah you can vcreate a new let binding shadowing the variable, but thats not what I'm referring to.

But perhaps thats a sign that should be made more explicit?

Otherwise splitting the sentence is a good suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking past each other. Are you referring to variables implementing Copy? Then I agree with you. But OBRM-Objects do, virtually by definition, implement Drop and thus not Copy.

I thought you meant creating a new binding using the same name. In this case you may end-up reusing the memory previously occupied by the moved value, but that being specified in the language would be news to me.

Copy link
Author

@9SMTM6 9SMTM6 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we probably are. Heres some code to ensure we understand each other now:

struct MoveSemantics {
    field: String,
}

struct WhateverElse(u32);

fn main() {
    let mut to_move_out_of = MoveSemantics {
        field: String::from("Whatever"),
    };
    // move value out of to_move_out_of,
    // afterwards its illegal to access
    let moved_to = to_move_out_of;
    // error
    let attempt_access = to_move_out_of.field;
    // reassignment to variable is legal and uses the same memory 
    // (not for the string, which is a seperate allocation, but even if 
    // to_move_out_of would be on the heap it would be reused nonetheless AFAIK
    // this might be beneficial for some situations
    // needs to be the same type of course
    to_move_out_of = MoveSemantics {
        field: String::new(),
    };
    // shadowing of the variable name, doesnt use the same memory
    let to_move_out_of = WhateverElse(2);
}

I think I used the right terminology here, the variable is the name and refers to the memory, the value is whats written in the memory and abstractly also connected resources, and you can define a new variable which has the same name as a old one, which would be what you've described above as far as I understood it.

Copy link
Author

@9SMTM6 9SMTM6 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the reassignment of the variable.

The reason I felt the need to mention that possibility is that its the one "usage" of the variable that is allowed, AFAIK all others are forbidden. So for completeness its required unless we find another formulation that covers only "usages" of a variable other than reassignment. Also some C++ programmers might be looking for something like that, and might otherwise wrongly think you cant do it.

We could say thats not worth the effort and strike it, or we expand on it, or we keep it as-is and hope people understand it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as I thought, I misinterpreted what you wrote. I think we're on the same page now.
The only case where you can reuse a variable after passing it by value is if its type implements Copy, so maybe the following would work?

Suggested change
In Rust, a moved-out-of variable can not be used, only reassigned a new value (this is referred to as "destructive move"), and the behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.
In Rust, a moved-out-of variable can not be used (this is referred to as "destructive move") unless its type implements [Copy].
The behavior on a move is not customizable, instead a move simply copies the bytes of the moved-out value into the moved-into variable, and ensures the semantics of a destructive move.

The hint at Copy should be sufficient imo. C++ people new to rust will probably look it up.

Copy link
Collaborator

@simonsan simonsan Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ people new to rust will probably look it up.

Please keep in mind that this book is not aimed at only either C++ developers or experienced programmers. We want to try to keep it as inclusive as possible, also to newcomers from other languages. I say that, because if there is a term, that C++ developers need to look up, it's probably good to at least try to explain it a bit or use a less complex explanation/go less into detail.

I haven't had much time the last days to review more of this article, but it's good that others did and you together keep on working on it. I will probably manage to read over it (in terms of reviewing) beginning of next week.


C++ has complex rules for copying and moving of values, that Rust managed to simplify while keeping most advantages.
In C++ behavior on a "move" (which is semantically meant to signify passing held resources to the moved-to value) is customizable in its move and move-assignment constructors.
But after a variable has been "moved out of", it must still be accessable in C++.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is not exactly wrong, but an imo important thing to understand when working with C++ is that the rules in the language are actually not that incredibly complex when it comes to object lifetimes. But they are from another era, and the choices back then had some unfortunate consequences, some of which you detail on further down.
Also, the whole "move" in C++ is really more a convention rather than a concept that exists in the language. And so you end up with zombie objects which end up in an "undefined-but-recoverable" state (simplified for educational purposes) after their contents was moved to another instance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, the rules are not really that complex. What is complex is executing it.

As you say, because moves are more of a convention in C++, which more or less mostly just added a move operation and let the developers implement it, you get undefined states for many objects even in the standard lib, with a few exceptions where behavior is defined in the standard, like unique_ptr AFAIK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly sure but I think even for unique_ptr the standard doesn't mandate that the object moved from is in a defined state but only that a call to std::unique_ptr::reset() will transition it to a defined state. But I imagine any sane implementation will leave a moved-from std::unique_ptr in a reset-state, already.

Copy link
Author

@9SMTM6 9SMTM6 Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting a bit sidetracked;-P.

I could rewrite that section to say something to the nature of "copy and move in C++ are just operations with connected conventions, which must be upheld across the respective copy (assignment) constructor and move (assignment) constructor and might also need adjustments to the destructor", which AFAIK would be the correct description but seems fairly complex right now, and more importantly, AFAIK isnt really how people usually think about copy and move.

Other suggestions, or do you think that suggestion is fine?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think your change is perfectly fine.


<!-- TODO this should be improved, I find it difficult to separate the creation and management of RAII Objects in the **-constructor - so at declaration time - from the one when using the RAII object. Feedbak welcome. -->
This massively simplifies creation and management of OBRM Objects compared to C++, where one often has to do a lot more manual management of RAII classes - definition of the `destructor`, the `copy constructor`, the `copy assignment constructor`, the `move constructor` and the `move assignment constructor` all at once -, which is very error prone, and where RAII objects have to have a legal moved-out state, which often makes usage of these classes more problematic.
For example, `unique_ptr`, the C++ equivalent to `Box`, can contain `nullptr`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and thus is more comparable to an Option<Box>, but less safe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of how its working under the hood if used correctly, yes. But many people work under the assumtion that unique_ptr will always or at least mostly not contain nullptr.

I'm not certain how to express that without overcomplications, and in the end, unique_ptr was introduced with similar aims to box, so I kept it simple at the expense of being perhapos not entirely correct.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could fix the correctness though, by not saying unique_ptr is equivalent, but instead something like that it was introduced for the same purpose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would also be fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote it, but perhaps you can think of a more fluent way to express it. Otherwise feel free to resolve this.


Rust also moves values by default, which can be opted out by explicitly calling `Clone::clone` on each assignment, or on a Type level by implementing `Copy`.
It is currently forbidden, and that is expected to continue, to implement `Copy` on a Type that implements `Drop` or contains a Type that implements `Drop`.
This means that resource aquisition in Rust is a lot more explicit than in C++, as it can not happen during a simple assignment as it can in C++.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You more or less captured already this already:
In C++, objects are expected to stay where they are (in memory) but you end up copying objects every now and then (and knowing when its acceptable is important for a C++ dev). Forbidding that is often an explicit choice and moving is also often more explicit.
In Rust, values tend to move around and if they are simple enough (like a Plain Old Datatype in C/C++), they can impl Copy. If they are not you have to impl Clone in order to multiply an instance, and cloning is always explicit (you have to call Clone::clone).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe if we address the points above we might integrate this with the section above about (currently) "complex C++ rules", but as it is this section still holds new information as far as I can see.

I'd defer this until we've cleared up the parts above.

@simonsan simonsan added C-stale Category: An issue/a PR that hasn't been worked on in a longer time (usually 90 days) C-amendment Category: Amendments to existing content labels Apr 6, 2023
@simonsan simonsan added this to In progress in Content via automation Apr 6, 2023
@simonsan simonsan added the C-zombie Category: A PR/an Issue that might be still useful but was closed label Dec 23, 2023
@simonsan
Copy link
Collaborator

Any updates on this? 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-amendment Category: Amendments to existing content C-stale Category: An issue/a PR that hasn't been worked on in a longer time (usually 90 days) C-zombie Category: A PR/an Issue that might be still useful but was closed
Projects
Content
In progress
Development

Successfully merging this pull request may close these issues.

None yet

4 participants