Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: zeroize Point and BigInt on drop, and other zeroize changes #154

Closed
wants to merge 9 commits into from
Closed

Fix: zeroize Point and BigInt on drop, and other zeroize changes #154

wants to merge 9 commits into from

Conversation

nskybytskyi
Copy link

@nskybytskyi nskybytskyi commented Nov 15, 2021

  • Impl Zeroize for Point<E>;
  • Zeroizing<Point<E>> for points that need to be zeroized;
  • zeroize num-bigint properly and on drop;
  • delete implementation of Zeroize for BigInts.

@nskybytskyi
Copy link
Author

Is everything okay with your Travis CI? 🙃

@survived
Copy link
Contributor

survived commented Nov 16, 2021

@Sky-Nik we ran out of credits on Travis CI 🙂 I'll run tests locally

(suggested by @survvived and @elichai)
Note: actual contents of gmp bigint are zeroized on drop within gmp, so it does not make sense to zeroize them by hand.
src/arithmetic/mod.rs Outdated Show resolved Hide resolved
Note: I had to tweak the implementation of zeroize a bit once again, to get past borrow checker. I believe it still does the same thing, and it still passes my tests, but feel free to double-check.
@DmytroTym
Copy link

Potentially closes ZenGo-X/multi-party-ecdsa#148 and #150.

@nskybytskyi
Copy link
Author

@survived one more thing: earlier we were concerned with Vec reallocations, but then we reviewed the code and realized that the current version of curv is mostly iterator-based. Dima and I are not huge Rust experts, so we are not sure if such iterators leave intermediate values in memory or if they construct the resulting object in place. Could you please clarify this?

@survived
Copy link
Contributor

the current version of curv is mostly iterator-based

What do you mean? Could you give an example?

earlier we were concerned with Vec reallocations

Actually, I'm not sure whether we should address reallocation concern in num-bigint backend. It slows down the performance which might be not desirable. Maybe we should address reallocations zeroizing in another backend?

@DmytroTym
Copy link

DmytroTym commented Nov 19, 2021

What do you mean? Could you give an example?

One random example might be Polynomial's sample_exact_with_fixed_const_term. Its predecessor in curv 0.7, Feldman VSS's sample_polynomial method, had a problem: it first created a vector of coefficients, consisting of one element, and then extended it with a vector of random_coefficients. Depending on the size of the latter, coefficients vector might have been reallocated in memory after this extension. Now, even if elements of this vector implemented Zeroize on drop, they would not get zeroed out from their old location in memory. We got this impression looking at the code of Vec reallocation and verified it experimentally. I guess in situations like this, it's better to allocate memory in advance using Vec's with_capacity?
That said, the new code is very different: it uses iterators and we do not understand if in this specific case and in other similar places in curv any reallocations happen (we do not consider num-bigint here, only higher-level curv methods). Do you happen to know if/when iterators reallocate and are there places in curv where it might happen? We can look into that ourselves if you don't know the answer straight away.

@survived
Copy link
Contributor

survived commented Nov 22, 2021

I don't think that in this particular case vector gets reallocated. Since it's constructed from Iterator, it can use Iterator::size_hint in order to allocate sufficient space to avoid reallocations. Here's a small program that traces every allocation, note that number of allocations is the same regardless of the tail size (try 10/100/1000/10000).

Though relocation might happen somewhere else in the library. The entire library needs to be carefully reviewed in order to make sure that it doesn't happen.

@survived
Copy link
Contributor

Moreover, (but I might be mistaken) every time we pass Scalar by value as an argument to some function, it gets copied and old version persist on stack without zeroing out. I mean:

let scalar = Scalar::random();
some_function(scalar);

Here scalar is supposed to be moved to some_function which means that its bits are going to be copied, and old value persist on stack. Possible solution is to clone it:

let scalar = Scalar::random();
some_function(scalar.clone());

In this case, scalar is not moved and is dropped properly. So there are a lot of things we need to keep in mind in order to design zeroing properly.

@survived survived linked an issue Nov 22, 2021 that may be closed by this pull request
@nskybytskyi
Copy link
Author

  1. Iterators: it is good news that no reallocations happen!
  2. Pass by value: it appears to me that most curv methods do not take Scalars by value. We will definitely take care of such concerns in our project, but I don't know if we can improve zeroize design without performance regression and need to modify some client-side code 😞

Note: as pointed out by @survived, passing Scalars by value will create an extra copy on the stack. Several methods and functions were updated to take a reference instead.
@nskybytskyi nskybytskyi changed the title Fix: impl zeroize for point and zeroize points that need to be zeroized Fix: zeroize Point and BigInt on drop, and other zeroize changes Nov 22, 2021
@DmytroTym
Copy link

DmytroTym commented Nov 22, 2021

As for scalars getting implicitly copied - in some functions, they are passed by reference and I think it is, if possible, the way to go, right? Any code that calls such function won't be able to accidentally create a copy. If scalars are passed by value, all callers should carefully clone scalars and so we transfer our problems onto the client code.
However, this copying happens not only with functions, but also when creating structures, right?
Anyway, do you think that we can 1) completely eliminate leaks on the stack 2) should take it as seriously as secrets persisting on the heap? For example, in GMP manual on secure methods, they guarantee no implicit heap allocations, but don't do the same for the stack:

Note however that compilers may choose to spill scalar values used within these functions to their stack frame and that such scalars may contain sensitive data.

@survived
Copy link
Contributor

survived commented Nov 22, 2021

@Sky-Nik,

most curv methods do not take Scalars by value

Indeed, we didn't take into account that aspect while were designing the library. Good opportunity to improve it. I don't think that bunch of .clone() will affect performance in any noticeable way, since all scalars are actually Copyable

@DmytroTym,

in some functions, they are passed by reference and I think it is, if possible, the way to go, right?

Yes, accepting secret by reference is the best way to avoid accident moving. (secrets like Scalar<E>; on other hand for BigInt it doesn't matter)

this copying happens not only with functions, but also when creating structures, right?

My intuition says that this code leaves a copy on the stack (unless compiler optimizes out extra stack variables; but we always can write more complex function that cannot be optimized):

fn keypair() -> Keypair {
    let sk = Scalar::random();
    let pk = Point::generator() * &sk;
    Keypair{ sk, pk }
}

And it needs to be rewritten:

fn keypair() -> Keypair {
    let sk = Scalar::random();
    let pk = Point::generator() * &sk;
    Keypair{ sk: sl.clone(), pk }
}

(but clippy will be complaining that .clone() is redundant)

do you think that we can 1) completely eliminate leaks on the stack 2) should take it as seriously as secrets persisting on the heap

I believe that artefacts on the stack is not a really bad thing, as they much likely are going to be rewritten by next function call. But we ideally should minimize these leaks. I feel like it's more important to vanish secrets on heap, but it's doubtable

@elichai
Copy link
Contributor

elichai commented Nov 22, 2021

@survived Your code still returns KeyPair by value, meaning it will copy it into the caller stack.
if you really want to do a best effort your best bet is to probably avoid the stack altogether and return a Box (even then, there still will be stack variables all over the place underneath these operations)

Note: as pointed out by @DmytroTym and @survived, moving Scalars into structs will create an extra copy on the stack. Several struct constructions were updated to clone instead.
@nskybytskyi
Copy link
Author

@survived I added two commits that try to pass Scalars by reference and clone them when constructing structs, but I must say that I agree with elichai. It sounds extremely difficult to wipe out the stack completely, as values are copied around all the time, including the underlying gmp implementation.

The performance regression I was talking about earlier actually refers to the "Box everything" solution suggested by elichai.

@survived
Copy link
Contributor

@elichai,

Your code still returns KeyPair by value, meaning it will copy it into the caller stack.

Doesn't it writes directly to caller stack? It writes to own stack first and then just copies the value to caller stack?

Anyway, zeroing stack values very tricky, right

@DmytroTym
Copy link

DmytroTym commented Nov 22, 2021

Anyway, zeroing stack values very tricky, right

It seems so :(
Do you think that, as a last resort, something like this can be used to just clear the stack after the computations are completed? Not by curv itself, but by its callers.

@survived
Copy link
Contributor

Can't say anything about that crate, I haven't used it. But looks like it erases stack, yeah, but you need to choose accurate amount of pages to be cleared.

@DmytroTym
Copy link

DmytroTym commented Nov 23, 2021

@survived would it be OK if we:

  1. Check if any reallocations happen (maybe there's a stray vector getting resized somewhere or smth?). If I understand correctly, this is the last option for values persisting on the heap (well, except for those potentially allocated by BigInt backends, which is a whole another story).
  2. Make a reasonable effort of zeroing out stack values, keeping in mind that we probably won't be able to do it completely.

@nskybytskyi
Copy link
Author

@survived to clarify:

  1. I don't plan on doing much more in this PR, mostly because I doubt that I'll achieve a better solution than the one we already have.
  2. @DmytroTym wanted to look through the code to make sure that I haven't missed something we already discussed, and call it a day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Zeroize for Point<E>
4 participants