Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: mark generated snapshot_data as const #45786

Merged
merged 1 commit into from Dec 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 0 additions & 2 deletions src/node_snapshot_builder.h
Expand Up @@ -32,8 +32,6 @@ class NODE_EXTERN_PRIVATE SnapshotBuilder {
v8::Isolate::CreateParams* params);

private:
// Used to synchronize access to the snapshot data
static Mutex snapshot_data_mutex_;
static const std::vector<intptr_t>& CollectExternalReferences();

static std::unique_ptr<ExternalReferenceRegistry> registry_;
Expand Down
7 changes: 2 additions & 5 deletions src/node_snapshotable.cc
Expand Up @@ -978,7 +978,7 @@ static void WriteCodeCacheInitializer(std::ostream* ss, const std::string& id) {
*ss << " },\n";
}

void FormatBlob(std::ostream& ss, SnapshotData* data) {
void FormatBlob(std::ostream& ss, const SnapshotData* data) {
ss << R"(#include <cstddef>
#include "env.h"
#include "node_snapshot_builder.h"
Expand All @@ -1004,7 +1004,7 @@ static const int v8_snapshot_blob_size = )"
WriteStaticCodeCacheData(&ss, item);
}

ss << R"(SnapshotData snapshot_data {
ss << R"(const SnapshotData snapshot_data {
// -- data_ownership begins --
SnapshotData::DataOwnership::kNotOwned,
// -- data_ownership ends --
Expand Down Expand Up @@ -1036,7 +1036,6 @@ static const int v8_snapshot_blob_size = )"
};
const SnapshotData* SnapshotBuilder::GetEmbeddedSnapshotData() {
Mutex::ScopedLock lock(snapshot_data_mutex_);
return &snapshot_data;
}
} // namespace node
Expand All @@ -1053,8 +1052,6 @@ static void ResetContextSettingsBeforeSnapshot(Local<Context> context) {
context->AllowCodeGenerationFromStrings(true);
}

Mutex SnapshotBuilder::snapshot_data_mutex_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually have a const_cast at L1064, though I think that V8 doesn't actually need params->snapshot_blob to be non-const. Can you modify the v8::CreateParams::snapshot_blob to a const pointer too and see if it works? If so we should send a patch to the upstream first to get rid of the const cast, so that they don't start to actually mutate the blob without us knowing about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh-oh, it appears v8 is const-casting too.. 😅

: SerializedData(const_cast<byte*>(snapshot.begin()), snapshot.length()) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yeah… our const_cast is technically invalid then as well, right? But yeah, agreed that the right thing to do here is to send a patch upstream. I’ll try to put one together and link it here.

I think what you’re bringing up is actually an argument to make this change either way, though. V8 really can’t mutate the blob, since it needs to be re-usable across Isolates (both concurrently and sequentially). If V8 were to mutate the blob, we would need to learn about that; and an easy way to do that is to make the actual data const so that attempting to mutate it crashes.

Copy link
Member Author

@addaleax addaleax Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, V8’s StartupData is just a const char* data + int raw_size. That can’t really be mutated by V8 anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks like V8 isn't actually mutating the data, just reusing the data structure that is also used for serialization. Perhaps some additional field in SerializedData that can be used to DCHECK mutation in non-const methods would be enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If V8 were to mutate the blob, we would need to learn about that; and an easy way to do that is to make the actual data const so that attempting to mutate it crashes.

Isn't mutating const-casted data undefined behavior, so it's not certain that it would crash (or it would crash reliably)? And that's my main concern because we might be seeing crashes that can't be easily link to the mutation, depending on how the mutation happens..it'd probably be safer to have some DCHECK against the mutation, either in V8 or we DCHECK a verification of the snapshot checksum when we initialize a new isolate with the snapshot data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V8 really can’t mutate the blob, since it needs to be re-usable across Isolates (both concurrently and sequentially).

I think you are talking about the default snapshot, not the one passed into IsolateParams? V8 also has a mutex to guard its default snapshot read from disk:

base::MutexGuard lock_guard(external_startup_data_mutex.Pointer());
(not entirely sure if that's strictly necessary for V8 either, but I think that's why we also have a mutex in our own code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(On a second thought, why is the mutex used in an accessor (both in our code and in V8)? It probably should've been somewhere surrounding Isolate::Initialize() instead to serve that purpose..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, the V8 mutex is guarding against refreshing the snapshot while it's being read for isolate initialization, which isn't a thing for us because we don't refresh our embedded snapshot blob in any way. The one we read from disk is in a separate location. So we really don't need this mutex anyway (and if we really want to avoid a race from mutation from V8, we should have a mutex surrounding Isolate::Initialize() instead). But with the V8 CL if V8 mutates the const-pointed snapshot in Isolate::Initialize(), it's their bug, not ours, so we should just do this anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't mutating const-casted data undefined behavior, so it's not certain that it would crash (or it would crash reliably)?

From a language perspective, yeah, it's UB. I was honestly expecting that marking this as const would put it in the .rodata section of the binary; looking at the compiled result, it seems like it's a bit too complex for that.

In any case, the V8 CL is merged now, if you'd like me to cherry-pick the IsolateParams change and remove the const_cast I'm happy to also do that.

DCHECK a verification of the snapshot checksum

Just to avoid disambiguity, this PR does not affect the const-ness of the actual snapshot data; that's already a const char* anyway (which also does actually end up in .rodata). If V8 does const_cast on that, then yeah, as you point out, it's on them to make sure they do it properly.


const std::vector<intptr_t>& SnapshotBuilder::CollectExternalReferences() {
static auto registry = std::make_unique<ExternalReferenceRegistry>();
return registry->external_references();
Expand Down