-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Safer CPP Guidelines
In the following guidelines, the definition of a trivial member function is a function that is inline and cannot cause
this
to get destroyed. It usually applies to simple getters / setters.
When calling a non-trivial member function on an object, hold a smart pointer to the object on the stack
Reasoning:
This makes sure the member function cannot make a use-after-free use of this
during its execution. Use a Ref
/ RefPtr
on the stack if the object is ref-counted, a CheckedRef
/ CheckedPtr
otherwise.
Similarly, we should be using RetainPtr
for Objective C objects, OSObjectPtr
for Darwin OS objects, GRefPtr
for various GLib types, and CachedResourceHandle
for CachedResource objects.
Note that it is important for the smart pointer to be a stack variable. Calling a function on a data member that has a smart pointer type is not truly safe, because this data member could get reassigned while the function is running.
Right:
if (RefPtr element = m_element) { // Even if m_element is already a RefPtr.
element->focus();
// Use protectedFoo() functions that return a Ref / RefPtr
// to avoid local variables.
element->protectedDocument()->didFocus();
if (CheckedPtr renderer = element->renderer())
renderer->didFocus();
// Use checkedFoo() functions that return a CheckedRef / CheckedPtr
// to avoid local variables.
element->checkedFocusContoller().didFocus();
}
if (RefPtr document = ownerDocument())
document->foo();
Wrong:
if (m_element) {
m_element->focus();
m_element->document()->didFocus();
if (auto* renderer = m_element->renderer())
renderer->didFocus();
m_element->focusContoller().didFocus();
}
// Calling protectedOwnerDocument() here is redundant, calling ownerDocument()
// is sufficient. protectedFoo() functions are only useful to avoid local
// smart pointer variables.
if (RefPtr document = protectedOwnerDocument())
document->foo();
Reasoning:
This makes sure that the object passed to the function cannot be used-after-free during the execution of the function. Use a Ref
/ RefPtr
on the stack if the object is ref-counted, a CheckedRef
/ CheckedPtr
otherwise.
Similarly, we should be using RetainPtr
for Objective C objects, OSObjectPtr
for Darwin OS objects, GRefPtr
for various GLib types, and CachedResourceHandle
for CachedResource
objects.
Right:
Ref element = task.element();
adjustDirectionality(element);
// Use protectedFoo() / checkedFoo() functions to avoid local variables.
registerWithDocument(element->protectedDocument());
Wrong:
auto& element = task.element();
adjustDirectionality(element);
registerWithDocument(element->document());
Reasoning:
This makes sure we don’t use-after-free data members, by enforcing that pointers cannot become stale. Use Ref
/ RefPtr
for ref-counted objects that you wish to keep alive. Use WeakRef
/ WeakPtr
for other pointers, or when you need to avoid reference cycles.
Right:
class Foo {
private:
Ref<Document> m_document;
RefPtr<Node> m_node;
WeakPtr<RenderObject> m_renderer;
WeakRef<RenderObject> m_rootRenderer;
WeakHashSet<Listener> m_listeners;
};
Wrong:
class Foo {
private:
Document& m_document;
Node* m_node;
RenderObject* m_renderer;
RenderObject& m_rootRenderer;
HashSet<Listener*> m_listeners;
};
Notes:
We do not recommend using CheckedRef
/ CheckedPtr
for data members. The reason for this is that crashes caused by CheckedRef
/ CheckedPtr
(to prevent use-after-free) are currently extremely hard to debug without a reproduction case if the CheckedRef
/ CheckedPtr
is not on the stack.
Reasoning:
This avoids leaks, double frees and complexity related to manual resource management.
A few examples of this are:
- Use
Ref
/RefPtr
instead of explicitref()
/deref()
calls. - Use
WTF::UniqueRef
/std::unique_ptr
to avoid explicitnew
/delete
calls. - Use
Locker
to avoid explicit calls toLock::lock()
/Lock:unlock()
.
In general, this applies to any 2 operations / function calls that need to be balanced in order to avoid a bug / leak. It is too easy for calls to get unbalanced, particularly due to early returns. RAII objects / handles avoids this class of bugs.
Right:
Locker locker { m_valueLock };
m_value = 0;
auto resource = makeUnique<Resource>();
RunLoop::main().dispatch([resource = WTFMove(resource)] {
// Use resource.
});
Wrong:
m_valueLock.lock();
m_value = 0;
m_valueLock.unlock();
auto* resource = new Resource;
RunLoop::main().dispatch([resource] {
// Use resource.
delete resource;
});
The GTK/WPE ports also have some types to make it easier to interact with GLib style C APIs.
Right:
GUniqueOutPtr<GError> error;
GUniquePtr<char> address(g_dbus_address_get_for_bus_sync(G_BUS_TYPE_SESSION, nullptr, &error.outPtr()));
if (error)
g_warning("Unable to get session D-Bus address: %s", error->message);
else
// Use address.
Wrong:
GError* error = nullptr;
char* address = g_dbus_address_get_for_bus_sync(G_BUS_TYPE_SESSION, nullptr, &error);
if (error) {
g_warning("Unable to get session D-Bus address: %s", error->message);
g_error_free(error);
} else {
// Use address.
g_free(address);
}
Reasoning:
downcast<>()
validates the type at runtime in release builds so that bad casts no longer result in type confusion security bugs.
Right:
Document* document() { return downcast<Document>(scriptExecutionContext()); }
Wrong:
Document* document() { return static_cast<Document*>(scriptExecutionContext()); }
Note:
JavaScriptCore uses jsCast<>()
instead of downcast<>()
for JSValue
.
Reasoning:
is<>()
+ downcast<>()
ends up checking the type twice, which may be inefficient. It is also more error prone than a single call to dynamicDowncast<>()
. dynamicDowncast<>()
often results in more concise code too.
Right:
return dynamicDowncast<Element>(node);
Wrong:
return is<Element>(node) ? &downcast<Element>(node) : nullptr;
Note:
JavaScriptCore uses jsDynamicCast<>()
instead of dynamicDowncast<>()
for JSValue
.
Reasoning:
Using weakly-typed identifiers such as uint64_t
leads to confusion between identifier types and bugs.
Right:
enum class NodeIdentifierType { };
using NodeIdentifier = ObjectIdentifier<NodeIdentifierType>;
class Node : public Identified<NodeIdentifier> { };
// Node::identifier() returns a NodeIdentifier type.
enum class TaskIdentifierType { };
using TaskIdentifier = AtomicObjectIdentifier<TaskIdentifierType>;
TaskIdentifier generateTask()
{
auto taskIdentifier = TaskIdentifier::generate();
Locker locker { m_tasksLock };
m_tasks.add(taskIdentifier, Task::create());
return taskIdentifier;
}
Wrong:
uint64_t m_identifier;
Reasoning:
Avoids type confusing, easier to pass over WebKit IPC.
Right:
enum class Border : uint8_t { Left, Right, Top, Bottom };
Wrong:
enum Border { BorderLeft, BorderRight, BorderTop, BorderBottom };
Reasoning:
Unions are prone to type confusion bugs. std::variant
fixes that.
Right:
std::variant<int, double> m_value;
Wrong:
union {
int m_intValue;
double m_doubleValue;
};
bool m_isDouble;
Notes:
A std::variant<>
may be larger than an equivalent union and may thus be unsuited for classes where the size matters for performance.
Reasoning:
Using ASCIILiteral
makes it clear that the string we’re dealing with is ASCII-only and that the string is immortal.
This allows code to leverage these facts and be more efficient. It also documents in code what the lifetime of the string is.
Finally, ASCIILiteral
goes bounds checking on indexed access, thus avoiding out-of-bounds access bugs.
Right:
static constexpr auto foo = "foo"_s; // `_s` suffix gives you an ASCIILiteral.
static ASCIILiteral toString(EnumType value)
{
switch (value) {
case EnumType::A:
return "A"_s;
case EnumType::B:
return "B"_s;
default:
return "Others"_s;
}
}
Wrong:
static const char* foo = "foo";
static const char* toString(EnumType value)
{
switch (value) {
case EnumType::A:
return "A";
case EnumType::B:
return "B";
default:
return "Others";
}
}
Reasoning:
Using a variable after it’s been "moved from" is bad practice in C++ and may result in unexpected behavior depending on how the move constructor is implemented. std::optional move constructor, for example, doesn’t reset the "moved-from" value to std::nullopt
.
Right:
if (RefPtr data = std::exchange(m_data, nullptr))
data->detach();
Wrong:
if (RefPtr data = WTFMove(m_data))
data->detach();
Reasoning:
A public create()
factory function should be exposed, to create instances of such classes to make sure that we adopt the object after construction and store it in a Ref
/ RefPtr
. Mixing explicit memory management and ref-counting can result in use-after-free bugs.
Right:
class Foo : public RefCounted<Foo> {
public:
static Ref<Foo> create();
private:
Foo();
}
Wrong:
class Foo : public RefCounted<Foo> {
public:
Foo();
}
Subclasses of RefCounted
/ ThreadSafeRefCounted
should have a virtual destructor if they’re not final
Reasoning:
RefCounted<T>
/ ThreadSafeRefCounted<T>
call delete static_cast<T*>(this)
. If the subclass T
is not final (has subclasses), then those subclasses’s destructors will not run unless T
’s destructor is marked as virtual
.
Right:
class Foo : public RefCounted<Foo> {
public:
virtual ~Foo(); // Needs to be virtual since Foo is not final.
};
class Bar : public Foo { };
class FinalBaz final : public RefCounted<FinalBaz> {
public:
~FinalBaz(); // Doesn't need to be virtual since the class is final.
}
Wrong:
class Foo : public RefCounted<Foo> {
public:
~Foo();
};
class Bar : public Foo { };
Reasoning:
std::span
does bounds checking, less risk of size and buffer getting out of sync. std::span also provides API such as subspan()
, first()
and last()
which can be used to avoid doing unsafe pointer arithmetics.
Right:
void didReceiveData(std::span<const uint8_t>);
Wrong:
void didReceiveData(const uint8_t* data, size_t size);
Reasoning:
std::array
does bounds checking, whereas C-style arrays do not.
Right:
std::array values { 1, 2, 3 };
Wrong:
int values[] = { 1, 2, 3 };
Reasoning:
Bounds checking in release builds plays a critical part of achieving memory safety.
Right:
class StringView {
public:
UChar operator[](size_t index) const
{
RELEASE_ASSERT(index < length());
if (is8Bit())
return characters8()[index];
return characters16()[index];
}
private:
const LChar* characters8() const;
const UChar* characters16() const;
}
Wrong:
class StringView {
public:
UChar operator[](size_t index) const
{
if (is8Bit())
return characters8()[index];
return characters16()[index];
}
private:
const LChar* characters8() const;
const UChar* characters16() const;
}
Note:
The view / container type may wrap another container / view which already does bounds checking, in which case you do not need to duplicate the check:
class NodeList {
public:
Node* operator[](size_t index) const
{
// No need for a RELEASE_ASSERT() here.
return m_nodes[index];
}
private:
Vector<Node> m_nodes; // Vector::operator[] already validates bounds.
}
Reasoning:
Poor data encapsulation is a frequent source of bugs.
Right:
class Parent {
protected:
const String& value() const { return m_value; }
void setValue(String&& value) { m_value = WTFMove(value); }
private:
String m_value;
};
class Child : public Parent {
// Child uses value() / setValue().
};
Wrong:
class Parent {
protected:
String m_value;
};
class Child : public Parent {
// Child uses m_value.
};
Reasoning:
Many WebKit types require copying to be safely passed to another thread. In the past, we’ve sometimes determined that it was sufficient to WTFMove()
some types in some cases. There is no longer any performance win associated with this so we should now do the safe thing and call crossThreadCopy()
unconditionally.
Right:
void MyClass::didReceiveData(String&& data)
{
m_workQueue->dispatch([data = crossThreadCopy(WTFMove(data))] {
// Do something.
});
}
Wrong:
void MyClass::didReceiveData(String&& data)
{
m_workQueue->dispatch([data = WTFMove(data)] {
// Do something.
});
}
Notes:
Many WebKit types (such as String
) have isolatedCopy()
overloads which may result in better performance when calling crossThreadCopy()
on a r-value reference.
Reasoning:
Multi-threading is a common source of (security) bugs and these macros can help catch a lot of the bugs at build time. They also help document the code.
Right:
class Foo {
protected:
void initializeValuesWhileLocked() WTF_REQUIRES_LOCK(m_valuesLock);
private:
Lock m_valuesLock;
std::array<int, 10> m_values WTF_GUARDED_BY_LOCK(m_valuesLock);
};
Wrong:
class Foo {
protected:
void initializeValuesWhileLocked();
private:
Lock m_valuesLock;
std::array<int, 10> m_values;
};
Reasoning:
A child process can become compromised and we should therefore not trust any data it sends other processes via IPC. Failure to properly validate received data can lead to sandbox escapes or denial of service attacks (by crashing the recipient process). In general, IPC from WebContent processes have the highest risk. However, other child processes can in theory get compromised too and send bad IPC.
It is never acceptable for bad IPC from a compromised process to cause the recipient to crash, even in a non-exploitable way (such as a RELEASE_ASSERT()
). Also, instead of ignoring bad IPC, it is better practice to kill the compromised process so it cannot keep trying to find an exploit.
Validation can happen at two levels:
- Ideally during IPC deserialization, in generated code (using
[Validator=X]
in*.serialization.in
files) - Otherwise, after deserialization using a
MESSAGE_CHECK()
Both will cause the termination of the sender.
Right:
class WebCore::ShareableBitmapConfiguration {
[Validator='m_size->width() >= 0 && m_size->height() >= 0'] WebCore::IntSize m_size;
};
Wrong:
class WebCore::ShareableBitmapConfiguration {
WebCore::IntSize m_size; // Then assume m_size contains positive values.
};
Right:
void Foo::setCookiesFromDOM(const URL& firstParty, const String& cookies)
{
MESSAGE_CHECK(allowsFirstPartyForCookies(m_processIdentifier, firstParty));
// Set cookies...
}
Wrong:
void Foo::setCookiesFromDOM(const URL& firstParty, const String& cookies)
{
// Assume this WebContent process is allowed to set cookies for this
// |firstParty| and set cookies...
}
Reasoning:
All IPC endpoints are potential attack vectors. As a result, if certain features are disabled (which is often the case for experimental ones), its related IPC endpoints (or messages) should also be disabled to reduce the surface of attack.
IPC endpoints can be conditionally enabled using [EnabledIf='isFooEnabled()']
in the *.messages.in
file.
Right:
messages -> GPUConnectionToWebProcess {
[EnabledIf='isWebGPUEnabled()'] void CreateRemoteGPU(WebKit::WebGPUIdentifier identifier)
[EnabledIf='isWebGPUEnabled()'] void ReleaseRemoteGPU(WebKit::WebGPUIdentifier identifier)
}
Wrong:
messages -> GPUConnectionToWebProcess {
// These messages are only sent when the WebGPU feature is turned on.
void CreateRemoteGPU(WebKit::WebGPUIdentifier identifier)
void ReleaseRemoteGPU(WebKit::WebGPUIdentifier identifier)
}
Reasoning:
Manually writing encode()
or decode()
for a class / struct is no longer acceptable in new code. Most of our IPC serializers / deserializers get generated from an IDL format in *.serialization.in
files (e.g. WebCoreArgumentCoders.serialization.in
), and any remaining classes with manual encode()
/ decode()
methods will be transitioned soon.
Generating serialization methods reduces the risk of bugs, automatically includes validation logic when encoding and decoding, and supports automatic testing of serialized types.
Right:
// In WebCoreArgumentCoders.in:
struct WebCore::FloatPoint {
float m_x;
float m_y;
}
Wrong:
// In FloatPoint.h:
namespace WebCore {
struct FloatPoint {
float m_x;
float m_y;
template<typename Encoder>
void encode(Encoder& encoder) const
{
encoder << m_x << m_y;
}
template<typename Decoder>
static std::optional<FloatPoint> decode(Decoder& decoder)
{
std::optional<float> x;
decoder >> x;
if (!x)
return std::nullopt;
std::optional<float> y;
decoder >> y;
if (!y)
return std::nullopt;
return FloatPoint { *x, *y };
}
}
} // namespace WebCore