Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not give StringVectors to users #54452

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jakobnissen
Copy link
Contributor

A StringVector v is a Vector{UInt8} which wraps memory allocated as a string. This allows String(v) to be zero-copy which is good for performance. However, if users are handed a StringVector directly, they can use that to unknowingly mutate strings, which is illegal.
For example, suppose a user calls a function f which returns a StringVector v. They can then do the following to mutate a string s:

v = fill!(Base.StringVector(3), 0x00);
v2 = reshape(v, 3, 1)
s = String(v) # all zeros
v2[1] = 0x01
s

The solution in this PR is to not ever let StringVectors escape out to the user, by instead returning normal Vector{UInt8} from the functions that would otherwise do so.

For reviewers

Potential solution to #54434
Alternative to #54424

Still needs to be rebased on top of #54372 before merging, as these two PRs overlap quite a bit

A `StringVector` v is a `Vector{UInt8}` which wraps memory allocated as a string.
This allows `String(v)` to be zero-copy which is good for performance.
However, if users are handed a `StringVector` directly, they can use that to
unknowingly mutate strings, which is illegal.
For example, suppose a user calls a function `f` which returns a `StringVector`
`v`. They can then do the following to mutate a string `s`:

```julia
v = fill!(Base.StringVector(3), 0x00);
v2 = reshape(v, 3, 1)
s = String(v) # all zeros
v2[1] = 0x01
s
```

The solution in this PR is to not ever let `StringVector`s escape out to the
user, by instead returning normal `Vector{UInt8}` from the functions that would
otherwise do so.
@KristofferC KristofferC requested a review from vtjnash May 13, 2024 13:28
@KristofferC
Copy link
Sponsor Member

This seems a bit strange since StringVectors seem to be particularly made to be used in IOBuffer:

julia/base/iobuffer.jl

Lines 43 to 45 in bbae417

# allocate Vector{UInt8}s for IOBuffer storage that can efficiently become Strings
StringMemory(n::Integer) = unsafe_wrap(Memory{UInt8}, _string_n(n))
StringVector(n::Integer) = view(StringMemory(n), 1:n)::Vector{UInt8}

@oscardssmith oscardssmith added the domain:strings "Strings!" label May 13, 2024
Copy link
Sponsor Member

@KristofferC KristofferC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say same thing as applied 7 years ago applies now. #24388 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:strings "Strings!"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants