-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UBSAN "index out of bounds" for generated C++ code #2073
Comments
In protobuf we defined the storage of a repeated field as: struct Rep {
Arena* arena;
Element elements[1];
} but actually used Rep* to point to a larger space and may access rep->elements[4]. I guess this triggers the UBSAN check. No idea how to make UBSAN happy with that but the protobuf code is working as intended. |
It seems UBSAN is being overzealous here. As far as I know, there is no section in the standard that explicitly disallows writing out of bounds if there is sufficient storage. |
@seishun I guess that's triggered by |
Hello there. I have the same issue with an R wrapper around the sentencepiece C++ library. https://github.com/google/sentencepiece
@xfxyjwf / @seishun Is there going to be a fix for these address sanitizer issues? |
@TeBoring |
@jwijffels I think you are right that this is undefined behavior. It is called a "flexible array member" and is technically valid in C but not C++. However, I found from researching this that it appears to be a pretty common pattern and is supported by GCC, Clang, and MSVC. I'm not sure that we can justify spending the time to fix this just to be technically compliant with the standard, even though all major compilers already behave the expected way. The feature may end up in a future C++ standard anyway. Would the CRAN owners perhaps be receptive to making an exception if they knew more of the context around this? |
@acozzette the cran maintainers don't accept that. They specifically enforce that all code follows the standards, because it should also run on scientific supercomputers with very unusual architectures and compilers, such as intel icc, solaris studio, AIX, and so on. Therefore undefined behavior is not allowed, even if it still works in gcc/clang. So if you believe this is valid behavior, it should be fixed in gcc-ubsan. If you believe fixing this would negatively affect performance, you could condition it in a macro, such that we can compile with |
@acozzette I already explained cran about the expanding arrays but it was a clear 'no can't do any more'. |
We now have a fix for this but it is just waiting to be synced from our internal repo to GitHub. |
🥳 |
@acozzette can you point us to the commit, and update this issue if it has been released? |
@acozzette has this landed yet? |
Thanks. Will this only be part of the 4.0 release, or also get fixed in 3.x ? |
We were originally planning to release 4.0 but decided instead to just go with 3.13 for now (which we released at the end of last week). I'm not sure if our next release will be 3.14 or 4.0, but either way it will include this fix. |
UBSAN shows index out of bounds errors for generated C++ code when setting the repeated fields. Is this a bug or am I using it incorrectly? An example
test.proto
:Compile to C++ using
protoc test.proto --cpp_out=.
Thenmain.cc
looks like this:Compile everything with sanitizer flags:
g++ -fsanitize=address,undefined,bounds-strict main.cc test.pb.cc \ $(pkg-config --cflags --libs protobuf)
And then when I run it:
I am using stock gcc and protobuf from Debian Testing:
The text was updated successfully, but these errors were encountered: