-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llext: Proper support for Thumb32 relocations #72832
Comments
Taking the Linux kernel's linker as a reference point - relevant parts being:
The main difference with Zephyr, as far as I can tell - and ignoring unrelated details - seems to be that Linux completely ignores the Lines 840 to 852 in 8c8e2f6
@edersondisouza could you try if the following patch fixes the issue on your side? diff --git a/subsys/llext/llext.c b/subsys/llext/llext.c
index c92f49abd60..16748d941b4 100644
--- a/subsys/llext/llext.c
+++ b/subsys/llext/llext.c
@@ -835,18 +835,15 @@ static int llext_link(struct llext_loader *ldr, struct llext *ext, bool do_local
} else {
LOG_INF("found symbol %s at 0x%lx", name, link_addr);
}
- } else if (ELF_ST_TYPE(sym.st_info) == STT_SECTION ||
- ELF_ST_TYPE(sym.st_info) == STT_FUNC ||
- ELF_ST_TYPE(sym.st_info) == STT_OBJECT) {
+ } else if (IN_RANGE(sym.st_shndx, 0xFF00, 0xFFFF)) {
+ //0xFF00 => SHN_LORESERVE, 0xFFFF => SHN_HIRESERVE
+ LOG_DBG("section index in RESERVED range - not relocating");
+ continue;
+ } else {
/* Link address is relative to the start of the section */
link_addr = (uintptr_t)ext->mem[ldr->sect_map[sym.st_shndx]]
+ sym.st_value;
LOG_INF("found section symbol %s addr 0x%lx", name, link_addr);
- } else {
- /* Nothing to relocate here */
- LOG_DBG("not relocated");
- continue;
}
LOG_INF("writing relocation symbol %s type %zd sym %zd at addr 0x%lx " Edit: re-reading the comment thread in EDK PR, I have seen the issue appeared on
|
@mathieuchopstm I've tested it here with the sample and can confirm that your patch does work!!! Will you submit a PR, please? |
In the current implementation, the LLEXT linker will only apply relocations for a symbol if it has a specfic symbol type. This is overzealous and causes issues on some platforms as legitimate relocations are not performed because they are marked with a "bad" type. Ignore the symbol type when performing relocation to solve this problem, but exclude symbols in special (reserved) sections from being relocated. This commit also updates an outdated comment. Fixes zephyrproject-rtos#72832. Signed-off-by: Mathieu Choplain <mathieu.choplain@st.com>
@edersondisouza PR with fix submitted |
In the current implementation, the LLEXT linker will only apply relocations targeting a given symbol if it has a specfic symbol type. This is overzealous and causes issues on some platforms, as some symbols that need to be relocated are skipped due to being of a "bad" type. Ignore the symbol type when performing relocation to solve this problem, but also add checks to ensure we don't attempt to relocate symbols with an invalid section index. If such a relocation is found, error out instead of ignoring the relocation entry to ensure that it is impossible to execute code from a (partially) unrelocated LLEXT. Also remove all hacks added to circumvent this issue: * qemu_cortex_r5 exclusion from test cases * unnecessary exclusion of some flags when building with LLEXT EDK Fixes zephyrproject-rtos#72832. Signed-off-by: Mathieu Choplain <mathieu.choplain@st.com>
In the current implementation, the LLEXT linker will only apply relocations targeting a given symbol if it has a specfic symbol type. This is overzealous and causes issues on some platforms, as some symbols that need to be relocated are skipped due to being of a "bad" type. Ignore the symbol type when performing relocation to solve this problem, but also add checks to ensure we don't attempt to relocate symbols with an invalid section index. If such a relocation is found, return an error instead of ignoring the relocation entry to ensure that it is impossible to execute code from a (partially) unrelocated LLEXT. Also remove all hacks added to circumvent this issue: * qemu_cortex_r5 exclusion from test cases * unnecessary exclusion of some flags when building with LLEXT EDK Fixes zephyrproject-rtos#72832. Signed-off-by: Mathieu Choplain <mathieu.choplain@st.com>
Is your enhancement proposal related to a problem? Please describe.
LLEXT doesn't seem to support certain Thumb32 relocation instructions on ARM targets. One can check that by building the
samples/subsys/llext/edk
from #69831 (follow instructions at the README.rst there). If one removes the first non-comment line from samples/subsys/llext/edk/app/CMakeLists.txt, and build the EDK, extensions and app, everything will be fine until one runs the application.Without that line, there'll be a fault when the app tries to run the first extension:
Looking at the generated assembly for the offending instruction, one sees:
Last line is the offending instruction, it tries to write to memory pointed to by
r3
, butr3
is0
. Looking at the instructions above (at502
and506
),r3
is set to be the address ofmy_sem
object, which should be obtained viaR_ARM_THM_MOVW_ABS_NC
andR_ARM_THM_MOVT_ABS
relocation instructions. The fact thatr3
is zero indicates that the relocation is somehow not working.Looking at the debug log from LLEXT, one sees several interesting messages related to
my_sem
symbol:The
not relocated
message is the finalelse
branch for theif
starting at https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/llext/llext.c#L825. It seems that LLEXT doesn't consider to be a needed relocation there and moves on. Interestingly enough, the arch relocation handler seems to be aware of Thumb32 relocation instructions, but that code is never called due to failing theif
above.Describe the solution you'd like
Support for the Thumb32 relocation instructions. It seems there's support on the arch to honour those relocation instructions, but they seem to not be recognised as such.
Describe alternatives you've considered
One workaround is to somehow build the extension without using those instructions, as it is currently done at #69831. However, that is done by an ugly hack that seems totally unrelated to the problem (dropping compiler flags
-mcpu=cortex-r5
and-mfloat-abi=hard
. This seems (judging from theobjdump
description) to change the architecture toarmv4t
(as opposed toarmv7
) and avoid those instructions. By doing that, the sample at #69831 just works, but it's far from ideal.Additional context
While my analysis here could be way off (like, the relocation not being the root cause of the issue), there still the issue of the sample failing with the seemingly unrelated compiler flags.
The text was updated successfully, but these errors were encountered: