Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug why Rsdp::search_for_on_bios is optimized to ud2 #425

Open
phil-opp opened this issue Feb 16, 2024 · 8 comments
Open

Debug why Rsdp::search_for_on_bios is optimized to ud2 #425

phil-opp opened this issue Feb 16, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@phil-opp
Copy link
Member

See #420

We worked around the issue for now by preventing inlining of the map_physical_region function, but we should figure out the root cause of this.

@phil-opp phil-opp added the bug Something isn't working label Feb 16, 2024
@Freax13
Copy link
Contributor

Freax13 commented Feb 16, 2024

This only occurs when using rsdp 2.0.0 and seems to be fixed by rsdp 2.0.1. I've tracked the fix down to this commit: rust-osdev/acpi@a0db984, but I'm not entirely sure why this fixes the bug.

At first glance, I thought this might be because the added iterator complexity slows down the inlining a bit: search_for_on_bios doesn't get fully inlined and its implementation is split into two parts, one setting up the addresses and one doing the work; perhaps the lack of inlining was preventing LLVM from seeing UB. To work around that, I removed the .step(16) call, which resulted in all functions being fully inlined. This didn't break the code though, it wasn't optimized down to ud2.

AFAICT rust-osdev/acpi@a0db984 doesn't fix anything that LLVM should care about. It prevents some out-of-bounds accesses, but for memory that's not managed by LLVM, so LLVM shouldn't be able to optimize out the accesses. Even if LLVM felt that optimizing out the later out-of-bounds accesses was legal, the rsdp could still be in the first address slot and LLVM should never allow that to be optimized out.

In conclusion, even though rust-osdev/acpi@a0db984 fixes this bug, I'm not yet convinced that this didn't just happen by accident.

@phil-opp
Copy link
Member Author

phil-opp commented Mar 7, 2024

We should retry to reproduce this with the latest nightlies. There was another LLVM update in rust-lang/rust#121395, which fixed some miscompilations.

@Freax13
Copy link
Contributor

Freax13 commented Mar 7, 2024

I can still reproduce this on 1.78.0-nightly (tested with 090f030).

@toothbrush7777777
Copy link
Contributor

Shouldn't this raw pointer dereference in rdsp::search_for_on_bios be a volatile read, so the compiler doesn't optimise it out?

https://github.com/rust-osdev/acpi/blob/35f9fba0aed2b00cfd16d0560a991c705666f704/rsdp/src/lib.rs#L106

@Freax13
Copy link
Contributor

Freax13 commented Apr 1, 2024

Can you elaborate on why the compiler would be allowed to optimize out the non-volatile read?

@toothbrush7777777
Copy link
Contributor

toothbrush7777777 commented Apr 1, 2024

Can you elaborate on why the compiler would be allowed to optimize out the non-volatile read?

Yes, I'm not sure that constructing a slice using slice::from_raw_parts and then reading from a raw pointer constructed from the slice's address with a non-volatile read without side-effects guarantees that the optimiser will keep that read.

Optimising compilers, and particularly LLVM, try to reduce the number of unnecessary memory accesses to improve performance. They will often remove memory accesses that they consider to have no side-effects. Using read_volatile() guarantees that the optimiser will not remove the memory access, even if it considers that there is no side-effect to reading that memory.

@Freax13
Copy link
Contributor

Freax13 commented Apr 1, 2024

I'd argue that side effects from the reads are irrelevant here. What matters are the values that are read. Unless LLVM can somehow statically determine the value at the addresses without doing the reads, I don't see why it would be allowed to optimize out the reads. AFAICT the memory accesses are necessary here because the values are not known until runtime.

@toothbrush7777777
Copy link
Contributor

True, but LLVM 18 appears to be (mis)optimising this function away, including the memory accesses. I was wondering if that would fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants