Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suspend/Resume on Kaby Lake : \o/ #596

Open
elthariel opened this issue Oct 25, 2020 · 11 comments
Open

Suspend/Resume on Kaby Lake : \o/ #596

elthariel opened this issue Oct 25, 2020 · 11 comments

Comments

@elthariel
Copy link

TLDR; I propose to use this issue to track available information and progress about suspend/resume on the Kaby Lake platform.

Hi,

I recently acquired an HP Chromebook 15 (aka Syndra) and when resuming from Linux I face the 'Corrupt OS' message and have to reboot to CrOS to fix the issue.

I'm trying to investigate the issue but I wasn't able to find any open issue tracking it or any reference to someone working on this. If you read this and have a tiny lead about where to start looking to solve this, please share ! (wink @MrChromebox)

@elthariel
Copy link
Author

If you happen to run into the 'Chrome OS missing or corrupted', please press TAB and share the recovery_reason and active firmware id :)
You can then reboot, go into Chrome OS, start a shell and run sudo crossytem dev_boot_legacy=1 to be able to boot into GNU/Linux again

I've tried it with the default galliumos kernel (4.16.18-galliumos) and the stock ubuntu kernel (5.4.0-52-lowlatency) and I get:
recovery_reason : 0x2b / 0x2b Secure NVRAM (TPM) initialization error
active firmware id: Google_Nami.10775.101.0

@elthariel
Copy link
Author

afaict, after the failure path in chromiumos code is the following:

  • coreboot/src/security/vboot/vboot_logic.c -> verstage_main() (called on boot and where resuming from S3)
  • platform/vboot_reference/firmware/2lib/2api.c -> vb2api_fw_phase1()
  • platform/vboot_reference/firmware/2lib/2secdata_firmware.c -> vb2_secdata_firmware_init()
  • same file -> vb2api_secdata_firmware_check()

Here, either the CRC check fails, or the version isn't right and it triggers the recovery mode

@elthariel
Copy link
Author

I was able to suspend/resume using a kernel built from the chromium os tree using a mix of their nami board kernel version/config and the stock gallium kernel config.

The kernel tree is here: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/chromeos-4.4

I've been suspending using echo mem | sudo tee /sys/power/state, which is what is used by the powerd_suspend script.

ATM, closing triggers a different kind of suspend which can never be resumed, the keyboard backlight comes back but the screen stays black (but I don't have the corrupt os error)

A semi-educated guess based on my kernel source code lecture during the build time is that the chromium OS kernel uses the firmware to suspend instead of ACPI signal/messsages/interruptions/whatever when compiled with the right options

@elthariel
Copy link
Author

elthariel commented Oct 25, 2020

Lid thing can be fixed (mostly?) by updating by creating a file in /etc/systemd/sleep.conf with the following content:

SuspendState=mem

@MrChromebox
Copy link

this is a known issue with CR50 devices running stock firmware.

On resume from S3/suspend, Google's verified boot code is looking to the TPM to confirm the previous boot was successful, which requires the OS to set a flag in the TPM. If the flag isn't set, vboot assumes the previous boot failed, and bails to recovery mode and clears the crossystem flags.

Up until recently (kernel 5.6?), there has been no driver for the CR50 TPM in the mainline kernel, and even now it's not selected by default IIRC. One needs a recent kernel with the CR50 TPM driver enabled to mitigate this. or to run UEFI firmware which doesn't implement Google's verified boot idiocy for non-ChromeOS booting.

@elthariel
Copy link
Author

@MrChromebox Thanks for answering so quickly 💌 .

Does you answer imply that with your coreboot build, there isn't such issue ?

Also, I don't understand how this TPM previous boot confirmation fits with the crc check failure / version mismatch I've seen in the code ? see the comment here

I'll give the latest mainline kernel a try, but in the meantime, my current solution of using the chromium os 4.4 kernel tree looks very promising (it doesn't work using the lid trigger, maybe the fw is interfering here or it's just configuration, and wifi isn't working when I resume)

@MrChromebox
Copy link

@elthariel not implying, definitively stating. CR50 devices running my UEFI firmware do not have this issue.

the TPM boot status is part of the vb2 security data, IIRC. I've not looked at this recently, but had discussed the issue with Google engineers back when it first surfaced. Either the legacy-booted OS needs to set the boot state in the TPM, or the firmware needs to not check it when resuming from suspend on the legacy boot path. The latter was originally how Google was going to handle dual booting Windows but then that got scrapped.

@elthariel
Copy link
Author

@MrChromebox Very nice to know. I'll update my firmware to your build once the famous SuzyQ cable gets available.

In the meantime I'll check for the cr50 support in the mainline kernel, or keep using the chrome platform kernel where the cr50 tpm is supported.

That being said, I'm still a tiny bit skeptic about the explanation given by the Google engineer. I'll try to dig a bit deeper to see if I can get a clearer picture, assuming the code in platform/vboot_reference of their repo is the one actually used on my machine :)

@elthariel
Copy link
Author

elthariel commented Nov 5, 2020

So, suspend/resume (and wifi) is working nicely on my Syndra board, I'll update this issue with the details:

I've been using the chromium os kernel tree:

The kernel config can be found here: https://gist.github.com/elthariel/d9f8dd2528cf36627c63555c4b7a3275#file-config-5-4-73-lta-6

To make this kernel work properly, you might need to fetch a few firmwares from the chromeos partition, or the upstream repos.

@elthariel
Copy link
Author

The resume operation sometimes doesn't work correctly when the suspend trigger was the lid close. When you open it back, sometimes the screen is considered disconnected. This small systemd-sleep hack fixes it:

https://gist.github.com/elthariel/d9f8dd2528cf36627c63555c4b7a3275#file-00_wake_up_screen-sh

@elthariel elthariel changed the title Suspend/Resume on Kaby Lake Suspend/Resume on Kaby Lake : \o/ Nov 5, 2020
@me11203sci
Copy link

me11203sci commented Jul 7, 2021

@elthariel Did you end up installing MrChromebox's firmware or where you able to find a work around with the other kernel? I have a Kaby Lake machine (Lenovo C340-15) with similar hang-ups to the ones you described and was wondering if you could give more detail as to what you did to fix it on your machine. I am pretty new to computing in general and would like to better understand what exactly the kernel fix works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants