Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v5.1.4][S3][Flash Encryption] Toggling VCC / GPIO-0 during boot BRICKS THE DEVICE! (invalid header: 0xce5209fc) (IDFGH-12807) #13783

Open
3 tasks done
chipweinberger opened this issue May 11, 2024 · 21 comments
Labels
Status: Opened Issue is new Type: Bug bugs in IDF

Comments

@chipweinberger
Copy link
Contributor

chipweinberger commented May 11, 2024

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.1.4

Espressif SoC revision.

ESP32-s3

Operating System used.

Windows

How did you build your project?

Command line with Make

If you are using Windows, please specify command line type.

None

Development Kit.

custom pcb

Power Supply used.

USB

What is the expected behavior?

Device can survive power loss without bricking itself

What is the actual behavior?

the bootloader becomes corrupted, (invalid header: 0xce5209fc)

edit: it appears this bug caused SPI_BOOT_CRYPT_CNT to somehow get reset back to 0, causing flash encryption to become disabled. see my comment here: #13783 (comment)

Steps to reproduce.

Note: I am using flash encryption, this seems to be required to repro
Note: I have a custom bootloader that waits 1 second before booting into app (probably unrelated)

  1. I unplug and replug the device power many times (~6 times)
  2. eventually the device does not boot anymore (invalid header: 0xce5209fc)
  3. edit: I also think I need to toggle GPIO 0 in some way

Debug Logs.

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x7 (TG0WDT_SYS_RST),boot:0x2b (SPI_FAST_FLASH_BOOT)
Saved PC:0x40048839
0x40048839: uart_tx_one_char_uart in ROM

invalid header: 0xce5209fc
invalid header: 0xce5209fc
invalid header: 0xce5209fc
invalid header: 0xce5209fc
invalid header: 0xce5209fc
invalid header: 0xce5209fc

More Information.

No response

@chipweinberger chipweinberger added the Type: Bug bugs in IDF label May 11, 2024
@chipweinberger
Copy link
Contributor Author

This is a super serious issue!

I can't ship a product that bricks itself!

@chipweinberger
Copy link
Contributor Author

sdkconfig.txt

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 11, 2024

Screenshot 2024-05-11 at 2 51 42 PM

power on reset circuit.

@espressif-bot espressif-bot added the Status: Opened Issue is new label May 11, 2024
@github-actions github-actions bot changed the title [v5.1.4][S3] Toggling VCC during boot BRICKS THE DEVICE! [v5.1.4][S3] Toggling VCC during boot BRICKS THE DEVICE! (IDFGH-12807) May 11, 2024
@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 11, 2024

note: I can still recover the device using GPIO 0 to put it into DOWNLOAD mode, and reflashing firmware.

edit: nevermind. I spoke too soon. I can reflash it using DOWNLOAD mode, but the device fails to boot with the same INVALID HEADER message. This device was previously working fine! Until I unplugged & replugged it!

Flashing using USB Serial / JTAG:

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                                                            
                                Flash Firmware                              
                                                                            
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/Volumes/User/MBP-Google-Drive/jamcorder/firmware/.espressif/python_env/idf5.1_py3.9_env/bin/python /Volumes/User/MBP-Google-Drive/jamcorder/firmware/esp-idf/components/esptool_py/esptool/esptool.py --port /dev/cu.usbmodem2101 --baud 921600 --before default_reset --after no_reset --chip esp32s3 write_flash --flash_mode dio --flash_freq 80m --flash_size 8MB --force 0x0 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/bootloader/bootloader.bin.enc 0xf000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/partition_table/partition-table.bin.enc 0x10000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/ota_data_initial.bin.enc 0x300000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/jJfwmApp.bin.enc 0xe0000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfu/build/jJfuApp.bin.enc 0x25000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfu/build/jOtaAesKey.bin.enc


esptool.py v4.7.0
Serial port /dev/cu.usbmodem2101
Connecting...
Chip is ESP32-S3 in Secure Download Mode
WARNING: Stub loader is not supported in Secure Download Mode, setting --no-stub
Changing baud rate to 921600
Changed.
Enabling default SPI flash mode...
Configuring flash size...
Flash will be erased from 0x00000000 to 0x00006fff...
Flash will be erased from 0x0000f000 to 0x0000ffff...
Flash will be erased from 0x00010000 to 0x00011fff...
Flash will be erased from 0x00300000 to 0x0059cfff...
Flash will be erased from 0x000e0000 to 0x0025ffff...
Flash will be erased from 0x00025000 to 0x00025fff...
Erasing flash...
Warning: Image file at 0x0 doesn't look like an image file, so not changing any flash settings.
Took 0.38s to erase flash block
Wrote 28672 bytes at 0x00000000 in 1.0 seconds (226.1 kbit/s)...
Erasing flash...
Took 0.08s to erase flash block
Wrote 3072 bytes at 0x0000f000 in 0.1 seconds (231.7 kbit/s)...
Erasing flash...
Took 0.15s to erase flash block
Wrote 8192 bytes at 0x00010000 in 0.3 seconds (217.5 kbit/s)...
Erasing flash...
Took 9.68s to erase flash block
Wrote 2738176 bytes at 0x00300000 in 95.5 seconds (229.3 kbit/s)...
Erasing flash...
Took 6.30s to erase flash block
Wrote 1572864 bytes at 0x000e0000 in 54.3 seconds (231.9 kbit/s)...
Erasing flash...
Took 0.08s to erase flash block
Wrote 1024 bytes at 0x00025000 in 0.0 seconds (242.0 kbit/s)...

Leaving...
Staying in bootloader.
 ___ _   _  ___ ___ ___ ___ ___  
/ __| | | |/ __/ __| __/ __/ __| 
\__ \ |_| | (_| (__| _|\__ \__ \ 
|___/\___/ \___\___|___|___/___/ 


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                                                            
                               Reboot                                       
                                                                            
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Please unplug the device now...
Please plug the device back in...
 ___ _   _  ___ ___ ___ ___ ___  
/ __| | | |/ __/ __| __/ __/ __| 
\__ \ |_| | (_| (__| _|\__ \__ \ 
|___/\___/ \___\___|___|___/___/ 

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                                                            
                               Console                                      
                                                                            
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Waiting on /dev/cu.usbmodem2101
found device!
Waiting for Console...
invalid header: 0xce5209fc
      > invalid header: 0xce5209fc
      > invalid header: 0xce5209fc
      > invalid header: 0xce5209fc
      > invalid header: 0xce5209fc
      > invalid header: 0xce5209fc
      > invalid header: 0xce5209fc
      > invalid header: 0xce5209fc
      > invalid header: 0xce5209fc

When flashing using UART, it is mostly the same but but the console just logs 'xxxxxxxxx' and nothing else:

Flashing using UART:

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                                                            
                                Flash Firmware                              
                                                                            
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/Volumes/User/MBP-Google-Drive/jamcorder/firmware/.espressif/python_env/idf5.1_py3.9_env/bin/python /Volumes/User/MBP-Google-Drive/jamcorder/firmware/esp-idf/components/esptool_py/esptool/esptool.py --port /dev/cu.SLAB_USBtoUART --baud 921600 --before default_reset --after no_reset --chip esp32s3 write_flash --flash_mode dio --flash_freq 80m --flash_size 8MB --force 0x0 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/bootloader/bootloader.bin.enc 0xf000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/partition_table/partition-table.bin.enc 0x10000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/ota_data_initial.bin.enc 0x300000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfwm/build/jJfwmApp.bin.enc 0xe0000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfu/build/jJfuApp.bin.enc 0x25000 /Volumes/User/MBP-Google-Drive/jamcorder/firmware/jamcorder-firmware/jfu/build/jOtaAesKey.bin.enc


esptool.py v4.7.0
Serial port /dev/cu.SLAB_USBtoUART
Connecting....
Chip is ESP32-S3 in Secure Download Mode
WARNING: Stub loader is not supported in Secure Download Mode, setting --no-stub
Changing baud rate to 921600
Changed.
Enabling default SPI flash mode...
Configuring flash size...
Flash will be erased from 0x00000000 to 0x00006fff...
Flash will be erased from 0x0000f000 to 0x0000ffff...
Flash will be erased from 0x00010000 to 0x00011fff...
Flash will be erased from 0x00300000 to 0x0059cfff...
Flash will be erased from 0x000e0000 to 0x0025ffff...
Flash will be erased from 0x00025000 to 0x00025fff...
Erasing flash...
Warning: Image file at 0x0 doesn't look like an image file, so not changing any flash settings.
Took 0.39s to erase flash block
Wrote 28672 bytes at 0x00000000 in 2.2 seconds (105.4 kbit/s)...
Erasing flash...
Took 0.11s to erase flash block
Wrote 3072 bytes at 0x0000f000 in 0.2 seconds (106.4 kbit/s)...
Erasing flash...
Took 0.19s to erase flash block
Wrote 8192 bytes at 0x00010000 in 0.6 seconds (106.4 kbit/s)...
Erasing flash...
Took 9.70s to erase flash block
Wrote 2738176 bytes at 0x00300000 in 207.3 seconds (105.7 kbit/s)...
Erasing flash...
Took 6.30s to erase flash block
Wrote 1572864 bytes at 0x000e0000 in 119.4 seconds (105.4 kbit/s)...
Erasing flash...
Took 0.11s to erase flash block
Wrote 1024 bytes at 0x00025000 in 0.1 seconds (109.3 kbit/s)...

Leaving...
Staying in bootloader.
 ___ _   _  ___ ___ ___ ___ ___  
/ __| | | |/ __/ __| __/ __/ __| 
\__ \ |_| | (_| (__| _|\__ \__ \ 
|___/\___/ \___\___|___|___/___/ 


^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                                                            
                               Reboot                                       
                                                                            
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/Volumes/User/MBP-Google-Drive/jamcorder/firmware/.espressif/python_env/idf5.1_py3.9_env/bin/python /Volumes/User/MBP-Google-Drive/jamcorder/firmware/esp-idf/components/esptool_py/esptool/esptool.py --port /dev/cu.SLAB_USBtoUART --chip esp32s3 --no-stub --after hard_reset get_security_info

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                                                            
                               Console                                      
                                                                            
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Waiting on /dev/cu.SLAB_USBtoUART
found device!
Waiting for Console...
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

@chipweinberger
Copy link
Contributor Author

@igrr , appreciate your thoughts here.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 11, 2024

full schematic. But I don't think my pcb is the problem. just for reference.

when the device got bricked, I was plugging & unplugging USBB1 many times. About 6 times.

Screenshot 2024-05-11 at 3 26 14 PM

@chipweinberger chipweinberger changed the title [v5.1.4][S3] Toggling VCC during boot BRICKS THE DEVICE! (IDFGH-12807) [v5.1.4][S3] Toggling VCC during boot BRICKS THE DEVICE! invalid header: 0xce5209fc (IDFGH-12807) May 11, 2024
@chipweinberger chipweinberger changed the title [v5.1.4][S3] Toggling VCC during boot BRICKS THE DEVICE! invalid header: 0xce5209fc (IDFGH-12807) [v5.1.4][S3] Toggling VCC during boot BRICKS THE DEVICE! (invalid header: 0xce5209fc) (IDFGH-12807) May 11, 2024
@chipweinberger chipweinberger changed the title [v5.1.4][S3] Toggling VCC during boot BRICKS THE DEVICE! (invalid header: 0xce5209fc) (IDFGH-12807) [v5.1.4][S3][Flash Encryption] Toggling VCC during boot BRICKS THE DEVICE! (invalid header: 0xce5209fc) (IDFGH-12807) May 11, 2024
@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 11, 2024

I want to be clear, invalid header: 0xce5209fc is triggered solely by unplugging the device from power at some unfortunate time, and then replugging it in. The device was previously working for weeks straight.

I use host-generated flash encryption keys, so I can normally re-flash the encrypted device and I thought I could recover it, but when the device gets in this state, I can not re-flash it to recover. invalid header: 0xce5209fc is permanent.

Its as if the flash encryption keys on the device got corrupted.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 12, 2024

So I've been trying to reproduce this for the past 45 minutes.

Original repro scenario (~10 attempts):

  • pcb: custom pcb
  • power supply: AC usb-c charger
  • power cable: usb-c to usb-b cable
  • uart: unconnected
  • enclosure: custom 3d printed enclosure

did not repro (~150 attempts)

  • pcb: dev-kit-c
  • power supply: computer usb port
  • power cable: usb-c to micro usb
  • uart: connected as typical
  • enclosure: none

did not repro (~150 attempts)

  • pcb: custom pcb
  • power supply: computer usb port
  • power cable: connected via modified dev-kit-c
  • uart: connected via modified dev-kit-c
  • enclosure: none

This leaves a couple suspects:

  1. maybe the AC usb-c charger is bad?
  2. maybe the 3d printed enclosure is causing a problem?

USB-C to USB-B Cable

My cable appears to be poorly made, so as you plug it in, VCC gets turned on and off a few times. Jiggling the cable causes the device to reset

3d printed enclosure

This may actually be related to the problem.

  • my device was not always turning on because the force required to plug in USB-B would cause the enclosure to sometimes randomly hold down a button on the device, which I have tied to GPIO0

GPIO0

My current theory is that some combination of toggling both VCC & GPIO 0 is causing the problem.

Next Steps

I will try the original repro scenario, but with a better enclosure that never holds down GPIO 0

@chipweinberger
Copy link
Contributor Author

@mahavirj

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 12, 2024

Wow no way!!!!

I wanted to check if somehow flash encryption got disabled, so I flashed a plaintext binary, AND IT BOOTED!!

I know this sounds incredible, but I am 100% sure flash encryption was enabled on this device. And just unplugging & replugging my device / GPIO 0 caused flash encryption to become disabled. Crazy.

This is why my device wont boot anymore & logs invalid header: 0xce5209fc.

How am I sure flash encryption was enabled?

  1. I have records & logs. The affected device has OPTIONAL_UNIQUE_ID a3aba04cdf8693c296ed73c5426f423e. It was flashed with my production script on April 28, 2024. My production script always enables flash encryption and will throw an error otherwise.
  2. All of the fuses are still burned. secure download mode, the key purposes, etc. These fuses are all burned at the same time by my script.
  3. this is my main test device. it has been working for weeks. I've used it every day. It was working, then i re-plugged it and it suddenly wont boot anymore. I grab the serial logs and see invalid header: 0xce5209fc. This error suggests that the firmware is encrypted, and it was working 10 seconds prior, so flash encryption had to have been enabled!
  4. I personally remember 100% flashing this device with flash encryption

So, somehow, flash encryption is no longer active. How could re-plugging the device cause that to happen?

This is the result of get_security_info on the affected device.

chipweinberger@Chips-MacBook-Pro jflash % /Volumes/User/MBP-Google-Drive/jamcorder/firmware/.espressif/python_env/idf5.1_py3.9_env/bin/python /Volumes/User/MBP-Google-Drive/jamcorder/firmware/esp-idf/components/esptool_py/esptool/esptool.py --port /dev/cu.usbmodem2101 --chip esp32s3 --no-stub get_security_info
esptool.py v4.7.0
Serial port /dev/cu.usbmodem2101
Connecting...
Chip is ESP32-S3 in Secure Download Mode
Enabling default SPI flash mode...

Security Information:
=====================
Flags: 0x00000284 (0b1010000100)
Key Purposes: (0, 2, 3, 0, 0, 0, 12)
Chip ID: 9
API Version: 0
Secure Boot: Disabled
Flash Encryption: Disabled
SPI Boot Crypt Count (SPI_BOOT_CRYPT_CNT): 0x0
Dcache in UART download mode: Disabled
JTAG: Permenantly Disabled
Hard resetting via RTS pin...

My device has a command to print the fuses, here are all the fuses currently set:

notably, SPI_BOOT_CRYPT_CNT is not set anymore and reports 0x0 as its value.

Fuses:

	"set":	[{
			"label":	"WR_DIS",
			"bits":	32,
			"offset":	"blk0:0",
			"value":	"0x0"
		}, {
			"label":	"WR_DIS_KEY_PURPOSE_1",
			"bits":	1,
			"offset":	"blk0:9",
			"value":	"0x1"
		}, {
			"label":	"WR_DIS_BLOCK_KEY1",
			"bits":	1,
			"offset":	"blk0:24",
			"value":	"0x1"
		}, {
			"label":	"WR_DIS_BLOCK_KEY2",
			"bits":	1,
			"offset":	"blk0:25",
			"value":	"0x1"
		}, {
			"label":	"RD_DIS",
			"bits":	7,
			"offset":	"blk0:32",
			"value":	"0x2"
		}, {
			"label":	"RD_DIS_BLOCK_KEY1",
			"bits":	1,
			"offset":	"blk0:33",
			"value":	"0x1"
		}, {
			"label":	"DIS_DOWNLOAD_DCACHE",
			"bits":	1,
			"offset":	"blk0:43",
			"value":	"0x1"
		}, {
			"label":	"DIS_PAD_JTAG",
			"bits":	1,
			"offset":	"blk0:51",
			"value":	"0x1"
		}, {
			"label":	"DIS_DOWNLOAD_MANUAL_ENCRYPT",
			"bits":	1,
			"offset":	"blk0:52",
			"value":	"0x1"
		}, {
			"label":	"VDD_SPI_XPD",
			"bits":	1,
			"offset":	"blk0:68",
			"value":	"0x1"
		}, {
			"label":	"VDD_SPI_TIEH",
			"bits":	1,
			"offset":	"blk0:69",
			"value":	"0x1"
		}, {
			"label":	"VDD_SPI_FORCE",
			"bits":	1,
			"offset":	"blk0:70",
			"value":	"0x1"
		}, {
			"label":	"KEY_PURPOSE_1",
			"bits":	4,
			"offset":	"blk0:92",
			"value":	"0x2"
		}, {
			"label":	"KEY_PURPOSE_2",
			"bits":	4,
			"offset":	"blk0:96",
			"value":	"0x3"
		}, {
			"label":	"DIS_USB_JTAG",
			"bits":	1,
			"offset":	"blk0:118",
			"value":	"0x1"
		}, {
			"label":	"DIS_DIRECT_BOOT",
			"bits":	1,
			"offset":	"blk0:129",
			"value":	"0x1"
		}, {
			"label":	"ENABLE_SECURITY_DOWNLOAD",
			"bits":	1,
			"offset":	"blk0:133",
			"value":	"0x1"
		}, {
			"label":	"MAC",
			"bits":	48,
			"offset":	"blk1:40",
			"value":	"0x68b6b3472ae0"
		}, {
			"label":	"WAFER_VERSION_MINOR_LO",
			"bits":	3,
			"offset":	"blk1:114",
			"value":	"0x1"
		}, {
			"label":	"BLK_VERSION_MINOR",
			"bits":	3,
			"offset":	"blk1:120",
			"value":	"0x2"
		}, {
			"label":	"K_RTC_LDO",
			"bits":	7,
			"offset":	"blk1:141",
			"value":	"0x53"
		}, {
			"label":	"K_DIG_LDO",
			"bits":	7,
			"offset":	"blk1:148",
			"value":	"0x4a"
		}, {
			"label":	"V_RTC_DBIAS20",
			"bits":	8,
			"offset":	"blk1:155",
			"value":	"0x99"
		}, {
			"label":	"V_DIG_DBIAS20",
			"bits":	8,
			"offset":	"blk1:163",
			"value":	"0x8e"
		}, {
			"label":	"DIG_DBIAS_HVT",
			"bits":	5,
			"offset":	"blk1:171",
			"value":	"0x1a"
		}, {
			"label":	"ADC2_CAL_VOL_ATTEN3",
			"bits":	6,
			"offset":	"blk1:186",
			"value":	"0x23"
		}, {
			"label":	"OPTIONAL_UNIQUE_ID",
			"bits":	128,
			"offset":	"blk2:0",
			"value":	"0xa3aba04cdf8693c296ed73c5426f423e"
		}, {
			"label":	"BLK_VERSION_MAJOR",
			"bits":	2,
			"offset":	"blk2:128",
			"value":	"0x1"
		}, {
			"label":	"TEMP_CALIB",
			"bits":	9,
			"offset":	"blk2:132",
			"value":	"0x171"
		}, {
			"label":	"OCODE",
			"bits":	8,
			"offset":	"blk2:141",
			"value":	"0x52"
		}, {
			"label":	"ADC1_INIT_CODE_ATTEN0",
			"bits":	8,
			"offset":	"blk2:149",
			"value":	"0x74"
		}, {
			"label":	"ADC1_INIT_CODE_ATTEN1",
			"bits":	6,
			"offset":	"blk2:157",
			"value":	"0x2b"
		}, {
			"label":	"ADC1_INIT_CODE_ATTEN2",
			"bits":	6,
			"offset":	"blk2:163",
			"value":	"0x1e"
		}, {
			"label":	"ADC1_INIT_CODE_ATTEN3",
			"bits":	6,
			"offset":	"blk2:169",
			"value":	"0x22"
		}, {
			"label":	"ADC2_INIT_CODE_ATTEN0",
			"bits":	8,
			"offset":	"blk2:175",
			"value":	"0x90"
		}, {
			"label":	"ADC2_INIT_CODE_ATTEN1",
			"bits":	6,
			"offset":	"blk2:183",
			"value":	"0x2d"
		}, {
			"label":	"ADC2_INIT_CODE_ATTEN2",
			"bits":	6,
			"offset":	"blk2:189",
			"value":	"0x12"
		}, {
			"label":	"ADC2_INIT_CODE_ATTEN3",
			"bits":	6,
			"offset":	"blk2:195",
			"value":	"0x20"
		}, {
			"label":	"ADC1_CAL_VOL_ATTEN0",
			"bits":	8,
			"offset":	"blk2:201",
			"value":	"0x80"
		}, {
			"label":	"ADC1_CAL_VOL_ATTEN1",
			"bits":	8,
			"offset":	"blk2:209",
			"value":	"0x7f"
		}, {
			"label":	"ADC1_CAL_VOL_ATTEN2",
			"bits":	8,
			"offset":	"blk2:217",
			"value":	"0x73"
		}, {
			"label":	"ADC1_CAL_VOL_ATTEN3",
			"bits":	8,
			"offset":	"blk2:225",
			"value":	"0x8d"
		}, {
			"label":	"ADC2_CAL_VOL_ATTEN0",
			"bits":	8,
			"offset":	"blk2:233",
			"value":	"0x78"
		}, {
			"label":	"ADC2_CAL_VOL_ATTEN1",
			"bits":	7,
			"offset":	"blk2:241",
			"value":	"0x3c"
		}, {
			"label":	"ADC2_CAL_VOL_ATTEN2",
			"bits":	7,
			"offset":	"blk2:248",
			"value":	"0x41"
		}, {
			"label":	"KEY2",
			"bits":	256,
			"offset":	"blk6:0",
			"value":	"0x45e59"
		}],

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 12, 2024

on a different proper functioning device I can see SPI_BOOT_CRYPT_CNT is set as expected.

		}, {
			"label":	"SPI_BOOT_CRYPT_CNT",
			"bits":	3,
			"offset":	"blk0:82",
			"value":	"0x1"
		}, {

All these fuses are burned at the same time in my production script

# burn fuses
    burnXtsAesKey1(st)
    burnXtsAesKey2(st)
    burnSpiBootCryptCount(st)
    burnDisableDirectBoot(st)
    burnDisableDownloadManualEncrypt(st)
    burnDisableDownloadDCache(st)
    burnDisableDownloadICache(st)
    burnHardDisableJtag(st)
    burnDisableUsbJtag(st)
    burnEnableSecureDownloadMode(st)

so really, on the affected device, all evidence points to SPI_BOOT_CRYPT_CNT was 0x1, and after the issue happened now it is 0x0. And this caused the device to fail to boot.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 12, 2024

also note, my devices does not use CONFIG_SECURE_FLASH_ENC_ENABLED. Instead I burn the fuses myself.

@chipweinberger chipweinberger changed the title [v5.1.4][S3][Flash Encryption] Toggling VCC during boot BRICKS THE DEVICE! (invalid header: 0xce5209fc) (IDFGH-12807) [v5.1.4][S3][Flash Encryption] Toggling VCC / GPIO-0 during boot BRICKS THE DEVICE! (invalid header: 0xce5209fc) (IDFGH-12807) May 13, 2024
@mahavirj
Copy link
Member

@chipweinberger

Few questions:

  1. For the device where you suspect that SPI_BOOT_CRYPT_CNT is reset to 0, are you able to again write it to enable the flash enc?
  2. Are you able to recreate this issue reliably on multiple devices?

@chipweinberger
Copy link
Contributor Author

For the device where you suspect that SPI_BOOT_CRYPT_CNT is reset to 0, are you able to again write it to enable the flash enc?

I was waiting for your recommendation before I tried it.

I have not yet reproduced it, but i have not tried the exact repro scenario yet. I only have a few devices and was not sure if I would permanently break them.

I will try the exact repro scenario again now that I think i can reverse the damage. will try soon

do you have any guesses what is happening?

@mahavirj
Copy link
Member

it appears SPI_BOOT_CRYPT_CNT somehow got reset back to 0, causing flash encryption to become disabled

So the sequence goes something like this:

  1. First the device was running the encrypted artifacts
  2. Power cycle device multiple times
  3. Device started running into boot failure
  4. Flash plaintext artifacts and the device could boot successfully

Is this correct understanding? CC @KonstantinKondrashov

do you have any guesses what is happening?

No. The issue looks entirely different and not something that we heard/observed previously.

(So far) Most of the failures in the security features workflow can be attributed to errors in the efuse configuration:

  • Device being abruptly power cycled whilst it was enabling the flash encryption workflow
  • Programming errors in the efuse write operation. We had added some protection measures for this in the software and they are part of all latest branches. (Reference)

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 13, 2024

Is this correct understanding?

yes exactly. Power cycle + gpio 0 toggling. I will try to repro it again.

I have done enough testing that I think I can rule out power cycles alone. I think GPIO 0 is related somehow.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 14, 2024

For the device where you suspect that SPI_BOOT_CRYPT_CNT is reset to 0, are you able to again write it to enable the flash enc?

I have results.

  1. I had my affected device in SPI_BOOT_CRYPT_CNT 0, as I described
  2. because Secure Download Mode is enabled, I wrote simple plaintext firmware to set SPI_BOOT_CRYPT_CNT 1
  3. I flashed the plaintext firmware
  4. I booted (Note: I'm note 100% sure if this step actually happened or not. I should have been more careful), which should have set SPI_BOOT_CRYPT_CNT to 1
  5. I booted again, so flash encryption should be enabled now. But of course firmware is still plaintext.

After step 5, I checked get_security_info, SPI_BOOT_CRYPT_CNT is now 1

chipweinberger@Chips-MacBook-Pro jfwm % /Volumes/User/MBP-Google-Drive/jamcorder/firmware/.espressif/python_env/idf5.1_py3.9_env/bin/python /Volumes/User/MBP-Google-Drive/jamcorder/firmware/esp-idf/components/esptool_py/esptool/esptool.py --port /dev/cu.usbmodem1101 --chip esp32s3 --no-stub get_security_info
esptool.py v4.7.0
Serial port /dev/cu.usbmodem1101
Connecting...
Chip is ESP32-S3 in Secure Download Mode
Enabling default SPI flash mode...

Security Information:
=====================
Flags: 0x00000684 (0b11010000100)
Key Purposes: (4, 2, 3, 0, 0, 0, 12)
Chip ID: 9
API Version: 0
Secure Boot: Disabled
Flash Encryption: Enabled
SPI Boot Crypt Count (SPI_BOOT_CRYPT_CNT): 0x1
Dcache in UART download mode: Disabled
Icache in UART download mode: Disabled
JTAG: Permenantly Disabled
Hard resetting via RTS pin...
chipweinberger@Chips-MacBook-Pro jfwm % 

These are the logs for step 5

--- esp-idf-monitor 1.4.0 on /dev/cu.usbmodem1101 460800 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x15 (USB_UART_CHIP_RESET),boot:0xb (SPI_FAST_FLASH_BOOT)
Saved PC:0x400490e9
0x400490e9: usb_uart_device_rx_one_char in ROM

SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3c58,len:0x37bc
load:0x403c9700,len:0x4
load:0x403c9704,len:0x1c94
load:0x403cc700,len:0x4550
entry 0x403c9a8c
I (32) boot: ESP-IDF v5.1.1-1884-g04ab2fe17e 2nd stage bootloader
I (32) boot: compile time May 13 2024 18:14:28
I (32) boot: Multicore bootloader
I (33) boot: chip revision: v0.1
I (34) boot.esp32s3: Boot SPI Speed : 80MHz
I (35) boot.esp32s3: SPI Mode       : DIO
I (36) boot.esp32s3: SPI Flash Size : 8MB
I (37) boot: Enabling RNG early entropy source...
I (39) boot: Partition Table:
I (40) boot: ## Label            Usage          Type ST Offset   Length
I (41) boot:  0 jOtaData         OTA data         01 00 00010000 00002000
I (43) boot:  1 jCoredump        Unknown data     01 03 00012000 00010000
I (45) boot:  2 nvsKeys          NVS keys         01 04 00022000 00001000
I (47) boot:  3 jPhyInit         RF data          01 01 00023000 00001000
I (49) boot:  4 efuse_em         efuse            01 05 00024000 00001000
I (51) boot:  5 jOtaAesKey       Unknown data     01 06 00025000 00001000
I (53) boot:  6 jFabInfo         Unknown data     01 06 00026000 00013000
I (54) boot:  7 jStSafe          Unknown data     01 06 00039000 00004000
I (56) boot:  8 jJamcorderUuid   Unknown data     01 06 0003d000 00001000
I (58) boot:  9 nvs              WiFi data        01 02 0003e000 00033000
I (60) boot: 10 jSafeBoot        Unknown data     01 06 00071000 00006000
I (62) boot: 11 jDevCmds         Unknown data     01 06 00077000 00004000
I (64) boot: 12 jLifeStats       Unknown data     01 06 0007b000 00042000
I (66) boot: 13 jImmortal        Unknown data     01 06 000bd000 00008000
I (67) boot: 14 jJfuArgs         Unknown data     01 06 000c5000 00008000
I (69) boot: 15 jJfwmArgs        Unknown data     01 06 000cd000 00008000
I (71) boot: 16 jExtra1          Unknown data     01 06 000d5000 00004000
I (73) boot: 17 jExtra2          Unknown data     01 06 000d9000 00004000
I (75) boot: 18 jExtra3          Unknown data     01 06 000dd000 00002000
I (77) boot: 19 jExtra4          Unknown data     01 06 000df000 00001000
I (79) boot: 20 jJfuApp          OTA app          00 11 000e0000 00220000
I (81) boot: 21 jJfwmApp         OTA app          00 10 00300000 00500000
I (82) boot: End of partition table
I (84) boot: No factory image, trying OTA 0
I (85) jboot: bootloader_utility_get_selected_boot_partition: 0
I (86) jboot: Clearing ota_data Partition...
I (88) boot_comm: ## Label            Usage Offset   Length   Cleaned
I (220) boot_comm:  0 jOtaData         data  00010000 00002000 [yes]
I (220) boot_comm:  1 jCoredump        data  00012000 00010000 [no]
I (221) boot_comm:  2 nvsKeys          data  00022000 00001000 [no]
I (222) boot_comm:  3 jPhyInit         data  00023000 00001000 [no]
I (224) boot_comm:  4 efuse_em         data  00024000 00001000 [no]
I (226) boot_comm:  5 jOtaAesKey       data  00025000 00001000 [no]
I (228) boot_comm:  6 jFabInfo         data  00026000 00013000 [no]
I (229) boot_comm:  7 jStSafe          data  00039000 00004000 [no]
I (231) boot_comm:  8 jJamcorderUuid   data  0003d000 00001000 [no]
I (233) boot_comm:  9 nvs              data  0003e000 00033000 [no]
I (235) boot_comm: 10 jSafeBoot        data  00071000 00006000 [no]
I (236) boot_comm: 11 jDevCmds         data  00077000 00004000 [no]
I (238) boot_comm: 12 jLifeStats       data  0007b000 00042000 [no]
I (240) boot_comm: 13 jImmortal        data  000bd000 00008000 [no]
I (241) boot_comm: 14 jJfuArgs         data  000c5000 00008000 [no]
I (243) boot_comm: 15 jJfwmArgs        data  000cd000 00008000 [no]
I (245) boot_comm: 16 jExtra1          data  000d5000 00004000 [no]
I (247) boot_comm: 17 jExtra2          data  000d9000 00004000 [no]
I (248) boot_comm: 18 jExtra3          data  000dd000 00002000 [no]
I (250) boot_comm: 19 jExtra4          data  000df000 00001000 [no]
I (252) jboot: Note: Hold User Button down during boot to see boot menu!
I (254) jboot: Waiting 1.5 seconds for user to press button...
I (1754) jboot: GPIO not held
I (1754) jboot: GPIO hold time: 0
I (1754) jboot: No command chosen. Entering Default App Mode
I (1754) esp_image: segment 0: paddr=00300020 vaddr=3c190020 size=edbd0h (973776) map
I (1931) esp_image: segment 1: paddr=003edbf8 vaddr=3fca2400 size=02420h (  9248) load
I (1933) esp_image: segment 2: paddr=003f0020 vaddr=42000020 size=18bbb4h (1620916) map
I (2225) esp_image: segment 3: paddr=0057bbdc vaddr=3fca4820 size=04064h ( 16484) load
I (2229) esp_image: segment 4: paddr=0057fc48 vaddr=40374000 size=1e308h (123656) load
I (2269) boot: Loaded app from partition at offset 0x300000
I (2341) boot: Set actual ota_seq=1 in otadata[0]
I (2341) boot: Checking flash encryption...
I (2341) efuse: Batch mode of writing fields is enabled
I (2342) flash_encrypt: Generating new flash encryption key...
I (2345) efuse: Writing EFUSE_BLK_KEY0 with purpose 4
W (2346) flash_encrypt: Not disabling UART bootloader encryption
I (2347) flash_encrypt: Disable UART bootloader cache...
I (2348) flash_encrypt: Disable JTAG...
I (2350) efuse: BURN BLOCK4
W (2353) efuse: BLOCK4 got a coding error
E (2353) efuse: BURN BLOCK4 - ERROR (written bits != read bits)
W (2354) efuse: BLOCK4: next retry to fix an error [1/3]...
I (2355) efuse: BURN BLOCK4
W (2358) efuse: BLOCK4 got a coding error
E (2358) efuse: BURN BLOCK4 - ERROR (written bits != read bits)
W (2359) efuse: BLOCK4: next retry to fix an error [2/3]...
I (2360) efuse: BURN BLOCK4
I (2364) efuse: BURN BLOCK4 - OK (write block == read block)
W (2364) efuse: BLOCK0 already has a coding error
I (2365) efuse: BURN BLOCK0
E (2366) efuse: BURN BLOCK0 - ERROR (written bits != read bits)
W (2367) efuse: BLOCK0: next retry to fix an error [1/3]...
I (2368) efuse: BURN BLOCK0
E (2370) efuse: BURN BLOCK0 - ERROR (written bits != read bits)
W (2371) efuse: BLOCK0: next retry to fix an error [2/3]...
I (2372) efuse: BURN BLOCK0
I (2375) efuse: BURN BLOCK0 - OK (all write block bits are set)
I (2375) efuse: Batch mode. Prepared fields are committed
E (2376) esp_image: image at 0x0 has invalid magic byte (nothing flashed here?)
W (2378) flash_encrypt: no valid bootloader was found
E (2380) boot: Encryption flash contents failed (261)
E (2381) boot: OTA app partition slot 0 is not bootable
E (2383) esp_image: image at 0xe0000 has invalid magic byte (nothing flashed here?)
E (2385) boot: OTA app partition slot 1 is not bootable
E (2386) boot: No bootable app partitions in the partition table
ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x3 (RTC_SW_SYS_RST),boot:0xb (SPI_FAST_FLASH_BOOT)
Saved PC:0x403cde6d
invalid header: 0xc9f45dd9
invalid header: 0xc9f45dd9
invalid header: 0xc9f45dd9
invalid header: 0xc9f45dd9
invalid header: 0xc9f45dd9
invalid header: 0xc9f45dd9

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 14, 2024

did more stuff..

step 6: I flashed encrypted firmware
step 7: reboot. Device boots as normal and works

So in the end, I was able to recover the device into encrypted mode.

However, this is of course very unacceptable. A consumer of my product will not be able to recover.

@mahavirj
Copy link
Member

@chipweinberger

Key Purposes: (4, 2, 3, 0, 0, 0, 12)

This suggests that device has following 3 key purposes:

  • XTS_AES_128_KEY
  • XTS_AES_256_KEY_1
  • XTS_AES_256_KEY_2

In this case the key used will still be 512 bit and derived from key purpose 2/3. Please refer to section 23.4.2 in the S3 TRM for more details.

Q: you wrote this new key purpose block (4) intentionally? Which key you are using to encrypt the artifacts here?

I (2365) efuse: BURN BLOCK0
E (2366) efuse: BURN BLOCK0 - ERROR (written bits != read bits)
W (2367) efuse: BLOCK0: next retry to fix an error [1/3]...

Are these errors common for your hardware design? Or they are occurring only on this specific device?

I (49) boot: 4 efuse_em efuse 01 05 00024000 00001000

Any specific purpose for this partition? This is used for the virtual efuse scenario but not something that is recommended outside of IDF test environment.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented May 16, 2024

Q: you wrote this new key purpose block (4) intentionally?

No. I have no idea why it is being written.

Are these errors common for your hardware design?

I've never seen them before, AFAIK.

Any specific purpose for this partition?

I used it during development, and never got rid of it. I think it is okay to keep? sdkconfig has virtual fuses disabled.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented Jun 7, 2024

I have a video showing how to reproduce strange inconsistent behavior with GPIO 0.

I think the exact GPIO 0 sequence is not too exact. In the video I think I did it different ways.

I do wonder if I'm hitting some type of power glitch. My board does have a good amount of capacitance. (schematic above)

I'm not too concerned about the "RED LED" issue in the video below. The only issue that really matters was the corrupt firmware issue I originally reported. I think they might be related though.

I'm not exactly sure if this "RED LED" situation is what caused the original flash corruption issue. It's just a lead worth investigating. I have not been able to reproduce the flash corruption issue again.

Ultimately, it would just be nice to understand what is going on that causes GPIO1 / GPIO8, etc, to go into this weird state.

Notes:

  • the USB-A port is plugged into my mac using a "illegal" USB-A to USB-A cable for USB-Serial-JTAG
  • the button is on GPIO 0 (not GPIO1), red LED GPIO 1, buzzer on GPIO 8
  • this device still works totally fine. It's not broken in any way. This red LED state is temporary.
  • the "invalid header" sticker refers to the original corruption issue I had on this device, described in greater detail above, which bricked the device until I restored it.
IMG_0337.mp4

@igrr @mahavirj

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

3 participants