Esp32 bare metal bootloader

Question

After being able to use ledc module and PWM without using idf drivers I am trying to build a very simple bootloader for my esp32 devkit v1. I am doing it to keep learning low level coding so I don't want to use idf framework or Rtos.

The issue i have: my bootloader is correctly loaded when the chip turns on but it is unable to start the main program. Here is the bootloader.c code

#include <stdint.h>
//#define MAIN_APP_ADDR 0x10000
#define MAIN_APP_ADDR (0x3F400000 + 0x10000)
#define GPIO_OUT_W1TS_REG (*(volatile uint32_t *)0x3FF44008)
#define GPIO_OUT_W1TC_REG (*(volatile uint32_t *)0x3FF4400C)
#define GPIO_ENABLE_W1TS_REG (*(volatile uint32_t *)0x3FF44024)
#define GPIO_ENABLE_W1TC_REG (*(volatile uint32_t *)0x3FF44028)

static void jump_to_app(void) {
  typedef void (*void_fn)(void);

  volatile uint32_t* rst_entry = (uint32_t*)(MAIN_APP_ADDR + 4);
  uint32_t* rst = (uint32_t*)(*rst_entry);
  /* led is never turned off */ 
  GPIO_OUT_W1TC_REG |= (1<<2);
  void_fn app_addr = (void_fn)rst;
  app_addr();
}

int main(void) {

  // disable watchdog for the boot process
  TIMG0_WDCONFIG0_REG &= ~(1<<14);
  RTC_CNTL_WDTCONFIG0_REG &= ~(1 << 10);

  /* enable thepin to be used as output, pin2 -> integrated led*/
  GPIO_ENABLE_W1TS_REG |= (1 << 2);

  /* tunrs on the led */
  GPIO_OUT_W1TS_REG |= (1 << 2);
  jump_to_app();

  return 0;
}

The main logic is taken from this youtube video (min 29:14)

I added the GPIO control to understand where the program breaks, I have no debugger interface so I do not know the exact reason it does not work correctly. The led is turned on so I assume the bootloader is correctly started up, the main program should have turned the led off but never happened.

So i tried to debug the code using the led to understant where it breaks and the guilty line seems to be uint32_t* rst = (uint32_t*)(*rst_entry); cause before placing the gpio manipulating line one line before this one the led is turned off. Actually this line also triggers a waring in clang lsp saying: Cast to 'uint32_t *' (aka 'unsigned int *') from smaller integer type 'uint32_t' (aka 'unsigned int') [-Wint-to-pointer-cast] i used (uintptr_t) too but it did not help.

Let me say that I think the esp is restarting because the led slightly blinks and is not 'always' on.

Here is my bootloader.ld that is (strongly) inspired by the one found in this post I followed the example in the post and building a single bin (flashed at 0x1000) with 'integrated' bootloader worked fine.

ENTRY( main );

MEMORY
{
  iram_seg ( RX )       : ORIGIN = 0x40080000, len = 0xFC00
  dram_seg ( RW )       : ORIGIN = 0x3FFF0000, len = 0x1000
}

/* Define output sections */
SECTIONS {
  /* The program code and other data goes into Instruction RAM */
  .iram.text :
  {
    . = ALIGN(16);
    KEEP(*(.entry.text))
    *(.text)
    *(.text*)
    KEEP (*(.init))
    KEEP (*(.fini))
    *(.rodata)
    *(.rodata*)

    . = ALIGN(4);
    _etext = .;
  } >iram_seg

  /* Initialized data goes into Data RAM */
  _sidata = .;
  .data : AT(_sidata)
  {
    . = ALIGN(4);
    _sdata = .;
    *(.data)
    *(.data*)

    . = ALIGN(4);
    _edata = .;
  } >dram_seg

  /* Uninitialized data also goes into Data RAM */
  .bss :
  {
    . = ALIGN(4);
    _sbss = .;
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    _ebss = .;
  } >dram_seg

  . = ALIGN(4);
  PROVIDE ( end = . );
  PROVIDE ( _end = . );
}

I am surely missing something because I am new to C and low level coding, I am writing this ask because I am struggling finding a solution online and I can't get if the issue is the bootloader code or linker or in the main.ld. (main.c should be fine since it worked in the single-bin example of the post and it has not been edited)

I am flashing the bootloader to 0x1000 and the main app to 0x10000

I don't know if it can help but I leave also the main.ld

ENTRY( main );

MEMORY
{
  ext_ram (RW) : ORIGIN = 0x3F800000, len = 4M
  ext_instruction_flash (RX) : ORIGIN = 0x400C2000, len = 11K
}

/* Define output sections */
SECTIONS {
  /* The program code and other data goes into Instruction RAM */
    .text : {
    *(.text)
    *(.text*)
  } > ext_instruction_flash

  /* Initialized data goes into Data RAM */
  .data : {
    . = ALIGN(4);
    /*_sdata = .;*/
    *(.data)
    *(.data*)

    . = ALIGN(4);
    /*_edata = .;*/
  } >ext_ram

  /* Uninitialized data also goes into Data RAM */
  .bss :
  {
    . = ALIGN(4);
    /*_sbss = .;*/
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    /*_ebss = .;*/
  } >ext_ram

  . = ALIGN(4);
  /*PROVIDE ( end = . );*/
  /*PROVIDE ( _end = . );*/
}

I am using the external memory of the esp32 (page 30 of the Tech ref manual)

UPDATE:

I updated the code as craig suggested, I used instruction addresses to store the code, disabled the watchdogs in the bootloader to avoid the chip reset. now the chip correctly turns on (I see no errors using screen to see the log on /dev/ttyUSB0 115200) but it cannot make the main app start. I guess it is because the 0x10000 offset is not related to the external data flash address, but at the moment I have no clue where the main.bin is flashed using esptool.py and choosing 0x10000 as address. If someone has any idea please let me know.

UPDATE 2:

Thank you Craig, yes I started with app_main and now I wanted to push further. I'm learning (but also getting stressed, to be honest :D) a lot, and I'm enjoying it. I started with Wokwi and then purchased an esp32 with some LEDs and resistor just to try something 'real'. Wokwi seems to use idf framework and I'd like to learn some low coding and not only blindly use APIs or libraries. I have been reading the idf framework files to understand what I'm missing but it is not an easy task, in addition is hard to find a straight explanation in Tech Ref Man too, (always need to jump from a chapter to another). But let's get down to business... Here you can see that idf flashes the main app at 0x10000. For my project I am using this command esptool.py --chip esp32 --port /dev/ttyUSB0 --baud 115200 --before default_reset --after hard_reset write_flash -e -z --flash_mode dio --flash_freq 40m --flash_size detect 0x1000 ./bootloader.bin 0x10000 ./main.bin so I am flashing too the main bin at 0x10000. The partition table is missing but my bootloader does not need it.

Creating the screen session I was able to spot an issue: the chip was resetting due to watchdogs. Clearing them makes the bootloader runs fine but still the main app does not start. I think the 0x1000 and 0x10000 are offset from external flash starting address. I think so cause if i use #define MAIN_APP_ADDR (0x3F400000 + 0x1000) (starting address of external flash + bootloader offset) the bootloader is able to call itself! Or at lease it seems so, I do not see any error in the screen session and using different delays the LED blinks responsively. To be able to call itself the bootloader has to read MAIN_APP_ADDR + 4 Without + 4 nothing happens (the LED is turned off and never not turned on). The strange fact is that the bootloader can read its own address to call the function but is unable to read a different offset (or maybe it can read but not run it) I leave an example of the bootloader calling itself, maybe I'm getting it wrong so every comment is appreciated.

// bootloader.c
#include <stdint.h>

#define MAIN_APP_ADDR (0x3F400000 + 0x1000)
#define GPIO_OUT_W1TS_REG (*(volatile uint32_t *)0x3FF44008)
#define GPIO_OUT_W1TC_REG (*(volatile uint32_t *)0x3FF4400C)
#define GPIO_ENABLE_W1TS_REG (*(volatile uint32_t *)0x3FF44024)
#define GPIO_ENABLE_W1TC_REG (*(volatile uint32_t *)0x3FF44028)
#define TIMG0_WDCONFIG0_REG (*(volatile uint32_t *)0x3FF5F048)
#define RTC_CNTL_WDTCONFIG0_REG (*(volatile uint32_t *)0x3FF4808C)

typedef void (*void_fn)(void);

void jump_to_app(void) {

  void_fn app_addr = *(void_fn *)(MAIN_APP_ADDR + 4);

  // wait and turns it off
  int count = 8000000;
  while (count--) {
    __asm__("nop");
  }
  GPIO_OUT_W1TC_REG |= (1 << 2);

  app_addr();
}

int main(void) {

  // disable watchdog
  TIMG0_WDCONFIG0_REG &= ~(1 << 14);
  RTC_CNTL_WDTCONFIG0_REG &= ~(1 << 10);

  GPIO_ENABLE_W1TS_REG = (1 << 2);

  // wait and turns LED on
  int count = 8000000;
  while (count--) {
    __asm__("nop");
  }
  GPIO_OUT_W1TS_REG |= (1 << 2);
  jump_to_app();

  // LED does not blink so fast so it is not reached
  // half delay
  while (1) {
    int count = 4000000;
    while (count--) {
      __asm__("nop");
    }
    GPIO_OUT_W1TS_REG = (1 << 2);
    count = 4000000;
    while (count--) {
      __asm__("nop");
    }
    GPIO_OUT_W1TC_REG = (1 << 2);
  }

  return 0;
}

Craig Estey · Accepted Answer

Caveat: This is all based on a cursory examination of the docs, but I'll hazard a guess or two ...

From the link where you got your main.ld, it talks about hooking up a JTAG cable, using openocd to hook up so you can use gdb. This is highly recommended as you could single step your bootloader and your main program. IMO, hooking up JTAG and the debugger is well worth the extra time/effort.

Now, on to your problem ...

My first clue was from page 30, the table in 1.3.3 External Memory:

Table 1-4. External Memory Address Mapping
Boundary Address
Bus Type Low Address High Address Size Target Comment
Data 0x3F40_0000 0x3F7F_FFFF 4 MB External Flash Read
Data 0x3F80_0000 0x3FBF_FFFF 4 MB External SRAM Read and Write
Boundary Address
Bus Type Low Address High Address Size Target Comment
Espressif Systems 30
Submit Documentation Feedback
ESP32 TRM (Version 5.3)
1 System and Memory
Instruction 0x400C_2000 0x40BF_FFFF 11512 KB External Flash

This means that data memory is in the 0x3F40xxxx and 0x3F80xxxx ranges. But, instruction memory is in the 0x400C2000-0x40BFFFFF range.

Since I looked at this first, it is where I surmised that the ESP32 has two separate instruction and data buses at two different address ranges (Harvard architecture). My guess is that this allows instructions and data to be fetched simultaneously (in parallel) ???

But, then, I looked at section 1.3.1 Address Mapping. From that, we have:

Addresses below 0x4000_0000 are serviced using the data bus. Addresses in the range 0x4000_0000 ~ 0x4FFF_FFFF are serviced using the instruction bus. Finally, addresses over and including 0x5000_0000 are shared by the data and instruction bus.

This states it more plainly and seems to confirm my original guess.

From the blog, they have the output of the linker:

400794c4 T bootloader_utility_get_selected_boot_partition
400795dc T bootloader_utility_load_boot_image
4007935c T bootloader_utility_load_partition_table
3fff0018 A _bss_end
3fff0000 A _bss_start
40078658 T __bswapsi2
400095e0 A cache_flash_mmu_set_rom
40009a14 A Cache_Flush_rom
40009ab8 A Cache_Read_Disable_rom
40009a84 A Cache_Read_Enable_rom
40080764 T call_start_cpu0
         U call_user_start_cpu0
4005cfec A crc32_le
3ff96350 A __ctype_ptr__
3fff0018 d current_read_mapping
3fff001c A _data_end
3fff0018 A _data_start
40079b64 t debug_log_hash

Notice that all the T (text/code) come from 0x400xxxxx

The blog's .ld file is:

/*
 * GNU linker script for Espressif ESP32
 */

/* Default entry point */
ENTRY( call_start_cpu0 );

/* Specify main memory areas */
MEMORY
{
  /* Use values from the ESP-IDF 'bootloader' component.
  /* TODO: Use human-readable lengths */
  /* TODO: Use the full memory map - this is just a test */
  iram_seg ( RX )       : ORIGIN = 0x40080400, len = 0xFC00
  dram_seg ( RW )       : ORIGIN = 0x3FFF0000, len = 0x1000
}

/* Define output sections */
SECTIONS {
  /* The program code and other data goes into Instruction RAM */
  .iram.text :
  {
    . = ALIGN(16);
    KEEP(*(.entry.text))
    *(.text)
    *(.text*)
    KEEP (*(.init))
    KEEP (*(.fini))
    *(.rodata)
    *(.rodata*)

    . = ALIGN(4);
    _etext = .;
  } >iram_seg

  /* Initialized data goes into Data RAM */
  _sidata = .;
  .data : AT(_sidata)
  {
    . = ALIGN(4);
    _sdata = .;
    *(.data)
    *(.data*)

    . = ALIGN(4);
    _edata = .;
  } >dram_seg

  /* Uninitialized data also goes into Data RAM */
  .bss :
  {
    . = ALIGN(4);
    _sbss = .;
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    _ebss = .;
  } >dram_seg

  . = ALIGN(4);
  PROVIDE ( end = . );
  PROVIDE ( _end = . );
}

Notice that it specifies two memory regions: iram_seg and dram_seg. And, the code is placed in iram_seg.

Now, for your main.ld:


MEMORY
{
  ext_flash (R) : ORIGIN = 0x3F400000, len = 0x400000
  ext_ram (RWX) : ORIGIN = 0x3F800000, len = 0x400000
}

/* Define output sections */
SECTIONS {
  /* The program code and other data goes into Instruction RAM */
    .text : {
    *(.text)
    *(.text*)
  } > ext_flash

You are putting the code at 0x3f400000 with ext_flash.

AFAICT, the 0x3Fxxxxxx range is for data and not code. So, I think you need to adjust your file to use 0x4xxxxxxx as specified by the datasheet and the blog pages.

After rereading your question, I think you already noticed a bit of this.

As to your compiler warnings, I think your code can be simplified.

The main issue is using uint32_t. For a 32 bit machine, it is the same size as a pointer (e.g. void *) but it is [effectively] doing extra casting to non-address types. Because the cell at MAIN_APP_ADDR is constant/fixed, I think we can drop the volatile

Side note: I prefer typedef directive(s) to be file scoped. There's little benefit to function scoping these (as they can only be used in the one function).

#include <stdint.h>

#define MAIN_APP_ADDR 0x10000

// NOTE: this defines a _pointer_ to the function, so:
//   void_fn * is an _indirect_ pointer to a function pointer
typedef void (*void_fn)(void);

static void
jump_to_app(void)
{

    void_fn *rst_entry = (void_fn *) (MAIN_APP_ADDR + 4);
    void_fn app_addr = *rst_entry;

    app_addr();
}

static void
jump_to_app_shorter(void)
{

    void_fn app_addr = *(void_fn *) (MAIN_APP_ADDR + 4);

    app_addr();
}

UPDATE:

your advices are like gold. Thank you. I was able to create a screen session pointig to the esp32 to read the error.

You're welcome. The good news is that your bootloader is running. But, 0x10000, is not a valid address.

Side note: For future reference, for error messages, it's much better to add them as a code block at the bottom of your question. It preserves the newlines. I've tried to reconstruct it:

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x40080000,len:84 entry 0x40080010
Fatal exception (28): LoadProhibited epc1=0x4008004a, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00010000, depc=0x00000000

so i guess the 0x10000 is just an offset and not a pure address or I am messing the initial configuration

Yes, excvaddr shows this. But, your bootloader was loaded starting at 0x40080000 with entry point at 0x40080010. The address of the instruction that borked was: 0x4008004a, so this lines up with your [adjusted] boot code.

But, you may not need a custom bootloader at all. See the ESP32 doc: Bootloader

I think you can use Espressif's standard bootloader just fine. Or, in the doc it refers to examples of how to customize it (or replace it) in subdirs of their github repo.

This would be the usual first step. Because what you really want to do [initially, at least] is get your app_main to run. When it's started, you could [then] choose to use ESP "helper" APIs (HAL, IDF, etc.). Or, just do full baremetal at that point. Later, you can write your own loader. Or, just follow the standard bootloader's code exactly.

From the link I just cited, there is a link: Application Startup Flow

It has a wealth of info about the entire startup process. What each stage of the bootloader does (has to do).

From that, it appears that the ESP32 has at least two CPUs. It talks about a "PRO CPU" and "APP CPU". Apparently, the bootloader runs in PRO and starts up the second core:

When running system initialization, the code on PRO CPU sets the entry point for APP CPU, de-asserts APP CPU reset, and waits for a global flag to be set by the code running on APP CPU, indicating that it has started.

I'm not sure where you got 0x10000. The boot docs talk about 0x1000 or 0x8000. But, to start with, I'd find a way when building your app_main that you get the linker to spit out a loadmap. Essentially, it's similar to what nm would output.

The ESP32-IDF github repo is: https://github.com/espressif/esp-idf/tree/v5.4.1

It's cited in the "flow" doc, and has example app_main, etc. The example does use the HAL layer. But, the key point is that it gives you a framework to work from. (e.g.) It shows the order in which devices should be initialized. You just write your own code instead of calling a HAL* API function. Either hack it up or do your own from scratch, using the example as a loose guide.

General advice: For projects like these, one of the most common errors is an "error of omission". That is, not knowing that one needs to do a certain thing (e.g. configure interrupt controller first). So, I'd read both docs cover to cover at least once (I usually do twice). With this, we're looking for "breath" of knowledge more so than "depth".

An example of this: I learned about the PRO vs APP cpu. At present, that's all I know. Later, as I'm proceeding step-by-step through the details, I have the full roadmap in my head [and written notes ;-)]. So, if I wrote app_main from scratch and it started up, I'd know that somehow I need to set a bit to tell the code in PRO cpu that I've at least started. That's the sort of "hidden" information that presents a "gotcha" that can take a lot of time to debug on the running H/W if we don't know about it.

Another thing: What about simulation? https://wokwi.com/esp32 Also, has a VsCode plugin that allows use of your real binaries? https://marketplace.visualstudio.com/items?itemName=Wokwi.wokwi-vscode Debugging [untested] code under a [good] simulator can speed things up. It can often catch gross errors more quickly that trying to debug directly on real H/W. Then, when the code runs cleanly under simulation, the amount of work to get it up on physical H/W is often [greatly] reduced.

Esp32 bare metal bootloader

Tags:

c

embedded

bare-metal

esp32

UPDATE:

UPDATE 2:

Michele Del Grosso

1 Answers

Craig Estey

Recent Activity

Donate For Us

Esp32 bare metal bootloader

Tags:

c

embedded

bare-metal

esp32

UPDATE:

UPDATE 2:

Michele Del Grosso

1 Answers

Craig Estey

Related questions

Recent Activity

Donate For Us