Can someone please explain how the Arduino bootloader works? I'm not looking for a high level answer here, I've read the code and I get the gist of it.
There's a bunch of protocol interaction that happens between the Arduino IDE and the bootloader code, ultimately resulting in a number of inline assembly instructions that self-program the flash with the program being transmitted over the serial interface.
What I'm not clear on is on line 270:
void (*app_start)(void) = 0x0000;  ...which I recognize as the declaration, and initialization to NULL, of a function pointer. There are subsequent calls to app_start in places where the bootloader is intended to delegate to execution of the user-loaded code.
Surely, somehow app_start needs to get a non-NULL value at some point for this to all come together. I'm not seeing that in the bootloader code... is it magically linked by the program that gets loaded by the bootloader? I presume that main of the bootloader is the entry point into software after a reset of the chip. 
Wrapped up in the 70 or so lines of assembly must be the secret decoder ring that tells the main program where app_start really is? Or perhaps it's some implicit knowlege being taken advantage of by the Arduino IDE? All I know is that if someone doesn't change app_start to point somewhere other than 0, the bootloader code would just spin on itself forever... so what's the trick?
Edit
I'm interested in trying to port the bootloader to an Tiny AVR that doesn't have separate memory space for boot loader code. As it becomes apparent to me that the bootloader code relies on certain fuse settings and chip support, I guess what I'm really interested in knowing is what does it take to port the bootloader to a chip that doesn't have those fuses and hardware support (but still has self-programming capability)?
The bootloader is the little program that runs when you turn the Arduino on, or press the reset button. Its main function is to wait for the Arduino software on your computer to send it a new program for the Arduino, which it then writes to the memory on the Arduino.
So while it's entirely possible to program many Arduino boards without using a bootloader, you would have to tie up one USB port on your computer for the programmer and another for serial communication (if you are using serial communication). You would also need separate hardware for both of these.
Arduino uses avr microcontrollers for their platforms which has program memory sections as shown in above figure. Boot Loader section is placed at the bottom of flash memory. The bootloader program is written in the bootloader section, and the application program is written in the application section.
Address 0 does not a null pointer make. A "null pointer" is something more abstract: a special value that applicable functions should recognize as being invalid. C says the special value is 0, and while the language says dereferencing it is "undefined behavior", in the simple world of microcontrollers it usually has a very well-defined effect.
Normally, on reset, the AVR's program counter (PC) is initialized to 0, thus the microcontroller begins executing code at address 0.
However, if the Boot Reset Fuse ("BOOTRST") is set, the program counter is instead initialized to an address of a block at the upper end of the memory (where that is depends on how the fuses are set, see a datasheet (PDF, 7 MB) for specifics). The code that begins there can do anything—if you really wanted you could put your own program there if you use an ICSP (bootloaders generally can't overwrite themselves).
Often though, it's a special program—a bootloader—that is able to read data from an external source (often via UART, I2C, CAN, etc.) to rewrite program code (stored in internal or external memory, depending on the micro). The bootloader will typically look for a "special event" which can literally be anything, but for development is most conveniently something on the data bus it will pull the new code from. (For production it might be a special logic level on a pin as it can be checked nearly-instantly.) If the bootloader sees the special event, it can enter bootloading-mode, where it will reflash the program memory, otherwise it passes control off to user code.
As an aside, the point of the bootloader fuse and upper memory block is to allow the use of a bootloader with no modifications to the original software (so long as it doesn't extend all the way up into the bootloader's address). Instead of flashing with just the original HEX and desired fuses, one can flash the original HEX, bootloader, and modified fuses, and presto, bootloader added.
Anyways, in the case of the Arduino, which I believe uses the protocol from the STK500, it attempts to communicate over the UART, and if it gets either no response in the allotted time:
uint32_t count = 0; while(!(UCSRA & _BV(RXC))) { // loops until a byte received     count++;     if (count > MAX_TIME_COUNT) // 4 seconds or whatever         app_start(); } or if it errors too much by getting an unexpected response:
if (++error_count == MAX_ERROR_COUNT)     app_start(); It passes control back to the main program, located at 0.  In the Arduino source seen above, this is done by calling app_start();, defined as void (*app_start)(void) = 0x0000;.
Because it's couched as a C function call, before the PC hops over to 0, it will push the current PC value onto the stack which also contains other variables used in the bootloader (e.g. count and error_count from above). Does this steal RAM from your program?  Well, after the PC is set to 0, the operations that are executed blatantly "violate" what a proper C function (that would eventually return) should do.  Among other initialization steps, it resets the stack pointer (effectively obliterating the call stack and all local variables), reclaiming RAM.  Global/static variables are initialized to 0, the address of which can freely overlap with whatever the bootloader was using because the bootloader and user programs were compiled independently.
The only lasting effects from the bootloader are modifications to hardware (peripheral) registers, which a good bootloader won't leave in a detrimental state (turning on peripherals that might waste power when you try to sleep). It's generally good practice to also fully initialize peripherals you will use, so even if the bootloader did something strange you'll set it how you want.
On ATtinys, as you mentioned, there is no luxury of the bootloader fuses or memory, so your code will always start at address 0.  You might be able to put your bootloader into some higher pages of memory and point your RESET vector at it, then whenever you receive a new hex file to flash with, take the command that's at address 0:1, replace it with the bootloader address, then store the replaced address somewhere else to call for normal execution.  (If it's an RJMP ("relative jump") the value will obviously need to be recalculated)
Edit
I'm interested in trying to port the bootloader to an Tiny AVR that doesn't have separate memory space for boot loader code. As it becomes apparent to me that the bootloader code relies on certain fuse settings and chip support, I guess what I'm really interested in knowing is what does it take to port the bootloader to a chip that doesn't have those fuses and hardware support (but still has self-programming capability)?
Depending on your ultimate goal it may be easier to just create your own bootloader rather than try to port one. You really only need to learn a few items for that part.
1) uart tx
2) uart rx
3) self-flash programming
Which can be learned separately and then combined into a bootloader. You will want a part that you can use spi or whatever to write the flash, so that if your bootloader doesnt work or whatever the part came with gets messed up you can still continue development.
Whether you port or roll your own you will still need to understand those three basic things with respect to that part.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With