I am using a STM32L476RG board and HAL SPI functions:
HAL_SPI_Transmit(&hspi2, &ReadAddr, 1, HAL_MAX_DELAY);
HAL_SPI_Receive(&hspi2, pBuffer, 4, HAL_MAX_DELAY);
I need to receive data from accelerometer's buffer with maximum speed and I have a problem with delay in these functions. As you can see on the oscilloscope screenshots, there are several microseconds during which nothing happens. I have no idea how to minimize the transmission gap.

I tried using HAL_SPI_Receive_DMA function and this delay was even bigger. Do you have any idea how to solve this problem using HAL functions or any pointers on how I could write my SPI function without these delays?
TL;DR Don't use HAL, write your transfer functions using the Reference Manual.
HAL is hopelessly overcomplicated for time-critical tasks (among others). Just look at the HAL_SPI_Transmit() function, it's over 60 lines of code till it gets to actually touching the Data Register. HAL will first mark the port access structure as busy even when there is no multitasking OS in sight, validates the function parameters, stores them in the hspi structure for no apparent reason, then goes on figuring out what mode SPI is in, etc. It's not necessary to check timeouts in SPI master mode either, because master controls all bus timings, if it can't get out a byte in a finite amount of time, then the port initialization is wrong, period.
Without HAL, it's a lot simpler. First, figure out what should go into the control registers, set CR1 and CR2 accordingly.
void SPIx_Init() {
    /* full duplex master, 8 bit transfer, default phase and polarity */
    SPIx->CR1 = SPI_CR1_MSTR | SPI_CR1_SPE | SPI_CR1_SSM | SPI_CR1_SSI;
    /* Disable receive FIFO, it'd complicate things when there is an odd number of bytes to transfer */
    SPIx->CR2 = SPI_CR2_FRXTH;
}
This initialization assumes that Slave Select (NSS or CS#) is handled by separate GPIO pins. If you want CS# managed by the SPI peripheral, then look up Slave select (NSS) pin management in the Reference Manual.
Note that a full duplex SPI connection can not just transmit or receive, it always does both simultaneously. If the slave expects one command byte, and answers with four bytes of data, that's a 5-byte transfer, the slave will ignore the last 4 bytes, the master should ignore the first one.
A very simple transfer function would be
void SPIx_Transfer(uint8_t *outp, uint8_t *inp, int count) {
    while(count--) {
        while(!(SPIx->SR & SPI_SR_TXE))
            ;
        *(volatile uint8_t *)&SPIx->DR = *outp++;
        while(!(SPIx->SR & SPI_SR_RXNE))
            ;
        *inp++ = *(volatile uint8_t *)&SPIx->DR;
    }
}
It can be further optimized when needed, by making use of the SPI fifo, interleaving writes and reads so that the transmitter is always kept busy.
If speed is critical, don't use generalized functions, or make sure they can be inlined when you do. Use a compiler with link-time optimization enabled, and optimize for speed (quite obviously).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With