Porting Optimizations

Note

In this section, we provide general guidelines that can be followed to optimize MCU performance. The examples shown are specific to the STM32G474 MCU series from STMicroelectronics but could be adapted to other MCU families as well. The reader should be familiar with the concepts of linker script, startup script and memory layout of the target MCU.

1 - Introduction

Running an MCU at high speeds can come with drawbacks depending on where the code is being executed from. Code located in flash memory will take more time to execute because flash memory has slower access time and consequently more wait states. The same code can be executed from a faster memory with less or no wait states such as RAM. It is therefore necessary to partition the code in such a way that performance is optimized based on application requirements. The SPARK Wireless Core is interrupt driven and should be optimized to execute as fast as possible to let the application have as much processing time as possible. In this section, we will describe how to configure the linker script and startup scripts to perform such optimizations.

2 - Memory Layout

Most MCUs include two banks of memory: the standard SRAM and another bank of Core-Coupled-Memory (a.k.a. CCM RAM for STM devices or Fast RAM for others). The CCM RAM is faster than the standard SRAM and is usually smaller. We can optimize code execution time by moving the code from Flash memory to the CCM SRAM. You need to manually edit the linker script file to assign code to these specific memory locations.

Memory layout

Figure 56: Example memory layout (taken from the STM32G474 reference manual).

3 - Modifying the Linker Script

The MEMORY portion of the linker script tells the linker about the memory spaces available to it within the device. Two new entries (ISRCCMRAM, CCMRAM) have been added to define the CCMRAM region. The purpose of the first entry is to map the interrupt vector table at the beginning of the CCMRAM area. The second entry is used to map user functions.

Note

For clarity purposes, only the relevant code section are shown.

MEMORY
{
   ISRCCMRAM (xrw): ORIGIN = 0x10000000, LENGTH = 0x1D8 /* 472 */
   CCMRAM (xrw)   : ORIGIN = 0x100001D8, LENGTH = 0x7E28 /* 32296 */
   RAM    (xrw)   : ORIGIN = 0x20000000, LENGTH = 0x18000 /* 96K */
   FLASH  (rx)    : ORIGIN = 0x08000000, LENGTH = 0x80000 /* 512K */
}

References to the start and end of theses sections are needed as the code will need to be copied from flash memory to CCMRAM on boot. The following variables are:

  • _isr_vector_start

  • _isr_vector_end

  • _ccmram_start

  • _ccmram_end

These can be found under the “section” delimiter of the linker script file.

SECTIONS
{
   ...
   /* The startup code into "FLASH" Rom type memory */
   .isr_vector :
   {
      . = ALIGN(4);
      _isr_vector_start = .;
      KEEP(*(.isr_vector)) /* Startup code */
      . = ALIGN(4);
      _isr_vector_end = .;
   } >ISRCCMRAM AT> FLASH

   .ccmram :
   {
      . = ALIGN(4);
      _ccmram_start = .;            /* create a global symbol at CCMRAM start */
      INCLUDE ccmram_sections.ld    /* user defined ccmram sections */
      . = ALIGN(4);
      _ccmram_end = .;              /* define a global symbol at the end of the used CCMRAM section */
   } >CCMRAM AT> FLASH
   ...
}

Note that the line “INCLUDE ccmram_sections.ld” refers to a file where the user can add function to be stored to CCMRAM. See Adding User Functions to CCMRAM.

The code is initially stored in flash memory, it needs to be copied from flash memory to CCMRAM where it will be executed. References to the flash memory addresses are needed. These are stored in the following variables:

  • _ccmram_loadaddr

  • _isr_vector_loadaddr

They can be found under the “section” delimiter of the linker.

SECTIONS
{
   ...
   /* Used by the startup script to copy user-defined content from the "ccmram" section load address,
      * which is located in flash, into the "_ccmram_start" relocation address which is located in the
      * actual CCMRAM.
      */
   _ccmram_loadaddr = LOADADDR(.ccmram);
   /* Used by the C program to copy the vector table from the "isr_vector" section load address,
      * which is located in flash, into the relocation address which is located in the actual CCMRAM.
      * This is optional as the relocation won't take effect until the SCB->VTOR register is adjusted.
      */
   _isr_vector_loadaddr = LOADADDR(.isr_vector);
   ...
}

4 - Copying from Flash to CCMRAM

On boot, the reset handler copies global variables and constants to SRAM so that they are accessed more easily at run time. User functions from “ccram_section.ld ” are copied to CCMRAM.

.section     .text.Reset_Handler
.weak        Reset_Handler
.type        Reset_Handler, %function
Reset_Handler:
ldr   r0, =_estack
mov   sp, r0          /* set stack pointer */

/* Copy the data segment initializers from flash to SRAM */
ldr r0, =_sdata
ldr r1, =_edata
ldr r2, =_sidata
movs r3, #0
b    LoopCopyDataInit

CopyDataInit:
ldr r4, [r2, r3]
str r4, [r0, r3]
adds r3, r3, #4

LoopCopyDataInit:
adds r4, r0, r3
cmp r4, r1
bcc CopyDataInit

/* Copy user functions from flash to CCMRAM from ccram_section.ld */
ldr r0, =_ccmram_start
ldr r1, =_ccmram_end
ldr r2, =_ccmram_loadaddr
movs r3, #0
b    LoopCopyCCMRAMfunctions

CopyCCMRAMfunctions:
ldr r4, [r2, r3]
str r4, [r0, r3]
adds r3, r3, #4

LoopCopyCCMRAMfunctions:
adds r4, r0, r3
cmp r4, r1
bcc CopyCCMRAMfunctions

/* Zero fill the bss segment. */
ldr r2, =_sbss
ldr r4, =_ebss
movs r3, #0
b LoopFillZerobss

FillZerobss:
str  r3, [r2]
adds r2, r2, #4

LoopFillZerobss:
cmp r2, r4
bcc FillZerobss

/* Call the clock system intitialization function.*/
   bl  SystemInit
/* Call static constructors */
   bl __libc_init_array
/* Call the application's entry point.*/
   bl        main

A system initialization function is called before running the main program (see previous code snippet). This SystemInit call is in fact a C function where the interrupt vector table is relocated based on the variables from the linker script. This is shown below.

void SystemInit(void)
{
   uint32_t *ptr_isr_vector_start = &_isr_vector_start;
   uint32_t *ptr_isr_vector_loadaddr = &_isr_vector_loadaddr;
   /* FPU settings ------------------------------------------------------------*/
#if (__FPU_PRESENT == 1) && (__FPU_USED == 1)
   SCB->CPACR |= ((3UL << (10 * 2)) | (3UL << (11 * 2)));  /* set CP10 and CP11 Full Access */
#endif
   /* Configure the Vector Table location add offset address -----------------*/
#ifdef VECT_TAB_SRAM
   SCB->VTOR = SRAM_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal SRAM */
#elif VECT_TAB_FLASH
   SCB->VTOR = FLASH_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal FLASH */
#else
   /* Move the isr vector to CCMRAM-------------------------------------------*/
   while (ptr_isr_vector_start < &_isr_vector_end) {
      *ptr_isr_vector_start++ = *ptr_isr_vector_loadaddr++;
   }
   SCB->VTOR = CCMSRAM_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal FLASH */
#endif
}

5 - Adding User Functions to CCMRAM

When including a file in the linker script, it is mandatory to add the linker flag “-L” followed by the target file path. For more information, see the INCLUDE directive in the GNU LD Documentation’s Options Commands section.

Relocating functions to CCMRAM is done by specifying the input file, the section and/or the functions name as described in the GNU LD Documentation’s Section Placement section.

Here are some examples:

  • Relocate the content of the “text” section from files in the WPS library (libwps.a), excluding the wps.c file (wps.c.obj).

libwps.a:(EXCLUDE_FILE(wps.c.obj) .text)
  • Relocate any content from the “text” section containing the word “radio” in its name from all files (note the wildcard “*” at the beginning).

*(.text.*radio*)