1ARM Trusted Firmware Design 2=========================== 3 4Contents : 5 61. [Introduction](#1--introduction) 72. [Cold boot](#2--cold-boot) 83. [EL3 runtime services framework](#3--el3-runtime-services-framework) 94. [Power State Coordination Interface](#4--power-state-coordination-interface) 105. [Secure-EL1 Payloads and Dispatchers](#5--secure-el1-payloads-and-dispatchers) 116. [Crash Reporting in BL3-1](#6--crash-reporting-in-bl3-1) 127. [Guidelines for Reset Handlers](#7--guidelines-for-reset-handlers) 138. [CPU specific operations framework](#8--cpu-specific-operations-framework) 149. [Memory layout of BL images](#9-memory-layout-of-bl-images) 1510. [Firmware Image Package (FIP)](#10--firmware-image-package-fip) 1611. [Use of coherent memory in Trusted Firmware](#11--use-of-coherent-memory-in-trusted-firmware) 1712. [Code Structure](#12--code-structure) 1813. [References](#13--references) 19 20 211. Introduction 22---------------- 23 24The ARM Trusted Firmware implements a subset of the Trusted Board Boot 25Requirements (TBBR) Platform Design Document (PDD) [1] for ARM reference 26platforms. The TBB sequence starts when the platform is powered on and runs up 27to the stage where it hands-off control to firmware running in the normal 28world in DRAM. This is the cold boot path. 29 30The ARM Trusted Firmware also implements the Power State Coordination Interface 31([PSCI]) PDD [2] as a runtime service. PSCI is the interface from normal world 32software to firmware implementing power management use-cases (for example, 33secondary CPU boot, hotplug and idle). Normal world software can access ARM 34Trusted Firmware runtime services via the ARM SMC (Secure Monitor Call) 35instruction. The SMC instruction must be used as mandated by the [SMC Calling 36Convention PDD][SMCCC] [3]. 37 38The ARM Trusted Firmware implements a framework for configuring and managing 39interrupts generated in either security state. The details of the interrupt 40management framework and its design can be found in [ARM Trusted 41Firmware Interrupt Management Design guide][INTRG] [4]. 42 432. Cold boot 44------------- 45 46The cold boot path starts when the platform is physically turned on. One of 47the CPUs released from reset is chosen as the primary CPU, and the remaining 48CPUs are considered secondary CPUs. The primary CPU is chosen through 49platform-specific means. The cold boot path is mainly executed by the primary 50CPU, other than essential CPU initialization executed by all CPUs. The 51secondary CPUs are kept in a safe platform-specific state until the primary 52CPU has performed enough initialization to boot them. 53 54The cold boot path in this implementation of the ARM Trusted Firmware is divided 55into five steps (in order of execution): 56 57* Boot Loader stage 1 (BL1) _AP Trusted ROM_ 58* Boot Loader stage 2 (BL2) _Trusted Boot Firmware_ 59* Boot Loader stage 3-1 (BL3-1) _EL3 Runtime Firmware_ 60* Boot Loader stage 3-2 (BL3-2) _Secure-EL1 Payload_ (optional) 61* Boot Loader stage 3-3 (BL3-3) _Non-trusted Firmware_ 62 63ARM development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a 64combination of the following types of memory regions. Each bootloader stage uses 65one or more of these memory regions. 66 67* Regions accessible from both non-secure and secure states. For example, 68 non-trusted SRAM, ROM and DRAM. 69* Regions accessible from only the secure state. For example, trusted SRAM and 70 ROM. The FVPs also implement the trusted DRAM which is statically 71 configured. Additionally, the Base FVPs and Juno development platform 72 configure the TrustZone Controller (TZC) to create a region in the DRAM 73 which is accessible only from the secure state. 74 75 76The sections below provide the following details: 77 78* initialization and execution of the first three stages during cold boot 79* specification of the BL3-1 entrypoint requirements for use by alternative 80 Trusted Boot Firmware in place of the provided BL1 and BL2 81* changes in BL3-1 behavior when using the `RESET_TO_BL31` option which 82 allows BL3-1 to run without BL1 and BL2 83 84 85### BL1 86 87This stage begins execution from the platform's reset vector at EL3. The reset 88address is platform dependent but it is usually located in a Trusted ROM area. 89The BL1 data section is copied to trusted SRAM at runtime. 90 91On the ARM FVP port, BL1 code starts execution from the reset vector at address 92`0x00000000` (trusted ROM). The BL1 data section is copied to the start of 93trusted SRAM at address `0x04000000`. 94 95On the Juno ARM development platform port, BL1 code starts execution at 96`0x0BEC0000` (FLASH). The BL1 data section is copied to trusted SRAM at address 97`0x04001000. 98 99The functionality implemented by this stage is as follows. 100 101#### Determination of boot path 102 103Whenever a CPU is released from reset, BL1 needs to distinguish between a warm 104boot and a cold boot. This is done using platform-specific mechanisms (see the 105`platform_get_entrypoint()` function in the [Porting Guide]). In the case of a 106warm boot, a CPU is expected to continue execution from a seperate 107entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe 108platform-specific state (see the `plat_secondary_cold_boot_setup()` function in 109the [Porting Guide]) while the primary CPU executes the remaining cold boot path 110as described in the following sections. 111 112#### Architectural initialization 113 114BL1 performs minimal architectural initialization as follows. 115 116* Exception vectors 117 118 BL1 sets up simple exception vectors for both synchronous and asynchronous 119 exceptions. The default behavior upon receiving an exception is to populate 120 a status code in the general purpose register `X0` and call the 121 `plat_report_exception()` function (see the [Porting Guide]). The status 122 code is one of: 123 124 0x0 : Synchronous exception from Current EL with SP_EL0 125 0x1 : IRQ exception from Current EL with SP_EL0 126 0x2 : FIQ exception from Current EL with SP_EL0 127 0x3 : System Error exception from Current EL with SP_EL0 128 0x4 : Synchronous exception from Current EL with SP_ELx 129 0x5 : IRQ exception from Current EL with SP_ELx 130 0x6 : FIQ exception from Current EL with SP_ELx 131 0x7 : System Error exception from Current EL with SP_ELx 132 0x8 : Synchronous exception from Lower EL using aarch64 133 0x9 : IRQ exception from Lower EL using aarch64 134 0xa : FIQ exception from Lower EL using aarch64 135 0xb : System Error exception from Lower EL using aarch64 136 0xc : Synchronous exception from Lower EL using aarch32 137 0xd : IRQ exception from Lower EL using aarch32 138 0xe : FIQ exception from Lower EL using aarch32 139 0xf : System Error exception from Lower EL using aarch32 140 141 The `plat_report_exception()` implementation on the ARM FVP port programs 142 the Versatile Express System LED register in the following format to 143 indicate the occurence of an unexpected exception: 144 145 SYS_LED[0] - Security state (Secure=0/Non-Secure=1) 146 SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0) 147 SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value 148 of the status code 149 150 A write to the LED register reflects in the System LEDs (S6LED0..7) in the 151 CLCD window of the FVP. 152 153 BL1 does not expect to receive any exceptions other than the SMC exception. 154 For the latter, BL1 installs a simple stub. The stub expects to receive 155 only a single type of SMC (determined by its function ID in the general 156 purpose register `X0`). This SMC is raised by BL2 to make BL1 pass control 157 to BL3-1 (loaded by BL2) at EL3. Any other SMC leads to an assertion 158 failure. 159 160* CPU initialization 161 162 BL1 calls the `reset_handler()` function which in turn calls the CPU 163 specific reset handler function (see the section: "CPU specific operations 164 framework"). 165 166* MMU setup 167 168 BL1 sets up EL3 memory translation by creating page tables to cover the 169 first 4GB of physical address space. This covers all the memories and 170 peripherals needed by BL1. 171 172* Control register setup 173 - `SCTLR_EL3`. Instruction cache is enabled by setting the `SCTLR_EL3.I` 174 bit. Alignment and stack alignment checking is enabled by setting the 175 `SCTLR_EL3.A` and `SCTLR_EL3.SA` bits. Exception endianness is set to 176 little-endian by clearing the `SCTLR_EL3.EE` bit. 177 178 - `SCR_EL3`. The register width of the next lower exception level is set to 179 AArch64 by setting the `SCR.RW` bit. 180 181 - `CPTR_EL3`. Accesses to the `CPACR_EL1` register from EL1 or EL2, or the 182 `CPTR_EL2` register from EL2 are configured to not trap to EL3 by 183 clearing the `CPTR_EL3.TCPAC` bit. Access to the trace functionality is 184 configured not to trap to EL3 by clearing the `CPTR_EL3.TTA` bit. 185 Instructions that access the registers associated with Floating Point 186 and Advanced SIMD execution are configured to not trap to EL3 by 187 clearing the `CPTR_EL3.TFP` bit. 188 189#### Platform initialization 190 191BL1 enables issuing of snoop and DVM (Distributed Virtual Memory) requests from 192the CCI-400 slave interface corresponding to the cluster that includes the 193primary CPU. BL1 also initializes UART0 (PL011 console), which enables access to 194the `printf` family of functions in BL1. 195 196#### BL2 image load and execution 197 198BL1 execution continues as follows: 199 2001. BL1 determines the amount of free trusted SRAM memory available by 201 calculating the extent of its own data section, which also resides in 202 trusted SRAM. BL1 loads a BL2 raw binary image from platform storage, at a 203 platform-specific base address. If the BL2 image file is not present or if 204 there is not enough free trusted SRAM the following error message is 205 printed: 206 207 "Failed to load boot loader stage 2 (BL2) firmware." 208 209 If the load is successful, BL1 updates the limits of the remaining free 210 trusted SRAM. It also populates information about the amount of trusted 211 SRAM used by the BL2 image. The exact load location of the image is 212 provided as a base address in the platform header. Further description of 213 the memory layout can be found later in this document. 214 2152. BL1 prints the following string from the primary CPU to indicate successful 216 execution of the BL1 stage: 217 218 "Booting trusted firmware boot loader stage 1" 219 2203. BL1 passes control to the BL2 image at Secure EL1, starting from its load 221 address. 222 2234. BL1 also passes information about the amount of trusted SRAM used and 224 available for use. This information is populated at a platform-specific 225 memory address. 226 227 228### BL2 229 230BL1 loads and passes control to BL2 at Secure-EL1. BL2 is linked against and 231loaded at a platform-specific base address (more information can be found later 232in this document). The functionality implemented by BL2 is as follows. 233 234#### Architectural initialization 235 236BL2 performs minimal architectural initialization required for subsequent 237stages of the ARM Trusted Firmware and normal world software. It sets up 238Secure EL1 memory translation by creating page tables to address the first 4GB 239of the physical address space in a similar way to BL1. EL1 and EL0 are given 240access to Floating Point & Advanced SIMD registers by clearing the `CPACR.FPEN` 241bits. 242 243#### Platform initialization 244 245BL2 copies the information regarding the trusted SRAM populated by BL1 using a 246platform-specific mechanism. It calculates the limits of DRAM (main memory) 247to determine whether there is enough space to load the BL3-3 image. A platform 248defined base address is used to specify the load address for the BL3-1 image. 249It also defines the extents of memory available for use by the BL3-2 image. 250BL2 also initializes UART0 (PL011 console), which enables access to the 251`printf` family of functions in BL2. Platform security is initialized to allow 252access to controlled components. The storage abstraction layer is initialized 253which is used to load further bootloader images. 254 255#### BL3-0 (System Control Processor Firmware) image load 256 257Some systems have a separate System Control Processor (SCP) for power, clock, 258reset and system control. BL2 loads the optional BL3-0 image from platform 259storage into a platform-specific region of secure memory. The subsequent 260handling of BL3-0 is platform specific. For example, on the Juno ARM development 261platform port the image is transferred into SCP memory using the SCPI protocol 262after being loaded in the trusted SRAM memory at address `0x04009000`. The SCP 263executes BL3-0 and signals to the Application Processor (AP) for BL2 execution 264to continue. 265 266#### BL3-1 (EL3 Runtime Firmware) image load 267 268BL2 loads the BL3-1 image from platform storage into a platform-specific address 269in trusted SRAM. If there is not enough memory to load the image or image is 270missing it leads to an assertion failure. If the BL3-1 image loads successfully, 271BL2 updates the amount of trusted SRAM used and available for use by BL3-1. 272This information is populated at a platform-specific memory address. 273 274#### BL3-2 (Secure-EL1 Payload) image load 275 276BL2 loads the optional BL3-2 image from platform storage into a platform- 277specific region of secure memory. The image executes in the secure world. BL2 278relies on BL3-1 to pass control to the BL3-2 image, if present. Hence, BL2 279populates a platform-specific area of memory with the entrypoint/load-address 280of the BL3-2 image. The value of the Saved Processor Status Register (`SPSR`) 281for entry into BL3-2 is not determined by BL2, it is initialized by the 282Secure-EL1 Payload Dispatcher (see later) within BL3-1, which is responsible for 283managing interaction with BL3-2. This information is passed to BL3-1. 284 285#### BL3-3 (Non-trusted Firmware) image load 286 287BL2 loads the BL3-3 image (e.g. UEFI or other test or boot software) from 288platform storage into non-secure memory as defined by the platform. 289 290BL2 relies on BL3-1 to pass control to BL3-3 once secure state initialization is 291complete. Hence, BL2 populates a platform-specific area of memory with the 292entrypoint and Saved Program Status Register (`SPSR`) of the normal world 293software image. The entrypoint is the load address of the BL3-3 image. The 294`SPSR` is determined as specified in Section 5.13 of the [PSCI PDD] [PSCI]. This 295information is passed to BL3-1. 296 297#### BL3-1 (EL3 Runtime Firmware) execution 298 299BL2 execution continues as follows: 300 3011. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the 302 BL3-1 entrypoint. The exception is handled by the SMC exception handler 303 installed by BL1. 304 3052. BL1 turns off the MMU and flushes the caches. It clears the 306 `SCTLR_EL3.M/I/C` bits, flushes the data cache to the point of coherency 307 and invalidates the TLBs. 308 3093. BL1 passes control to BL3-1 at the specified entrypoint at EL3. 310 311 312### BL3-1 313 314The image for this stage is loaded by BL2 and BL1 passes control to BL3-1 at 315EL3. BL3-1 executes solely in trusted SRAM. BL3-1 is linked against and 316loaded at a platform-specific base address (more information can be found later 317in this document). The functionality implemented by BL3-1 is as follows. 318 319#### Architectural initialization 320 321Currently, BL3-1 performs a similar architectural initialization to BL1 as 322far as system register settings are concerned. Since BL1 code resides in ROM, 323architectural initialization in BL3-1 allows override of any previous 324initialization done by BL1. BL3-1 creates page tables to address the first 3254GB of physical address space and initializes the MMU accordingly. It initializes 326a buffer of frequently used pointers, called per-CPU pointer cache, in memory for 327faster access. Currently the per-CPU pointer cache contains only the pointer 328to crash stack. It then replaces the exception vectors populated by BL1 with its 329own. BL3-1 exception vectors implement more elaborate support for 330handling SMCs since this is the only mechanism to access the runtime services 331implemented by BL3-1 (PSCI for example). BL3-1 checks each SMC for validity as 332specified by the [SMC calling convention PDD][SMCCC] before passing control to 333the required SMC handler routine. BL3-1 programs the `CNTFRQ_EL0` register with 334the clock frequency of the system counter, which is provided by the platform. 335 336#### Platform initialization 337 338BL3-1 performs detailed platform initialization, which enables normal world 339software to function correctly. It also retrieves entrypoint information for 340the BL3-3 image loaded by BL2 from the platform defined memory address populated 341by BL2. BL3-1 also initializes UART0 (PL011 console), which enables 342access to the `printf` family of functions in BL3-1. It enables the system 343level implementation of the generic timer through the memory mapped interface. 344 345* GICv2 initialization: 346 347 - Enable group0 interrupts in the GIC CPU interface. 348 - Configure group0 interrupts to be asserted as FIQs. 349 - Disable the legacy interrupt bypass mechanism. 350 - Configure the priority mask register to allow interrupts of all 351 priorities to be signaled to the CPU interface. 352 - Mark SGIs 8-15, the secure physical timer interrupt (#29) and the 353 trusted watchdog interrupt (#56) as group0 (secure). 354 - Target the trusted watchdog interrupt to CPU0. 355 - Enable these group0 interrupts in the GIC distributor. 356 - Configure all other interrupts as group1 (non-secure). 357 - Enable signaling of group0 interrupts in the GIC distributor. 358 359* GICv3 initialization: 360 361 If a GICv3 implementation is available in the platform, BL3-1 initializes 362 the GICv3 in GICv2 emulation mode with settings as described for GICv2 363 above. 364 365* Power management initialization: 366 367 BL3-1 implements a state machine to track CPU and cluster state. The state 368 can be one of `OFF`, `ON_PENDING`, `SUSPEND` or `ON`. All secondary CPUs are 369 initially in the `OFF` state. The cluster that the primary CPU belongs to is 370 `ON`; any other cluster is `OFF`. BL3-1 initializes the data structures that 371 implement the state machine, including the locks that protect them. BL3-1 372 accesses the state of a CPU or cluster immediately after reset and before 373 the data cache is enabled in the warm boot path. It is not currently 374 possible to use 'exclusive' based spinlocks, therefore BL3-1 uses locks 375 based on Lamport's Bakery algorithm instead. BL3-1 allocates these locks in 376 device memory by default. 377 378* Runtime services initialization: 379 380 The runtime service framework and its initialization is described in the 381 "EL3 runtime services framework" section below. 382 383 Details about the PSCI service are provided in the "Power State Coordination 384 Interface" section below. 385 386* BL3-2 (Secure-EL1 Payload) image initialization 387 388 If a BL3-2 image is present then there must be a matching Secure-EL1 Payload 389 Dispatcher (SPD) service (see later for details). During initialization 390 that service must register a function to carry out initialization of BL3-2 391 once the runtime services are fully initialized. BL3-1 invokes such a 392 registered function to initialize BL3-2 before running BL3-3. 393 394 Details on BL3-2 initialization and the SPD's role are described in the 395 "Secure-EL1 Payloads and Dispatchers" section below. 396 397* BL3-3 (Non-trusted Firmware) execution 398 399 BL3-1 initializes the EL2 or EL1 processor context for normal-world cold 400 boot, ensuring that no secure state information finds its way into the 401 non-secure execution state. BL3-1 uses the entrypoint information provided 402 by BL2 to jump to the Non-trusted firmware image (BL3-3) at the highest 403 available Exception Level (EL2 if available, otherwise EL1). 404 405 406### Using alternative Trusted Boot Firmware in place of BL1 and BL2 407 408Some platforms have existing implementations of Trusted Boot Firmware that 409would like to use ARM Trusted Firmware BL3-1 for the EL3 Runtime Firmware. To 410enable this firmware architecture it is important to provide a fully documented 411and stable interface between the Trusted Boot Firmware and BL3-1. 412 413Future changes to the BL3-1 interface will be done in a backwards compatible 414way, and this enables these firmware components to be independently enhanced/ 415updated to develop and exploit new functionality. 416 417#### Required CPU state when calling `bl31_entrypoint()` during cold boot 418 419This function must only be called by the primary CPU, if this is called by any 420other CPU the firmware will abort. 421 422On entry to this function the calling primary CPU must be executing in AArch64 423EL3, little-endian data access, and all interrupt sources masked: 424 425 PSTATE.EL = 3 426 PSTATE.RW = 1 427 PSTATE.DAIF = 0xf 428 SCTLR_EL3.EE = 0 429 430X0 and X1 can be used to pass information from the Trusted Boot Firmware to the 431platform code in BL3-1: 432 433 X0 : Reserved for common Trusted Firmware information 434 X1 : Platform specific information 435 436BL3-1 zero-init sections (e.g. `.bss`) should not contain valid data on entry, 437these will be zero filled prior to invoking platform setup code. 438 439##### Use of the X0 and X1 parameters 440 441The parameters are platform specific and passed from `bl31_entrypoint()` to 442`bl31_early_platform_setup()`. The value of these parameters is never directly 443used by the common BL3-1 code. 444 445The convention is that `X0` conveys information regarding the BL3-1, BL3-2 and 446BL3-3 images from the Trusted Boot firmware and `X1` can be used for other 447platform specific purpose. This convention allows platforms which use ARM 448Trusted Firmware's BL1 and BL2 images to transfer additional platform specific 449information from Secure Boot without conflicting with future evolution of the 450Trusted Firmware using `X0` to pass a `bl31_params` structure. 451 452BL3-1 common and SPD initialization code depends on image and entrypoint 453information about BL3-3 and BL3-2, which is provided via BL3-1 platform APIs. 454This information is required until the start of execution of BL3-3. This 455information can be provided in a platform defined manner, e.g. compiled into 456the platform code in BL3-1, or provided in a platform defined memory location 457by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the 458Cold boot Initialization parameters. This data may need to be cleaned out of 459the CPU caches if it is provided by an earlier boot stage and then accessed by 460BL3-1 platform code before the caches are enabled. 461 462ARM Trusted Firmware's BL2 implementation passes a `bl31_params` structure in 463`X0` and the FVP port interprets this in the BL3-1 platform code. 464 465##### MMU, Data caches & Coherency 466 467BL3-1 does not depend on the enabled state of the MMU, data caches or 468interconnect coherency on entry to `bl31_entrypoint()`. If these are disabled 469on entry, these should be enabled during `bl31_plat_arch_setup()`. 470 471##### Data structures used in the BL3-1 cold boot interface 472 473These structures are designed to support compatibility and independent 474evolution of the structures and the firmware images. For example, a version of 475BL3-1 that can interpret the BL3-x image information from different versions of 476BL2, a platform that uses an extended entry_point_info structure to convey 477additional register information to BL3-1, or a ELF image loader that can convey 478more details about the firmware images. 479 480To support these scenarios the structures are versioned and sized, which enables 481BL3-1 to detect which information is present and respond appropriately. The 482`param_header` is defined to capture this information: 483 484 typedef struct param_header { 485 uint8_t type; /* type of the structure */ 486 uint8_t version; /* version of this structure */ 487 uint16_t size; /* size of this structure in bytes */ 488 uint32_t attr; /* attributes: unused bits SBZ */ 489 } param_header_t; 490 491The structures using this format are `entry_point_info`, `image_info` and 492`bl31_params`. The code that allocates and populates these structures must set 493the header fields appropriately, and the `SET_PARA_HEAD()` a macro is defined 494to simplify this action. 495 496#### Required CPU state for BL3-1 Warm boot initialization 497 498When requesting a CPU power-on, or suspending a running CPU, ARM Trusted 499Firmware provides the platform power management code with a Warm boot 500initialization entry-point, to be invoked by the CPU immediately after the 501reset handler. On entry to the Warm boot initialization function the calling 502CPU must be in AArch64 EL3, little-endian data access and all interrupt sources 503masked: 504 505 PSTATE.EL = 3 506 PSTATE.RW = 1 507 PSTATE.DAIF = 0xf 508 SCTLR_EL3.EE = 0 509 510The PSCI implementation will initialize the processor state and ensure that the 511platform power management code is then invoked as required to initialize all 512necessary system, cluster and CPU resources. 513 514 515### Using BL3-1 as the CPU reset vector 516 517On some platforms the runtime firmware (BL3-x images) for the application 518processors are loaded by trusted firmware running on a secure system processor 519on the SoC, rather than by BL1 and BL2 running on the primary application 520processor. For this type of SoC it is desirable for the application processor 521to always reset to BL3-1 which eliminates the need for BL1 and BL2. 522 523ARM Trusted Firmware provides a build-time option `RESET_TO_BL31` that includes 524some additional logic in the BL3-1 entrypoint to support this use case. 525 526In this configuration, the platform's Trusted Boot Firmware must ensure that 527BL3-1 is loaded to its runtime address, which must match the CPU's RVBAR reset 528vector address, before the application processor is powered on. Additionally, 529platform software is responsible for loading the other BL3-x images required and 530providing entry point information for them to BL3-1. Loading these images might 531be done by the Trusted Boot Firmware or by platform code in BL3-1. 532 533The ARM FVP port supports the `RESET_TO_BL31` configuration, in which case the 534`bl31.bin` image must be loaded to its run address in Trusted SRAM and all CPU 535reset vectors be changed from the default `0x0` to this run address. See the 536[User Guide] for details of running the FVP models in this way. 537 538This configuration requires some additions and changes in the BL3-1 539functionality: 540 541#### Determination of boot path 542 543In this configuration, BL3-1 uses the same reset framework and code as the one 544described for BL1 above. On a warm boot a CPU is directed to the PSCI 545implementation via a platform defined mechanism. On a cold boot, the platform 546must place any secondary CPUs into a safe state while the primary CPU executes 547a modified BL3-1 initialization, as described below. 548 549#### Architectural initialization 550 551As the first image to execute in this configuration BL3-1 must ensure that 552interconnect coherency is enabled (if required) before enabling the MMU. 553 554#### Platform initialization 555 556In this configuration, when the CPU resets to BL3-1 there are no parameters 557that can be passed in registers by previous boot stages. Instead, the platform 558code in BL3-1 needs to know, or be able to determine, the location of the BL3-2 559(if required) and BL3-3 images and provide this information in response to the 560`bl31_plat_get_next_image_ep_info()` function. 561 562As the first image to execute in this configuration BL3-1 must also ensure that 563any security initialisation, for example programming a TrustZone address space 564controller, is carried out during early platform initialisation. 565 566 5673. EL3 runtime services framework 568---------------------------------- 569 570Software executing in the non-secure state and in the secure state at exception 571levels lower than EL3 will request runtime services using the Secure Monitor 572Call (SMC) instruction. These requests will follow the convention described in 573the SMC Calling Convention PDD ([SMCCC]). The [SMCCC] assigns function 574identifiers to each SMC request and describes how arguments are passed and 575returned. 576 577The EL3 runtime services framework enables the development of services by 578different providers that can be easily integrated into final product firmware. 579The following sections describe the framework which facilitates the 580registration, initialization and use of runtime services in EL3 Runtime 581Firmware (BL3-1). 582 583The design of the runtime services depends heavily on the concepts and 584definitions described in the [SMCCC], in particular SMC Function IDs, Owning 585Entity Numbers (OEN), Fast and Standard calls, and the SMC32 and SMC64 calling 586conventions. Please refer to that document for more detailed explanation of 587these terms. 588 589The following runtime services are expected to be implemented first. They have 590not all been instantiated in the current implementation. 591 5921. Standard service calls 593 594 This service is for management of the entire system. The Power State 595 Coordination Interface ([PSCI]) is the first set of standard service calls 596 defined by ARM (see PSCI section later). 597 598 NOTE: Currently this service is called PSCI since there are no other 599 defined standard service calls. 600 6012. Secure-EL1 Payload Dispatcher service 602 603 If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then 604 it also requires a _Secure Monitor_ at EL3 to switch the EL1 processor 605 context between the normal world (EL1/EL2) and trusted world (Secure-EL1). 606 The Secure Monitor will make these world switches in response to SMCs. The 607 [SMCCC] provides for such SMCs with the Trusted OS Call and Trusted 608 Application Call OEN ranges. 609 610 The interface between the EL3 Runtime Firmware and the Secure-EL1 Payload is 611 not defined by the [SMCCC] or any other standard. As a result, each 612 Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime 613 service - within ARM Trusted Firmware this service is referred to as the 614 Secure-EL1 Payload Dispatcher (SPD). 615 616 ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and its 617 associated Dispatcher (TSPD). Details of SPD design and TSP/TSPD operation 618 are described in the "Secure-EL1 Payloads and Dispatchers" section below. 619 6203. CPU implementation service 621 622 This service will provide an interface to CPU implementation specific 623 services for a given platform e.g. access to processor errata workarounds. 624 This service is currently unimplemented. 625 626Additional services for ARM Architecture, SiP and OEM calls can be implemented. 627Each implemented service handles a range of SMC function identifiers as 628described in the [SMCCC]. 629 630 631### Registration 632 633A runtime service is registered using the `DECLARE_RT_SVC()` macro, specifying 634the name of the service, the range of OENs covered, the type of service and 635initialization and call handler functions. This macro instantiates a `const 636struct rt_svc_desc` for the service with these details (see `runtime_svc.h`). 637This structure is allocated in a special ELF section `rt_svc_descs`, enabling 638the framework to find all service descriptors included into BL3-1. 639 640The specific service for a SMC Function is selected based on the OEN and call 641type of the Function ID, and the framework uses that information in the service 642descriptor to identify the handler for the SMC Call. 643 644The service descriptors do not include information to identify the precise set 645of SMC function identifiers supported by this service implementation, the 646security state from which such calls are valid nor the capability to support 64764-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately 648to these aspects of a SMC call is the responsibility of the service 649implementation, the framework is focused on integration of services from 650different providers and minimizing the time taken by the framework before the 651service handler is invoked. 652 653Details of the parameters, requirements and behavior of the initialization and 654call handling functions are provided in the following sections. 655 656 657### Initialization 658 659`runtime_svc_init()` in `runtime_svc.c` initializes the runtime services 660framework running on the primary CPU during cold boot as part of the BL3-1 661initialization. This happens prior to initializing a Trusted OS and running 662Normal world boot firmware that might in turn use these services. 663Initialization involves validating each of the declared runtime service 664descriptors, calling the service initialization function and populating the 665index used for runtime lookup of the service. 666 667The BL3-1 linker script collects all of the declared service descriptors into a 668single array and defines symbols that allow the framework to locate and traverse 669the array, and determine its size. 670 671The framework does basic validation of each descriptor to halt firmware 672initialization if service declaration errors are detected. The framework does 673not check descriptors for the following error conditions, and may behave in an 674unpredictable manner under such scenarios: 675 6761. Overlapping OEN ranges 6772. Multiple descriptors for the same range of OENs and `call_type` 6783. Incorrect range of owning entity numbers for a given `call_type` 679 680Once validated, the service `init()` callback is invoked. This function carries 681out any essential EL3 initialization before servicing requests. The `init()` 682function is only invoked on the primary CPU during cold boot. If the service 683uses per-CPU data this must either be initialized for all CPUs during this call, 684or be done lazily when a CPU first issues an SMC call to that service. If 685`init()` returns anything other than `0`, this is treated as an initialization 686error and the service is ignored: this does not cause the firmware to halt. 687 688The OEN and call type fields present in the SMC Function ID cover a total of 689128 distinct services, but in practice a single descriptor can cover a range of 690OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a 691service handler, the framework uses an array of 128 indices that map every 692distinct OEN/call-type combination either to one of the declared services or to 693indicate the service is not handled. This `rt_svc_descs_indices[]` array is 694populated for all of the OENs covered by a service after the service `init()` 695function has reported success. So a service that fails to initialize will never 696have it's `handle()` function invoked. 697 698The following figure shows how the `rt_svc_descs_indices[]` index maps the SMC 699Function ID call type and OEN onto a specific service handler in the 700`rt_svc_descs[]` array. 701 702![Image 1](diagrams/rt-svc-descs-layout.png?raw=true) 703 704 705### Handling an SMC 706 707When the EL3 runtime services framework receives a Secure Monitor Call, the SMC 708Function ID is passed in W0 from the lower exception level (as per the 709[SMCCC]). If the calling register width is AArch32, it is invalid to invoke an 710SMC Function which indicates the SMC64 calling convention: such calls are 711ignored and return the Unknown SMC Function Identifier result code `0xFFFFFFFF` 712in R0/X0. 713 714Bit[31] (fast/standard call) and bits[29:24] (owning entity number) of the SMC 715Function ID are combined to index into the `rt_svc_descs_indices[]` array. The 716resulting value might indicate a service that has no handler, in this case the 717framework will also report an Unknown SMC Function ID. Otherwise, the value is 718used as a further index into the `rt_svc_descs[]` array to locate the required 719service and handler. 720 721The service's `handle()` callback is provided with five of the SMC parameters 722directly, the others are saved into memory for retrieval (if needed) by the 723handler. The handler is also provided with an opaque `handle` for use with the 724supporting library for parameter retrieval, setting return values and context 725manipulation; and with `flags` indicating the security state of the caller. The 726framework finally sets up the execution stack for the handler, and invokes the 727services `handle()` function. 728 729On return from the handler the result registers are populated in X0-X3 before 730restoring the stack and CPU state and returning from the original SMC. 731 732 7334. Power State Coordination Interface 734-------------------------------------- 735 736TODO: Provide design walkthrough of PSCI implementation. 737 738The PSCI v1.0 specification categorizes APIs as optional and mandatory. All the 739mandatory APIs in PSCI v1.0 and all the APIs in PSCI v0.2 draft specification 740[Power State Coordination Interface PDD] [PSCI] are implemented. The table lists 741the PSCI v1.0 APIs and their support in generic code. 742 743An API implementation might have a dependency on platform code e.g. CPU_SUSPEND 744requires the platform to export a part of the implementation. Hence the level 745of support of the mandatory APIs depends upon the support exported by the 746platform port as well. The Juno and FVP (all variants) platforms export all the 747required support. 748 749| PSCI v1.0 API |Supported| Comments | 750|:----------------------|:--------|:------------------------------------------| 751|`PSCI_VERSION` | Yes | The version returned is 1.0 | 752|`CPU_SUSPEND` | Yes* | The original `power_state` format is used | 753|`CPU_OFF` | Yes* | | 754|`CPU_ON` | Yes* | | 755|`AFFINITY_INFO` | Yes | | 756|`MIGRATE` | Yes** | | 757|`MIGRATE_INFO_TYPE` | Yes** | | 758|`MIGRATE_INFO_CPU` | Yes** | | 759|`SYSTEM_OFF` | Yes* | | 760|`SYSTEM_RESET` | Yes* | | 761|`PSCI_FEATURES` | Yes | | 762|`CPU_FREEZE` | No | | 763|`CPU_DEFAULT_SUSPEND` | No | | 764|`CPU_HW_STATE` | No | | 765|`SYSTEM_SUSPEND` | Yes* | | 766|`PSCI_SET_SUSPEND_MODE`| No | | 767|`PSCI_STAT_RESIDENCY` | No | | 768|`PSCI_STAT_COUNT` | No | | 769 770*Note : These PSCI APIs require platform power management hooks to be 771registered with the generic PSCI code to be supported. 772 773**Note : These PSCI APIs require appropriate Secure Payload Dispatcher 774hooks to be registered with the generic PSCI code to be supported. 775 776 7775. Secure-EL1 Payloads and Dispatchers 778--------------------------------------- 779 780On a production system that includes a Trusted OS running in Secure-EL1/EL0, 781the Trusted OS is coupled with a companion runtime service in the BL3-1 782firmware. This service is responsible for the initialisation of the Trusted 783OS and all communications with it. The Trusted OS is the BL3-2 stage of the 784boot flow in ARM Trusted Firmware. The firmware will attempt to locate, load 785and execute a BL3-2 image. 786 787ARM Trusted Firmware uses a more general term for the BL3-2 software that runs 788at Secure-EL1 - the _Secure-EL1 Payload_ - as it is not always a Trusted OS. 789 790The ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and a Test 791Secure-EL1 Payload Dispatcher (TSPD) service as an example of how a Trusted OS 792is supported on a production system using the Runtime Services Framework. On 793such a system, the Test BL3-2 image and service are replaced by the Trusted OS 794and its dispatcher service. The ARM Trusted Firmware build system expects that 795the dispatcher will define the build flag `NEED_BL32` to enable it to include 796the BL3-2 in the build either as a binary or to compile from source depending 797on whether the `BL32` build option is specified or not. 798 799The TSP runs in Secure-EL1. It is designed to demonstrate synchronous 800communication with the normal-world software running in EL1/EL2. Communication 801is initiated by the normal-world software 802 803* either directly through a Fast SMC (as defined in the [SMCCC]) 804 805* or indirectly through a [PSCI] SMC. The [PSCI] implementation in turn 806 informs the TSPD about the requested power management operation. This allows 807 the TSP to prepare for or respond to the power state change 808 809The TSPD service is responsible for. 810 811* Initializing the TSP 812 813* Routing requests and responses between the secure and the non-secure 814 states during the two types of communications just described 815 816### Initializing a BL3-2 Image 817 818The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing 819the BL3-2 image. It needs access to the information passed by BL2 to BL3-1 to do 820so. This is provided by: 821 822 entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t); 823 824which returns a reference to the `entry_point_info` structure corresponding to 825the image which will be run in the specified security state. The SPD uses this 826API to get entry point information for the SECURE image, BL3-2. 827 828In the absence of a BL3-2 image, BL3-1 passes control to the normal world 829bootloader image (BL3-3). When the BL3-2 image is present, it is typical 830that the SPD wants control to be passed to BL3-2 first and then later to BL3-3. 831 832To do this the SPD has to register a BL3-2 initialization function during 833initialization of the SPD service. The BL3-2 initialization function has this 834prototype: 835 836 int32_t init(); 837 838and is registered using the `bl31_register_bl32_init()` function. 839 840Trusted Firmware supports two approaches for the SPD to pass control to BL3-2 841before returning through EL3 and running the non-trusted firmware (BL3-3): 842 8431. In the BL3-2 setup function, use `bl31_set_next_image_type()` to 844 request that the exit from `bl31_main()` is to the BL3-2 entrypoint in 845 Secure-EL1. BL3-1 will exit to BL3-2 using the asynchronous method by 846 calling bl31_prepare_next_image_entry() and el3_exit(). 847 848 When the BL3-2 has completed initialization at Secure-EL1, it returns to 849 BL3-1 by issuing an SMC, using a Function ID allocated to the SPD. On 850 receipt of this SMC, the SPD service handler should switch the CPU context 851 from trusted to normal world and use the `bl31_set_next_image_type()` and 852 `bl31_prepare_next_image_entry()` functions to set up the initial return to 853 the normal world firmware BL3-3. On return from the handler the framework 854 will exit to EL2 and run BL3-3. 855 8562. The BL3-2 setup function registers a initialization function using 857 `bl31_register_bl32_init()` which provides a SPD-defined mechanism to 858 invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL3-2 859 entrypoint. 860 NOTE: The Test SPD service included with the Trusted Firmware provides one 861 implementation of such a mechanism. 862 863 On completion BL3-2 returns control to BL3-1 via a SMC, and on receipt the 864 SPD service handler invokes the synchronous call return mechanism to return 865 to the BL3-2 initialization function. On return from this function, 866 `bl31_main()` will set up the return to the normal world firmware BL3-3 and 867 continue the boot process in the normal world. 868 869 8706. Crash Reporting in BL3-1 871---------------------------- 872 873The BL3-1 implements a scheme for reporting the processor state when an unhandled 874exception is encountered. The reporting mechanism attempts to preserve all the 875register contents and report it via the default serial output. The general purpose 876registers, EL3, Secure EL1 and some EL2 state registers are reported. 877 878A dedicated per-CPU crash stack is maintained by BL3-1 and this is retrieved via 879the per-CPU pointer cache. The implementation attempts to minimise the memory 880required for this feature. The file `crash_reporting.S` contains the 881implementation for crash reporting. 882 883The sample crash output is shown below. 884 885 x0 :0x000000004F00007C 886 x1 :0x0000000007FFFFFF 887 x2 :0x0000000004014D50 888 x3 :0x0000000000000000 889 x4 :0x0000000088007998 890 x5 :0x00000000001343AC 891 x6 :0x0000000000000016 892 x7 :0x00000000000B8A38 893 x8 :0x00000000001343AC 894 x9 :0x00000000000101A8 895 x10 :0x0000000000000002 896 x11 :0x000000000000011C 897 x12 :0x00000000FEFDC644 898 x13 :0x00000000FED93FFC 899 x14 :0x0000000000247950 900 x15 :0x00000000000007A2 901 x16 :0x00000000000007A4 902 x17 :0x0000000000247950 903 x18 :0x0000000000000000 904 x19 :0x00000000FFFFFFFF 905 x20 :0x0000000004014D50 906 x21 :0x000000000400A38C 907 x22 :0x0000000000247950 908 x23 :0x0000000000000010 909 x24 :0x0000000000000024 910 x25 :0x00000000FEFDC868 911 x26 :0x00000000FEFDC86A 912 x27 :0x00000000019EDEDC 913 x28 :0x000000000A7CFDAA 914 x29 :0x0000000004010780 915 x30 :0x000000000400F004 916 scr_el3 :0x0000000000000D3D 917 sctlr_el3 :0x0000000000C8181F 918 cptr_el3 :0x0000000000000000 919 tcr_el3 :0x0000000080803520 920 daif :0x00000000000003C0 921 mair_el3 :0x00000000000004FF 922 spsr_el3 :0x00000000800003CC 923 elr_el3 :0x000000000400C0CC 924 ttbr0_el3 :0x00000000040172A0 925 esr_el3 :0x0000000096000210 926 sp_el3 :0x0000000004014D50 927 far_el3 :0x000000004F00007C 928 spsr_el1 :0x0000000000000000 929 elr_el1 :0x0000000000000000 930 spsr_abt :0x0000000000000000 931 spsr_und :0x0000000000000000 932 spsr_irq :0x0000000000000000 933 spsr_fiq :0x0000000000000000 934 sctlr_el1 :0x0000000030C81807 935 actlr_el1 :0x0000000000000000 936 cpacr_el1 :0x0000000000300000 937 csselr_el1 :0x0000000000000002 938 sp_el1 :0x0000000004028800 939 esr_el1 :0x0000000000000000 940 ttbr0_el1 :0x000000000402C200 941 ttbr1_el1 :0x0000000000000000 942 mair_el1 :0x00000000000004FF 943 amair_el1 :0x0000000000000000 944 tcr_el1 :0x0000000000003520 945 tpidr_el1 :0x0000000000000000 946 tpidr_el0 :0x0000000000000000 947 tpidrro_el0 :0x0000000000000000 948 dacr32_el2 :0x0000000000000000 949 ifsr32_el2 :0x0000000000000000 950 par_el1 :0x0000000000000000 951 far_el1 :0x0000000000000000 952 afsr0_el1 :0x0000000000000000 953 afsr1_el1 :0x0000000000000000 954 contextidr_el1 :0x0000000000000000 955 vbar_el1 :0x0000000004027000 956 cntp_ctl_el0 :0x0000000000000000 957 cntp_cval_el0 :0x0000000000000000 958 cntv_ctl_el0 :0x0000000000000000 959 cntv_cval_el0 :0x0000000000000000 960 cntkctl_el1 :0x0000000000000000 961 fpexc32_el2 :0x0000000004000700 962 sp_el0 :0x0000000004010780 963 9647. Guidelines for Reset Handlers 965--------------------------------- 966 967Trusted Firmware implements a framework that allows CPU and platform ports to 968perform actions immediately after a CPU is released from reset in both the cold 969and warm boot paths. This is done by calling the `reset_handler()` function in 970both the BL1 and BL3-1 images. It in turn calls the platform and CPU specific 971reset handling functions. 972 973Details for implementing a CPU specific reset handler can be found in 974Section 8. Details for implementing a platform specific reset handler can be 975found in the [Porting Guide](see the `plat_reset_handler()` function). 976 977When adding functionality to a reset handler, the following points should be 978kept in mind. 979 9801. The first reset handler in the system exists either in a ROM image 981 (e.g. BL1), or BL3-1 if `RESET_TO_BL31` is true. This may be detected at 982 compile time using the constant `FIRST_RESET_HANDLER_CALL`. 983 9842. When considering ROM images, it's important to consider non TF-based ROMs 985 and ROMs based on previous versions of the TF code. 986 9873. If the functionality should be applied to a ROM and there is no possibility 988 of a ROM being used that does not apply the functionality (or equivalent), 989 then the functionality should be applied within a `#if 990 FIRST_RESET_HANDLER_CALL` block. 991 9924. If the functionality should execute in BL3-1 in order to override or 993 supplement a ROM version of the functionality, then the functionality 994 should be applied in the `#else` part of a `#if FIRST_RESET_HANDLER_CALL` 995 block. 996 9975. If the functionality should be applied to a ROM but there is a possibility 998 of ROMs being used that do not apply the functionality, then the 999 functionality should be applied outside of a `FIRST_RESET_HANDLER_CALL` 1000 block, so that BL3-1 has an opportunity to apply the functionality instead. 1001 In this case, additional code may be needed to cope with different ROMs 1002 that do or do not apply the functionality. 1003 1004 10058. CPU specific operations framework 1006----------------------------- 1007 1008Certain aspects of the ARMv8 architecture are implementation defined, 1009that is, certain behaviours are not architecturally defined, but must be defined 1010and documented by individual processor implementations. The ARM Trusted 1011Firmware implements a framework which categorises the common implementation 1012defined behaviours and allows a processor to export its implementation of that 1013behaviour. The categories are: 1014 10151. Processor specific reset sequence. 1016 10172. Processor specific power down sequences. 1018 10193. Processor specific register dumping as a part of crash reporting. 1020 1021Each of the above categories fulfils a different requirement. 1022 10231. allows any processor specific initialization before the caches and MMU 1024 are turned on, like implementation of errata workarounds, entry into 1025 the intra-cluster coherency domain etc. 1026 10272. allows each processor to implement the power down sequence mandated in 1028 its Technical Reference Manual (TRM). 1029 10303. allows a processor to provide additional information to the developer 1031 in the event of a crash, for example Cortex-A53 has registers which 1032 can expose the data cache contents. 1033 1034Please note that only 2. is mandated by the TRM. 1035 1036The CPU specific operations framework scales to accommodate a large number of 1037different CPUs during power down and reset handling. The platform can specify 1038any CPU optimization it wants to enable for each CPU. It can also specify 1039the CPU errata workarounds to be applied for each CPU type during reset 1040handling by defining CPU errata compile time macros. Details on these macros 1041can be found in the [cpu-specific-build-macros.md][CPUBM] file. 1042 1043The CPU specific operations framework depends on the `cpu_ops` structure which 1044needs to be exported for each type of CPU in the platform. It is defined in 1045`include/lib/cpus/aarch64/cpu_macros.S` and has the following fields : `midr`, 1046`reset_func()`, `core_pwr_dwn()`, `cluster_pwr_dwn()` and `cpu_reg_dump()`. 1047 1048The CPU specific files in `lib/cpus` export a `cpu_ops` data structure with 1049suitable handlers for that CPU. For example, `lib/cpus/cortex_a53.S` exports 1050the `cpu_ops` for Cortex-A53 CPU. According to the platform configuration, 1051these CPU specific files must must be included in the build by the platform 1052makefile. The generic CPU specific operations framework code exists in 1053`lib/cpus/aarch64/cpu_helpers.S`. 1054 1055### CPU specific Reset Handling 1056 1057After a reset, the state of the CPU when it calls generic reset handler is: 1058MMU turned off, both instruction and data caches turned off and not part 1059of any coherency domain. 1060 1061The BL entrypoint code first invokes the `plat_reset_handler()` to allow 1062the platform to perform any system initialization required and any system 1063errata workarounds that needs to be applied. The `get_cpu_ops_ptr()` reads 1064the current CPU midr, finds the matching `cpu_ops` entry in the `cpu_ops` 1065array and returns it. Note that only the part number and implementer fields 1066in midr are used to find the matching `cpu_ops` entry. The `reset_func()` in 1067the returned `cpu_ops` is then invoked which executes the required reset 1068handling for that CPU and also any errata workarounds enabled by the platform. 1069This function must preserve the values of general purpose registers x20 to x29. 1070 1071Refer to Section "Guidelines for Reset Handlers" for general guidelines 1072regarding placement of code in a reset handler. 1073 1074### CPU specific power down sequence 1075 1076During the BL3-1 initialization sequence, the pointer to the matching `cpu_ops` 1077entry is stored in per-CPU data by `init_cpu_ops()` so that it can be quickly 1078retrieved during power down sequences. 1079 1080The PSCI service, upon receiving a power down request, determines the highest 1081affinity level at which to execute power down sequence for a particular CPU and 1082invokes the corresponding 'prepare' power down handler in the CPU specific 1083operations framework. For example, when a CPU executes a power down for affinity 1084level 0, the `prepare_core_pwr_dwn()` retrieves the `cpu_ops` pointer from the 1085per-CPU data and the corresponding `core_pwr_dwn()` is invoked. Similarly when 1086a CPU executes power down at affinity level 1, the `prepare_cluster_pwr_dwn()` 1087retrieves the `cpu_ops` pointer and the corresponding `cluster_pwr_dwn()` is 1088invoked. 1089 1090At runtime the platform hooks for power down are invoked by the PSCI service to 1091perform platform specific operations during a power down sequence, for example 1092turning off CCI coherency during a cluster power down. 1093 1094### CPU specific register reporting during crash 1095 1096If the crash reporting is enabled in BL3-1, when a crash occurs, the crash 1097reporting framework calls `do_cpu_reg_dump` which retrieves the matching 1098`cpu_ops` using `get_cpu_ops_ptr()` function. The `cpu_reg_dump()` in 1099`cpu_ops` is invoked, which then returns the CPU specific register values to 1100be reported and a pointer to the ASCII list of register names in a format 1101expected by the crash reporting framework. 1102 1103 11049. Memory layout of BL images 1105----------------------------- 1106 1107Each bootloader image can be divided in 2 parts: 1108 1109 * the static contents of the image. These are data actually stored in the 1110 binary on the disk. In the ELF terminology, they are called `PROGBITS` 1111 sections; 1112 1113 * the run-time contents of the image. These are data that don't occupy any 1114 space in the binary on the disk. The ELF binary just contains some 1115 metadata indicating where these data will be stored at run-time and the 1116 corresponding sections need to be allocated and initialized at run-time. 1117 In the ELF terminology, they are called `NOBITS` sections. 1118 1119All PROGBITS sections are grouped together at the beginning of the image, 1120followed by all NOBITS sections. This is true for all Trusted Firmware images 1121and it is governed by the linker scripts. This ensures that the raw binary 1122images are as small as possible. If a NOBITS section would sneak in between 1123PROGBITS sections then the resulting binary file would contain a bunch of zero 1124bytes at the location of this NOBITS section, making the image unnecessarily 1125bigger. Smaller images allow faster loading from the FIP to the main memory. 1126 1127### Linker scripts and symbols 1128 1129Each bootloader stage image layout is described by its own linker script. The 1130linker scripts export some symbols into the program symbol table. Their values 1131correspond to particular addresses. The trusted firmware code can refer to these 1132symbols to figure out the image memory layout. 1133 1134Linker symbols follow the following naming convention in the trusted firmware. 1135 1136* `__<SECTION>_START__` 1137 1138 Start address of a given section named `<SECTION>`. 1139 1140* `__<SECTION>_END__` 1141 1142 End address of a given section named `<SECTION>`. If there is an alignment 1143 constraint on the section's end address then `__<SECTION>_END__` corresponds 1144 to the end address of the section's actual contents, rounded up to the right 1145 boundary. Refer to the value of `__<SECTION>_UNALIGNED_END__` to know the 1146 actual end address of the section's contents. 1147 1148* `__<SECTION>_UNALIGNED_END__` 1149 1150 End address of a given section named `<SECTION>` without any padding or 1151 rounding up due to some alignment constraint. 1152 1153* `__<SECTION>_SIZE__` 1154 1155 Size (in bytes) of a given section named `<SECTION>`. If there is an 1156 alignment constraint on the section's end address then `__<SECTION>_SIZE__` 1157 corresponds to the size of the section's actual contents, rounded up to the 1158 right boundary. In other words, `__<SECTION>_SIZE__ = __<SECTION>_END__ - 1159 _<SECTION>_START__`. Refer to the value of `__<SECTION>_UNALIGNED_SIZE__` 1160 to know the actual size of the section's contents. 1161 1162* `__<SECTION>_UNALIGNED_SIZE__` 1163 1164 Size (in bytes) of a given section named `<SECTION>` without any padding or 1165 rounding up due to some alignment constraint. In other words, 1166 `__<SECTION>_UNALIGNED_SIZE__ = __<SECTION>_UNALIGNED_END__ - 1167 __<SECTION>_START__`. 1168 1169Some of the linker symbols are mandatory as the trusted firmware code relies on 1170them to be defined. They are listed in the following subsections. Some of them 1171must be provided for each bootloader stage and some are specific to a given 1172bootloader stage. 1173 1174The linker scripts define some extra, optional symbols. They are not actually 1175used by any code but they help in understanding the bootloader images' memory 1176layout as they are easy to spot in the link map files. 1177 1178#### Common linker symbols 1179 1180Early setup code needs to know the extents of the BSS section to zero-initialise 1181it before executing any C code. The following linker symbols are defined for 1182this purpose: 1183 1184* `__BSS_START__` This address must be aligned on a 16-byte boundary. 1185* `__BSS_SIZE__` 1186 1187Similarly, the coherent memory section (if enabled) must be zero-initialised. 1188Also, the MMU setup code needs to know the extents of this section to set the 1189right memory attributes for it. The following linker symbols are defined for 1190this purpose: 1191 1192* `__COHERENT_RAM_START__` This address must be aligned on a page-size boundary. 1193* `__COHERENT_RAM_END__` This address must be aligned on a page-size boundary. 1194* `__COHERENT_RAM_UNALIGNED_SIZE__` 1195 1196#### BL1's linker symbols 1197 1198BL1's early setup code needs to know the extents of the .data section to 1199relocate it from ROM to RAM before executing any C code. The following linker 1200symbols are defined for this purpose: 1201 1202* `__DATA_ROM_START__` This address must be aligned on a 16-byte boundary. 1203* `__DATA_RAM_START__` This address must be aligned on a 16-byte boundary. 1204* `__DATA_SIZE__` 1205 1206BL1's platform setup code needs to know the extents of its read-write data 1207region to figure out its memory layout. The following linker symbols are defined 1208for this purpose: 1209 1210* `__BL1_RAM_START__` This is the start address of BL1 RW data. 1211* `__BL1_RAM_END__` This is the end address of BL1 RW data. 1212 1213#### BL2's, BL3-1's and TSP's linker symbols 1214 1215BL2, BL3-1 and TSP need to know the extents of their read-only section to set 1216the right memory attributes for this memory region in their MMU setup code. The 1217following linker symbols are defined for this purpose: 1218 1219* `__RO_START__` 1220* `__RO_END__` 1221 1222### How to choose the right base addresses for each bootloader stage image 1223 1224There is currently no support for dynamic image loading in the Trusted Firmware. 1225This means that all bootloader images need to be linked against their ultimate 1226runtime locations and the base addresses of each image must be chosen carefully 1227such that images don't overlap each other in an undesired way. As the code 1228grows, the base addresses might need adjustments to cope with the new memory 1229layout. 1230 1231The memory layout is completely specific to the platform and so there is no 1232general recipe for choosing the right base addresses for each bootloader image. 1233However, there are tools to aid in understanding the memory layout. These are 1234the link map files: `build/<platform>/<build-type>/bl<x>/bl<x>.map`, with `<x>` 1235being the stage bootloader. They provide a detailed view of the memory usage of 1236each image. Among other useful information, they provide the end address of 1237each image. 1238 1239* `bl1.map` link map file provides `__BL1_RAM_END__` address. 1240* `bl2.map` link map file provides `__BL2_END__` address. 1241* `bl31.map` link map file provides `__BL31_END__` address. 1242* `bl32.map` link map file provides `__BL32_END__` address. 1243 1244For each bootloader image, the platform code must provide its start address 1245as well as a limit address that it must not overstep. The latter is used in the 1246linker scripts to check that the image doesn't grow past that address. If that 1247happens, the linker will issue a message similar to the following: 1248 1249 aarch64-none-elf-ld: BLx has exceeded its limit. 1250 1251Additionally, if the platform memory layout implies some image overlaying like 1252on FVP, BL3-1 and TSP need to know the limit address that their PROGBITS 1253sections must not overstep. The platform code must provide those. 1254 1255 1256#### Memory layout on ARM FVPs 1257 1258The following list describes the memory layout on the FVP: 1259 1260* A 4KB page of shared memory is used to store the entrypoint mailboxes 1261 and the parameters passed between bootloaders. The shared memory is located 1262 at the base of the Trusted SRAM. The amount of Trusted SRAM available to 1263 load the bootloader images will be reduced by the size of the shared memory. 1264 1265* BL1 is originally sitting in the Trusted ROM at address `0x0`. Its 1266 read-write data are relocated at the top of the Trusted SRAM at runtime. 1267 1268* BL3-1 is loaded at the top of the Trusted SRAM, such that its NOBITS 1269 sections will overwrite BL1 R/W data. 1270 1271* BL2 is loaded below BL3-1. 1272 1273* BL3-2 can be loaded in one of the following locations: 1274 1275 * Trusted SRAM 1276 * Trusted DRAM 1277 * Secure region of DRAM (top 16MB of DRAM configured by the TrustZone 1278 controller) 1279 1280When BL3-2 is loaded into Trusted SRAM, its NOBITS sections are allowed to 1281overlay BL2. This memory layout is designed to give the BL3-2 image as much 1282memory as possible when it is loaded into Trusted SRAM. 1283 1284The location of the BL3-2 image will result in different memory maps. This is 1285illustrated in the following diagrams using the TSP as an example. 1286 1287**TSP in Trusted SRAM (default option):** 1288 1289 Trusted SRAM 1290 0x04040000 +----------+ loaded by BL2 ------------------ 1291 | BL1 (rw) | <<<<<<<<<<<<< | BL3-1 NOBITS | 1292 |----------| <<<<<<<<<<<<< |----------------| 1293 | | <<<<<<<<<<<<< | BL3-1 PROGBITS | 1294 |----------| ------------------ 1295 | BL2 | <<<<<<<<<<<<< | BL3-2 NOBITS | 1296 |----------| <<<<<<<<<<<<< |----------------| 1297 | | <<<<<<<<<<<<< | BL3-2 PROGBITS | 1298 0x04001000 +----------+ ------------------ 1299 | Shared | 1300 0x04000000 +----------+ 1301 1302 Trusted ROM 1303 0x04000000 +----------+ 1304 | BL1 (ro) | 1305 0x00000000 +----------+ 1306 1307 1308**TSP in Trusted DRAM:** 1309 1310 Trusted DRAM 1311 0x08000000 +----------+ 1312 | BL3-2 | 1313 0x06000000 +----------+ 1314 1315 Trusted SRAM 1316 0x04040000 +----------+ loaded by BL2 ------------------ 1317 | BL1 (rw) | <<<<<<<<<<<<< | BL3-1 NOBITS | 1318 |----------| <<<<<<<<<<<<< |----------------| 1319 | | <<<<<<<<<<<<< | BL3-1 PROGBITS | 1320 |----------| ------------------ 1321 | BL2 | 1322 |----------| 1323 | | 1324 0x04001000 +----------+ 1325 | Shared | 1326 0x04000000 +----------+ 1327 1328 Trusted ROM 1329 0x04000000 +----------+ 1330 | BL1 (ro) | 1331 0x00000000 +----------+ 1332 1333**TSP in the TZC-Secured DRAM:** 1334 1335 DRAM 1336 0xffffffff +----------+ 1337 | BL3-2 | (secure) 1338 0xff000000 +----------+ 1339 | | 1340 : : (non-secure) 1341 | | 1342 0x80000000 +----------+ 1343 1344 Trusted SRAM 1345 0x04040000 +----------+ loaded by BL2 ------------------ 1346 | BL1 (rw) | <<<<<<<<<<<<< | BL3-1 NOBITS | 1347 |----------| <<<<<<<<<<<<< |----------------| 1348 | | <<<<<<<<<<<<< | BL3-1 PROGBITS | 1349 |----------| ------------------ 1350 | BL2 | 1351 |----------| 1352 | | 1353 0x04001000 +----------+ 1354 | Shared | 1355 0x04000000 +----------+ 1356 1357 Trusted ROM 1358 0x04000000 +----------+ 1359 | BL1 (ro) | 1360 0x00000000 +----------+ 1361 1362Moving the TSP image out of the Trusted SRAM doesn't change the memory layout 1363of the other boot loader images in Trusted SRAM. 1364 1365 1366#### Memory layout on Juno ARM development platform 1367 1368The following list describes the memory layout on Juno: 1369 1370* Trusted SRAM at 0x04000000 contains the MHU page, BL1 r/w section, BL2 1371 image, BL3-1 image and, optionally, the BL3-2 image. 1372 1373* The MHU 4 KB page is used as communication channel between SCP and AP. It 1374 also contains the entrypoint mailboxes for the AP. Mailboxes are stored in 1375 the first 128 bytes of the MHU page. 1376 1377* BL1 resides in flash memory at address `0x0BEC0000`. Its read-write data 1378 section is relocated to the top of the Trusted SRAM at runtime. 1379 1380* BL3-1 is loaded at the top of the Trusted SRAM, such that its NOBITS 1381 sections will overwrite BL1 R/W data. This implies that BL1 global variables 1382 will remain valid only until execution reaches the BL3-1 entry point during 1383 a cold boot. 1384 1385* BL2 is loaded below BL3-1. 1386 1387* BL3-0 is loaded temporarily into the BL3-1 memory region and transfered to 1388 the SCP before being overwritten by BL3-1. 1389 1390* The BL3-2 image is optional and can be loaded into one of these two 1391 locations: Trusted SRAM (right after the MHU page) or DRAM (14 MB starting 1392 at 0xFF000000 and secured by the TrustZone controller). When loaded into 1393 Trusted SRAM, its NOBITS sections are allowed to overlap BL2. 1394 1395Depending on the location of the BL3-2 image, it will result in different memory 1396maps, illustrated by the following diagrams. 1397 1398**BL3-2 in Trusted SRAM (default option):** 1399 1400 Flash0 1401 0x0C000000 +----------+ 1402 : : 1403 0x0BED0000 |----------| 1404 | BL1 (ro) | 1405 0x0BEC0000 |----------| 1406 : : 1407 0x08000000 +----------+ BL3-1 is loaded 1408 after BL3-0 has 1409 Trusted SRAM been sent to SCP 1410 0x04040000 +----------+ loaded by BL2 ------------------ 1411 | BL1 (rw) | <<<<<<<<<<<<< | BL3-1 NOBITS | 1412 |----------| <<<<<<<<<<<<< |----------------| 1413 | BL3-0 | <<<<<<<<<<<<< | BL3-1 PROGBITS | 1414 |----------| ------------------ 1415 | BL2 | <<<<<<<<<<<<< | BL3-2 NOBITS | 1416 |----------| <<<<<<<<<<<<< |----------------| 1417 | | <<<<<<<<<<<<< | BL3-2 PROGBITS | 1418 0x04001000 +----------+ ------------------ 1419 | MHU | 1420 0x04000000 +----------+ 1421 1422 1423**BL3-2 in the secure region of DRAM:** 1424 1425 DRAM 1426 0xFFE00000 +----------+ 1427 | BL3-2 | (secure) 1428 0xFF000000 |----------| 1429 | | 1430 : : (non-secure) 1431 | | 1432 0x80000000 +----------+ 1433 1434 Flash0 1435 0x0C000000 +----------+ 1436 : : 1437 0x0BED0000 |----------| 1438 | BL1 (ro) | 1439 0x0BEC0000 |----------| 1440 : : 1441 0x08000000 +----------+ BL3-1 is loaded 1442 after BL3-0 has 1443 Trusted SRAM been sent to SCP 1444 0x04040000 +----------+ loaded by BL2 ------------------ 1445 | BL1 (rw) | <<<<<<<<<<<<< | BL3-1 NOBITS | 1446 |----------| <<<<<<<<<<<<< |----------------| 1447 | BL3-0 | <<<<<<<<<<<<< | BL3-1 PROGBITS | 1448 |----------| ------------------ 1449 | BL2 | 1450 |----------| 1451 | | 1452 0x04001000 +----------+ 1453 | MHU | 1454 0x04000000 +----------+ 1455 1456Loading the BL3-2 image in DRAM doesn't change the memory layout of the other 1457images in Trusted SRAM. 1458 1459 146010. Firmware Image Package (FIP) 1461--------------------------------- 1462 1463Using a Firmware Image Package (FIP) allows for packing bootloader images (and 1464potentially other payloads) into a single archive that can be loaded by the ARM 1465Trusted Firmware from non-volatile platform storage. A driver to load images 1466from a FIP has been added to the storage layer and allows a package to be read 1467from supported platform storage. A tool to create Firmware Image Packages is 1468also provided and described below. 1469 1470### Firmware Image Package layout 1471 1472The FIP layout consists of a table of contents (ToC) followed by payload data. 1473The ToC itself has a header followed by one or more table entries. The ToC is 1474terminated by an end marker entry. All ToC entries describe some payload data 1475that has been appended to the end of the binary package. With the information 1476provided in the ToC entry the corresponding payload data can be retrieved. 1477 1478 ------------------ 1479 | ToC Header | 1480 |----------------| 1481 | ToC Entry 0 | 1482 |----------------| 1483 | ToC Entry 1 | 1484 |----------------| 1485 | ToC End Marker | 1486 |----------------| 1487 | | 1488 | Data 0 | 1489 | | 1490 |----------------| 1491 | | 1492 | Data 1 | 1493 | | 1494 ------------------ 1495 1496The ToC header and entry formats are described in the header file 1497`include/firmware_image_package.h`. This file is used by both the tool and the 1498ARM Trusted firmware. 1499 1500The ToC header has the following fields: 1501 `name`: The name of the ToC. This is currently used to validate the header. 1502 `serial_number`: A non-zero number provided by the creation tool 1503 `flags`: Flags associated with this data. None are yet defined. 1504 1505A ToC entry has the following fields: 1506 `uuid`: All files are referred to by a pre-defined Universally Unique 1507 IDentifier [UUID] . The UUIDs are defined in 1508 `include/firmware_image_package`. The platform translates the requested 1509 image name into the corresponding UUID when accessing the package. 1510 `offset_address`: The offset address at which the corresponding payload data 1511 can be found. The offset is calculated from the ToC base address. 1512 `size`: The size of the corresponding payload data in bytes. 1513 `flags`: Flags associated with this entry. Non are yet defined. 1514 1515### Firmware Image Package creation tool 1516 1517The FIP creation tool can be used to pack specified images into a binary package 1518that can be loaded by the ARM Trusted Firmware from platform storage. The tool 1519currently only supports packing bootloader images. Additional image definitions 1520can be added to the tool as required. 1521 1522The tool can be found in `tools/fip_create`. 1523 1524### Loading from a Firmware Image Package (FIP) 1525 1526The Firmware Image Package (FIP) driver can load images from a binary package on 1527non-volatile platform storage. For the FVPs this is currently NOR FLASH. 1528 1529Bootloader images are loaded according to the platform policy as specified in 1530`plat/<platform>/plat_io_storage.c`. For the FVPs this means the platform will 1531attempt to load images from a Firmware Image Package located at the start of NOR 1532FLASH0. 1533 1534Currently the FVP's policy only allows loading of a known set of images. The 1535platform policy can be modified to allow additional images. 1536 1537 153811. Use of coherent memory in Trusted Firmware 1539---------------------------------------------- 1540 1541There might be loss of coherency when physical memory with mismatched 1542shareability, cacheability and memory attributes is accessed by multiple CPUs 1543(refer to section B2.9 of [ARM ARM] for more details). This possibility occurs 1544in Trusted Firmware during power up/down sequences when coherency, MMU and 1545caches are turned on/off incrementally. 1546 1547Trusted Firmware defines coherent memory as a region of memory with Device 1548nGnRE attributes in the translation tables. The translation granule size in 1549Trusted Firmware is 4KB. This is the smallest possible size of the coherent 1550memory region. 1551 1552By default, all data structures which are susceptible to accesses with 1553mismatched attributes from various CPUs are allocated in a coherent memory 1554region (refer to section 2.1 of [Porting Guide]). The coherent memory region 1555accesses are Outer Shareable, non-cacheable and they can be accessed 1556with the Device nGnRE attributes when the MMU is turned on. Hence, at the 1557expense of at least an extra page of memory, Trusted Firmware is able to work 1558around coherency issues due to mismatched memory attributes. 1559 1560The alternative to the above approach is to allocate the susceptible data 1561structures in Normal WriteBack WriteAllocate Inner shareable memory. This 1562approach requires the data structures to be designed so that it is possible to 1563work around the issue of mismatched memory attributes by performing software 1564cache maintenance on them. 1565 1566### Disabling the use of coherent memory in Trusted Firmware 1567 1568It might be desirable to avoid the cost of allocating coherent memory on 1569platforms which are memory constrained. Trusted Firmware enables inclusion of 1570coherent memory in firmware images through the build flag `USE_COHERENT_MEM`. 1571This flag is enabled by default. It can be disabled to choose the second 1572approach described above. 1573 1574The below sections analyze the data structures allocated in the coherent memory 1575region and the changes required to allocate them in normal memory. 1576 1577### PSCI Affinity map nodes 1578 1579The `psci_aff_map` data structure stores the hierarchial node information for 1580each affinity level in the system including the PSCI states associated with them. 1581By default, this data structure is allocated in the coherent memory region in 1582the Trusted Firmware because it can be accessed by multiple CPUs, either with 1583their caches enabled or disabled. 1584 1585 typedef struct aff_map_node { 1586 unsigned long mpidr; 1587 unsigned char ref_count; 1588 unsigned char state; 1589 unsigned char level; 1590 #if USE_COHERENT_MEM 1591 bakery_lock_t lock; 1592 #else 1593 unsigned char aff_map_index; 1594 #endif 1595 } aff_map_node_t; 1596 1597In order to move this data structure to normal memory, the use of each of its 1598fields must be analyzed. Fields like `mpidr` and `level` are only written once 1599during cold boot. Hence removing them from coherent memory involves only doing 1600a clean and invalidate of the cache lines after these fields are written. 1601 1602The fields `state` and `ref_count` can be concurrently accessed by multiple 1603CPUs in different cache states. A Lamport's Bakery lock is used to ensure mutual 1604exlusion to these fields. As a result, it is possible to move these fields out 1605of coherent memory by performing software cache maintenance on them. The field 1606`lock` is the bakery lock data structure when `USE_COHERENT_MEM` is enabled. 1607The `aff_map_index` is used to identify the bakery lock when `USE_COHERENT_MEM` 1608is disabled. 1609 1610### Bakery lock data 1611 1612The bakery lock data structure `bakery_lock_t` is allocated in coherent memory 1613and is accessed by multiple CPUs with mismatched attributes. `bakery_lock_t` is 1614defined as follows: 1615 1616 typedef struct bakery_lock { 1617 int owner; 1618 volatile char entering[BAKERY_LOCK_MAX_CPUS]; 1619 volatile unsigned number[BAKERY_LOCK_MAX_CPUS]; 1620 } bakery_lock_t; 1621 1622It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU 1623fields can be read by all CPUs but only written to by the owning CPU. 1624 1625Depending upon the data cache line size, the per-CPU fields of the 1626`bakery_lock_t` structure for multiple CPUs may exist on a single cache line. 1627These per-CPU fields can be read and written during lock contention by multiple 1628CPUs with mismatched memory attributes. Since these fields are a part of the 1629lock implementation, they do not have access to any other locking primitive to 1630safeguard against the resulting coherency issues. As a result, simple software 1631cache maintenance is not enough to allocate them in coherent memory. Consider 1632the following example. 1633 1634CPU0 updates its per-CPU field with data cache enabled. This write updates a 1635local cache line which contains a copy of the fields for other CPUs as well. Now 1636CPU1 updates its per-CPU field of the `bakery_lock_t` structure with data cache 1637disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of 1638its field in any other cache line in the system. This operation will invalidate 1639the update made by CPU0 as well. 1640 1641To use bakery locks when `USE_COHERENT_MEM` is disabled, the lock data structure 1642has been redesigned. The changes utilise the characteristic of Lamport's Bakery 1643algorithm mentioned earlier. The per-CPU fields of the new lock structure are 1644aligned such that they are allocated on separate cache lines. The per-CPU data 1645framework in Trusted Firmware is used to achieve this. This enables software to 1646perform software cache maintenance on the lock data structure without running 1647into coherency issues associated with mismatched attributes. 1648 1649The per-CPU data framework enables consolidation of data structures on the 1650fewest cache lines possible. This saves memory as compared to the scenario where 1651each data structure is separately aligned to the cache line boundary to achieve 1652the same effect. 1653 1654The bakery lock data structure `bakery_info_t` is defined for use when 1655`USE_COHERENT_MEM` is disabled as follows: 1656 1657 typedef struct bakery_info { 1658 /* 1659 * The lock_data is a bit-field of 2 members: 1660 * Bit[0] : choosing. This field is set when the CPU is 1661 * choosing its bakery number. 1662 * Bits[1 - 15] : number. This is the bakery number allocated. 1663 */ 1664 volatile uint16_t lock_data; 1665 } bakery_info_t; 1666 1667The `bakery_info_t` represents a single per-CPU field of one lock and 1668the combination of corresponding `bakery_info_t` structures for all CPUs in the 1669system represents the complete bakery lock. It is embedded in the per-CPU 1670data framework `cpu_data` as shown below: 1671 1672 CPU0 cpu_data 1673 ------------------ 1674 | .... | 1675 |----------------| 1676 | `bakery_info_t`| <-- Lock_0 per-CPU field 1677 | Lock_0 | for CPU0 1678 |----------------| 1679 | `bakery_info_t`| <-- Lock_1 per-CPU field 1680 | Lock_1 | for CPU0 1681 |----------------| 1682 | .... | 1683 |----------------| 1684 | `bakery_info_t`| <-- Lock_N per-CPU field 1685 | Lock_N | for CPU0 1686 ------------------ 1687 1688 1689 CPU1 cpu_data 1690 ------------------ 1691 | .... | 1692 |----------------| 1693 | `bakery_info_t`| <-- Lock_0 per-CPU field 1694 | Lock_0 | for CPU1 1695 |----------------| 1696 | `bakery_info_t`| <-- Lock_1 per-CPU field 1697 | Lock_1 | for CPU1 1698 |----------------| 1699 | .... | 1700 |----------------| 1701 | `bakery_info_t`| <-- Lock_N per-CPU field 1702 | Lock_N | for CPU1 1703 ------------------ 1704 1705Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an 1706operation on Lock_N, the corresponding `bakery_info_t` in both CPU0 and CPU1 1707`cpu_data` need to be fetched and appropriate cache operations need to be 1708performed for each access. 1709 1710For multiple bakery locks, an array of `bakery_info_t` is declared in `cpu_data` 1711and each lock is given an `id` to identify it in the array. 1712 1713### Non Functional Impact of removing coherent memory 1714 1715Removal of the coherent memory region leads to the additional software overhead 1716of performing cache maintenance for the affected data structures. However, since 1717the memory where the data structures are allocated is cacheable, the overhead is 1718mostly mitigated by an increase in performance. 1719 1720There is however a performance impact for bakery locks, due to: 1721* Additional cache maintenance operations, and 1722* Multiple cache line reads for each lock operation, since the bakery locks 1723 for each CPU are distributed across different cache lines. 1724 1725The implementation has been optimized to mimimize this additional overhead. 1726Measurements indicate that when bakery locks are allocated in Normal memory, the 1727minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas 1728in Device memory the same is 2 micro seconds. The measurements were done on the 1729Juno ARM development platform. 1730 1731As mentioned earlier, almost a page of memory can be saved by disabling 1732`USE_COHERENT_MEM`. Each platform needs to consider these trade-offs to decide 1733whether coherent memory should be used. If a platform disables 1734`USE_COHERENT_MEM` and needs to use bakery locks in the porting layer, it should 1735reserve memory in `cpu_data` by defining the macro `PLAT_PCPU_DATA_SIZE` (see 1736the [Porting Guide]). Refer to the reference platform code for examples. 1737 1738 173912. Code Structure 1740------------------- 1741 1742Trusted Firmware code is logically divided between the three boot loader 1743stages mentioned in the previous sections. The code is also divided into the 1744following categories (present as directories in the source code): 1745 1746* **Architecture specific.** This could be AArch32 or AArch64. 1747* **Platform specific.** Choice of architecture specific code depends upon 1748 the platform. 1749* **Common code.** This is platform and architecture agnostic code. 1750* **Library code.** This code comprises of functionality commonly used by all 1751 other code. 1752* **Stage specific.** Code specific to a boot stage. 1753* **Drivers.** 1754* **Services.** EL3 runtime services, e.g. PSCI or SPD. Specific SPD services 1755 reside in the `services/spd` directory (e.g. `services/spd/tspd`). 1756 1757Each boot loader stage uses code from one or more of the above mentioned 1758categories. Based upon the above, the code layout looks like this: 1759 1760 Directory Used by BL1? Used by BL2? Used by BL3-1? 1761 bl1 Yes No No 1762 bl2 No Yes No 1763 bl31 No No Yes 1764 arch Yes Yes Yes 1765 plat Yes Yes Yes 1766 drivers Yes No Yes 1767 common Yes Yes Yes 1768 lib Yes Yes Yes 1769 services No No Yes 1770 1771The build system provides a non configurable build option IMAGE_BLx for each 1772boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE_BL1 will be 1773defined by the build system. This enables the Trusted Firmware to compile 1774certain code only for specific boot loader stages 1775 1776All assembler files have the `.S` extension. The linker source files for each 1777boot stage have the extension `.ld.S`. These are processed by GCC to create the 1778linker scripts which have the extension `.ld`. 1779 1780FDTs provide a description of the hardware platform and are used by the Linux 1781kernel at boot time. These can be found in the `fdts` directory. 1782 1783 178413. References 1785--------------- 1786 17871. Trusted Board Boot Requirements CLIENT PDD (ARM DEN 0006B-5). Available 1788 under NDA through your ARM account representative. 1789 17902. [Power State Coordination Interface PDD (ARM DEN 0022B.b)][PSCI]. 1791 17923. [SMC Calling Convention PDD (ARM DEN 0028A)][SMCCC]. 1793 17944. [ARM Trusted Firmware Interrupt Management Design guide][INTRG]. 1795 1796- - - - - - - - - - - - - - - - - - - - - - - - - - 1797 1798_Copyright (c) 2013-2014, ARM Limited and Contributors. All rights reserved._ 1799 1800[ARM ARM]: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.e/index.html "ARMv8-A Reference Manual (ARM DDI0487A.E)" 1801[PSCI]: http://infocenter.arm.com/help/topic/com.arm.doc.den0022c/DEN0022C_Power_State_Coordination_Interface.pdf "Power State Coordination Interface PDD (ARM DEN 0022C)" 1802[SMCCC]: http://infocenter.arm.com/help/topic/com.arm.doc.den0028a/index.html "SMC Calling Convention PDD (ARM DEN 0028A)" 1803[UUID]: https://tools.ietf.org/rfc/rfc4122.txt "A Universally Unique IDentifier (UUID) URN Namespace" 1804[User Guide]: ./user-guide.md 1805[Porting Guide]: ./porting-guide.md 1806[INTRG]: ./interrupt-framework-design.md 1807[CPUBM]: ./cpu-specific-build-macros.md.md 1808