= GIET_VM / Boot Procedure = [[PageOutline]] The boot procedure is done in three phases: * The generic ''reset'' code (hard-coded in the external ROM) is executed by processor P(0,0,0), and load the GIET_VM boot-loader code, stored in the ''boot.elf'' file, from the external disk to the physical memory. * The GIET_VM boot-loader is executed in parallel by all processors P(x,y,0): one processor per cluster. The bootloader loads the ''map.bin'' file, build the page tables, initializes the schedulers as specified in the mapping, initializes the peripherals, and load the kernel code, as well as the user application(s) code into memory. * Finally, the GIET_VM ''kernel_init()'' function is executed by all processors P(x,y,p), and completes the kernel initialization. == Phase 1 : Reset Initialization == After hard reset, all processors execute the same ''reset'' code (also called ''preloader'' code) stored in the external ROM. The work done depends on the processor global index: * Processor P(0,0,0) load the GIET_VM boot-loader code from the external disk (or another bootable peripheral), to the physical memory bank in cluster(0,0): segments seg_boot_code and seg_boot_data. * All other processors initialize their private interrupt controller, to be able to receive an inter-processor interrupt (WTI), and enter ''wait_state'' in low-power mode. This ''reset'' code is generic, and can be used to boot any operating system. == Phase 2 : Boot Initialisation == The GIET_VM boot-loader is defined in the [source:soft/giet_vm/giet_boot/boot.c boot.c] and [source:soft/giet_vm/giet_boot/boot_entry.S boot_entry.S] files. The step 1 is executed by processor P(0,0,0) only, but the other steps are executed in parallel by all processor P(x,y,0) : one processor per cluster. === step 0 === The ''boot_entry.S'' file defines the entry point in the GIET_VM bootloader. This assembly code is executed by all processors. It allocates a stack from the seg_boot_stack segment (defined by the SEG_BOOT_STACK_BASE and SEG_BOOT_STACK_SIZE parameters in the ''hard_config.h'' file), and initialises the CP0_SP (stack pointer) register. * The stack size for P(x,y,0) is 1.25 Kbytes. * The stack size for other processors is 0.25 Kbytes. * The SEG_BOOT_STACK_SIZE cannot be smaller than : 0x100 * (NB_PROCS_MAX-1) + 0x500) * X_SIZE * Y_SIZE === step 1 === Processor P(0,0,0) initializes the FAT, initializes the TTY0 lock, initializes the synchronization barrier, and load the ''map.bin'' file to the physical memory bank in cluster(0,0), in segment ''seg_boot_mapping''. Then processor P(0,0,0) uses inter-processor-interrupts (WTI) to start the parallel execution, and activates one processor per cluster (processor P(x,y,0) ) in all clusters containing processors. === step 2 === In each cluster(x,y), processor P(x,y,0) makes the physical memory allocator initialisation (function '''boot_pmem_init()''' ). The GIET VM uses two types of pages: BPP (Big Physical Page, 2 Mbytes), and SPP (Small Physical Page, 4 Kbytes). There is one SPP and one BPP allocator per cluster containing a physical memory bank. All the physical memory allocation must be done by the boot-loader in the boot phase, and these memory allocators should not be used by the kernel in the execution phase. === step 3 === In each cluster(x,y), processor P(x,y,0) makes the local page table initialisation (function '''boot_ptabs_init()''') as specified in the mapping. There is one page table per user application (vspace) defined in the mapping, and it is replicated in all clusters containing processors. In each cluster, all pages tables are packed in one segment (seg_ptab) occupying one single big page (2 Mbytes). Global vsegs are mapped in all vspaces. Any vseg (but the peripherals) can be mapped on any physical segment. As the kernel read-only segments (seg_kcode and seg_kinit) are replicated in all clusters to avoid contention, the content of the page tables depends on the cluster-coordinates: for the kernel code, a given virtual address is mapped to different physical addresses, depending on the cluster coordinates. === step 4 === In each cluster(x,y), processor P(x,y,0) makes the schedulers initialization for all processors in the cluster (function '''boot_schedulers init()''') as specified in the mapping: * There is one scheduler per processor. * The HWI, PTI, WTI interrupts vectors are initialised. * The local IRQs routing (defined by the XCU masks) is statically defined. * Any task defined in any application can be allocated to any processor. * The allocation of task to processors is fully static (no task migration). * One single processor cannot schedule more than 14 tasks. * One scheduler occupies 8 Kbytes, and contains the contexts of all tasks allocated to the processor (256 bytes per task). === step 5 === Finally, processor P(0,0,0) makes peripherals initialisation (function '''boot_peripherals_init()'''), coprocessors initialisation, and load into memory the kernel code (''kernel.elf'' file), and the user code for all applications specified in the mapping (function '''boot_elf_load()'''). === step 6 === Finally, in each cluster(x,y) processor(x,y,0) starts all other processors in the cluster, using an inter-processor interrupt (WTI). Each processor initializes its own CP0_SCHED register, its own CP2_MODE register to activates its MMU, its own CP0_SR register to use the GIET_VM exception handler, and jumps to the ''kernel_init()'' function. == Phase 3 : Kernel Initialisation == This code is executed in parallel by all processors P(x,y,p). All processors enter the same [source:soft/giet_vm/giet_kernel/kernel_init.c kernel_init.c] code and execute the following steps, separated by synchronization barriers. Step 0 is done by processor P(0,0,0) only, others steps are done by all processors in parallel. === step 0 === Processor P(0,0,0) makes kernel_heap[x][y] array initialization, TTY0 lock initialization, and kernel barrier initialization. === step 1 === Each processor P(x,y,p) get its scheduler virtual address from CP0_SCHED register and contributes to _schedulers[x][y][p] array initialization. === step 2 === Each processor P(x,y,p) loops on all allocated tasks to build the _ptabs_vaddr[vspace] & _ptabs_ptprs[vspace] arrays from the tasks contexts, and to complete the tasks contexts initialization (CTX_RA and CTS_EPC slots). === step 3 === Each processor P(x,y,p) completes its private idle_task context, and starts its TICK timer if it has at least one task allocated. Processor P(0,0,0) initializes the kernel FAT. === step 4 === Each processor P(x,y,p) set registers SP, SR, PTPR, EPC, with the values corresponding to the first allocated task, and jump to user code.