Version 14 (modified by 15 years ago) (diff) | ,
---|
TSAR virtual memory
The TSAR MMU (Memory Management Unit) is an hardware component implemented in the L1 cache controller. This cache controller is a generic component that can be used with any single instruction issue, 32 bits processor. As any MMU, the generic TSAR MMU is in charge of the virtual to physical address translation, and perfoms access right verifications. It implements a paginated virtual memory, supporting two page sizes : 4 Kbytes pages, and 2 Mbytes pages.
As the processor core can issue two simultaneous instruction and data requests, there is actually two separated data and instructions caches, sharing the same physical access to the VCI/OCP interconnect. These L1 caches use physical addresses. Similarly, the L1 cache controller contains two separated hardware MMUs for instruction and data. In order to be independent on the processor core choice, the TLB MISS are handled by an hardwired Finite State Machine (called a Table Walk), without any software action.
1. Page Table Organisation
The TSAR architecture defines two page sizes : 4 Kbytes pages, and 2 Mbytes pages. The virtual address space size is 4 Gbytes (32 bits virtual addresses). The physical address space is limited to 1 Tbytes (40 bits physical addresses). The page table are build by the operating system, and are stored in the main memory.
1.1 Two levels Page Table structure
As described below, the Page Table has a hierarchical two levels structure :
- All page tables (first & second level) must be aligned : the page table base adress must be a multiple of 8K bytes for a first level page table, and multiple of 4K bytes for a second level page table.
- The page tables can be placed anywhere in the physical address space.
- The PTPR register is located in the generic MMU, and is re-initialised by the OS at each context switch. It contains the 27 MSB bits of the first level page table base address, and is extended (left-shifted) to 40 bits by the Table-Walk FSM in case of TLB MISS.
1.2 First Level Page Table Entry Format
Each entry in a first level page table can contain either a 2M bytes page descriptor (called PTE1), or a second level page table descriptor (called PTD1). It is implemented as a single 32 bits word :
- PTE1 :
V | T | L | R | C | W | X | U | G | D | reserved (3 bits) | PPN1 (19 bits) |
- PTD1
V | T | reserved (2 bits) | PTBA (28 bits) |
The various fields are defined as follows :
V | Valid bit | Valid entry when 1 (set by the OS) |
T | Type bit | PTD1 when 1 (set by the OS) |
L | Local access bit | Used by the OS for page replacement (set by the hardware) |
R | Remote access bit | Used by the OS for page replacement (set by the hardware) |
C | Cachable bit | The page is cachable in the L1 cache when 1 (set by the OS) |
W | Writable bit | The page is writable when 1 (set by the OS) |
X | eXecutable bit | The page can contain instructions when 1 (set by the OS) |
U | User bit | The page is accessible in user mode when 1 (set by the OS) |
G | Global bit | Entry not invalidated in TLB flush when 1 (set by the OS) |
D | Dirty bit | The page has been modified when 1 (set by the hardware) |
PPN1 | Physical Page Number | Concatened to the page offset to build the physical address |
PTBA | Page Table Base Address | Second level page table base address |
The L, R, D bits are used by the operating system to implement the page replacement policy.
- The D bit is set by the hardware, when a page is written and when it is not already set, using an atomic access (LL/SC).
- The L bit is set by the hardware, when the page is accessed by a local processor or coprocessor, after a TLB miss, and when it is not already set.
- The R bit is set by the hardware, when the page is accessed by a remote processor or coprocessor, after a TLB miss, and when it is not already set.
These page table updates use atomic access (LL/SC).
If the entry is a PTE1, the PPN1 value (19 bits) must be concatened with the page offset (21 bits) to build the 40 bits physical address.
If the entry is a PTD1, the PTBA value (28 bits) must be left-shifted by 12 bits to define the base address of the level 2 page table. The page table being aligned in memory, the 12 LSB bits of this base address have a 0 value.
1.3 Second Level Page Table Entry Format
Each entry in a second level page table contains a 4K bytes page descriptor (called PTE2). It is implemented as two 32 bits words: the first word contains the flags; the second word contains the 28 bits physical page number (PPN2).
- PTE2 first word :
V | T | L | R | C | W | X | U | G | D | reserved (22 bits) |
- PTE2 second word :
reserved (4 bits) | PPN2 (28 bits) |
The various fields are defined as follows :
V | Valid bit | Valid entry when 1 (set by the OS) |
T | Type bit | Must be 0 for a PTE2 (set by the OS) |
L | Local access bit | Used by the OS for page replacement (set by the hardware) |
R | Remote access bit | Used by the OS for page replacement (set by the hardware) |
C | Cachable bit | The page is cachable in the L1 cache when 1 (set by the OS) |
W | Writable bit | The page is writable when 1 (set by the OS) |
X | eXecutable bit | The page can contain instructions when 1 (set by the OS) |
U | User bit | The page is accessible in user mode when 1 (set by the OS) |
G | Global bit | Entry not invalidated in TLB flush when 1 (set by the OS) |
D | Dirty bit | The page has been modified when 1 (set by the hardware) |
PPN2 | Physical Page Number | Concatened to the page offset to build the 40 bits address |
The L, R, D bits are used by the operating system to implement the page replacement policy.
- The D bit is set by the hardware, when a page is written and when it is not already set, using an atomic access (LL/SC).
- The L bit is set by the hardware, when the page is accessed by a local processor or coprocessor, after a TLB miss, and when it is not already set.
- The R bit is set by the hardware, when the page is accessed by a remote processor or coprocessor, after a TLB miss, and when it is not already set.
These page table updates use atomic access (LL/SC).
The PPN2 value (28 bits) must be concatened with the page offset (12 bits) to build the 40 bits physical address.
2. Generic MMU
The generic MMU is implemented as an hardware component in the L1 cache controller. As the processor core can issue two simultaneous instruction and data requests, there is actually two separated data and instructions caches, sharing the same physical access to the VCI/OCP interconnect. These cache are set associative, and have a total capacity of 16 Kbytes :
- cache line width = 64 bytes
- number of associative sets = 64 sets
- number of associative ways = 4 ways
Similarly, the L1 cache controller contains two separated hardware MMUs for instruction and data. Each MMU contains a 64 entries TLB (Translation Look-aside Buffer). These TLBs are implemented as set-associative caches (16 sets of 4 ways). Each entry in these TLBs can contain either a 4 Kbytes page descriptor, or a 2 Mbytes page descriptor. The figure below illustrate the general structure of the TSAR L1 caches.
For both data & instructions, the TSAR L1 caches use physical addresses : the cache directories are indexed by the physical addresses, and the tags contained in the directories are obtained from the physical addresses. The access to the L1 cache being a critical path, the TSAR MMU use a speculative approach to avoid to serialize the TLB access and the L1 cache access:
- After each TLB hit, the input VPN and the resulting PPN values are saved in two VPN_save & PPN_save registers.
- During access (n), the PPN_save value, corresponding to access (n_1) is used to access the cache. Simultaneously, the cache controller checks that the VPN value is equal to the VPN_save value (no page change).
- In case of TLB hit with a page change, the cache must be accessed twice, which means one cycle penalty.
2.1 Generic MMU activation
After general RESET, both the L1 caches and the generic MMU are desactivated : As long as the MMU is not activated, the 32 bits virtual address is simply extended to 40 bits, and directly used as a physical address. As long as the caches are not activated, all access are handled as uncached by the cache controller.
The instruction cache, the data cache, the instruction MMU and the data MMU can be separately activated by the software, by writing in the MMU_MODE register, using the MMU driver.
2.2 Generic MMU exceptions
The hardware MMU can signal exceptions by rising the general instruction_bus_error and data_bus_error signals (for an instruction or data accesss respectively). The error type is written in the INS_ERROR_TYPE & DATA_ERROR_TYPE registers, as described below:
Exeption type | code | cause | severity |
MMU_PT1_UNMAPPED | Page fault on Table1 (invalid PTE) | non fatal error | |
MMU_PT2_UNMAPPED | Page fault on Table 2 (invalid PTE) | non fatal error | |
MMU_PRIVILEGE_VIOLATION | Protected access in user mode | user error | |
MMU_WRITE_VIOLATION | Write access to a non write page | user error | |
MMU_EXEC_VIOLATION | Exec access to a non exec page | user error | |
MMU_UNDEFINED_XTN | Undefined external access address | user error | |
MMU_PT1_ILLEGAL_ACCESS | Bus Error accès Table1 | kernel error | |
MMU_PT2_ILLEGAL_ACCESS | Bus Error accès Table2 | kernel error | |
MMU_CACHE_ILLEGAL_ACCESS | Bus Error during the cache access | kernel error |
2.3 generic MMU registers mapping
The generic MMU contains a set of registers (or pseudo-registers) that can be accessed by operating system, through a dedicated MMU driver. In the case of the MIPS processor, these registers are implemented in coprocessor 2, and are accessed using the mtc2 (write) and mfc2 (read) instructions.
These registers are described below :
register name | index | description | mode |
MMU_PTPR | Page Table Pointer Register | R/W | |
MMU_MODE | Data & Inst TLBs Mode Register | R/W | |
MMU_ICACHE_FLUSH | Instruction Cache flush | W | |
MMU_DCACHE_FLUSH | Data Cache flush | W | |
MMU_ITLB_INVAL | Instruction TLB line invalidation | W | |
MMU_DTLB_INVAL | Data TLB line Invalidation | W | |
MMU_ICACHE_INVAL | Instruction Cache line invalidation | W | |
MMU_DCACHE_INVAL | Data Cache line invalidation | W | |
MMU_IETR | Instruction Exception Type Register | R/W | |
MMU_IBVAR | Instruction Bad Virtual Address Register | R/W | |
MMU_DETR | Data Exception Type Register | R/W | |
MMU_DBVAR | Data Bad Virtual Address Register | R/W |
3. I/O MMU
To be defined...
Attachments (3)
- two_levels_pages_tables.png (32.6 KB) - added by 15 years ago.
- generic_mmu.png (42.7 KB) - added by 15 years ago.
- cache_tlb.png (45.0 KB) - added by 15 years ago.
Download all attachments as: .zip