Changes between Version 15 and Version 16 of replication_distribution
- Timestamp:
- Oct 13, 2016, 6:59:46 PM (8 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
replication_distribution
v15 v16 6 6 * The read-only segments (type CODE) are replicated in all clusters where they are used. 7 7 * The private segments (type STACK) are placed in the same cluster as the thread using it. 8 * The shared segments (types DATA, MMAP, MALLOC, etc) are distributed on all clusters as regularly as possible to avoid contention.9 * The pinned segments (type REMOTE or FILE) are placed in the user-specified cluster.8 * The shared segments (types DATA, HEAP ) are distributed on all clusters as regularly as possible to avoid contention. 9 * The pinned segments (type MMAP) are placed in the specified cluster. 10 10 11 To actually control data placement on the physical memory banks, the kernel uses the paged virtual memory MMU .11 To actually control data placement on the physical memory banks, the kernel uses the paged virtual memory MMU to map a virtual segment to a given physical memory bank. 12 12 13 This policy is implemented by the Virtual Memory Manager (in the vmm.h / vmm.c files).13 This replication / distribution policy is implemented by the Virtual Memory Manager (in the vmm.h / vmm.c files). 14 14 15 15 A '''vseg''' is a contiguous memory zone in the process virtual space. It is always an integer number of pages. Depending on its type, a '''vseg''' has some specific attributes regarding access rights, replication policy, and distribution policy. The vseg descriptor is defined by the structure vseg_t in the vseg.h file. 16 16 17 For each process P, the process descriptor is replicated in all clusters containing at least one thread of P (called active clusters). In each cluster K, the virtual memory manager VMM(P,K) is stored in the local process descriptor, and contains two main structures: VSL(P,K) is the list of all vsegs registered for process P in cluster K, and GPT(P,K) is the generic page table, defining the actual physical mapping of those vsegs on the distributed physical memory banks. 17 For each process P, the process descriptor is replicated in all clusters containing at least one thread of P (these clusters are called active clusters). In each active cluster K, the virtual memory manager VMM(P,K) is stored in the local process descriptor, and contains two main structures: 18 * the VSL(P,K) is the list of all vsegs registered for process P in cluster K, 19 * the GPT(P,K) is the generic page table, defining the actual physical mapping of those vsegs. 18 20 19 21 == 1) User segments types and attributes == 20 22 21 * A vseg is '''public''' when it can be accessed by any thread of the process, whatever the cluster where the thread is running. It is '''private''' when it can only be accessed by the threads running in the cluster containing the physical memory bank where this vseg is mapped. A '''private''' vseg is entirely mapped in one single cluster K. For a '''public''' vseg ALMOS-MK implement a global mapping : a virtual address is mapped to the same physical address in all clusters. For a '''private''' vessel, ALMOS-MK implement a local mapping : the same virtual address can be mapped to different physical addresses, in different clusters.22 * A vseg can be '''localized''' (all vseg pages are mapped in the same cluster), or '''distributed''' (different pages are mapped on different clusters, using the virtual page number (VPN) least significant bits as distribution key). A '''private''' vseg is always '''localized'''.23 * A vseg is '''public''' when it can be accessed by any thread T of the process, whatever the cluster running the thread T. It is '''private''' when it can only be accessed by the threads running in the cluster containing the physical memory bank where this vseg is mapped. A '''private''' vseg is entirely mapped in one single cluster K. For a '''public''' vseg ALMOS-MK implement a global mapping : In all clusters, a given virtual address is mapped to the same physical address. For a '''private''' vessel, ALMOS-MK implement a local mapping : the same virtual address can be mapped to different physical addresses, in different clusters. 24 * A '''public''' vseg can be '''localized''' (all vseg pages are mapped in the same cluster), or '''distributed''' (different pages are mapped on different clusters, using the virtual page number (VPN) least significant bits as distribution key). A '''private''' vseg is always '''localized'''. 23 25 24 26 ALMOS-MK defines seven vseg types: … … 32 34 || REMOTE || public || localized || Read/Write || one per remote_mmap() || 33 35 34 1.'''CODE''' : This private vseg contains the user application code. ALMOS-MK creates one CODE vseg per cluster. For a process P, the CODE vseg is registered in the VSL(P,Z) when the process is created in reference cluster Z. In the other clusters X, the CODE vseg is registered in VSL(P,X) when a page fault is signaled by a thread of P running in cluster X. In each cluster X, the CODE vseg 35 being private is physically mapped in cluster X. 36 1.'''CODE''' : This private vseg contains the user application code. ALMOS-MK creates one CODE vseg per cluster. For a process P, the CODE vseg is registered in the VSL(P,Z) when the process is created in reference cluster Z. In the other clusters X, the CODE vseg is registered in VSL(P,X) when a page fault is signaled by a thread of P running in cluster X. In each cluster X, the CODE vseg is physically mapped in cluster X. 36 37 37 38 1. '''DATA''' : This vseg contains the user application global data. ALMOX-MK creates one single DATA vseg per process, that is registered in the reference VSL(P,Z) when the process P is created in reference cluster Z. In the other clusters X, the DATA vseg is registered in VSL(P,X) when a page fault is signaled by a thread of P running in cluster X. To avoid contention, this vseg is physically distributed on all clusters. For each page, the physical mapping is decided by the reference cluster Z, but the page can be mapped on any cluster Y. 38 39 40 1. '''HEAP''' This vseg is actually used by the malloc() library. ALMOX-MK creates one single HEAP vseg per process, that is registered in the reference VSL(P,Z) when the process P is created in reference cluster Z. In the other clusters X, the HEAP vseg is registered in VSL(P,X) when a page fault is signaled by a thread of P running in cluster X. To avoid contention, this vseg is physically distributed on all clusters. For each page, the physical mapping is decided by the reference cluster Z, but the page can be mapped on any cluster Y. 41 39 42 1. '''STACK''' : This private vseg contains the execution stack of a thread. For each thread T of process P running in cluster X, ALMOS_MK creates one STACK vseg. This vseg is registered in the VSL(P,X) when the thread descriptor is created in cluster X. To enforce locality, this vseg is physically mapped in cluster X. 40 41 1. '''HEAP''' This vseg is actually used by the malloc() library. ALMOX-MK creates one single HEAP vseg per process, that is registered in the reference VSL(P,Z) when the process P is created in reference cluster Z. In the other clusters X, the HEAP vseg is registered in VSL(P,X) when a page fault is signaled by a thread of P running in cluster X. To avoid contention, this vseg is physically distributed on all clusters. For each page, the physical mapping is decided by the reference cluster Z, but the page can be mapped on any cluster Y.42 43 43 44 1. '''MMAP''' : This type of vseg is dynamically created by ALMOS-MK to serve an anonymous mmap() system call executed by a client thread running in a cluster X. The first vseg registration and the physicaI mapping are done by the reference cluster Z, but the vseg is mapped in the client cluster X. … … 47 48 1. '''REMOTE''' : This type of vseg is dynamically created by ALMOS-MK to serve a remote_mmap() system call executed by a client thread running in a cluster X. The first vseg registration and the physicaI mapping are done by the reference cluster Z, but the vseg is mapped in cluster Y specified by the user. 48 49 49 50 51 52 The replication of the VSL(P,K) and GPT(P,K) structures creates a coherence problem for non private vsegs. 50 The replication of the VSL(P,K) and GPT(P,K) kernel structures creates a coherence problem for the non private vsegs. 53 51 54 52 * A VSL(P,K) contains all private vsegs in cluster K, but contains only the public vsegs that have been actually accessed by a thread of P running in cluster K. Only the '''reference''' process descriptor stored in the reference cluster Z contains the complete list VSL(P,Z) of all public vsegs for the P process. 55 * A GPT(P,K) contains all contains all entries corresponding to private vsegs. For public vsegs, it contains only the entries corresponding to pages that have been accessed by a thread running in cluster K. Only the reference cluster Z contains the complete GPT(P,Z) page table of all mapped pages in all clusters for process P.53 * A GPT(P,K) contains all mapped entries corresponding to private vsegs. For public vsegs, it contains only the entries corresponding to pages that have been accessed by a thread running in cluster K. Only the reference cluster Z contains the complete GPT(P,Z) page table of all mapped entries for process P. 56 54 57 55 Therefore, the process descriptors - other than the reference one - are used as read-only caches. 58 59 For a process P, the CODE and DATA vsegs are registered in the VSL(P,Z) when the process main thread is created in reference cluster Z. The STACK vsegs are registered in the VSL(P,K) when a thread of process P is created in cluster K. The HEAP, REMOTE, or FILE threads are registered in the VSL(P,Z) of the reference cluster Z, when any thread, running in any cluster X, makes a mmap(), malloc() or remote malloc() system call, because only the reference cluster can dynamically create a public vseg. The new vseg is then copied in the VSL(P,X) of the cluster X running the client thread. It is only registered in the VSL of other clusters in case of page-fault (on-demand registration). 60 61 The GPT(P,K) page tables are progressively updated by the kernel as a response to a page-fault (on-demand paging). But a page fault detected in any cluster K must always be reported to the reference cluster Z, and the new mapping should be introduced in the reference GPT(P,Z), before to be copied in the client GPT(P,X). 62 63 Finally, when a given vseg or a given entry in the page table must be removed by the kernel, this modification must be done first in the reference cluster, and broadcasted to all other clusters for update. 56 When a given vseg or a given entry in the page table must be removed by the kernel, this modification must be done first in the reference cluster, and broadcasted to all other clusters for update. 64 57 65 58 == 2) User process virtual space organisation == 66 59 67 The virtual space of an user process P in a given cluster K is split in 5 zones called ''vzone''. Each vzone contains one or several vsegs.60 The virtual space of an user process P in a given cluster K is split in 5 fixed size zones called ''vzone'' defined by configuration parameters. Each vzone contains one or several vsegs. 68 61 69 1. The '''utils''' vzone has a fixed size, and is located in the lower part of the virtual space. It contains the three vsegs ''kentry'', ''args'', ''envs'', whose sizes are defined by configuration parameters. These vsegs are set by the kernel each time a new process is created.The ''kentry'' vseg has CODE type and contains the code that must be executed to enter the kernel from user space. The ''args'' vseg has DATA type, and contains the process main() thread arguments. The ''envs'' vseg has DATA type and contains the process environment variables.62 1. The '''utils''' vzone is located in the lower part of the virtual space. It contains the three vsegs ''kentry'', ''args'', ''envs'', whose sizes are defined by configuration parameters. The ''kentry'' vseg has CODE type and contains the code that must be executed to enter the kernel from user space. The ''args'' vseg has DATA type, and contains the process main() thread arguments. The ''envs'' vseg has DATA type and contains the process environment variables. 70 63 71 1. The '''elf''' vzone has a variable size, and contains the ''text'' and and ''data'' vsegs containing the process binary code and global data. The size is defined in the .elf file and reported in the boot_info structure by the boot loader. It is located on top of the '''utils''' vzone64 1. The '''elf''' zone is located on top of the '''utils''' vzone. It is defined by the CONFIG_USER_ELF_BASE and CONFIG_USER_ELF_SIZE parameters. It contains the ''text'' vseg (CODE type) and ''data'' vseg (DATA type) defining the process binary code and global data. The actual vsegs sizes are defined in the .elf file and reported in the boot_info structure by the boot loader. 72 65 73 1. The '''HEAP''' vzone contains the 74 1. The '''stack''' vzone has a fixed size, and is located in the upper part of the virtual space. It contains as many vsegs of type STACK as the max number of threads for a process in a single cluster. The total size is defined as CONFIG_VSPACE_STACK_SIZE * CONFIG_PTHREAD_MAX_NR. 66 1. The '''HEAP''' vzone is located on top of the '''elf''' vzone. It is defined by the CONFIG_USER_HEAP_BASE and CONFIG_USER_HEAP_SIZE parameters. It contains one single ''heap'' vseg, used by the malloc() library. 75 67 76 1. The '''heap''' vzone has a variable size, and occupies all space between the top of the '''elf''' vzone and the base of the '''stack''' zone. It contains all vsegs of type MMAP, FILE, MALLOC, or REMOTE that are dynamically allocated by the reference VMM manager. 77 68 1. The '''mmap''' vzone is located on top of the '''heap''' vzone. It is defined by the CONFIG_USER_MMAP_BASE and CONFIG_USER_MMAP_SIZE parameters. It contains all vsegs of type ANON, FILE, or REMOTE that are dynamically allocated / released by the user application. The VMM implements a specific MMAP allocator for this zone. 69 70 1. The '''stack''' vzone has a fixed size, defined by configuration parameters as CONFIG_USER_STACK_SIZE * CONFIG_PTHREAD_MAX_NR. It is located in the upper part of the virtual space. It contains an array of fixed size slots, and each slot contain one ''stack'' vseg. In each slot the first page is not mapped to detect stack overflow. 71 As threads can be dynamically created and destroyed, the VMM implement a specific STACK allocator for this zone.