Changes between Version 47 and Version 48 of replication_distribution


Ignore:
Timestamp:
Dec 8, 2019, 9:47:01 PM (5 years ago)
Author:
alain
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • replication_distribution

    v47 v48  
    1212 * A vseg is '''public''' when it can be accessed by any thread T of the involved process, whatever the cluster running the thread T.  It is '''private''' when it can only be accessed by the threads running in the cluster containing the physical memory bank where this vseg is defined and mapped.
    1313 * For a '''public''' vseg, ALMOS-MKH implements a global mapping : In all clusters, a given virtual address is mapped to the same physical address. For a '''private''' vseg, ALMOS-MKH implements a local mapping : the same virtual address can be mapped to different physical addresses, in different clusters.
    14  * A '''public''' vseg can be '''localized''' (all vseg pages are mapped in the same cluster), or '''distributed''' (different pages are mapped on different clusters, using the virtual page number (VPN) least significant bits as distribution key). A '''private''' vseg is always '''localized'''.
     14 * A '''public''' vseg can be '''localized''' (all vseg pages are mapped in the same cluster), or '''distributed''' (different pages are mapped on different clusters). A '''private''' vseg is always '''localized'''.
    1515
    1616To avoid contention, in case of parallel applications defining a large number of threads in one single process P, almos-mkh replicates, the process descriptor in all clusters containing at least one thread of P, and these clusters are called active clusters.
     
    1919 * The [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/mm/vmm.h GPT(P,K)] is the generic page table, defining the actual physical mapping of these vsegs.
    2020
    21 This replication is clearly a non-standard feature that creates several difficulties:
    22 
    2321For a given process P, all VMM(P,K) descriptors in different clusters can have different contents for several reasons :
    2422 1. A '''private''' vseg can be registered in only one VSL(P,K) in cluster K, and be totally undefined in the others VSL(P,K').
     
    2624 1. Similarly, the mapping of a given virtual page VPN of a given vseg (i.e. the allocation of a physical page PPN to a virtual page VPN, and the registration of this PPN in the GPT(P,K) is ''on demand'': the page table entry will be updated in the GPT(P,K) only when a thread of process P in cluster K try to access this VPN. 
    2725 
    28 Finally, we have the following rules:
    29  * The '''private''' vsegs, and  the corresponding entries in the page table are purely ''local'' : the VSL(P,K) and the GPT(P,K) are only shared by the threads of P running in cluster K, and these structures can be privately handled by the local kernel instance in cluster K.
     26The replication of the VSL(P,K) and GPT(P,K) kernel structures creates a coherence problem for the public vsegs:
     27 * A VSL(P,K) contains all private vsegs in cluster K, but contains only the public vsegs that have been actually accessed by a thread of P running in cluster K. Only the '''reference''' process descriptor stored in the reference cluster KREF contains the complete list VSL(P,KREF) of all public vsegs for the P process.
     28 * A GPT(P,K) contains all mapped entries corresponding to private vsegs but  for public vsegs, it contains only the entries corresponding to pages that have been accessed by a thread running in cluster K. Only the reference cluster KREF contains the complete GPT(P,KREF) of all mapped entries of public vsegs for process P.
    3029
    31 * For the '''public''' vsegs that are shared by all the threads of the process P, almos-mkh define a '''reference cluster KREF''', that is the cluster on which the main thread of the process has been created. Only the VSL(P,KREF) and the GPT(P,KREF) are complete: the VSL(P,KREF) contains all publics vsegs defined for the process P, and the GPT(P,KREF) contains all page table entries mapped for public vsegs in process P. Other VSL and GPT copies in other clusters are
    32 only ''caches'' where an existing vseg, or a mapped page table entry can be missing, bur available in the reference cluster.
     30Therefore, almos-mkh defines the following rules :
    3331
    34 For more details:
     32For the '''public'' vsegs, the VMM(P,K) structures - other than the reference one - can be considered as read-only caches.
     33When a given vseg or a given entry in the page table must be removed by the kernel, this modification must be done first in the reference cluster, and broadcast to all other clusters for update.
     34When a miss is detected in a non-reference cluster, the reference VMM(P,KREF) must be accessed first to check a possible ''false segmentation fault'' or a false page fault''.
     35
     36For the '''private''' vsegs, and  the corresponding entries in the page table,  the VSL(P,K) and the GPT(P,K) are only shared by the threads of P running in cluster K, and these structures can be privately handled by the local kernel instance in cluster K.
     37
     38For more details on implementation:
    3539
    3640The '''vseg''' API is defined in the [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/mm/vseg.h almos_mk/kernel/mm/vseg] and [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/mm/vseg.c almos-mkh/kernel/mm/vseg.c] files.
     
    5054|| REMOTE ||  public   || localized   || Read Write || same mapping for all threads              || cluster defined by user             || dynamic (one heap allocator per process)  ||
    5155
    52  1. '''CODE''' : This private vseg contains the application code. ALMOS-MK creates one CODE vseg per active cluster. For a process P, the CODE vseg is registered in the VSL(P,Z) when the process is created in reference cluster Z. In the other clusters X, the CODE vseg is registered in VSL(P,X) when a page fault is signaled by a thread of P running in cluster X. In each active cluster X, the CODE vseg is localized, and physically mapped in cluster X.
    53  1. '''DATA''' : This vseg contains the user application global data. ALMOS-MK creates one single DATA vseg per process, that is registered in the reference VSL(P,Z) when the process P is created in reference cluster Z.  In the other clusters X, the DATA vseg is registered  in VSL(P,X) when a page fault is signaled by a thread of P running in cluster X. To avoid contention, this vseg is physically distributed on all clusters, with a page granularity. For each page, the physical mapping is defined by the LSB bits of the page VPN.
    54  1. '''STACK''' : This private vseg contains the execution stack of a thread. For each thread T of process P running in cluster X, ALMOS_MK creates one STACK vseg. This vseg is registered in the VSL(P,X) when the thread descriptor is created in cluster X. To enforce locality, this vseg is physically mapped in cluster X.
    55  1. '''ANON''' : This type of vseg is dynamically created by ALMOS-MK to serve an anonymous [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/syscalls/sys_mmap.c#L38 mmap()] system call executed by a client thread running in a cluster X. The first vseg registration and the physical mapping are done by the reference cluster Z, but the vseg is mapped in the client cluster X.
    56  1. '''FILE''' : This type of vseg is dynamically created by ALMOS-MK to serve a file based mmap() system call executed by a client thread running in a cluster X. The first vseg registration and the physical mapping are done by the reference cluster Z, but the vseg is mapped in cluster Y containing the file cache.
    57  1. '''REMOTE''' : This type of vseg is dynamically created by ALMOS-MK to serve a remote mmap() system call where a client thread running in a cluster X requests to create a new vseg mapped in another cluster Y. The first vseg registration and the physical mapping are done by the reference cluster Z, but the vseg is mapped in cluster Y specified by the user.
     56 1. '''CODE''' : This '''private''' vseg contains the application code. It is replicated in all clusters. ALMOS-MK creates one CODE vseg per active cluster. For a process P, the CODE vseg is registered in the VSL(P,Z) when the process is created in reference cluster KREF. In the other clusters K, the CODE vseg is registered in VSL(P,K) when a page fault is signaled by a thread of P running in cluster K. In each active cluster K, the CODE vseg is localized, and physically mapped in cluster K.
     57 1. '''DATA''' : This '''public''' vseg contains the user application global data. ALMOS-MK creates one DATA vseg, that is registered in the reference VSL(P,KREF) when the process P is created in reference cluster KREF.  In the other clusters K, the DATA vseg is registered  in VSL(P,K) when a page fault is signaled by a thread of P running in cluster K. To avoid contention, this vseg is physically distributed on all clusters, with a page granularity. For each page, the physical mapping is defined by the LSB bits of the page VPN.
     58 1. '''STACK''' : This '''private''' vseg contains the execution stack of a thread. Almos-mkh creates one STACK vseg for each thread of P running in cluster K. This vseg is registered in the VSL(P,K) when the thread descriptor is created in cluster K. To enforce locality, this vseg is of course mapped in cluster K.
     59 1. '''ANON''' : This '''public''' vseg is dynamically created by ALMOS-MK to serve an anonymous [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/syscalls/sys_mmap.c mmap] system call executed by a client thread running in a cluster K. The first vseg registration and the physical mapping are done in the reference cluster KREF, but the vseg is mapped in the client cluster K.
     60 1. '''FILE''' : This '''public''' vseg is dynamically created by ALMOS-MK to serve a file based [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/syscalls/sys_mmap.c mmap] system call executed by a client thread running in a cluster K. The first vseg registration and the physical mapping are done in the reference cluster KREF, but the vseg is mapped in cluster Y containing the file cache.
     61 1. '''REMOTE''' : This '''public''' vseg is dynamically created by ALMOS-MK to serve a remote [https://www-soc.lip6.fr/trac/almos-mkh/browser/trunk/kernel/syscalls/sys_mmap.c mmap] system call, where a client thread running in a cluster X requests to create a new vseg mapped in another cluster Y. The first vseg registration and the physical mapping are done by the reference cluster K, but the vseg is mapped in cluster Y specified by the user.
    5862
    5963
    60 The replication of the VSL(P,K) and GPT(P,K) kernel structures creates a coherence problem for the non private vsegs.
    61 
    62  * A VSL(P,K) contains all private vsegs in cluster K, but contains only the public vsegs that have been actually accessed by a thread of P running in cluster K. Only the '''reference''' process descriptor stored in the reference cluster Z contains the complete list VSL(P,Z) of all public vsegs for the P process.
    63  * A GPT(P,K) contains all mapped entries corresponding to private vsegs. For public vsegs, it contains only the entries corresponding to pages that have been accessed by a thread running in cluster K. Only the reference cluster Z contains the complete  GPT(P,Z) page table of all mapped entries for process P.
    64 
    65 Therefore, the process descriptors - other than the reference one - can be considered as read-only caches.
    66 When a given vseg or a given entry in the page table must be removed by the kernel, this modification must be done first in the reference cluster, and broadcast to all other clusters for update.
    6764
    6865== __ 2. kernel segments types__==
    6966
    70 * The read-only segment containing the user code is replicated in all clusters where there is at least one thread using it.
     67For any process descriptor P in a cluster K, the VSL(P,K) and the GPT(P,K) contains not only the user vsegs, but also the kernel vsegs, because all user theads can make system calls, that must access to these kernel vsegs,
     68with requires address translation. This section describes the three types of kernel virtual segments defined by almost-mkh
     69
     70 1. '''KCODE''' : This '''private''' vseg contains the kernel code. It is replicated in all clusters. ALMOS-MK creates one KCODE vseg per cluster. For a process P, the CODE vseg is registered in the VSL(P,Z) when the process is created in reference cluster KREF. In the other clusters K, the CODE vseg is registered in VSL(P,K) when a page fault is signaled by a thread of P running in cluster K. In each active cluster K, the CODE vseg is localized, and physically mapped in cluster K.
     71 1. '''KDATA''' : This '''public''' vseg contains the user application global data. ALMOS-MK creates one DATA vseg, that is registered in the reference VSL(P,KREF) when the process P is created in reference cluster KREF.  In the other clusters K, the DATA vseg is registered  in VSL(P,K) when a page fault is signaled by a thread of P running in cluster K. To avoid contention, this vseg is physically distributed on all clusters, with a page granularity. For each page, the physical mapping is defined by the LSB bits of the page VPN.* The read-only segment containing the user code is replicated in all clusters where there is at least one thread using it.
    7172 * The private segment containing the stack for a given thread is placed in the same cluster as the thread using it.
    7273 * The shared segment containing the global data is distributed on all clusters as regularly as possible to avoid contention.