Context Navigation

Changes between Version 68 and Version 69 of WikiStart

Timestamp:: Nov 15, 2013, 11:07:16 AM (12 years ago)
Author:: almaless
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

WikiStart

v68	v69
14	14	In a cc-NUMA architecture, the locality of memory access impacts directly both the scalability and the power consumption (energy by moved bit). The main challenge is to enforce the locality of memory access made by threads of parallel applications as well as kernel-level threads. Although the locality enforcing needs a fine management of hardware resources (mainly cores and physical memory), ALMOS aims to hide the hardware topology and the management of its resources to the user's applications. This allows POSIX shared-memory parallel applications as well as legacy applications to benefit from the performances offered by a many-core.
15	15
16		The design of ALMOS kernel has a distributed approach articulated around objects named cluster-managers. The goal is to reduce the contention on the resources and to increase the locality of the kernel's memory access when referring to its own data structures. A cluster-manager is a manager of hardware resources of a cc-NUMA node (mainly cores and physical memory). A cluster-manager can locally supply its homed threads with any type of memory~~-objects including physical pages and kernel-level data-~~objects. A cluster-manager contains a per-core managers. Each core-manager has its own events-manager and a multi-policies scheduler server. In this distributed scheme, the kernel has no notion of resources global state. Instead, it has a decentralized mechanism for discovering from where to allocate the requested resources, taking in account the locality of memory access of the requester thread. This decentralized mechanism and its related policies are built using the DQDT (Distributed Quaternary Decision Tree). The DQDT is a distributed wait-free infrastructure based on a set of resource usage indicators and integrates the locality awareness needed by all kernel decisions regarding the tasks/threads placement, memory allocation and cores load-balancing. It constitutes a common framework for building strategies in order to manage the resources of other kernel sub-systems like files and I/O.
	16	The design of ALMOS kernel has a distributed approach articulated around objects named cluster-managers. The goal is to reduce the contention on the resources and to increase the locality of the kernel's memory access when referring to its own data structures. A cluster-manager is a manager of hardware resources of a cc-NUMA node (mainly cores and physical memory). A cluster-manager can locally supply its homed threads with any type of memory objects including physical pages and kernel-level data objects. A cluster-manager contains a per-core managers. Each core-manager has its own events-manager and a multi-policies scheduler server. In this distributed scheme, the kernel has no notion of resources global state. Instead, it has a decentralized mechanism for discovering from where to allocate the requested resources, taking in account the locality of memory access of the requester thread. This decentralized mechanism and its related policies are built using the DQDT (Distributed Quaternary Decision Tree). The DQDT is a distributed wait-free infrastructure based on a set of resource usage indicators and integrates the locality awareness needed by all kernel decisions regarding the tasks/threads placement, memory allocation and cores load-balancing. It constitutes a common framework for building strategies in order to manage the resources of other kernel sub-systems like files and I/O.
17	17
18	18	The kernel of ALMOS replicates its code in each cluster by creating a per-cluster kernel process. The threads of a cluster-manager are attached to their local kernel process of their cluster. The kernel of ALMOS has a hybrid notion of threads which is based on a new organization of processes virtual address space. This new organization enables the kernel to replicate both of the page tables and the application's code; therefore, enforcing the locality of the per-core TLBs and I-cache misses. ALMOS threading model is compatible with PThreads standard and it allows the kernel to provide a native support to PGAS (Partitioned Global Address Space) programming model found in HPC oriented languages like UPC (Unified Parallel C).