wiki:WikiStart

Version 55 (modified by alain, 5 years ago) (diff)

--

ALMOS-MKH Specification

This document describes the general principles of ALMOS-MKH, which is an operating system targeting manycore architectures with CC-NUMA (Coherent Cache, Non Uniform Memory Access) shared address space, such as the TSAR architecture which can support up to 1024 32-bit MIPS cores. ALMOS-MKH also targets INTEL / AMD multi-core architectures using 64-bit I86 cores.

Targeted architectures are assumed to be clustered, with one or more core and a physical memory bank per cluster. These architecture must support POSIX standard multi-threaded parallel applications.

ALMOS-MKH inherited from the ALMOS system, developed by Ghassan Almaless. The general principles of the ALMOS system are described in his thesis.

A first version of ALMOS-MKH, and in particular the distributed file system and the communication mechanism by RPC were developed by Mohamed Karaoui. The general principles of the proposed Multi-Kernel approach are described in his thesis. This system was called ALMOS-MK without H.

ALMOS-MKH is based on the "Multi-Kernel" approach to ensure scalability, and support the distribution of system services. In this approach, each cluster of the architecture contains an instance of the kernel. Each instance controls the local resources (memory and computing cores). These multiple instances cooperate with each other to give applications the image of a single system controlling all resources. They communicate with each other using both (i) the client / server model, sending a Remote Procedure Call to a remote cluster for a complex service, (ii) the shared memory paradigm, making direct access to remote memory when required.

To reduce energy consumption, ALMOS-MKH supports architectures using 32-bit cores. In this case, each cluster has a 32-bit physical address space, addressed by the LPA (Local Physical Address). To access the physical addressing space of other clusters, ALMOS-MKH defines 64-bit global physical addresses (CXY,LPA). For example, the physical space of the TSAR architecture has 40 bits, the 8 most significant bits define the target cluster identifier CXY, and the 32 LSB bits define the local physical address LPA. The ALMOS-MKH kernel thus explicitly distinguishes two types of access:

  • Local access (internal to a cluster) uses standard pointers.
  • Remote accesses (to another cluster) use extended pointers.

On hardware platforms containing 32-bit cores, such as TSAR, the ALMOS-MKH kernel runs partially in physical addressing: To reduce contention, ALMOS_MKH replicates the kernel code in all clusters, and uses - in each cluster - the Instruction MMU to map the local kernel code copy in the kernel virtual space. But the Data MMU is deactivated as soon as a core enters the kernel, and it is reactivated when it returns to user. To access the local addressable resources (memory or devices), ALMOS-MK uses standard 32 bits pointers in identity mapping. To access ressources in another cluster, ALMOS-MKH uses the remote_read and remote_write primitives implementing the 64-bit extended pointers (CXY / PTR). CXY is the target cluster identifier, and PTR is the local pointer in the target cluster. These remote access primitives are used to implement the RPC mechanism, but are also used for fast access to kernel distributed data structures, which are critical in performance. For the applications ALMOS_MKH, as any POSIX compliant OS, uses the MMU for both the instructions and the data.

On a hardware platform containing 64-bit cores, such as Intel servers, it is no longer necessary for the kernel to use physical addressing to access data in remote clusters, since all the physical space can be mapped into the 64-bit virtual space. Therefore, both the local accesses and the remote access primitives uses the MMU to translate a 64 bits virtual address to a 64 bits physical address. However, to enhance access localization while minimizing contention, the ALMOS-MKH communication model continues to explicitly distinguish local and remote accesses.

In both cases, communications between kernel instances are therefore implemented by a mix of RPCs (on the client / server model), and direct access to remote memory (when this is useful for performance). This hybrid approach is the main originality of ALMOS-MKH.

A) Hardware Platform Definition

This section describes the general assumptions made by ALMOS-MKH regarding the hardware architecture, and the mechanism to configure ALMOS-MKH for a given target architecture.

B) Process & threads creation/destruction

ALMOS-MKH supports the POSIX threads API. In order to avoid contention in massively multi-threaded applications, ALMOS-MKH replicates the user process descriptors in all clusters containing threads of this process. This section describes the mechanisms for process and thread creation / destruction.

C) Data replication & distribution policy

This section describes the general ALMOS-MKH policy for replication/distribution of the information on the various physical memory banks. We have two main goals: enforce memory access locality, and avoid contention when several threads access simultaneously the same information. To control the placement and the replication of the physical memory banks, the kernel uses the paged virtual memory.

D) GPT & VSL implementation

To avoid contention when several threads access the same page table to handle TLB miss, ALMOS-MKH replicates the process descriptors : For each multi-threaded process P, the Generic Page Table (GPT), and the Virtual Segments List (VSL) are replicated in each cluster K containing at least one thread of the application. According to the "on-demand paging" principle, these replicated structures GPT(K,P) and VSL(K,P) are dynamically updated when page faults are detected. This section describes this building mechanism and the coherence protocol required by these multiple copies.

E) Trans-cluster lists of threads

ALMOS-MKH must handle dynamic sets of threads, such as the set of all threads waiting to access a given peripheral device. These sets of threads are implemented as circular double linked lists. As these threads can be running on any cluster, these linked lists are trans-cluster, and require specific technics in a multi kernel OS, where each kernel instance is handling only resources localized in a single cluster.

F) Remote Procedure Calls

To enforce locality for complex operations requiring a large number of remote memory accesses, the various kernel instances can communicate using RPCs (Remote Procedure Call), following the client/server model. This section describe the RPC mechanism implemented by ALMOS-MKH.

G) Input/Output Operations

H) Boot procedure

This section describes the ALMOS-MKH boot procedure.

I) Threads Scheduling

This section describes the ALMOS-MKH policy for threads scheduling.

J) Kernel level synchronisations

This section describes the synchronisation primitives used by ALMO-MKH, namely the barriers used during the parallel kernel initialization, and the locks used to protect concurrent access to the shared kernel data structures.

K) User level synchronisations

This section describes the ALMOS-MKH implementation of the POSIX compliant, user-level synchronisation services: mutex, condvar, barrier and semaphore.