This section describes the ALMOS-MKH implementation of 4 POSIX compliant, user-level synchronisation services: mutex, condvar, barrier and semaphore.
A) General Principles
- A mutex is declared by a given user process as a pthread_mutex_t global variable.
- A condvar is declared by a given user process as a pthread_cond_t global variable.
- A barrier is declared by a given user process as a pthread_barrier_t global variable.
- A semaphore is declared by a given user process as a sem_t global variable.
All these user types are implemented by ALMOS-MKH as unsigned long. The value stored in user space for this variable is NOT used by the kernel. ALMOS-MKH uses only the virtual address of this variable as an identifier for the synchronization variable.
As these synchronization variables are used by threads running in different clusters, all access functions use remote_read() / remote_write() primitives.
For each type of variable, ALMOS-MKH defines a specific internal data structure as described below. All these structures are protected by a kernel remote_busylock_t, as described in section J.
B) Mutex
The user level, POSIX compliant, mutex is defined in the pthread library, implemented by the pthread.h, shared_pthread.h, and pthread.c files.
It can be used by a muti-threaded user application to give a thread exclusive access to a shared user object.
The kernel implementation of a mutex is defined in the remote_mutex.h and remote_mutex.c files.
For each user mutex, ALMOS-MKH creates a kernel remote_mutex_t structure, dynamically allocated in the reference cluster (i.e. in the cluster containing the reference process descriptor).
- The remote_mutex_create() function allocates and initializes a mutex, using an RPC if if the calling thread is not running in the reference cluster.
- The remote_mutex_destroy() function destroys a given mutex, using RPC if the calling thread is not running in the reference cluster.
- The blocking remote_mutex_lock() function implements a descheduling policy when the mutex is already taken by another thread : the calling thread registers in the mutex waiting queue, and blocks on the THREAD_BLOCKED_USERSYNC condition.
- The remote_mutex_unlock() function unblocks the first waiting thread in the queue, without releasing the mutex, if the queue is not empty.
C) Condvar
The user level, POSIX compliant, condvar is defined in the pthread library, implemented by the pthread.h, shared_pthread.h, and pthread.c files.
It allows a given thread to efficiently wait for a change in a shared user object. A condvar must always be associated to a mutex.
The kernel implementation of a condvar is defined in the remote_condvar.h and remote_condvar.c files.
For each user condvar, ALMOS-MKH creates a kernel remote_condvar_t structure, dynamically allocated in the reference cluster (i.e. in the cluster containing the reference process descriptor).
- The remote_condvar_create() function allocates and initializes a condvar, using an RPC if if the calling thread is not running in the reference cluster.
- The remote_condvar_destroy() function destroys a given condvar, using RPC if the calling thread is not running in the reference cluster.
- The blocking remote_condvar_wait() function implement a descheduling policy: the calling thread registers in the condvar waiting queue, and blocks on the THREAD_BLOCKED_USERSYNC condition.
- The remote_condvar_signal() function allows (another) thread to unblock the first blocked thread waiting on a given condvar.
- The remote_condvar_broadcast() function allows (another) thread to unblock all threads waiting on a given condvar.
The three functions wait(), signal() and broadcast() must be called by a thread holding the mutex associated to the condvar.
D) Barrier
The user level, POSIX compliant, barrier is defined in the pthread library, implemented by the pthread.h, shared_pthread.h, and pthread.c files.
It can be used by a muti-threaded user application to implement a "rendez-vous" for a given number of threads running in different clusters. As the implementation uses a toggle variable, the same barrier can be safely used several times, as long as the number of expected threads does not change.
The kernel implementation of a barrier is defined in the remote_barrier.h and remote_barrier.c files.
For each user barrier, ALMOS-MKH creates a kernel remote_barrier_t structure, dynamically allocated in the reference cluster (i.e. in the cluster containing the reference process descriptor).
- The remote_barrier_create() function allocates and initializes a barrier, to define the number of expected threads, using an RPC if if the calling thread is not running in the reference cluster.
- The remote_barrier_destroy() function destroys a given barrier, using RPC if the calling thread is not running in the reference cluster.
- The blocking remote_barrier_wait() function returns only when all expected threads reach the barrier. It implements a descheduling policy: when a thread is not the last expected thread, it register in he barrier waiting queue and blocks on the THREAD_BLOCKED_USERSYNC condition. The last thread reset the barrier and unblocks all waiting threads.
N.B. In the current implementation, the barrier is implemented in the reference cluster, creating contention when the number of threads is large. It exists a distributed implementation, where the barrier is implemented as a quad-tree, entirely in user space (no system calls), but this implementation makes the assumption that the target architecture has a 2D mesh topology.
E) Semaphore
The user level, POSIX compliant, semaphore is defined in the semaphore library, implemented by the semaphore.h, shared_semaphore.h, and semaphore.c files.
It can be used by a muti-threaded user application to synchronize user threads running in different clusters, through the wait and post primitives.
The kernel implementation of a semaphore is defined in the remote_sem.h and remote_sem.c files.
For each user semaphore, ALMOS-MKH creates a kernel remote_sem_t structure, dynamically allocated in the reference cluster (i.e. in the cluster containing the reference process descriptor).
- The remote_sem_create() function allocates and initializes a semaphore, using an RPC if if the calling thread is not running in the reference cluster.
- The remote_sem_destroy() function destroys a given semaphore, using RPC if the calling thread is not running in the reference cluster.
- The blocking remote_sem_wait() function returns only when the semaphore has a non-zero value, and has been atomically decremented. If the semaphore has a zero value, the calling thread registers in the semaphore waiting queue, and block on the THREAD_BLOCKED_USERSYNC condition.
- The remote_sem_post() function atomically increments the semaphore. If the waiting queue is not empty, it unblock all waiting threads.
- The remote_sem_get_value() function returns the semaphore current value, without modifying the semaphore state.