Changes between Version 24 and Version 25 of AtomicOperations


Ignore:
Timestamp:
Dec 1, 2017, 6:39:22 PM (7 years ago)
Author:
alain
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AtomicOperations

    v24 v25  
    88 * The '''LL/SC''' (Linked Load / Store Conditional) operation is implemented as two specific VCI transactions. As the LL/SC instructions are implemented in the MIPS32 instruction set, these instructions can be used by both the kernel code and by the application code to read a data at address X, test and modify this data, and write the modified data at the same address X, with the guaranty that no other access to this address was done between the read and the write access.
    99 
    10  * The '''CAS''' (Compare and Swap) operation is implemented as a specific VCI transaction. As there is no CAS instruction in the MIPS32 instruction set, this operation is only used by the L1 cache controller for some low-level, synchronisation primitives such as updating a page table entry. 
     10 * The '''CAS''' (Compare and Swap) operation is implemented as a specific VCI transaction. As there is no CAS instruction in the MIPS32 instruction set, this operation is only used by the L1 cache controller for some low-level synchronisation mechanisms, such as updating a page table entry in kernel memory. 
    1111
    1212For both types of operation, the addresses is supposed to be aligned on 32 bits word boundaries, and the data are supposed to be 32 bits words.
     
    1919=== 2.1 General Principle ===
    2020
    21 From a conceptual point of view, the atomicity is handled on the memory controller side, that is actually the L2 cache controller in the TSAR architecture. Each L2 cache controller contains a list of all pending LL/SC atomic operations in an associative ''reservation table'', that contains 32 entries. Cette table doit enregistrer l'adresse X. Elle n'a pas besoin d'enregistrer le(s) numéro(s) des clients, mais elle doit identifier l'opération LL/SC pour éviter que deux opérations (LL/SC) à la même adresse X ne se mélangent.
    22  * When a processor P executes the LL(X) instruction for an address X, this réservation request is sent to the L2 cache by the L1 cache. The L2 cache allocates a 32 bits authentication key for this reservation. It registers both the X address and the K key in the associative ''reservation table'', and returns both the value stored at address X and the K value to the L1. Both the X address and the K key are also registered in the L1 cache. If another processor P' request a reservation for the same address X, it receives the saved K value from the L2 cache.
     21From a conceptual point of view, the atomicity is handled on the memory controller side, that is actually the L2 cache controller in the TSAR architecture. Each L2 cache controller contains a list of all pending LL/SC atomic operations in an associative ''reservation table'', that contains 32 entries. Each entry registers the X dress, plus a 32 bits K key identifying a given LL/SC operation. It does not register the client identifier, but the K key avoids to mix two successive LL/SC operations with the same address X.
     22 
     23 * When a processor P executes the LL(X) instruction for an address X, this réservation request is sent to the L2 cache by the L1 cache. The L2 cache allocates a 32 bits authentication key for this reservation. It registers both the X address and the K key in the associative ''reservation table'', and returns both the value stored at address X and the K value to the L1. Both the X address and the K key are also registered in the L1 cache. If another processor P' request a reservation for the same address X, it receives the registered K value from the L2 cache.
    2324
    24  * When a processor P executes the SC(X,D) instruction to an address X, this conditional write is sent to the L2 cache by the L1 cache, and the command contains both the reservation key K and the data D to be written. The L2 cache makes an associative search in the ''reservation table''. If a reservation with the same address X and the same key K is found, the atomic operation is a success : The reservation is canceled in the ''reservation table'', the D value is written at address X, and a ''success'' value is returned to the L1 cache. If there is no match in the ''reservation table'', the atomic operation is a failure: the D value is not written at address X, the ''reservation table'' is not modified, and a ''failure'' value is returned to the L1 cache.
     25 * When a processor P executes the SC(X,D) instruction to an address X, this conditional write is sent to the L2 cache by the L1 cache, and the command contains both the reservation key K and the data D to be written. The L2 cache makes an associative search in the ''reservation table''. If a valid reservation with the same address X and the same key K is found, the atomic operation is a success : The reservation is canceled in the ''reservation table'', the D value is written at address X, and a ''success'' value is returned to the L1 cache. If there is no match in the ''reservation table'', the atomic operation is a failure: the D value is not written at address X, the ''reservation table'' is not modified, and a ''failure'' value is returned to the L1 cache.
    2526
    2627Clearly, in case of concurrent LL/SC access to the same address X by two or more L1 caches, the winner is defined by the first SC(X,D) instruction received by the L2 cache.
     
    6869The actual encoding of the (success/failure) response for a SC(X,D) instruction depends on the processor core: For the MIPS2
    6970and ARM processors, a success is encoded as a non-zero value. For the PPC processor, a success is encoded as a zero value.
    70 In the TSAR architecture, the memory cache controller returns the value 0 for a success, and the value 1 for a failure to a SC(X,D,K) VCI command.
    71 If the architecture uses a MIPS or ARM processor, the SC value must be transcoded by the L1 cache controller before
     71In the TSAR architecture, the L2 cache controller returns the value 0 for a success, and the value 1 for a failure to a SC(X,D,K) VCI command.
     72If the architecture uses a MIPS or ARM processor, the value returned by the L2 cache must be transcoded by the L1 cache controller before
    7273to be transmitted to the processor core.
    7374 
     
    8283                        _itmask                 # enter critical section
    8384# lock acquisition
    84 loop                    LL r1,   0(r4)          # r1 <= M[r4]
    85                         BNEZ r1, loop           # retry if lock already taken (r1 != 0)
    86                         ORI r1,  r0, 1          # r1 <= 1
    87                         SC  r1,  0(r4)          # if atomic (M[r4] <= 1 / r1 <= 1) else (r1 <= 0)
    88                         BEQZ r1, loop           # retry if not atomic (r1 == 0)
     85loop                    ll r1,   0(r4)          # r1 <= M[r4]
     86                        bnez r1, loop           # retry if lock already taken (r1 != 0)
     87                        ori r1,  r0, 1          # r1 <= 1
     88                        sc  r1,  0(r4)          # if atomic (M[r4] <= 1 / r1 <= 1) else (r1 <= 0)
     89                        beqz r1, loop           # retry if not atomic (r1 == 0)
    8990                        ...
    9091# lock release
    91                         ORI r1, r0, 0           # r1 <= 0
    92                         SW r1, 0(r4)            # M[r4] <= 0
     92                        ori r1, r0, 0           # r1 <= 0
     93                        sw r1, 0(r4)            # M[r4] <= 0
     94
    9395                        _itunmask               # exit critical section
    9496}}}