Changes between Version 7 and Version 8 of AtomicOperations


Ignore:
Timestamp:
Oct 5, 2009, 1:37:40 PM (15 years ago)
Author:
guthmull
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AtomicOperations

    v7 v8  
    3939== 3.  Cachable atomic operations ==
    4040
    41 In order to support cachable spin-locks, the memory cache controller, and the L1 cache controller must cooperate to implement the LL/SC mechanism.
     41In order to support cachable spin-locks and a better scalability, the memory cache controller, and the L1 cache controller must cooperate to implement the LL/SC mechanism. But the previous semantic of the LL/SC mechanism has to be modified.
    4242
    43 === 3.1 memory cache controller ===
     43Furthermore, the LL/SC mechanism is extended to support both 32 and 64 bits atomic accesses.
    4444
    45 The memory cache controller contains a dedicated storage that is used to register, for each cache line the set of  L1 caches that have copies. These sets of copies are implemented as linked lists of SRCIDs. To implement the Reservation Table, we just introduce, for each registered copy of a cache line, (i.e. each entry in this Reservation Table) one extra bit to register a pending LL/SC atomic operation.
    46 This approach is scalable, but creates the possibility of “false conflicts”, when several atomic access are done to the same cache line.
     45=== 3.1 new semantic ===
     46
     47The previous semantic is :
     48(1) The Store Conditional succeeds if there was no other Store Conditional at the same address since the last Linked Load.
     49
     50The new semantic is :
     51(2) The Store Conditional succeeds if the content of the memory has not changed since the last Linked Load.
     52
     53Clearly we have (1) => (2) . But we have only (2) => (1) when the software does not use the LL/SC mechanism to monitor WRITEs. Such use of the LL/SC mechanism has to be avoided.
     54
     55=== 3.2 new protocol ===
     56
     57In the new protocol, there is no more LL on the network :
     58
     59 * Linked Loads become simple Reads, where the data sent to the processor is recorded in a register of the L1 cache. When the processor issues a Store Conditional, the L1 cache sends a "SC" packet, where the first flits contain the data previously read and the last flits contain the data to write in the memory. So this new SC packet is 2 (32 bits accesses) or 4 flits (64 bits accesses) long.
     60
     61 * The memory cache controller then compares the data read by the L1 cache to the data in the memory cache. If these two values are equal, the Store Conditional is issued to the memory and the response to the SC is TRUE, else the Store Conditional is not issued and the response is FALSE.
     62
     63=== 3.3 memory cache controller ===
     64
     65The memory cache controller does not have a linked access buffer any more. It simply has to be capable of recording pending Store Conditional in the table of transactions to the external RAM controller, in case of miss in the memory cache.
    4766
    4867The actions done by the memory cache controller for the various commands are described below :
    4968
    50 '''LL(SRCID, X)'''
    51 {{{
    52    Scan all copies associated to the cache line containing the X address
    53   If ( a copy corresponding to SRCID.exists) {
    54       RESERVED = true
    55   } else {
    56       a new copy corresponding to SRCID is created in the linked list
    57       and marked RESERVED in the linked list
    58   }
    59  }}}
    6069
    6170'''SC(SRCID, X)'''
    6271{{{
    63   Scan all copies associated to the cache line containing the X address
    64   If ( a copy corresponding to SRCID.exists and RESERVED == true ) {
    65       - scan again the linked list of copies to send an UPDATE request
    66         to the other L1 caches, and invalidate all RESERVED bits
     72  Read the data in the memory cache.
     73  If ( data read by the L1 cache == data from the memory cache ) {
    6774      - write data in the memory cache
    68       - after all responses to UPDATE have been received, return true
     75      - send updates or invalidates to the owners of this cache line
     76      - after all responses to UPDATE or INVALIDATE have been received, return true
    6977        to the L1 cache.   
    7078  } else {
     
    7684{{{
    7785  - Scan the linked list of copies to send an UPDATE request
    78     to the L1 caches (other than SRCID), and invalidate all RESERVED bits
     86    to the L1 caches (other than SRCID)
    7987  - Write data in the memory cache
    8088  - after all responses to UPDATE have been received, acknowledge the
     
    8694  If ( cachable request ) {
    8795    - register the SRCID in the list of copies associated to the X address.
    88     - return the complete cache line
     96    - return the complete or partial cache line
    8997  } else {
    9098    - return a single word.
     
    92100}}}
    93101
    94 === 3.2  L1 cache controller ===
     102=== 3.4  L1 cache controller ===
    95103
    96 The L1 cache controller receiving a new LL(X) request from the processor must locally register this reservation on the X address to validate the use of the locally cached copy, and to check the address when it receives a SC(X) request from the processor. This requires an extra register to store the address, and a RESERVED flip-flop in the L1 cache controller.
     104The L1 cache controller receiving a new LL(X) request from the processor must locally register this request on the X address and the data sent to the processor. When it receives a SC(X) request from the processor, it checks the LL register and, if this register is valid, it sends a Store Conditional to the memory cache controller containing both the data read by the LL(X) and the data to write. This requires an extra register to store the address, the data, a VALID flip-flop, and a LONG flip-flop (in case of 64 bits LL/SC) in the L1 cache controller.
    97105
    98106The actions done by the L1 cache controller for the various commands are described below :
     
    100108'''LL(X) from processor'''
    101109{{{
    102   If (RESERVED = true & ADDRESS = X) {    // local spin-lock         
     110  If (VALID = true & ADDRESS = X) {    // local spin-lock         
    103111    return the read data to the processor
    104112  } else {                                          // first LL access                       
    105     RESERVED <= true
     113    VALID <= true
    106114    ADDRESS <= X
    107     send a LL(X) request to memory cache,
     115    DATA <= READ(X),
    108116    and return the read value to the processor
    109117  }
     
    112120'''SC(X) from processor'''
    113121{{{
    114   If (RESERVED = true & ADDRESS = X)  {    // possible success
    115     send a SC(X) request to the memory cache,
     122  If (VALID = true & ADDRESS = X)  {    // possible success
     123    send a SC(X) request to the memory cache containing the data stored in the LL register and the data to write,
    116124    and return the Boolean response to the processor
    117     RESERVED <= false
     125    VALID <= false
    118126  } else {                                         // failure 
    119127   return a false value to the processor
    120    RESERVED <= false
     128   VALID <= false
    121129  }
    122130}}}
     
    124132'''INVAL(L) or UPDATE(L) from memory controller'''
    125133{{{
    126   If (ADDRESS = L)  {                        // invalidate reservation
    127     RESERVED <= false
    128   }
    129   and the L1 cache is updated or invalidated.
     134  The LL register must not be invalidated.
     135  The L1 cache is updated or invalidated.
    130136}}}
    131137