209 | | The first and easiest one is to continue to consider that an initiator is fully locked until the real response is caught. This way, the only thing to do when a null_response is caught is to send a null_command to the interconnect with the same time information as the null_message. This doesn't even need to wake up the initiator. |
| 209 | The first and easiest one is to continue to consider that an initiator is fully locked until the real response is caught. This way, the only thing to do when a null_response is caught is to send a null_command to the interconnect with the same time information as the null_response. This doesn't even need to wake up the initiator. This method is called Passive_Sync. |
| 210 | |
| 211 | The second one is dedicated for advanced multitransactionnal components modeling. There is a gap between the null_response time and the initiator one. During this gap, there can be some useful cycles to simulate which can also initiate a transaction. In order to prevent this eventual request to be delayed, the cycles in the gap need to be simulated. When a null_response is caught, the initiator is woken up and is allowed to pursue its treatment until a new transaction is sent or the null_response time is matched, resulting in the sending of a null_command. This method is called Active_Sync. |
| 213 | Mostly, a multi-transactionnal component is composed of several threads in its CABA model. Those threads are used to model various behaviors, such as the control of access to a material resource or the resource usage by a dataflow. |
| 214 | |
| 215 | In a CABA simulation, a multi-thread component is effective because threads advance their time all together. The only issue which can occur is in case of concurrent accesses on a single material resource. In a TLMDT simulation, the possible desynchronization between threads has to be taken into account. This induce the need of a strong synchronization between each threads in order to prevent accidental transaction's reordering. Moreover, the cost of this synchronisation isn't negligible. |
| 216 | |
| 217 | However, in TLMDT a component is modeled using a single thread. In order to represent the multi-thread function, every CABA thread is modeled by a timer in the TLMDT model. The internal modeling of a component will then be divided into two major sections. The first one will represent a scheduler whose job is to determine which action can be computed, while the second one will perform the elected action treatment. When a treatment is started, it won't be stopped unless it is over or an access to a shared material resource is requested. |
| 227 | The interconnects are the most modified component with the "TLMDT for tightly interdependent architectures with several levels of interconnections" specification. The Local Crossbar has to handle the new synchronisation protocol. It has a time quantum (Δqlc), which determine the maximum allowed desynchronization for each target. |
| 228 | |
| 229 | Pseudo code : |
| 230 | |
| 231 | {{{#!c++ |
| 232 | //Global Synchronisation |
| 233 | // T = Time - Δ = quantum |
| 234 | If (T.global_input + Δqlc < T.local_crossbar) |
| 235 | If (null_command of vci_transaction received from global crossbar) |
| 236 | T.global_input = T.local_crossbar |
| 237 | Else If(arbitration ok : Req = handled request) |
| 238 | T.local_crossbar = T.Req |
| 239 | //Local Synchonisation |
| 240 | For every local target |
| 241 | if(T.local_target + Δqlc < T.local_crossbar) |
| 242 | send a null_command to the target |
| 243 | T.local_target = T.local_crossbar |
| 244 | //Global Synchronisation |
| 245 | If (T.global_input + Δqlc < T.local_crossbar) |
| 246 | send a null_command to the global_crossbar |
| 247 | //Initiators Synchronisation |
| 248 | Else If (Req.type == Sync_request) |
| 249 | send the response to this transaction |
| 250 | //Cluster Unlock |
| 251 | Else If (Req.type == Null_command && input != global_input) |
| 252 | send a Null_command to every local target and global crossbar |
| 253 | //Routing |
| 254 | Else If (Req.type == vci_transaction && input == global_input) |
| 255 | T.global_input = T.local_crossbar |
| 256 | Routage |
| 257 | Else If (Req.type == vci_transaction && input != global_input) |
| 258 | Routage |
| 259 | Else //arbitration ko |
| 260 | send a null_response for any non primary blocking or conditionnaly blocking request for which the interconnect did not sent one before. |
| 261 | wait for a new incoming transaction |
| 262 | }}} |
| 264 | The Global Crossbar is even more impacted than the local crossbar because of the release of synchronisation on it in order to break the strong global dependencies and ever allow every cluster to feed its targets with times. Its duty is to release all clusters which are not too much in advance (determined by its time quantum - Δqgc) and to route transactions. Considering that a cluster can be released even if another one got a lower timer and that a cluster doesn't need to synchronize with the Global Crossbar until the desynchronization timer reaches a certain value, there will be a loss in precision but this allows an increase in the parallelization of the simulation and a reduced amount of PDES transactions. |
| 265 | |
| 266 | Pseudo code : |
| 267 | |
| 268 | {{{#!c++ |
| 269 | T.global_crossbar = min(all T inputs) |
| 270 | For every input |
| 271 | If (T.input <= T.global_crossbar + Δqgc) |
| 272 | //Synchronization |
| 273 | If(Req.type == Null_command) |
| 274 | Send back another null_command with the same temporal information. |
| 275 | //Routing |
| 276 | If(Req.type == vci_transaction) |
| 277 | T.Req = T.target_port |
| 278 | Routing |
| 279 | Else |
| 280 | //Init Unlcok |
| 281 | If(Req.type == vci_transaction) |
| 282 | Send a null_response |
| 283 | |
| 284 | }}} |