74 | | || '''Task 2.d''' : Event database access/history primitives, Task manager : LIP6, Partners : ALL. This task aims at defining the Application Program Interface (API) of the AIM database, in interaction with the API defined in task 1d. An AIM is a collection of logged events corresponding to a single monitored phenomenon (exact locations and characteristics). In particular, this task is to define all the means needed to manipulate/enhance a database of AIMs, classified by observed phenomenon (temperature audit of the SoC, accurate identification of the faulty parts, communication contention points, etc), as well as their history for backtrack purposes. |
| 74 | || '''Task 2.d''' : Event database access/history primitives, Task manager : '''LIP6''', Partners : '''ALL'''. This task aims at defining the Application Program Interface (API) of the AIM database, in interaction with the API defined in task 1d. An AIM is a collection of logged events corresponding to a single monitored phenomenon (exact locations and characteristics). In particular, this task is to define all the means needed to manipulate/enhance a database of AIMs, classified by observed phenomenon (temperature audit of the SoC, accurate identification of the faulty parts, communication contention points, etc), as well as their history for backtrack purposes. |
| 77 | |
| 78 | == WP3 : On-line application remapping == |
| 79 | |
| 80 | [[Image(htdocs:images/wp3.png)]] |
| 81 | |
| 82 | Based on the online monitoring information that have been gathered by the appropriate monitoring resources (WP1), diagnosed and classified in the Architecture Instant Map Database AIM-DB (WP2) a collection of strategies for improving the application mapping are considered. This self-adaptability scheme may, depending on the system policy, aim at,: |
| 83 | * Decreasing the system power consumption |
| 84 | * Ensuring real-time performance or more generally improving application performance |
| 85 | * Guaranteeing functionality in the presence of faulty hardware resources |
| 86 | |
| 87 | These strategies operate on a task basis as per defined by the application task graph. Only the following on-line operations are considered: |
| 88 | |
| 89 | * Migration. Tasks are moved from HW resource to HW resource for : |
| 90 | * Lowering communication cost / power consumption in order to improve performance if an alternative processing resource has a more appropriate support for a particular task. Additionally, if some of the processing resources have time-sliced execution capabilities (a CPU running a multitasking operating system); migrating tasks results in a higher performance since time is shared among fewer tasks. |
| 91 | * Avoiding mapping to a faulty hardware resource, when online diagnostic support identifies an imminent problem occurrence (increased current leakage, temperature rising, etc.) |
| 92 | |
| 93 | * Replication. Tasks that are identified as critical (forming a bottleneck) can get replicated in order to improve performance, provided that the application description enables it. Replication only occurs when a task becomes critical momentarily meaning that later on, when the performance demand drops below a given threshold replicated tasks are killed freeing the corresponding processing resources. |
| 94 | |
| 95 | * Router reconfiguration. For either communication performance (avoiding contentions) or for dealing with a hardware defect on some units/physical links, the routing tables may be changed at run-time. |
| 96 | |
| 97 | The two first classes of operations may take place in different ways: |
| 98 | * a fully centralized scenario where once the decision to remap the application has been taken, a global remapping is issued. |
| 99 | * a fully decentralized scenario where processing resources are all equally endowed with decision capabilities. |
| 100 | |
| 101 | Depending on the chosen approach (i.e. centralized or distributed) there may exist tight coupling between the application remapping (current WP) and online diagnosis (WP2). Although a centralized mapping strategy can operate using the system-level information issued by the online diagnosis and drive the corresponding remapping operations, a fully distributed strategy runs differently. |
| 102 | The fundamental difference between the 2 principles relies in the following: |
| 103 | * In a centralized system a single unit holds the global system-level monitoring information and may therefore take decisions and issue remapping orders |
| 104 | * In a fully distributed system each unit takes decisions independently (according to potentially local-only information) from the others leading to potentially conflicting solution. |
| 105 | |
| 106 | The underlying motivation behind the evaluation of both strategies relies in the problem of scalability. A centralized scenario may intrinsically take better decisions since it operates on a global system view. Nevertheless, the necessary support it requires for periodically retrieving the AIM information implies an overhead which grows larger with the number of processors the system features. The intended explorations will help better defining the best trade-off according to, among other criteria, the number of processors and the desired adaptability. |
| 107 | |