Changes between Version 14 and Version 15 of projectstructure
- Timestamp:
- Jun 16, 2008, 4:30:23 PM (17 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
projectstructure
v14 v15 30 30 Software and hardware monitoring for performance, power/voltage/temperature and fault detection work as first level instrumentation tasks, and are studied in WP tasks 1.a,1.b and 1.c. They deliver raw monitored digital information on a periodic basis or permanently (online behaviour). This information is then modelled in the form of classified events in the task 1.d. The event model specifies the event format, fixed for the architecture. Formatted events are stored on the local memories of the architecture tiles, in fixed-size cyclic buffers designed to be easily accessible. Altogether, the disseminated buffers represent the “Distributed Raw Event Tables” (DRET) available at all times within the architecture. The monitoring capabilities are summarized in figure 4. 31 31 32 || '''Task 1.a''' : Performance measurement, Task manager : '''LETI''' Partners : LETI, LIRMM. In this task, the performance of the tile is monitored. The difficulty is to reach the minimum perturbation requirement. We propose to develop two mechanisms. The first one is SW oriented and consists on measuring periodically, or on-line the processors as well as their communication workloads. The Network Interface of the NoC will help to have a generic way to perform in/out throughput on-line monitoring. The second mechanism is HW oriented and consists in probing some chosen critical paths. The advantage of this kind of monitoring is the non-intrusive property, but the difficulty is to have access to the data paths or the control part. Both HW and SW solutions will be studied and compared in this task.32 || '''Task 1.a''' : Performance measurement, Task manager : '''LETI''' Partners : '''LETI''', '''LIRMM'''. In this task, the performance of the tile is monitored. The difficulty is to reach the minimum perturbation requirement. We propose to develop two mechanisms. The first one is SW oriented and consists on measuring periodically, or on-line the processors as well as their communication workloads. The Network Interface of the NoC will help to have a generic way to perform in/out throughput on-line monitoring. The second mechanism is HW oriented and consists in probing some chosen critical paths. The advantage of this kind of monitoring is the non-intrusive property, but the difficulty is to have access to the data paths or the control part. Both HW and SW solutions will be studied and compared in this task. 33 33 || T0 → T0+18|| 34 34 || Status : '''not achieved yet'''|| 35 35 [[BR]] 36 || '''Task 1.b''' : PVT management, Task manager : '''LETI''' Partners : LETI, LIP6. The objective of this task is the monitoring of physical information, such as temperature, voltage and power consumption. This can be obtained by the way of direct measurement, with on-site temperature sensors for example, or with non direct measurement, thanks to SW load evaluation and equivalent tables. Due to parameters dispersions throughout the chip in nanotechnologies, HW on-site sensors will be probably necessary. Nevertheless, non direct measurement will add another dimension and help the diagnosis phase. These two techniques will be studied and evaluated in this task. Some of the chosen techniques will also be implemented.36 || '''Task 1.b''' : PVT management, Task manager : '''LETI''' Partners : '''LETI''', '''LIP6'''. The objective of this task is the monitoring of physical information, such as temperature, voltage and power consumption. This can be obtained by the way of direct measurement, with on-site temperature sensors for example, or with non direct measurement, thanks to SW load evaluation and equivalent tables. Due to parameters dispersions throughout the chip in nanotechnologies, HW on-site sensors will be probably necessary. Nevertheless, non direct measurement will add another dimension and help the diagnosis phase. These two techniques will be studied and evaluated in this task. Some of the chosen techniques will also be implemented. 37 37 || T0 → T0+24|| 38 38 || Status : '''not achieved yet'''|| 39 39 [[BR]] 40 || '''Task 1.c''' : HW fault detection, Task manager : '''LETI''' Partners : All. Nanotechnologies are leading to more and more difficulties to ensure a correct behavior of the tiles and interconnects between tiles during the chip lifetime. The fault detection is then becoming a mandatory feature of future architectures. The objective of this WP is to evaluate some of the HW and SW possible techniques, like on-line or periodic testing, BIST, software CRC or software security survey tasks. As the field of research is very vast and it is not the aim of the project to have a full protection against faults, just a few techniques will be implemented. The objective is to add this dimension to the event table, because of its importance.40 || '''Task 1.c''' : HW fault detection, Task manager : '''LETI''' Partners : '''All'''. Nanotechnologies are leading to more and more difficulties to ensure a correct behavior of the tiles and interconnects between tiles during the chip lifetime. The fault detection is then becoming a mandatory feature of future architectures. The objective of this WP is to evaluate some of the HW and SW possible techniques, like on-line or periodic testing, BIST, software CRC or software security survey tasks. As the field of research is very vast and it is not the aim of the project to have a full protection against faults, just a few techniques will be implemented. The objective is to add this dimension to the event table, because of its importance. 41 41 || T0 → T0+24|| 42 42 || Status : '''not achieved yet'''|| 43 43 [[BR]] 44 || '''Task 1.d''' : Event shaping and logging, Task manager : '''LETI''' Partners : All. The final objective of all the tasks in this WP is to obtain a raw event table for all the features of the architecture measured. The objective of this task is to determine the format of this table and how it can be accessed (how to write and how and when to read the table). Both of them should be kept as simple as possible to be exploited by different diagnosis systems, and not only those provided in this project. In that way, these first level (or preliminary) results can be independently used and disseminated.44 || '''Task 1.d''' : Event shaping and logging, Task manager : '''LETI''' Partners : '''All'''. The final objective of all the tasks in this WP is to obtain a raw event table for all the features of the architecture measured. The objective of this task is to determine the format of this table and how it can be accessed (how to write and how and when to read the table). Both of them should be kept as simple as possible to be exploited by different diagnosis systems, and not only those provided in this project. In that way, these first level (or preliminary) results can be independently used and disseminated. 45 45 || T0 → T0+24|| 46 46 || Status : '''not achieved yet'''|| … … 57 57 58 58 [[BR]] 59 || '''Task 2.a''' : Statistical analysis, Task manager : '''LIRMM''' Partners : LIRMM. Among the remapping strategies that will later be presented in WP3; some, or a sequence of those applied over time may have hardly predictable effects on application performance. In order to keep track of mid- or long-term consequences of those remapping decisions, a statistical analysis of the DRET and AIM databases will be performed. These information may later help refining the decision-taking policy when, for instance, a previous task migration order led to a worst global solution.59 || '''Task 2.a''' : Statistical analysis, Task manager : '''LIRMM''' Partners : '''LIRMM'''. Among the remapping strategies that will later be presented in WP3; some, or a sequence of those applied over time may have hardly predictable effects on application performance. In order to keep track of mid- or long-term consequences of those remapping decisions, a statistical analysis of the DRET and AIM databases will be performed. These information may later help refining the decision-taking policy when, for instance, a previous task migration order led to a worst global solution. 60 60 || T0+12 → T0+24|| 61 61 || Status : '''not achieved yet'''||