| 1 | % les objectifs scientifiques/techniques du projet. | 
|---|
| 2 | The objectives of the COACH project are to develop a complete framework to HPC | 
|---|
| 3 | (accelerating solutions for existing software applications) and embedded | 
|---|
| 4 | applications (implementing an application on a low power standalone | 
|---|
| 5 | device).  The design steps are presented figure~\ref{coach-flow}. | 
|---|
| 6 | \begin{figure}[hbtp]\leavevmode\center | 
|---|
| 7 |   \includegraphics[width=.8\linewidth]{flow} | 
|---|
| 8 |   \caption{\label{coach-flow} COACH design flow} | 
|---|
| 9 | \end{figure} | 
|---|
| 10 | \begin{description} | 
|---|
| 11 | \item[HPC setup:] During this step, the user splits the application into 2 parts: the host application | 
|---|
| 12 | which remains on a PC and the SoC application which is mapped on the FPGA.  | 
|---|
| 13 | COACH will allow to automatically translate high level language programs to FPGA configurations. | 
|---|
| 14 | In addition, it will provide a SystemC simulation model of the whole system (PC+communication+FPGA-SoC)  | 
|---|
| 15 | which will allow performance evaluation of the partitioning. | 
|---|
| 16 | \item[SoC design:] In this phase,  | 
|---|
| 17 | COACH will allow the user to obtain simulators for the SoC at different abstraction levels by giving to the COACH framework a SoC description.   | 
|---|
| 18 | This description will consist of a process network corresponding to the application,  | 
|---|
| 19 | an OS, an instance of a generic hardware platform | 
|---|
| 20 | and a mapping of processes on the platform components. COACH will offer different targets to map the processes:   | 
|---|
| 21 | software (the process runs on a SoC processor), | 
|---|
| 22 | ASIP (the process runs on a SoC processor enhanced with dedicated instructions), | 
|---|
| 23 | and hardware (the process runs into a coprocessor that is generated by HLS and plugged on the SoC bus). | 
|---|
| 24 | \item[Application compilation:] Once the SoC description is validated through performances analysis, COACH will generate automatically | 
|---|
| 25 | an FPGA bitstream containing the hardware platform with the SoC application software and  | 
|---|
| 26 | an executable containing the host application. The user will be able to launch the application by | 
|---|
| 27 | loading the bitstream on an FPGA and running the executable on PC. | 
|---|
| 28 | \end{description} | 
|---|
| 29 |   | 
|---|
| 30 | % l'avancee scientifique attendue. Preciser l'originalite et le caractere  | 
|---|
| 31 | % ambitieux du projet.  | 
|---|
| 32 | %FIXME == {NON ceci n'est pas une contribution scientifique. A re-ecrire} | 
|---|
| 33 |  | 
|---|
| 34 | %The main scientific contribution of the project is to unify various synthesis techniques | 
|---|
| 35 | %(same input and output formats) allowing the user to swap without engineering effort | 
|---|
| 36 | %from one to another and even to chain them. For instance, it will be possible to run loop transformations before synthesis. | 
|---|
| 37 | %Another advantage of this framework is to provide different abstraction levels from | 
|---|
| 38 | %a single description. | 
|---|
| 39 | %Finally, this description is device family independent and its hardware implementation | 
|---|
| 40 | %is automatically generated. | 
|---|
| 41 |  | 
|---|
| 42 | % Detailler les verrous scientifiques et techniques a lever par la realisation du projet. | 
|---|
| 43 | System design is a very complex task and in this project we will try to simplify it | 
|---|
| 44 | as much as possible. For this purpose the following scientific and technological barriers | 
|---|
| 45 | have to be addressed. | 
|---|
| 46 |  | 
|---|
| 47 | \begin{description} | 
|---|
| 48 | \item[Design Space Exploration:] | 
|---|
| 49 |     The COACH environment will allow to easily map an application described by using a process  | 
|---|
| 50 |         network Model of Computation (MoC) on a shared-memory, MPSoC architecture. COACH will | 
|---|
| 51 |         permit to explore the design space by allowing system designer to select and  | 
|---|
| 52 |         parameterize the target architecture, and to define the best hardware/software  | 
|---|
| 53 |         partitioning of the application. | 
|---|
| 54 |  | 
|---|
| 55 | \item[High-Level Synthesis:] | 
|---|
| 56 |     COACH will allow the automatic generation of hardware accelerators when required | 
|---|
| 57 |         by using High-Level Synthesis (HLS) tools. | 
|---|
| 58 |         HLS will thus be fully integrated into a complete system-level design environment. | 
|---|
| 59 |         Moreover, COACH will support both data and control dominated applications.  | 
|---|
| 60 |     Indeed, the HLS tools of COACH will support a common language and coding style  | 
|---|
| 61 |         to avoid re-engineering by the designer. | 
|---|
| 62 |     COACH will provide a tool which will automatically explore the micro-architectural  | 
|---|
| 63 |         design space of coprocessor. | 
|---|
| 64 |  | 
|---|
| 65 | \item[High-level code transformation:] | 
|---|
| 66 |     COACH will allow to optimize the memory usage, to enhance the parallelism through  | 
|---|
| 67 |         loop transformations and parallelization. The challenge is to identify the coarse  | 
|---|
| 68 |         grained parallelism and to generate, | 
|---|
| 69 |         from a sequential algorithm, application containing multiple communicating | 
|---|
| 70 |         tasks. To this aim, one may adapt techniques which were developed in the 1990 for  | 
|---|
| 71 |         the construction of distributed programs. However, in the context of HLS, there are  | 
|---|
| 72 |         still several original problems to be solved, mainly to do with the construction of  | 
|---|
| 73 |         FIFO communication channels and with memory optimization. | 
|---|
| 74 |         Additionnal preprocessing, source-level transformations, are thus | 
|---|
| 75 |         required to improve the process. | 
|---|
| 76 |         Particularly, this includes parallelism exposure and efficient memory mapping. | 
|---|
| 77 |         COACH will support code transformation by providing a source to source C2C tool. | 
|---|
| 78 |  | 
|---|
| 79 | \item[Platform based design:]  | 
|---|
| 80 |     COACH will define architectural templates that can be customized by adding | 
|---|
| 81 |     dedicated coprocessors and ASIPs and by fixing template parameters such as | 
|---|
| 82 |     the number of embedded processors, the number of sizes of embedded memory banks | 
|---|
| 83 |     or the embedded the operating system. | 
|---|
| 84 |     However, the specification of the application will be independant of both the | 
|---|
| 85 |     architectural template and the target FPGA device. | 
|---|
| 86 |  | 
|---|
| 87 | \item[Hardware/Software communication middleware:] | 
|---|
| 88 |     COACH will implement an homogeneous HW/SW communication infrastructure and | 
|---|
| 89 |     communication APIs (Application Programming Interface), that will be used for  | 
|---|
| 90 |     communications between software tasks running on embedded processors and  | 
|---|
| 91 |     dedicated hardware coprocessors. This will allow explore the design space by  | 
|---|
| 92 |         mapping the tasks of application (described as a process network) on a  | 
|---|
| 93 |         shared-memory, MPSoC architecture. | 
|---|
| 94 |  | 
|---|
| 95 | \item[Processor customization:] | 
|---|
| 96 | ASIP design will be addressed by the COACH project. COACH will allow system designers to explore  | 
|---|
| 97 | the various level of interactions between the original CPU micro-architecture and its | 
|---|
| 98 |   extension. It will also allow to retarget the compiler instruction-selection pass. Finally, | 
|---|
| 99 |  COACH will integrate ASIP design in a complete System-level design framework. | 
|---|
| 100 |  | 
|---|
| 101 | \item [High-Performance Computing:] The main problem in HPC is the communication  | 
|---|
| 102 | between the PC and the SoC. This problem has 2 aspects. The first one is the run-time  | 
|---|
| 103 | efficiency. The second is its engineering  cost, especially if one want to refine an  | 
|---|
| 104 | implementation at several abstract levels. | 
|---|
| 105 | COACH will  | 
|---|
| 106 |  | 
|---|
| 107 | %\item The COACH design flow has a top-down approach. In such a case, | 
|---|
| 108 | %the required performance of a coprocessor (clock frequency, maximum cycles for | 
|---|
| 109 | %a given computation, power consumption, etc) are imposed by the other system | 
|---|
| 110 | %components. The challenge is to allow the user to control accurately the synthesis | 
|---|
| 111 | %process. For instance, the clock frequency must not be a result of the RTL synthesis | 
|---|
| 112 | %but a strict synthesis constraint. | 
|---|
| 113 |  | 
|---|
| 114 | \end{description} | 
|---|
| 115 |  | 
|---|
| 116 |  | 
|---|
| 117 |  | 
|---|
| 118 |  | 
|---|
| 119 | %Presenter les resultats escomptes en proposant si possible des criteres de reussite  | 
|---|
| 120 | %et d'evaluation adaptes au type de projet, permettant d'evaluer les resultats en  | 
|---|
| 121 | %fin de projet. | 
|---|
| 122 | The main result is the framework. It is composed concretely of:  | 
|---|
| 123 | a communication middleware for HPC,  | 
|---|
| 124 | 5 HAS tools (control dominated HLS, data dominated HLS, Coarse grained HLS,  | 
|---|
| 125 | Memory optimisation HLS and ASIP), | 
|---|
| 126 | 3 architectural templates that are synthesizable and that can be prototyped, | 
|---|
| 127 | one design space exploration tool, | 
|---|
| 128 | 2 operating systems (DNA/OS and MUTEKH). | 
|---|
| 129 | \\ | 
|---|
| 130 | The framework fonctionality will be demonstrated with the demonstrators | 
|---|
| 131 | (see task-7 page~\pageref{task-7}) and the tutorial example (see task-8 | 
|---|
| 132 | page~\ref{subtask-tutorial}). | 
|---|