source: anr/section-3.2.tex @ 173

Last change on this file since 173 was 134, checked in by coach, 15 years ago

IA: fixed mutek, altera, xilinx, and neutal architectural template

File size: 5.4 KB
RevLine 
[12]1% les objectifs scientifiques/techniques du projet.
[33]2The objectives of the COACH project are to develop a complete framework to HPC
[20]3(accelerating solutions for existing software applications) and embedded
4applications (implementing an application on a low power standalone
[24]5device).  The design steps are presented figure~\ref{coach-flow}.
[12]6\begin{figure}[hbtp]\leavevmode\center
7  \includegraphics[width=.8\linewidth]{flow}
[20]8  \caption{\label{coach-flow} COACH flow}
[12]9\end{figure}
10\begin{description}
[104]11\item[HPC setup:] During this step, the user splits the application into 2 parts: the host application
12which remains on a PC and the SoC application which is mapped on the FPGA.
13The COACH framework provides a SystemC simulation model of the whole system (PC+communication+FPGA-SoC) which allows a performance evaluation of the partitioning.
14\item[SoC design:] In this phase,
[71]15the user can obtain simulators for the SoC at different abstraction levels by giving to the COACH framework a SoC description. 
[12]16This description consists of a process network corresponding to the SoC application,
17an OS, an instance of a generic hardware platform
18and a mapping of processes on the platform components. The supported mapping are
19software (the process runs on a SoC processor),
[99]20ASIP (the process runs on a SoC processor enhanced with dedicated instructions),
21and hardware (the process runs into a coprocessor that is generated by HLS and plugged on the SoC bus).
[104]22\item[Application compilation:] Once the SoC description is validated, COACH generates automatically
[33]23an FPGA bitstream containing the hardware platform with the SoC application software and
[12]24an executable containing the host application. The user can launch the application by
[33]25loading the bitstream on an FPGA and running the executable on PC.
[12]26\end{description}
27 
28% l'avancee scientifique attendue. Preciser l'originalite et le caractere
29% ambitieux du projet.
[104]30%FIXME == {NON ceci n'est pas une contribution scientifique. A re-ecrire}
[12]31The main scientific contribution of the project is to unify various synthesis techniques
32(same input and output formats) allowing the user to swap without engineering effort
[104]33from one to another and even to chain them. For instance, it will be possible to run loop transformations before synthesis.
[12]34Another advantage of this framework is to provide different abstraction levels from
35a single description.
36Finally, this description is device family independent and its hardware implementation
37is automatically generated.
38
39% Detailler les verrous scientifiques et techniques a lever par la realisation du projet.
40System design is a very complicated task and in this project we try to simplify it
41as much as possible. For this purpose we have to deal with the following scientific
42and technological barriers.
43\begin{itemize}
44\item HLS tools are sensitive to the style in which the algorithm is written.
45In addition, they are are not integrated into an architecture and system
46exploration tool.
47Consequently, engineering work is required to swap from a tool to another,
48to integrate the resulting simulation model to an architectural exploration tool
49and to synthesize the generated RTL description.
50%CA Additionnal preprocessing, source-level transformations, are thus
51%CA required to improve the process.
52%CA Particularly, this includes parallelism exposure and efficient memory mapping.
53\item Most HLS tools translate a sequential algorithm into a coprocessor
54containing a single data-path and finite state machine (FSM). In this way,
55only the fine grained parallelism is exploited (ILP parallelism).
56The challenge is to identify the coarse grained parallelism and to generate,
57from a sequential algorithm, coprocessor containing multiple communicating
[33]58tasks (data-paths and FSMs). To this aim, one may adapt techniques which
59were developed in the 1990 for the construction of distributed programs.
60However, in the context of HLS, there are still several original problems
61to be solved, mainly to do with the construction of FIFO communication
62channels and with memory optimization.
[104]63\item The COACH design flow has a top-down approach. In such a case,
64the required performance of a coprocessor (clock frequency, maximum cycles for
65a given computation, power consumption, etc) are imposed by the other system
66components. The challenge is to allow user to control accurately the synthesis
67process. For instance, the clock frequency must not be a result of the RTL synthesis
68but a strict synthesis constraint.
69\item The main problem in HPC is the communication between the PC and the SoC.
70This problem has 2 aspects. The first one is the run-time efficiency. The second is
71its engineering  cost, especially if one want to refine an implementation
72at several abstract levels.
[33]73
[12]74\end{itemize}
75
76%Presenter les resultats escomptes en proposant si possible des criteres de reussite
77%et d'evaluation adaptes au type de projet, permettant d'evaluer les resultats en
78%fin de projet.
79The main result is the framework. It is composed concretely of:
[99]80a communication middleware for HPC,
815 HAS tools (control dominated HLS, data dominated HLS, Coarse grained HLS,
[12]82Memory optimisation HLS and ASIP),
[99]833 architectural templates that are synthesizable and that can be prototyped,
[12]84one design space exploration tool,
[134]852 operating systems (DNA/OS and MUTEKH.
[12]86\\
[99]87The framework fonctionality will be demonstrated with the demonstrators
88(see task-7 page~\pageref{task-7}) and the tutorial example (see task-8
89page~\ref{subtask-tutorial}.
Note: See TracBrowser for help on using the repository browser.