1 | \section{Project context} |
---|
2 | \hspace{2cm}\begin{scriptsize}\begin{verbatim} |
---|
3 | % 1. CONTEXTE ET POSITIONNEMENT DU PROJET |
---|
4 | % (1 page maximum) Présentation générale du problème qu'il est proposé de traiter |
---|
5 | % dans le projet et du cadre de travail (recherche fondamentale, industrielle ou |
---|
6 | % développement expérimental). |
---|
7 | \end{verbatim} |
---|
8 | \end{scriptsize} |
---|
9 | High Performance Computing (HPC) consists in accelerating applications. |
---|
10 | This topic is investigated since 80s using Applications Specific Integrated Circuits (ASIC), |
---|
11 | Digital Signal Processing (DSP) and parallel computing on multiprocessor machines or networks. |
---|
12 | More recently, since end of 90s, other technologies appeared like Very Large Instruction Word (VLIW), |
---|
13 | Application Specific Instruction Processors (ASIP), System on Chip (SoC), |
---|
14 | Multi-Processors SoC (MPSoC). |
---|
15 | \\ |
---|
16 | During these last decades HPC was reserved to major industrial companies targetinh high volume market |
---|
17 | due to the design and fabrication costs. |
---|
18 | Nowadays Field Programmable Gate Arrays (FPGA), like Virtex5 from Xilinx and Stratix4 from Altera, |
---|
19 | can implement a SoC with multiple processors and several coprocessors for less than 10K euros the piece. |
---|
20 | In addition, High Level Synthesis (HLS) becomes more mature and allows to automize design |
---|
21 | and to decrease drastically its cost in terms of man power. Thus, both FPGA and HLS tends to spread over |
---|
22 | HPC for small companies targeting low volume markets. |
---|
23 | \par |
---|
24 | To get a good acceleration ratio designer has to take into account application characteristics when it |
---|
25 | chooses one of the former HPC technologies. This choice is not easy and in most cases designer has to try |
---|
26 | different technologies to retain the most adapted one. |
---|
27 | The objective of COACH project is to provide a framework to accelerate applications. |
---|
28 | Typically, the kind of targeted application is an existing one running on PC. COACH help designer either |
---|
29 | to migrate it into an embedded FPGA or to accelerate it by migrating critical parts on FPGA |
---|
30 | plugged to the PC bus. |
---|
31 | COACH framework allows designer to explore various software/hardware partitions of the |
---|
32 | target application, to run timing and functional simulations and to generate automatically both |
---|
33 | the software and the synthetisable description of the hardware. |
---|
34 | The main topics of the project are: |
---|
35 | \begin{itemize} |
---|
36 | \item |
---|
37 | PC/FPGA communication: COACH provides tools and communication schemes with their implementation helping |
---|
38 | user to split its application in two parts (one running on the PC and the other running on FPGA) and to |
---|
39 | evaluate the split efficiency. |
---|
40 | \item |
---|
41 | Design space exploration: It consists in analysing the application runnig on FPGA, defining the target |
---|
42 | technology (SoC, MPSoC, ASIP, ...) and hardware/software partitioning of tasks depending on |
---|
43 | technology choice. This exploration is driven basically by throughput and latency criteria. |
---|
44 | Moreover power consumption can be considered in the case of embedded systems. |
---|
45 | \item |
---|
46 | Micro-architectural exploration: When hardware components are required, the HLS tools of the framework |
---|
47 | generate them automatically. At this stage the framework provides various HLS tools allowing the |
---|
48 | micro-architectural space design exploration. The exploration criteria are also throughput, latency |
---|
49 | and power consumption. |
---|
50 | \item |
---|
51 | Performance measurement: For each point of design space exploration, metrics of criteria are available |
---|
52 | such as throughput, latency, power consumption, area, memory allocation and data locality. |
---|
53 | They are evaluated using simulation, estimation or analysing methodologies. |
---|
54 | \item |
---|
55 | Targeted hardware technology: COACH is independent of the FPGA family. Every point of the design |
---|
56 | exploration space can be implemented on any FPGA having the required resources. Basically, COACH |
---|
57 | handles both Altera and Xilinx FPGA families. |
---|
58 | \end{itemize} |
---|
59 | COACH is the result of the will of several laboratory to unify their know how and skills in the |
---|
60 | following domains: Operating system and hardware communication (TIMA, SITI), SoC and MPSoC (LIP6 and TIMA), |
---|
61 | ASIP (IRISA) and HLS (LIP6, Lab-STIC and LIP). The project objective is to integrate these various |
---|
62 | domains into a unique free framework (licence ...) masking as much as possible these domains and its |
---|
63 | different tools to the user. |
---|
64 | |
---|
65 | |
---|
66 | \subsection{Economical context and interest} |
---|
67 | \hspace{2cm}\begin{scriptsize}\begin{verbatim} |
---|
68 | % 1.1. CONTEXTE ET ENJEUX ECONOMIQUES ET SOCIETAUX |
---|
69 | % (2 pages maximum) |
---|
70 | % Décrire le contexte économique, social, réglementaire. dans lequel se situe |
---|
71 | % le projet en présentant une analyse des enjeux sociaux, économiques, environnementaux, |
---|
72 | % industriels. Donner si possible des arguments chiffrés, par exemple, pertinence et |
---|
73 | % portée du projet par rapport à la demande économique (analyse du marché, analyse des |
---|
74 | % tendances), analyse de la concurrence, indicateurs de réduction de coûts, perspectives |
---|
75 | % de marchés (champs d'application, .). Indicateurs des gains environnementaux, cycle |
---|
76 | % de vie. |
---|
77 | \end{verbatim} |
---|
78 | \end{scriptsize} |
---|
79 | Microelectronics allow to integrate complicated functions into products, to increase their |
---|
80 | commercial attractivity and to improve their competitivity. Multimedia and communication |
---|
81 | sectors have taken advantage from microelectronics facilities thanks to developpment of |
---|
82 | design methodologies and tools for real time embedded systems. Many other sectors could |
---|
83 | benefit from microelectronics if these methologies and tools are adapted to their features. |
---|
84 | The Non Recurring Engineering (NRE) costs involded in designing and manufacturing an ASIC is |
---|
85 | very high. It costs several milliars of euros for IC factory and several millions to fabricate |
---|
86 | a specific circuit. Consequently, it is generally unfeasible to design and fabricate ASICs in |
---|
87 | low volumes and ICs must be designed to cover a broad applications spectrum. This is achieved |
---|
88 | by MPSoC (Multi-Processor System on Chip) with several application dedicated coprocessors. |
---|
89 | \\ |
---|
90 | Today, FPGAs become important actors in the computational domain that was originally dominated |
---|
91 | by microprocessors and ASICs. Just like microprocessors FPGA based systems can be reprogrammed |
---|
92 | on a per-application basis. At the same time, FPGAs offer significant performance benefits over |
---|
93 | microprocessors implementation for a number of applications. Although these benefits are still |
---|
94 | generally an order of magnitude less than equivalent ASIC implementations, low costs |
---|
95 | (500 euros to 10K euros), fast time to market and flexibility of FPGAs make them an attractive |
---|
96 | choice for low-to-medium volume applications. |
---|
97 | Since their introduction in the mid eighties, FPGAs evolved from a simple, |
---|
98 | low-capacity gate array technology to devices (Altera STRATIX III, Xilinx Virtex V) that |
---|
99 | provide a mix of coarse-grained data path units, memory blocks, microprocessor cores, |
---|
100 | on chip A/D conversion, and gate counts by millions. This high logic capacity allows to implement |
---|
101 | complex systems like multi-processors platform with application dedicated coprocessors. |
---|
102 | Using FPGA limits the NRE costs to design cost. This boosts the developpment of methodologies |
---|
103 | and tools to automize design and reduce its cost. |
---|
104 | \par |
---|
105 | Nowadays, there are neither commercial nor free tools covering the whole design process. |
---|
106 | For instance, With SOPC Builder from Altera, users can select and parameterize IP components |
---|
107 | from an extensive drop-down list of communication, digital signal processor (DSP), microprocessor |
---|
108 | and bus interface cores, as well as incorporate their own IP. Designers can then generate |
---|
109 | a synthesized netlist, simulation test bench and custom software library that reflect the hardware |
---|
110 | configuration. |
---|
111 | Nevertheless, SOPC Builder does not provide any facilities to synthesize coprocessors and to |
---|
112 | evaluate the platform at a high design level. |
---|
113 | In addition, SOPC Builder is closed world since it is impossible to migrate a SOPC Builder |
---|
114 | based design to other tools or devices family. |
---|
115 | PICO [CITATION] and CATAPULT [CITATION] allow to synthesize coprocessors from a C++ description. |
---|
116 | Nevertheless, they can only deal with data dominated applications and they do not handle the |
---|
117 | platform level. |
---|
118 | The Xilinx System Generator for DSP [http://www.xilinx.com/tools/sysgen.htm] is a plug-in to |
---|
119 | Simulink that enables designers to develop high-performance DSP systems for Xilinx FPGAs. |
---|
120 | Designers can design and simulate a system using MATLAB and Simulink. The tool will then |
---|
121 | automatically generate synthesizable Hardware Description Language (HDL) code mapped to Xilinx |
---|
122 | pre-optimized algorithms. |
---|
123 | However, this tool targets only DSP based algorithms. |
---|
124 | \\ |
---|
125 | Consequently, designer developping a HPC application needs to master for example |
---|
126 | the communication between FPGA device and PC, |
---|
127 | SoCLib for design exploration, |
---|
128 | SOPC at the platform level, |
---|
129 | PICO for synthesizing the data dominated coprocessors |
---|
130 | and Quartus for design implementation. |
---|
131 | This requires an important tools interfacing effort and makes the design process very complex |
---|
132 | and achievable only by designers skilled in various domains. |
---|
133 | COACH project integrates all these tools in the same framework masking them to the user. |
---|
134 | The objective is to allow \textbf{pure software} developpers to realize HPC or embedded application. |
---|
135 | \par |
---|
136 | The combination of the framework dedicated to software developpers and FPGA target, allows small |
---|
137 | and even very small companies to propose accelerating solutions for standard software applications |
---|
138 | with acceptable prices. |
---|
139 | avoiding huge hardware investment in opposite to ASIC based solution. |
---|
140 | |
---|
141 | The combination of the framework dedicated to software developpers and FPGA target can open new markets |
---|
142 | to small and even very small companies. |
---|
143 | Such markets we can state HPC (High Performance Computing) and embedded applications. |
---|
144 | HPC consists in proposing accelerating solutions for standard software applications with acceptable |
---|
145 | prices, for example, DNA sequencing recognization or DBMS acceleration. |
---|
146 | Embedded application consists in implementing an application on a low power standalone device, |
---|
147 | for example distributed intelligent sensors. |
---|
148 | \\ |
---|
149 | This new market may explose like it was done by micro-computing in eighties. This success were due |
---|
150 | to the low cost of first micro-computers (compared to main frame) and the advent of high level |
---|
151 | programming languages that allow a high number of programmers to launch start-ups in software |
---|
152 | engineering. |
---|
153 | |
---|
154 | |
---|
155 | \subsection{Project position} |
---|
156 | \hspace{2cm}\begin{scriptsize}\begin{verbatim} |
---|
157 | % 1.2. POSITIONNEMENT DU PROJET |
---|
158 | % (2 pages maximum) |
---|
159 | % Préciser : |
---|
160 | % - positionnement du projet par rapport au contexte développé précédemment : |
---|
161 | % vis- à-vis des projets et recherches concurrents, complémentaires ou antérieurs, |
---|
162 | % des brevets et standards. |
---|
163 | % - positionnement du projet par rapport aux axes thématiques de l'appel à projets. |
---|
164 | % - positionnement du projet aux niveaux européen et international. |
---|
165 | \end{verbatim} |
---|
166 | \end{scriptsize} |
---|
167 | The aim of this project is to propose an open-source framework for architecture synthesis |
---|
168 | targeting mainly field programmable gate array circuits (FPGA). |
---|
169 | To evaluate the different architectures, the project uses the prototyping platform |
---|
170 | of the SoCLIB ANR project (2006-2009). |
---|
171 | \begin{Large}\begin{verbatim} |
---|
172 | -- POUVEZ VOUS CHACUN AJOUTER SVP (SI POSSIBLE) UNE LIGNE |
---|
173 | -- REFERANT UN PROJET ANR OU EUROPEEN |
---|
174 | * LAB-STIC |
---|
175 | * LIP |
---|
176 | * IRISA |
---|
177 | * CITI |
---|
178 | * TIMA |
---|
179 | |
---|
180 | -- Projets européens ou ANR réutilisés ou continués |
---|
181 | \end{verbatim} |
---|
182 | \end{Large} |
---|
183 | For High Level Synthesis (HLS), the project is based on a know-how acquired over 15 years |
---|
184 | with GAUT project developped in Lab-STIC laboratory and UGH project developped in LIP6 |
---|
185 | and TIMA laboratories. |
---|
186 | For architecture synthesis, the project is based on a know-how acquired over 10 years |
---|
187 | with the COSY European project (1998-2000) and the DISYDENT project developped in LIP6. |
---|
188 | \begin{Large}\begin{verbatim} |
---|
189 | -- A COMPLETER (COURT) |
---|
190 | * For polyedric transformation and memory optimization ... LIP |
---|
191 | * For ASIP IRISA |
---|
192 | * For ... CITI |
---|
193 | * For ... TIMA |
---|
194 | |
---|
195 | -- Compétences |
---|
196 | \end{verbatim} |
---|
197 | \end{Large} |
---|
198 | \par |
---|
199 | The SoCLIB ANR platform were developped by 11 laboratories and 6 companies. It allows to |
---|
200 | describe hardware architectectures with shared memory space and to deploy software |
---|
201 | applications on them to evaluate their performance. |
---|
202 | The heart of this platform is a library containing simulation models (in SystemC) |
---|
203 | of hardware IP cores such as processors, buses, networks, memories, IO controller. |
---|
204 | The platform provides also embedded operating systems and software/hardware |
---|
205 | communication components useful to implement applications quickly. |
---|
206 | However, the synthesisable description of IPs have to be provided by users. |
---|
207 | \par |
---|
208 | This project enhances SoCLib by providing synthesisable VHDL of standard IPs. |
---|
209 | In addition, HLS tools such as UGH and GAUT allow to get automatically a synthesisable |
---|
210 | description of an IP (coprocessor) from a sequential algorithm. |
---|
211 | \begin{Large}\begin{verbatim} |
---|
212 | -- A COMPLETER (COURT) |
---|
213 | * ASIP tool such as ... IRISA |
---|
214 | * ... |
---|
215 | |
---|
216 | -- Indiquer en quoi votre contribution à Caoch est intéressante au niveau européen. |
---|
217 | \end{verbatim} |
---|
218 | \end{Large} |
---|
219 | \par |
---|
220 | The different points proposed in this project cover priorities defined by the commission |
---|
221 | experts in the field of Information Technolgies Society (IST) for Embedded |
---|
222 | systems: <<Concepts, methods and tools for designing systems dealing with systems complexity |
---|
223 | and allowing to apply efficiently applications and various products on embedded platforms, |
---|
224 | considering resources constraints (delais, power, memory, etc.), security and quality |
---|
225 | services>>. |
---|
226 | \\ |
---|
227 | Our team aims at covering all the steps of the design flow of architecture synthesis. |
---|
228 | Our project overcomes the complexity of using various synthesis tools and description |
---|
229 | languages required today to design architectures. |
---|
230 | |
---|
231 | \section{Scientific and Technical Description} |
---|
232 | \subsection{State of the art} |
---|
233 | \hspace{2cm}\begin{scriptsize}\begin{verbatim} |
---|
234 | % 2. DESCRIPTION SCIENTIFIQUE ET TECHNIQUE |
---|
235 | % 2.1. ÉTAT DE L'ART |
---|
236 | % (3 pages maximum) |
---|
237 | % Décrire le contexte et les enjeux scientifiques dans lequel se situe le projet |
---|
238 | % en présentant un état de l'art national et international dressant l'état des |
---|
239 | % connaissances sur le sujet. Faire apparaître d'éventuels résultats préliminaires. |
---|
240 | % Inclure les références bibliographiques nécessaires en annexe 7.1. |
---|
241 | \end{verbatim} |
---|
242 | \end{scriptsize} |
---|
243 | Our project covers several critical domains in system design in order |
---|
244 | to achieve high performance computing. Starting from a high level description we aim |
---|
245 | at generating automatically both hardware and software components of the system. |
---|
246 | \subsubsection{High Performance Computing} |
---|
247 | Accelerating high-performance computing (HPC) applications with field-programmable |
---|
248 | gate arrays (FPGAs) can potentially improve performance. |
---|
249 | However, using FPGAs presents significant challenges [1]. |
---|
250 | First, the operating frequency of an FPGA is low compared to a high-end microprocessor. |
---|
251 | Second, based on Amdahl law, HPC/FPGA application performance is unusually sensitive |
---|
252 | to the implementation quality [2]. |
---|
253 | Finally, High-performance computing programmers are a highly sophisticated but scarce |
---|
254 | resource. Such programmers are expected to readily use new technology but lack the time |
---|
255 | to learn a completely new skill such as logic design [3]. |
---|
256 | \\ |
---|
257 | HPC/FPGA hardware is only now emerging and in early commercial stages, |
---|
258 | but these techniques have not yet caught up. |
---|
259 | Thus, much effort is required to develop design tools that translate high level |
---|
260 | language programs to FPGA configurations. |
---|
261 | |
---|
262 | \hspace{2cm}\begin{scriptsize}\begin{verbatim} |
---|
263 | [1] M.B. Gokhale et al., Promises and Pitfalls of Reconfigurable |
---|
264 | Supercomputing, Proc. 2006 Conf. Eng. of Reconfigurable |
---|
265 | Systems and Algorithms, CSREA Press, 2006, pp. 11-20; |
---|
266 | http://nis-www.lanl.gov/~maya/papers/ersa06_gokhale_paper. |
---|
267 | pdf. |
---|
268 | [2] D. Buell, Programming Reconfigurable Computers: Language |
---|
269 | Lessons Learned, keynote address, Reconfigurable Systems |
---|
270 | Summer Institute 2006, 12 July 2006; http://gladiator. |
---|
271 | ncsa.uiuc.edu/PDFs/rssi06/presentations/00_Duncan_Buell.pdf |
---|
272 | [3] T. Van Court et al., Achieving High Performance |
---|
273 | with FPGA-Based Computing, Computer, vol. 40, no. 3, |
---|
274 | pp. 50-57, Mar. 2007, doi:10.1109/MC.2007.79 |
---|
275 | \end{verbatim} |
---|
276 | \end{scriptsize} |
---|
277 | |
---|
278 | \subsubsection{System Synthesis} |
---|
279 | Today, several solutions for system design are proposed and commercialized. The most common are |
---|
280 | those provided by Altera and Xilinx to promote their FPGA devices. |
---|
281 | \\ |
---|
282 | The Xilinx System Generator for DSP [http://www.xilinx.com/tools/sysgen.htm] is a plug-in to |
---|
283 | Simulink that enables designers to develop high-performance DSP systems for Xilinx FPGAs. |
---|
284 | Designers can design and simulate a system using MATLAB and Simulink. The tool will then |
---|
285 | automatically generate synthesizable Hardware Description Language (HDL) code mapped to Xilinx |
---|
286 | pre-optimized algorithms. |
---|
287 | However, this tool targets only DSP based algorithms, Xilinx FPGAs and cannot handle complete |
---|
288 | SoC. Thus, it is not really a system synthesis tool. |
---|
289 | \\ |
---|
290 | In the opposite, SOPC [CITATION] allows to describe a system, to synthesis it, |
---|
291 | to programm it into a target FPGA and to upload a software application. |
---|
292 | Nevertheless, SOPC does not provide any facilities to synthesize coprocessors. |
---|
293 | Users have to provide the synthesizable description with the feasible bus interface. |
---|
294 | \\ |
---|
295 | In addition, Xilinx System Generator and SOPC are closed world since each one imposes |
---|
296 | their own IPs which are not interchangeable. |
---|
297 | We can conclude that the existing commercial or free tools does not coverthe whole system |
---|
298 | synthesis process in a full automatic way. Moreover, they are bound to a particular device family |
---|
299 | and to IPs library. |
---|
300 | |
---|
301 | \subsubsection{High Level Synthesis} |
---|
302 | High Level Synthesis translates a sequential algorithmic description and a constraints set |
---|
303 | (area, power, frequency, ...) to a micro-architecture at Register Transfer Level (RTL). |
---|
304 | Several academic and commercial tools are today available. |
---|
305 | Most common tools are SPARK [HLS1], GAUT [HLS2], UGH [HLS3] in the academic world |
---|
306 | and catapultC [HLS4], PICO [HLS5] and Cynthesizer [HLS6] in commercial world. |
---|
307 | Despite their maturity, their usage is restrained by: |
---|
308 | \begin{itemize} |
---|
309 | \item They do not respect accurately the frequency constraint when they target an FPGA device. |
---|
310 | Their error is about 10 percent. This is annoying when the generated component is integrated |
---|
311 | in a SoC since it will slow down the hole system. |
---|
312 | \item These tools take into account only one or few constraints simultaneously while realistic |
---|
313 | designs are multi-constrained. |
---|
314 | Moreover, low power consumption constraint is mandatory for embedded systems. |
---|
315 | However, it is not yet well handled by common synthesis tools. |
---|
316 | \item The parallelism is extracted from initial algorithm. To get more parallelism or to reduce |
---|
317 | the amout of required memory, the user must re-write it while there is techniques as polyedric |
---|
318 | transformations to increase the intrinsec parallelism. |
---|
319 | \item Despite they have the same input language (C/C++), they are sensitive to the style in |
---|
320 | which the algorithm is written. Consequently, engineering work is required to swap from |
---|
321 | a tool to another. |
---|
322 | \item The HLS tools are not integrated into an architecture and system exploration tool. |
---|
323 | Thus, a designer who needs to accelerate a software part of the system, must adapt it manually |
---|
324 | to the HLS input dialect and performs engineering work to exploit the synthesis result |
---|
325 | at the system level. |
---|
326 | \end{itemize} |
---|
327 | Regarding these limitations, it is necessary to create a new tool generation reducing the gap |
---|
328 | between the specification of an heterogenous system and its hardware implementation. |
---|
329 | |
---|
330 | \hspace{2cm}\begin{scriptsize}\begin{verbatim} |
---|
331 | [HLS1] SPARK universite de californie San Diego |
---|
332 | [HLS2] GAUT UBS/Lab-STIC |
---|
333 | [HLS3] UGH |
---|
334 | [HLS4] catapultC Mentor |
---|
335 | [HLS5] PICO synfora |
---|
336 | [HLS6] Cynthesizer Forte design system |
---|
337 | \end{verbatim} |
---|
338 | \end{scriptsize} |
---|
339 | |
---|
340 | \subsubsection{Application Specific Instruction Processors} |
---|
341 | \begin{Large}\begin{verbatim} |
---|
342 | -- A COMPLETER IRISA Etat de l'art |
---|
343 | \end{verbatim} |
---|
344 | \end{Large} |
---|
345 | |
---|
346 | \subsubsection{Automatic Parallelization} |
---|
347 | \begin{Large}\begin{verbatim} |
---|
348 | -- A COMPLETER LIP Etat de l'art |
---|
349 | \end{verbatim} |
---|
350 | \end{Large} |
---|
351 | |
---|
352 | \subsubsection{Interfaces} |
---|
353 | \begin{Large}\begin{verbatim} |
---|
354 | -- A COMPLETER INSA Etat de l'art |
---|
355 | \end{verbatim} |
---|
356 | \end{Large} |
---|
357 | % |
---|
358 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
---|
359 | \subsection{Objectives and innovation aspects} |
---|
360 | \hspace{2cm}\begin{scriptsize}\begin{verbatim} |
---|
361 | % 2.2. OBJECTIFS ET CARACTERE AMBITIEUX/NOVATEUR DU PROJET |
---|
362 | % (2 pages maximum) |
---|
363 | % Décrire les objectifs scientifiques/techniques du projet. |
---|
364 | % Présenter l'avancée scientifique attendue. Préciser l'originalité et le caractère |
---|
365 | % ambitieux du projet. |
---|
366 | % Détailler les verrous scientifiques et techniques à lever par la réalisation du projet. |
---|
367 | % Décrire éventuellement le ou les produits finaux développés à l'issue du projet |
---|
368 | % montrant le caractère innovant du projet. |
---|
369 | % Présenter les résultats escomptés en proposant si possible des critères de réussite |
---|
370 | % et d'évaluation adaptés au type de projet, permettant d'évaluer les résultats en |
---|
371 | % fin de projet. |
---|
372 | % Le cas échéant (programmes exigeant la pluridisciplinarité), démontrer l'articulation |
---|
373 | % entre les disciplines scientifiques. |
---|
374 | \end{verbatim} |
---|
375 | \end{scriptsize} |
---|
376 | |
---|
377 | % les objectifs scientifiques/techniques du projet. |
---|
378 | The objectives of COACH project are to develop a complete framework to |
---|
379 | HPC (accelerating solutions for existing software applications) |
---|
380 | and embedded applications (implementing an application on a low power standalone device). |
---|
381 | The design steps are presented figure 1. |
---|
382 | \begin{figure}[hbtp]\leavevmode\center |
---|
383 | \includegraphics[width=.8\linewidth]{anr-2010} |
---|
384 | \caption{\label{coach-flow} COACH flow.} |
---|
385 | \end{figure} |
---|
386 | \begin{description} |
---|
387 | \item[HPC setup] Here the user splits the application into 2 parts: the host application |
---|
388 | which remains on PC and the SoC application which migrates on SoC. |
---|
389 | The framework provides a simulation model allowing to evaluate the partitioning. |
---|
390 | \item[SoC design] In this phase, |
---|
391 | The user can obtain simulators at different abstraction levels of the SoC by giving to COACH framework |
---|
392 | a SoC description. |
---|
393 | This description consists of a process network corresponding to the SoC application, |
---|
394 | an OS, an instance of a generic hardware platform |
---|
395 | and a mapping of processes on the platform components. The supported mapping are |
---|
396 | software (the process runs on a SoC processor), |
---|
397 | XXXpeci (the process runs on a SoC processor enhanced with dedicated instructions), |
---|
398 | and hardware (the process runs into a coprocessor generated by HLS and plugged on the SoC bus). |
---|
399 | \item[Application compilation] Once SoC description is validated, COACH generates automatically |
---|
400 | an FPGA bitstream containing the hardware platform with SoC application software and |
---|
401 | an executable containing the host application. The user can launch the application by |
---|
402 | loading the bitstream on FPGA and running the executable on PC. |
---|
403 | \end{description} |
---|
404 | |
---|
405 | % l'avancee scientifique attendue. Preciser l'originalite et le caractere |
---|
406 | % ambitieux du projet. |
---|
407 | The main scientific contribution of the project is to unify various synthesis techniques |
---|
408 | (same input and output formats) allowing the user to swap without engineering effort |
---|
409 | from one to an other and even to chain them, for example, to run polyedric transformation |
---|
410 | before synthesis. |
---|
411 | Another advantage of this framework is to provide different abstraction levels from |
---|
412 | a single description. |
---|
413 | Finally, this description is device family independent and its hardware implementation |
---|
414 | is automatically generated. |
---|
415 | |
---|
416 | % Detailler les verrous scientifiques et techniques a lever par la realisation du projet. |
---|
417 | System design is a very complicated task and in this project we try to simplify it |
---|
418 | as much as possible. For this purpose we have to deal with the following scientific |
---|
419 | and technological barriers. |
---|
420 | \begin{itemize} |
---|
421 | \item The main problem in HPC is the communication between the PC and the SoC. |
---|
422 | This problem has 2 aspects. The first one is the efficiency. The second is to |
---|
423 | eliminate enginnering effort to implement it at different abstract levels. |
---|
424 | \item COACH design flow has a top-down approach. In the such case, |
---|
425 | the required performance of a coprocessor (run frequency, maximum cycles for |
---|
426 | a given computation, power consumption, etc) are imposed by the other system |
---|
427 | components. The challenge is to allow user to control accurately the synthesis |
---|
428 | process. For instance, the run frequency must not be a result of the RTL synthesis |
---|
429 | but a strict synthesis constraint. |
---|
430 | \item HLS tools are sensitive to the style in which the algorithm is written. |
---|
431 | In addition, they are are not integrated into an architecture and system |
---|
432 | exploration tool. |
---|
433 | Consequently, engineering work is required to swap from a tool to another, |
---|
434 | to integrate the resulting simulation model to an architectural exploration tool |
---|
435 | and to synthesize the generated RTL description. |
---|
436 | \item Most HLS tools translate a sequential algorithm into a coprocessor |
---|
437 | containing a single data-path and finite state machine (FSM). In this way, |
---|
438 | only the fine grained parallelism is exploited (ILP parallelism). |
---|
439 | The challenge is to identify the coarse grained parallelism and to generate, |
---|
440 | from a sequential algorithm, coprocessor containing multiple communicating |
---|
441 | tasks (data-paths and FSMs). |
---|
442 | \end{itemize} |
---|
443 | |
---|
444 | %Presenter les resultats escomptes en proposant si possible des criteres de reussite |
---|
445 | %et d'evaluation adaptes au type de projet, permettant d'evaluer les resultats en |
---|
446 | %fin de projet. |
---|
447 | The main result is the framework. It is composed concretely of: |
---|
448 | 2 HPC communication shemes with their implementation, |
---|
449 | 5 HLS tools (control dominated HLS, data dominated HLS, Coarse grained HLS, |
---|
450 | Memory optimisation HLS and XXXpeci), |
---|
451 | a generic platform with SystemC CABA model and synthesizable RTL descriptions, |
---|
452 | a design space exploration tool configured for the former platform and |
---|
453 | one operating system (OS). |
---|
454 | \\ |
---|
455 | The framework fonctionality will be demonstrated on both HPC and embedded SoC |
---|
456 | application examples. |
---|
457 | \\ |
---|
458 | For the HPC application, we provide the following simulation levels: |
---|
459 | Original application, |
---|
460 | the splitted application (host/SoC) and the splitted application with the SoC |
---|
461 | application as a process network. |
---|
462 | \\ |
---|
463 | For both HPC and embedded SoC, we provide the following simulation levels: |
---|
464 | process network simulation, |
---|
465 | CABA simulation of the application with all the processes in software in |
---|
466 | the SoC processor, |
---|
467 | CABA simulation with a task running in a specific hardware for each HLS tool. |
---|
468 | \\ |
---|
469 | Finally, the previous simulated descriptions are synthesized and the application |
---|
470 | is run. This is done twice one time for Altera and one time for Xilinx FPGAs. |
---|
471 | |
---|
472 | %% \section{} |
---|
473 | %% %3. PROGRAMME SCIENTIFIQUE ET TECHNIQUE, ORGANISATION DU PROJET |
---|
474 | %% \subsection{} |
---|
475 | %% %3.1. PROGRAMME SCIENTIFIQUE ET STRUCTURATION DU PROJET |
---|
476 | %% %(2 pages maximum) |
---|
477 | %% %Présentez le programme scientifique et justifiez la décomposition en tâches du |
---|
478 | %% %programme de travail en cohérence avec les objectifs poursuivis. |
---|
479 | %% %Utilisez un diagramme pour présenter les liens entre les différentes tâches |
---|
480 | %% %(organigramme technique) |
---|
481 | %% %Les tâches représentent les grandes phases du projet. Elles sont en nombre limité. |
---|
482 | %% %N'oubliez pas les activités et actions correspondant à la dissémination et à la |
---|
483 | %% %valorisation. |
---|
484 | %% |
---|
485 | %% %METTRE UNE FIGURE ICI DECRIVANT LES TACHES ET LEURS INTERACTION (AVEC LE FLOT |
---|
486 | %% %EN FILIGRANE ? ) |
---|
487 | %% \subsection{} |
---|
488 | %% %3.2. MANAGEMENT DU PROJET |
---|
489 | %% %(2 pages maximum) |
---|
490 | %% %Préciser les aspects organisationnels du projet et les modalités de coordination |
---|
491 | %% %(si possible individualisation d'une tâche coordination : cf. tâche 0 du document |
---|
492 | %% %de soumission A). |
---|
493 | %% \subsection{} |
---|
494 | %% %3.3. DESCRIPTION DES TRAVAUX PAR TACHE |
---|
495 | %% %(idéalement 1 ou 2 pages par tâche) |
---|
496 | %% %Pour chaque tâche, décrire : |
---|
497 | %% %- les objectifs de la tâche et éventuels indicateurs de succès, |
---|
498 | %% %- le responsable de la tâche et les partenaires impliqués (possibilité de |
---|
499 | %% %l'indiquer sous forme graphique), |
---|
500 | %% %- le programme détaillé des travaux par tâche, |
---|
501 | %% %- les livrables de la tâche, |
---|
502 | %% %- les contributions des partenaires (le " qui fait quoi "), |
---|
503 | %% %- la description des méthodes et des choix techniques et de la manière dont |
---|
504 | %% %les solutions seront apportées, |
---|
505 | %% %- les risques de la tâche et les solutions de repli envisagées. |
---|
506 | |
---|
507 | |
---|
508 | |
---|
509 | |
---|
510 | |
---|
511 | |
---|