Changeset 30
- Timestamp:
- Jan 12, 2010, 3:27:48 PM (15 years ago)
- Location:
- anr
- Files:
-
- 3 edited
Legend:
- Unmodified
- Added
- Removed
-
anr/section-2.2.tex
r25 r30 63 63 compilers\cite{FIXME:IRISA} since 2002). 64 64 %%% EXPERTISE DANS DES DOMAINES: FIXME:LIP 65 \mustbecompleted{For polyedric transformations and memory optimization, SYNTOL, BEE, ... LIP (CA ou PF)} 65 %%%\mustbecompleted{For polyedric transformations and memory optimization, SYNTOL, BEE, ... LIP (CA ou PF)} 66 67 Compsys was founded in 2002 by several senior researchers with experience in 68 high performance computing and automatic parallelization. They have been 69 among the initiators of the polyhedral model, a theory which serve to 70 unify many parallelism detection and exploitation techniques for regular 71 programs. It is expected that the techniques developped by Compsys for 72 parallelism detection, scheduling, process construction and memory management 73 will be very useful as a first step for a high-level synthesis tool. 74 66 75 \par 67 76 %%% DESCRIPTION DES PROJETS ANR UTILISES: SOCLIB OK -
anr/section-2.tex
r25 r30 67 67 HLS tools of the framework generate them automatically. At this stage the 68 68 framework provides various HLS tools allowing the micro-architectural space 69 design exploration. The exploration criteria a re alsothroughput, latency69 design exploration. The exploration criteria also are throughput, latency 70 70 and power consumption. 71 %FIXME:CA 72 %FIXME:CA At this stage, preliminary source-level transformations will be 73 %FIXME:CA required to improve the efficiency of the target component. 74 %FIXME:CA COACH will also provide such facilities, such as automatic parallelization 75 %FIXME:CA and memory optimisation. 71 At this stage, preliminary source-level transformations will be 72 required to improve the efficiency of the target component. 73 For instance, one may transform a loop nest to expose parallelism, 74 or shrink an array to promote it to a register or reduce a memory footprint. 75 76 76 \item 77 77 Performance measurement: For each point of design space exploration, -
anr/section-3.1.tex
r12 r30 131 131 % Paul je ne suis pas sur que ce soit vraiment un etat de l'art 132 132 % Christophe, ce que tu m'avais envoye se trouve dans obsolete/body.tex 133 \mustbecompleted{ 134 Hardware is inherently parallel. On the other hand, high level languages, 135 like C or Fortran, are abstractions of the processors of the 1970s, and 136 hence are sequential. One of the aims of an HLS tool is therefore to 137 extract hidden parallelism from the source program, and to infer enough 138 hardaware operators for its efficient exploitation. 139 \\ 140 Present day HLS tools search for parallelism in linear pieces of code 141 acting only on scalars -- the so-called basic blocs. On the other hand, 142 it is well known that most programs, especially in the fields of signal 143 processing and image processing, spend most of their time executing loops 144 acting on arrays. Efficient use of the large amount of hardware available 145 in the next generation of FPGA chips necessitates parallelism far beyond 146 what can be extracted from basic blocs only. 147 \\ 148 The Compsys team of LIP has built an automatic parallelizer, Syntol, which 149 handle restricted C programs -- the well known polyhedral model --, 150 computes dependences and build a symbolic schedule. The schedule is 151 a specification for a parallel program. The parallelism itself can be 152 expressed in several ways: as a system of threads, or as data-parallel 153 operations, or as a pipeline. In the context of the COACH project, one 154 of the task will be to decide which form of parallelism is best suited 155 to hardware, and how to convey the results of Syntol to the actual 156 synthesis tools. One of the advantages of this approach is that the 157 resulting degree of parallelism can be easilly controlled, e.g. by 158 adjusting the number of threads, as a mean of exploring the 159 area / performance tradeoff of the resulting design. 160 \\ 161 Another point is that potentially parallel programs necessarily involve 162 arrays: two operations which write to the same location must be executed 163 in sequence. In synthesis, arrays translate to memory. However, in FPGAs, 164 the amount of on-chip memory is limited, and access to an external memory 165 has a high time penalty. Hence the importance of reducing the size of 166 temporary arrays to the minimum necessary to support the requested degree 167 of parallelism. Compsys has developped a stand-alone tool, Bee, based 168 on research by A. Darte, F. Baray and C. Alias, which can be extended 169 into a memory optimizer for COACH. 170 } 133 %\mustbecompleted{ 134 %Hardware is inherently parallel. On the other hand, high level languages, 135 %like C or Fortran, are abstractions of the processors of the 1970s, and 136 %hence are sequential. One of the aims of an HLS tool is therefore to 137 %extract hidden parallelism from the source program, and to infer enough 138 %hardware operators for its efficient exploitation. 139 %\\ 140 %Present day HLS tools search for parallelism in linear pieces of code 141 %acting only on scalars -- the so-called basic blocs. On the other hand, 142 %it is well known that most programs, especially in the fields of signal 143 %processing and image processing, spend most of their time executing loops 144 %acting on arrays. Efficient use of the large amount of hardware available 145 %in the next generation of FPGA chips necessitates parallelism far beyond 146 %what can be extracted from basic blocs only. 147 \\ 148 %The Compsys team of LIP has built an automatic parallelizer, Syntol, which 149 %handle restricted C programs -- the well known polyhedral model --, 150 %computes dependences and build a symbolic schedule. The schedule is 151 %a specification for a parallel program. The parallelism itself can be 152 %expressed in several ways: as a system of threads, or as data-parallel 153 %operations, or as a pipeline. In the context of the COACH project, one 154 %of the task will be to decide which form of parallelism is best suited 155 %to hardware, and how to convey the results of Syntol to the actual 156 %synthesis tools. One of the advantages of this approach is that the 157 %resulting degree of parallelism can be easilly controlled, e.g. by 158 %adjusting the number of threads, as a mean of exploring the 159 %area / performance tradeoff of the resulting design. 160 \\ 161 %Another point is that potentially parallel programs necessarily involve 162 %arrays: two operations which write to the same location must be executed 163 %in sequence. In synthesis, arrays translate to memory. However, in FPGAs, 164 %the amount of on-chip memory is limited, and access to an external memory 165 %has a high time penalty. Hence the importance of reducing the size of 166 %temporary arrays to the minimum necessary to support the requested degree 167 %of parallelism. Compsys has developped a stand-alone tool, Bee, based 168 %on research by A. Darte, F. Baray and C. Alias, which can be extended 169 %into a memory optimizer for COACH. 170 %} 171 172 The problem of compiling sequential programs for parallel computers 173 has been studied since the advent of the first parallel architectures 174 in the 1970s. The basic approach consists in applying program transformations 175 which exhibit or increase the potential parallelism, while guaranteeing 176 the preservation of the program semantics. Most of these transformations 177 just reorder the operations of the program; some of them modify its 178 data structures. Dpendences (exact or conservative) are checked to guarantee 179 the legality of the transformation. 180 181 This has lead to the invention of many loop transformations (loop fusion, 182 loop splitting, loop skewing, loop interchange, loop unrolling, ...) 183 which interact in a complicated way. More recently, it has been noticed 184 that all of these are just changes of basis in the iteration domain of 185 the program. This has lead to the invention of the polyhedral model, in 186 which the combination of two transformation is simply a matrix product. 187 188 As a side effect, it has been observed that the polytope model is a useful 189 tool for many other optimization, like memory reduction and locality 190 improvement. Another point is 191 that the polyhedral domain \emph{stricto sensu} applies only to 192 very regular programs. Its extension to more general programs is 193 an active research subject. 171 194 172 195 \subsubsection{Interfaces}
Note: See TracChangeset
for help on using the changeset viewer.