1 | |
---|
2 | regression_test - perform regression test with the information given in |
---|
3 | arguments file |
---|
4 | _________________________________________________________________ |
---|
5 | |
---|
6 | regression_test [-s <script_file_name>] |
---|
7 | |
---|
8 | Invokes two set of execution of vis to compare the performance and to |
---|
9 | check sanity of each execution. Then make a set of tables summarizing |
---|
10 | results, such as CPU sec, memoty usage and fail/pass of model_check, |
---|
11 | etc.. This command will read a description of the regression test to |
---|
12 | be performed from the argument file, which is configuartion file. |
---|
13 | Besides the configuration file, one needs to prepare two template |
---|
14 | files and a design list. We now describe these in turn. As running |
---|
15 | example we shall use a regression test that compares check_invariant |
---|
16 | with and without guided search. |
---|
17 | |
---|
18 | The configuration file for our example is shown below, delimited by |
---|
19 | lines of dashes (which are not part of the file). The order in which |
---|
20 | the lines appear in the file is immaterial. A line beginning with '#' |
---|
21 | is a comment, and is ignored by regression_test. |
---|
22 | ---------------------------------------------------------------------- |
---|
23 | DesignDirectory : /projects/development/hsv/vis/Regression |
---|
24 | DesignList : ./DesignListInv |
---|
25 | ReferenceVis : /usr/local/vis/i686-o/vis-cu |
---|
26 | ReferenceVisNickName : BFS |
---|
27 | ReferenceOutputDir : ./result_inv_bfs |
---|
28 | RefCommandTemplate : ./TemplateRefInv |
---|
29 | bRefRun : FALSE |
---|
30 | NewVis : /usr/local/vis/i686-o/vis-cu |
---|
31 | NewVisNickName : GS |
---|
32 | NewOutputDir : ./result_inv_gs |
---|
33 | bNewRun : FALSE |
---|
34 | NewCommandTemplate : ./TemplateNewInv |
---|
35 | bCompareCounterExample : TRUE |
---|
36 | bDefaultInitCommand : FALSE |
---|
37 | ResultForPerformance : ./BFSvsGS.tex |
---|
38 | CompareItem : cpu : elapse: |
---|
39 | CompareItem : mem : Memory in use: |
---|
40 | CompareItem : bdd : Peak number of live nodes: |
---|
41 | ---------------------------------------------------------------------- |
---|
42 | |
---|
43 | Each line defines a parameter of the regression test. |
---|
44 | |
---|
45 | * DesignDirectory gives the path to the directory where the input |
---|
46 | files are stored. In this case, we are using the central repository |
---|
47 | under ~hsv/vis/Regression. It's typically a good idea to store the |
---|
48 | results of regression tests in a separate place from the input files. |
---|
49 | In our examples, the results are stored in the current directory. |
---|
50 | |
---|
51 | * DesignList gives the name of the file containing the list of test |
---|
52 | cases to run. Each line consists of several white space-separated |
---|
53 | fields describing the files making up the test case. The first field |
---|
54 | is a path to the files relative to DesignDirectory. The second field |
---|
55 | is the test case name. The remaining fields are file names. More on |
---|
56 | this later, when we talk about the design list file in detail. |
---|
57 | |
---|
58 | * ReferenceVis and NewVis give paths to the two versions of vis to be |
---|
59 | compared in the regression test. They can be the same, as in this case |
---|
60 | in which both are the "nightly build." |
---|
61 | |
---|
62 | * ReferenceVisNickName and NewVisNickName are two identifiers that |
---|
63 | are used in the LaTeX tables to refer to the two versions that are |
---|
64 | compared. Short mnemonics are preferable. Avoid using underscores, |
---|
65 | because they currently cause trouble with LaTeX. |
---|
66 | |
---|
67 | * ReferenceOutputDir and NewOutputDir give the names of the |
---|
68 | directories in which the results of the runs will be stored. In each |
---|
69 | of these directories, vis_reg will create a directory for each test |
---|
70 | case. The names of these directories are the test case names taken |
---|
71 | from the design list. Therefore, these names must be unique. |
---|
72 | |
---|
73 | * RefCommandTemplate and NewCommandTemplate are the templates from |
---|
74 | which vis_reg produces the scripts used to run the test cases. The |
---|
75 | mechanism used to derive the scripts from the templates and the design |
---|
76 | list is described in detail when we talk about the template files. The |
---|
77 | two templates can be the same file. |
---|
78 | |
---|
79 | * bRefRun and bNewRun are flags that tell vis_reg which of the two |
---|
80 | sets of experiments should be run (if any). If one set of results has |
---|
81 | been already computed, the corresponding flag should be set to FALSE. |
---|
82 | If both flags are FALSE, vis_reg simply regenerates the summary table. |
---|
83 | |
---|
84 | * bCompareCounterExample is a flag that instructs vis_reg to scan the |
---|
85 | output files for counterexamples and record their lengths. For each |
---|
86 | counterexample, stem and cycle length are found. If this flag is set |
---|
87 | to FALSE, counterexamples are not analyzed and the corresponding table |
---|
88 | is not produced. |
---|
89 | |
---|
90 | * bDefaultInitCommand is a flag that, if set to TRUE, instructs |
---|
91 | vis_reg to generate a default set of commands to initialize each vis |
---|
92 | run. This option allows one to save some time in preparing the |
---|
93 | templates, at the expense of some flexibility. |
---|
94 | |
---|
95 | * ResultForPerformance is the name of the LaTeX file to which vis_reg |
---|
96 | writes the summary of the experiments. Its format is described in |
---|
97 | detail later. |
---|
98 | |
---|
99 | * CompareItem specifies one item to be monitored in the experiments. |
---|
100 | Multiple CompareItem lines may appear in the configuration file. Each |
---|
101 | one consists of two items: an identifier, and a pattern. The |
---|
102 | identifier is used in the table to denote the item, whereas the |
---|
103 | pattern is used to extract from the output files the information to be |
---|
104 | put in the tables. |
---|
105 | |
---|
106 | In our example, we declare |
---|
107 | |
---|
108 | CompareItem : mem : Memory in use: |
---|
109 | |
---|
110 | We instruct regression_test to search for a line beginning with |
---|
111 | "Memory in use:" in each section of the output file. The token |
---|
112 | immediately following the occurrence of the pattern is the data item |
---|
113 | to be put in the table under "mem." so that the resulting LaTeX table |
---|
114 | contains data on the memmory usage. |
---|
115 | |
---|
116 | Since the output file is divided into sections delimited by strings |
---|
117 | "CommandBegin" and "CommandEnd." (See below.), each section is |
---|
118 | searched for the patterns specified by the CompareItem declarations. |
---|
119 | If the pattern is matched, the item appears in the table for that |
---|
120 | section. |
---|
121 | |
---|
122 | To run the regression_test we need configuration file, template file |
---|
123 | and design list. In case of template file, we need a two template |
---|
124 | files, since we want to compare two different run of vis. The syntax |
---|
125 | of the two templates is identical. We describe the template for "new," |
---|
126 | but all we say applies to "ref" as well. The "new" template for our |
---|
127 | example is shown below. |
---|
128 | ---------------------------------------------------------------------- |
---|
129 | |
---|
130 | echo CommandBegin init |
---|
131 | read_model |
---|
132 | flt |
---|
133 | order_vars |
---|
134 | part |
---|
135 | time |
---|
136 | print_bdd_stats |
---|
137 | echo CommandEnd init |
---|
138 | echo CommandBegin ci |
---|
139 | set hd_frontier_approx_threshold 3500 |
---|
140 | ci -t 3600 -g $HintFile $InvFile |
---|
141 | time |
---|
142 | print_bdd_stats |
---|
143 | echo CommandEnd ci |
---|
144 | echo CommandBegin ci_cex |
---|
145 | ci -d 1 -i -t 3600 -g $HintFile $InvFile |
---|
146 | time |
---|
147 | print_bdd_stats |
---|
148 | echo CommandEnd ci_cex |
---|
149 | ---------------------------------------------------------------------- |
---|
150 | |
---|
151 | Before proceeding to its discussion, it is instructive to examine the |
---|
152 | script that regression_test produces from it for one test case. |
---|
153 | regression_test relies on vis's shell variables and aliases to |
---|
154 | customize the script. The script consists of a preamble in which |
---|
155 | variables and aliases are defined, followed by one or more sections, |
---|
156 | each corresponding to a major step in the experiment. Statistics will |
---|
157 | be collected for each step separately. Our example has three sections, |
---|
158 | whose names are "init," "ci," and "ci_cex." The names of the sections |
---|
159 | are arbitrary, except that the automatically generated initialization |
---|
160 | section is always called "init." |
---|
161 | ---------------------------------------------------------------------- |
---|
162 | |
---|
163 | set vis_stdout ./result_inv_gs/am2901/am2901.out |
---|
164 | set vis_stderr ./result_inv_gs/am2901/am2901.err |
---|
165 | set BlifFile /projects/development/hsv/vis/Regression/Am2901/am2901.mv |
---|
166 | set OrdFile /projects/development/hsv/vis/Regression/Am2901/am2901.ord |
---|
167 | set InvFile /projects/development/hsv/vis/Regression/Am2901/am2901.inv |
---|
168 | set HintFile |
---|
169 | /projects/development/hsv/vis/Regression/Am2901/am2901.hint |
---|
170 | alias read_model "rlmv $BlifFile" |
---|
171 | alias set_fairness " " |
---|
172 | alias order_vars "so -s input_and_latch $OrdFile" |
---|
173 | echo CommandBegin init |
---|
174 | read_model |
---|
175 | flt |
---|
176 | order_vars |
---|
177 | part |
---|
178 | time |
---|
179 | print_bdd_stats |
---|
180 | echo CommandEnd init |
---|
181 | echo CommandBegin ci |
---|
182 | set hd_frontier_approx_threshold 3500 |
---|
183 | ci -t 3600 -g $HintFile $InvFile |
---|
184 | time |
---|
185 | print_bdd_stats |
---|
186 | echo CommandEnd ci |
---|
187 | echo CommandBegin ci_cex |
---|
188 | ci -d 1 -i -t 3600 -g $HintFile $InvFile |
---|
189 | time |
---|
190 | print_bdd_stats |
---|
191 | echo CommandEnd ci_cex |
---|
192 | quit |
---|
193 | ---------------------------------------------------------------------- |
---|
194 | |
---|
195 | The variables and the aliases are defined by vis_reg by examining the |
---|
196 | fields of the test case description in the design file. |
---|
197 | regression_test uses file extensions to identify file types. (E.g., |
---|
198 | .ord indicates an order file in "input_and_latch" format.) A full list |
---|
199 | of extensions is given below. |
---|
200 | |
---|
201 | regression_test defines variables for all the file types actually |
---|
202 | available for the experiment, and only for those. It also defines |
---|
203 | three aliases that provide some flexibility in mixing different |
---|
204 | experiments in the same batch. |
---|
205 | |
---|
206 | For instance, "read_model" is defined as either |
---|
207 | |
---|
208 | read_blif_mv $BlifFile |
---|
209 | |
---|
210 | or |
---|
211 | |
---|
212 | read_blif $BlifFile |
---|
213 | |
---|
214 | depending on whether the design list specifies a .mv file or a .blif |
---|
215 | file as model file. |
---|
216 | |
---|
217 | The definition of "order_vars" takes into account whether an order |
---|
218 | file is available or not. Finally, the definition of "set_fairness" |
---|
219 | depends on whether a .fair file is available. This allows one to mix |
---|
220 | model checking experiments for which no fairness condition is |
---|
221 | specified to experiments that do require fairness conditions. |
---|
222 | |
---|
223 | When bDefaultInitCommand is FALSE, the script is simply the preamble, |
---|
224 | followed by the unchanged template, follows by the "quit" command, |
---|
225 | which is always automatically added. |
---|
226 | |
---|
227 | When bDefaultInitCommand is TRUE, regression_test also adds an |
---|
228 | automatically generated "init" section between the preamble and the |
---|
229 | template. This default section is equivalent to |
---|
230 | |
---|
231 | echo CommandBegin init |
---|
232 | read_model |
---|
233 | flt |
---|
234 | order_vars |
---|
235 | part |
---|
236 | time |
---|
237 | print_bdd_stats |
---|
238 | echo CommandEnd init |
---|
239 | |
---|
240 | Notice that one has no control over dynamic reordering during the |
---|
241 | initialization phase if bDefaultInitCommand is TRUE. |
---|
242 | |
---|
243 | The template should refer to the defined aliases and variables to |
---|
244 | tailor itself to each test case. Notice, however, that one is not |
---|
245 | forced to use "order_vars." If one wants to always use vis's static |
---|
246 | ordering algorithm, (s)he can simply use "so" instead of "order_vars." |
---|
247 | Likewise, one can decide to always ignore fairness constraints by not |
---|
248 | including "set_fairness." |
---|
249 | |
---|
250 | On the other hand, it is usually a bad idea to explicitly include |
---|
251 | commands like |
---|
252 | |
---|
253 | read_blif_mv |
---|
254 | read_blif |
---|
255 | so -s input_and_latch |
---|
256 | read_fairness |
---|
257 | |
---|
258 | in the templates, because they limit the range of test cases to which |
---|
259 | they can be applied. |
---|
260 | |
---|
261 | As we said, the templates, and hence the scripts, are divided in |
---|
262 | sections. Each section begins and ends with an "echo" command. These |
---|
263 | commands write delimiters to the output file, which are used by |
---|
264 | vis_reg to define the scope of the pattern searches. It is important |
---|
265 | that the tag specified in corresponding pairs of CommandBegin and |
---|
266 | CommandEnd be the same. |
---|
267 | |
---|
268 | The calls to "print_bdd_stats" are typically used to add to the output |
---|
269 | files the information requested by the CompareItem declarations. It is |
---|
270 | a good idea to specify timeouts for potentially expensive steps. |
---|
271 | |
---|
272 | The design list file has one line for each test case. A master design |
---|
273 | list is in ~hsv/vis/Regression/DesignList. Specific design lists can |
---|
274 | be obtained by commenting out or deleting lines from this master list. |
---|
275 | One can also create an entirely new list according to the following |
---|
276 | rules. |
---|
277 | |
---|
278 | Lines starting with '#' are interpreted as comments and are skipped, |
---|
279 | as in all files read by vis_reg. |
---|
280 | |
---|
281 | A test case is a named collection of related files that can be used to |
---|
282 | run one or more experiments. As mentioned before, each line consists |
---|
283 | of white space-separated fields. The first is a (relative) path, while |
---|
284 | the second is the collection name. The remaining fields specify the |
---|
285 | files in the collection. |
---|
286 | |
---|
287 | The smallest collection will specify one .mv or one .blif file. This |
---|
288 | file is called the "model file," and should always appear in the third |
---|
289 | field of the line. If the model file is the only file specified, one |
---|
290 | can run reachability analysis, but not much more. Vis_reg understands |
---|
291 | the file extensions shown in the following table. For each extension, |
---|
292 | the variable bound to the file name and a short explanation are given. |
---|
293 | |
---|
294 | .mv $BlifFile model file to be read with rlmv |
---|
295 | .blif $BlifFile model file to be read with rl |
---|
296 | .ord $OrdFIle order file to be read with so -s input_and_latch |
---|
297 | .inv $InvFile invariant file to be read with ci |
---|
298 | .ctl $CtlFile property file to be read with mc |
---|
299 | .fair $FairFile fairness file to be read with rf |
---|
300 | .hint $HintFile hint file to be read with the -g option of mc, ci, and |
---|
301 | rch |
---|
302 | |
---|
303 | Only one file of each type can appear in the collection defining one |
---|
304 | test case. If multiple property files, or different order files are |
---|
305 | available for one model file, one has to create two test cases for |
---|
306 | that model. Different models, on the other hand, can share property |
---|
307 | files, order files, and so on. The master design list provides |
---|
308 | examples for both these situations. |
---|
309 | |
---|
310 | It is not possible to specify both a .mv file and a .blif file for one |
---|
311 | test case. There must be exactly one model file per test case. |
---|
312 | |
---|
313 | As was mentioned before, it is important that the name of a test case |
---|
314 | be unique in the entire batch used for a regression test. On the other |
---|
315 | hand, several test cases may share the same directory, and often do |
---|
316 | when they have some files in common. |
---|
317 | |
---|
318 | The result summary created by regression_test is a LaTeX file |
---|
319 | consisting of several tables. There is one table with performance |
---|
320 | stats for each section of the output file. In addition, there is an |
---|
321 | optional counterexample comparison table that summarizes PASS/FAIL |
---|
322 | information for the properties and the length of counterexamples. For |
---|
323 | each failing property, the length of the stem and the length of the |
---|
324 | cycle (if applicable) are reported. |
---|
325 | |
---|
326 | Each performance table consists of one column for the test case name, |
---|
327 | followed by as many groups of three columns as there are CompareItem |
---|
328 | declarations applicable to the section that the table summarizes. |
---|
329 | |
---|
330 | Each group of three columns gives the data for the reference run, the |
---|
331 | data for the new run, and the improvement. Vis_reg regards smaller as |
---|
332 | better. Hence, the improvement is positive if the CPU time, memory |
---|
333 | occupation, and so on, have decreased from the reference run to the |
---|
334 | new run. The improvement is computed as |
---|
335 | |
---|
336 | (value(ref) - value(new)) / value(ref) * 100. |
---|
337 | |
---|
338 | Notice that this formula tends to make "new" look bad. for instance, |
---|
339 | if new is twice as fast as ref, the improvement is 50%. If ref is |
---|
340 | twice as fast, however, the "improvement" is -100%. |
---|
341 | |
---|
342 | The results of each test case are stored in a separate directory, |
---|
343 | whose name is obtained by concatenating ReferenceOutputDir (or |
---|
344 | NewOutputDir) with the test case name. Each directory stores three |
---|
345 | files, whose names are given by the test case name concatenated with |
---|
346 | .src, .out, and .err. |
---|
347 | |
---|
348 | * test_case_name.src stores the script used to run vis. |
---|
349 | * test_case_name.out stores what vis writes to vis_stdout. |
---|
350 | * test_case_name.err stores what vis writes to vis_stderr. |
---|
351 | |
---|
352 | One has to be careful, because vis_reg will overwrite these results |
---|
353 | directories without warning if run twice with the same |
---|
354 | ReferenceOutputDir or NewOutputDir and bRefRun or bNewRun set to TRUE. |
---|
355 | |
---|
356 | With '-s' option, you can create c-shell script file to be able to run |
---|
357 | experiments in shell. We add this option to solve the problem that are |
---|
358 | happened when your template file contains child-process by system |
---|
359 | call. |
---|
360 | _________________________________________________________________ |
---|
361 | |
---|
362 | Last updated on 20050519 10h16 |
---|