source: vis_dev/vis-2.3/share/help/regression_testCmd.txt @ 101

Last change on this file since 101 was 14, checked in by cecile, 13 years ago

vis2.3

File size: 15.8 KB
Line 
1
2  regression_test - perform regression test with the information given in
3  arguments file
4     _________________________________________________________________
5
6   regression_test [-s <script_file_name>]
7
8   Invokes two set of execution of vis to compare the performance and to check
9   sanity of each execution. Then make a set of tables summarizing results,
10   such as CPU sec, memoty usage and fail/pass of model_check, etc.. This
11   command will read a description of the regression test to be performed from
12   the argument file, which is configuartion file. Besides the configuration
13   file, one needs to prepare two template files and a design list. We now
14   describe these in turn. As running example we shall use a regression test
15   that compares check_invariant with and without guided search.
16
17   The configuration file for our example is shown below, delimited by lines of
18   dashes (which are not part of the file). The order in which the lines appear
19   in the file is immaterial. A line beginning with '#' is a comment, and is
20   ignored by regression_test.
21   ----------------------------------------------------------------------
22   DesignDirectory : /projects/development/hsv/vis/Regression
23   DesignList : ./DesignListInv
24   ReferenceVis : /usr/local/vis/i686-o/vis-cu
25   ReferenceVisNickName : BFS
26   ReferenceOutputDir : ./result_inv_bfs
27   RefCommandTemplate : ./TemplateRefInv
28   bRefRun : FALSE
29   NewVis : /usr/local/vis/i686-o/vis-cu
30   NewVisNickName : GS
31   NewOutputDir : ./result_inv_gs
32   bNewRun : FALSE
33   NewCommandTemplate : ./TemplateNewInv
34   bCompareCounterExample : TRUE
35   bDefaultInitCommand : FALSE
36   ResultForPerformance : ./BFSvsGS.tex
37   CompareItem : cpu : elapse:
38   CompareItem : mem : Memory in use:
39   CompareItem : bdd : Peak number of live nodes:
40   ----------------------------------------------------------------------
41
42   Each line defines a parameter of the regression test.
43
44   * DesignDirectory gives the path to the directory where the input files are
45   stored.  In  this  case,  we  are  using  the central repository under
46   ~hsv/vis/Regression. It's typically a good idea to store the results of
47   regression tests in a separate place from the input files. In our examples,
48   the results are stored in the current directory.
49
50    * DesignList gives the name of the file containing the list of test cases
51   to  run.  Each  line  consists of several white space-separated fields
52   describing the files making up the test case. The first field is a path to
53   the files relative to DesignDirectory. The second field is the test case
54   name. The remaining fields are file names. More on this later, when we talk
55   about the design list file in detail.
56
57    * ReferenceVis and NewVis give paths to the two versions of vis to be
58   compared in the regression test. They can be the same, as in this case in
59   which both are the "nightly build."
60
61    * ReferenceVisNickName and NewVisNickName are two identifiers that are used
62   in the LaTeX tables to refer to the two versions that are compared. Short
63   mnemonics are preferable. Avoid using underscores, because they currently
64   cause trouble with LaTeX.
65
66    * ReferenceOutputDir and NewOutputDir give the names of the directories in
67   which the results of the runs will be stored. In each of these directories,
68   vis_reg will create a directory for each test case. The names of these
69   directories are the test case names taken from the design list. Therefore,
70   these names must be unique.
71
72    * RefCommandTemplate and NewCommandTemplate are the templates from which
73   vis_reg produces the scripts used to run the test cases. The mechanism used
74   to derive the scripts from the templates and the design list is described in
75   detail when we talk about the template files. The two templates can be the
76   same file.
77
78    * bRefRun and bNewRun are flags that tell vis_reg which of the two sets of
79   experiments should be run (if any). If one set of results has been already
80   computed, the corresponding flag should be set to FALSE. If both flags are
81   FALSE, vis_reg simply regenerates the summary table.
82
83    * bCompareCounterExample is a flag that instructs vis_reg to scan the
84   output  files  for  counterexamples and record their lengths. For each
85   counterexample, stem and cycle length are found. If this flag is set to
86   FALSE, counterexamples are not analyzed and the corresponding table is not
87   produced.
88
89    * bDefaultInitCommand is a flag that, if set to TRUE, instructs vis_reg to
90   generate a default set of commands to initialize each vis run. This option
91   allows one to save some time in preparing the templates, at the expense of
92   some flexibility.
93
94    * ResultForPerformance is the name of the LaTeX file to which vis_reg
95   writes the summary of the experiments. Its format is described in detail
96   later.
97
98    *  CompareItem specifies one item to be monitored in the experiments.
99   Multiple CompareItem lines may appear in the configuration file. Each one
100   consists of two items: an identifier, and a pattern. The identifier is used
101   in the table to denote the item, whereas the pattern is used to extract from
102   the output files the information to be put in the tables.
103
104    In our example, we declare
105
106    CompareItem : mem : Memory in use:
107
108   We instruct regression_test to search for a line beginning with "Memory in
109   use:" in each section of the output file. The token immediately following
110   the occurrence of the pattern is the data item to be put in the table under
111   "mem." so that the resulting LaTeX table contains data on the memmory usage.
112
113    Since  the  output file is divided into sections delimited by strings
114   "CommandBegin" and "CommandEnd." (See below.), each section is searched for
115   the patterns specified by the CompareItem declarations. If the pattern is
116   matched, the item appears in the table for that section.
117
118    To run the regression_test we need configuration file, template file and
119   design list. In case of template file, we need a two template files, since
120   we want to compare two different run of vis. The syntax of the two templates
121   is identical. We describe the template for "new," but all we say applies to
122   "ref" as well. The "new" template for our example is shown below.
123   ----------------------------------------------------------------------
124
125   echo CommandBegin init
126   read_model
127   flt
128   order_vars
129   part
130   time
131   print_bdd_stats
132   echo CommandEnd init
133   echo CommandBegin ci
134   set hd_frontier_approx_threshold 3500
135   ci -t 3600 -g $HintFile $InvFile
136   time
137   print_bdd_stats
138   echo CommandEnd ci
139   echo CommandBegin ci_cex
140   ci -d 1 -i -t 3600 -g $HintFile $InvFile
141   time
142   print_bdd_stats
143   echo CommandEnd ci_cex
144   ----------------------------------------------------------------------
145
146   Before proceeding to its discussion, it is instructive to examine the script
147   that regression_test produces from it for one test case. regression_test
148   relies on vis's shell variables and aliases to customize the script. The
149   script consists of a preamble in which variables and aliases are defined,
150   followed by one or more sections, each corresponding to a major step in the
151   experiment. Statistics will be collected for each step separately. Our
152   example has three sections, whose names are "init," "ci," and "ci_cex." The
153   names of the sections are arbitrary, except that the automatically generated
154   initialization section is always called "init."
155   ----------------------------------------------------------------------
156
157   set vis_stdout ./result_inv_gs/am2901/am2901.out
158   set vis_stderr ./result_inv_gs/am2901/am2901.err
159   set BlifFile /projects/development/hsv/vis/Regression/Am2901/am2901.mv
160   set OrdFile /projects/development/hsv/vis/Regression/Am2901/am2901.ord
161   set InvFile /projects/development/hsv/vis/Regression/Am2901/am2901.inv
162   set HintFile /projects/development/hsv/vis/Regression/Am2901/am2901.hint
163   alias read_model "rlmv $BlifFile"
164   alias set_fairness " "
165   alias order_vars "so -s input_and_latch $OrdFile"
166   echo CommandBegin init
167   read_model
168   flt
169   order_vars
170   part
171   time
172   print_bdd_stats
173   echo CommandEnd init
174   echo CommandBegin ci
175   set hd_frontier_approx_threshold 3500
176   ci -t 3600 -g $HintFile $InvFile
177   time
178   print_bdd_stats
179   echo CommandEnd ci
180   echo CommandBegin ci_cex
181   ci -d 1 -i -t 3600 -g $HintFile $InvFile
182   time
183   print_bdd_stats
184   echo CommandEnd ci_cex
185   quit
186   ----------------------------------------------------------------------
187
188   The variables and the aliases are defined by vis_reg by examining the fields
189   of the test case description in the design file. regression_test uses file
190   extensions to identify file types. (E.g., .ord indicates an order file in
191   "input_and_latch" format.) A full list of extensions is given below.
192
193   regression_test defines variables for all the file types actually available
194   for the experiment, and only for those. It also defines three aliases that
195   provide some flexibility in mixing different experiments in the same batch.
196
197   For instance, "read_model" is defined as either
198
199           read_blif_mv $BlifFile
200
201   or
202
203           read_blif $BlifFile
204
205   depending on whether the design list specifies a .mv file or a .blif file as
206   model file.
207
208   The definition of "order_vars" takes into account whether an order file is
209   available or not. Finally, the definition of "set_fairness" depends on
210   whether a .fair file is available. This allows one to mix model checking
211   experiments for which no fairness condition is specified to experiments that
212   do require fairness conditions.
213
214   When  bDefaultInitCommand is FALSE, the script is simply the preamble,
215   followed by the unchanged template, follows by the "quit" command, which is
216   always automatically added.
217
218   When bDefaultInitCommand is TRUE, regression_test also adds an automatically
219   generated "init" section between the preamble and the template. This default
220   section is equivalent to
221
222   echo CommandBegin init
223   read_model
224   flt
225   order_vars
226   part
227   time
228   print_bdd_stats
229   echo CommandEnd init
230
231   Notice  that  one  has  no  control over dynamic reordering during the
232   initialization phase if bDefaultInitCommand is TRUE.
233
234   The template should refer to the defined aliases and variables to tailor
235   itself to each test case. Notice, however, that one is not forced to use
236   "order_vars." If one wants to always use vis's static ordering algorithm,
237   (s)he can simply use "so" instead of "order_vars." Likewise, one can decide
238   to always ignore fairness constraints by not including "set_fairness."
239
240   On the other hand, it is usually a bad idea to explicitly include commands
241   like
242
243   read_blif_mv
244   read_blif
245   so -s input_and_latch
246   read_fairness
247
248   in the templates, because they limit the range of test cases to which they
249   can be applied.
250
251   As we said, the templates, and hence the scripts, are divided in sections.
252   Each section begins and ends with an "echo" command. These commands write
253   delimiters to the output file, which are used by vis_reg to define the scope
254   of  the  pattern  searches.  It is important that the tag specified in
255   corresponding pairs of CommandBegin and CommandEnd be the same.
256
257   The calls to "print_bdd_stats" are typically used to add to the output files
258   the information requested by the CompareItem declarations. It is a good idea
259   to specify timeouts for potentially expensive steps.
260
261   The design list file has one line for each test case. A master design list
262   is in ~hsv/vis/Regression/DesignList. Specific design lists can be obtained
263   by commenting out or deleting lines from this master list. One can also
264   create an entirely new list according to the following rules.
265
266   Lines starting with '#' are interpreted as comments and are skipped, as in
267   all files read by vis_reg.
268
269   A test case is a named collection of related files that can be used to run
270   one or more experiments. As mentioned before, each line consists of white
271   space-separated fields. The first is a (relative) path, while the second is
272   the  collection  name.  The  remaining fields specify the files in the
273   collection.
274
275   The smallest collection will specify one .mv or one .blif file. This file is
276   called the "model file," and should always appear in the third field of the
277   line. If the model file is the only file specified, one can run reachability
278   analysis, but not much more. Vis_reg understands the file extensions shown
279   in the following table. For each extension, the variable bound to the file
280   name and a short explanation are given.
281
282   .mv $BlifFile model file to be read with rlmv
283   .blif $BlifFile model file to be read with rl
284   .ord $OrdFIle order file to be read with so -s input_and_latch
285   .inv $InvFile invariant file to be read with ci
286   .ctl $CtlFile property file to be read with mc
287   .fair $FairFile fairness file to be read with rf
288   .hint $HintFile hint file to be read with the -g option of mc, ci, and rch
289
290   Only one file of each type can appear in the collection defining one test
291   case. If multiple property files, or different order files are available for
292   one model file, one has to create two test cases for that model. Different
293   models, on the other hand, can share property files, order files, and so on.
294   The master design list provides examples for both these situations.
295
296   It is not possible to specify both a .mv file and a .blif file for one test
297   case. There must be exactly one model file per test case.
298
299   As was mentioned before, it is important that the name of a test case be
300   unique in the entire batch used for a regression test. On the other hand,
301   several test cases may share the same directory, and often do when they have
302   some files in common.
303
304   The result summary created by regression_test is a LaTeX file consisting of
305   several tables. There is one table with performance stats for each section
306   of  the  output file. In addition, there is an optional counterexample
307   comparison table that summarizes PASS/FAIL information for the properties
308   and the length of counterexamples. For each failing property, the length of
309   the stem and the length of the cycle (if applicable) are reported.
310
311   Each  performance table consists of one column for the test case name,
312   followed  by  as many groups of three columns as there are CompareItem
313   declarations applicable to the section that the table summarizes.
314
315   Each group of three columns gives the data for the reference run, the data
316   for the new run, and the improvement. Vis_reg regards smaller as better.
317   Hence, the improvement is positive if the CPU time, memory occupation, and
318   so on, have decreased from the reference run to the new run. The improvement
319   is computed as
320
321   (value(ref) - value(new)) / value(ref) * 100.
322
323   Notice that this formula tends to make "new" look bad. for instance, if new
324   is twice as fast as ref, the improvement is 50%. If ref is twice as fast,
325   however, the "improvement" is -100%.
326
327   The results of each test case are stored in a separate directory, whose name
328   is obtained by concatenating ReferenceOutputDir (or NewOutputDir) with the
329   test case name. Each directory stores three files, whose names are given by
330   the test case name concatenated with .src, .out, and .err.
331
332   * test_case_name.src stores the script used to run vis.
333   * test_case_name.out stores what vis writes to vis_stdout.
334   * test_case_name.err stores what vis writes to vis_stderr.
335
336   One  has  to  be careful, because vis_reg will overwrite these results
337   directories without warning if run twice with the same ReferenceOutputDir or
338   NewOutputDir and bRefRun or bNewRun set to TRUE.
339
340   With '-s' option, you can create c-shell script file to be able to run
341   experiments in shell. We add this option to solve the problem that are
342   happened when your template file contains child-process by system call.
343     _________________________________________________________________
344
345   Last updated on 20100410 00h02
Note: See TracBrowser for help on using the repository browser.