[14] | 1 | |
---|
| 2 | regression_test - perform regression test with the information given in |
---|
| 3 | arguments file |
---|
| 4 | _________________________________________________________________ |
---|
| 5 | |
---|
| 6 | regression_test [-s <script_file_name>] |
---|
| 7 | |
---|
| 8 | Invokes two set of execution of vis to compare the performance and to check |
---|
| 9 | sanity of each execution. Then make a set of tables summarizing results, |
---|
| 10 | such as CPU sec, memoty usage and fail/pass of model_check, etc.. This |
---|
| 11 | command will read a description of the regression test to be performed from |
---|
| 12 | the argument file, which is configuartion file. Besides the configuration |
---|
| 13 | file, one needs to prepare two template files and a design list. We now |
---|
| 14 | describe these in turn. As running example we shall use a regression test |
---|
| 15 | that compares check_invariant with and without guided search. |
---|
| 16 | |
---|
| 17 | The configuration file for our example is shown below, delimited by lines of |
---|
| 18 | dashes (which are not part of the file). The order in which the lines appear |
---|
| 19 | in the file is immaterial. A line beginning with '#' is a comment, and is |
---|
| 20 | ignored by regression_test. |
---|
| 21 | ---------------------------------------------------------------------- |
---|
| 22 | DesignDirectory : /projects/development/hsv/vis/Regression |
---|
| 23 | DesignList : ./DesignListInv |
---|
| 24 | ReferenceVis : /usr/local/vis/i686-o/vis-cu |
---|
| 25 | ReferenceVisNickName : BFS |
---|
| 26 | ReferenceOutputDir : ./result_inv_bfs |
---|
| 27 | RefCommandTemplate : ./TemplateRefInv |
---|
| 28 | bRefRun : FALSE |
---|
| 29 | NewVis : /usr/local/vis/i686-o/vis-cu |
---|
| 30 | NewVisNickName : GS |
---|
| 31 | NewOutputDir : ./result_inv_gs |
---|
| 32 | bNewRun : FALSE |
---|
| 33 | NewCommandTemplate : ./TemplateNewInv |
---|
| 34 | bCompareCounterExample : TRUE |
---|
| 35 | bDefaultInitCommand : FALSE |
---|
| 36 | ResultForPerformance : ./BFSvsGS.tex |
---|
| 37 | CompareItem : cpu : elapse: |
---|
| 38 | CompareItem : mem : Memory in use: |
---|
| 39 | CompareItem : bdd : Peak number of live nodes: |
---|
| 40 | ---------------------------------------------------------------------- |
---|
| 41 | |
---|
| 42 | Each line defines a parameter of the regression test. |
---|
| 43 | |
---|
| 44 | * DesignDirectory gives the path to the directory where the input files are |
---|
| 45 | stored. In this case, we are using the central repository under |
---|
| 46 | ~hsv/vis/Regression. It's typically a good idea to store the results of |
---|
| 47 | regression tests in a separate place from the input files. In our examples, |
---|
| 48 | the results are stored in the current directory. |
---|
| 49 | |
---|
| 50 | * DesignList gives the name of the file containing the list of test cases |
---|
| 51 | to run. Each line consists of several white space-separated fields |
---|
| 52 | describing the files making up the test case. The first field is a path to |
---|
| 53 | the files relative to DesignDirectory. The second field is the test case |
---|
| 54 | name. The remaining fields are file names. More on this later, when we talk |
---|
| 55 | about the design list file in detail. |
---|
| 56 | |
---|
| 57 | * ReferenceVis and NewVis give paths to the two versions of vis to be |
---|
| 58 | compared in the regression test. They can be the same, as in this case in |
---|
| 59 | which both are the "nightly build." |
---|
| 60 | |
---|
| 61 | * ReferenceVisNickName and NewVisNickName are two identifiers that are used |
---|
| 62 | in the LaTeX tables to refer to the two versions that are compared. Short |
---|
| 63 | mnemonics are preferable. Avoid using underscores, because they currently |
---|
| 64 | cause trouble with LaTeX. |
---|
| 65 | |
---|
| 66 | * ReferenceOutputDir and NewOutputDir give the names of the directories in |
---|
| 67 | which the results of the runs will be stored. In each of these directories, |
---|
| 68 | vis_reg will create a directory for each test case. The names of these |
---|
| 69 | directories are the test case names taken from the design list. Therefore, |
---|
| 70 | these names must be unique. |
---|
| 71 | |
---|
| 72 | * RefCommandTemplate and NewCommandTemplate are the templates from which |
---|
| 73 | vis_reg produces the scripts used to run the test cases. The mechanism used |
---|
| 74 | to derive the scripts from the templates and the design list is described in |
---|
| 75 | detail when we talk about the template files. The two templates can be the |
---|
| 76 | same file. |
---|
| 77 | |
---|
| 78 | * bRefRun and bNewRun are flags that tell vis_reg which of the two sets of |
---|
| 79 | experiments should be run (if any). If one set of results has been already |
---|
| 80 | computed, the corresponding flag should be set to FALSE. If both flags are |
---|
| 81 | FALSE, vis_reg simply regenerates the summary table. |
---|
| 82 | |
---|
| 83 | * bCompareCounterExample is a flag that instructs vis_reg to scan the |
---|
| 84 | output files for counterexamples and record their lengths. For each |
---|
| 85 | counterexample, stem and cycle length are found. If this flag is set to |
---|
| 86 | FALSE, counterexamples are not analyzed and the corresponding table is not |
---|
| 87 | produced. |
---|
| 88 | |
---|
| 89 | * bDefaultInitCommand is a flag that, if set to TRUE, instructs vis_reg to |
---|
| 90 | generate a default set of commands to initialize each vis run. This option |
---|
| 91 | allows one to save some time in preparing the templates, at the expense of |
---|
| 92 | some flexibility. |
---|
| 93 | |
---|
| 94 | * ResultForPerformance is the name of the LaTeX file to which vis_reg |
---|
| 95 | writes the summary of the experiments. Its format is described in detail |
---|
| 96 | later. |
---|
| 97 | |
---|
| 98 | * CompareItem specifies one item to be monitored in the experiments. |
---|
| 99 | Multiple CompareItem lines may appear in the configuration file. Each one |
---|
| 100 | consists of two items: an identifier, and a pattern. The identifier is used |
---|
| 101 | in the table to denote the item, whereas the pattern is used to extract from |
---|
| 102 | the output files the information to be put in the tables. |
---|
| 103 | |
---|
| 104 | In our example, we declare |
---|
| 105 | |
---|
| 106 | CompareItem : mem : Memory in use: |
---|
| 107 | |
---|
| 108 | We instruct regression_test to search for a line beginning with "Memory in |
---|
| 109 | use:" in each section of the output file. The token immediately following |
---|
| 110 | the occurrence of the pattern is the data item to be put in the table under |
---|
| 111 | "mem." so that the resulting LaTeX table contains data on the memmory usage. |
---|
| 112 | |
---|
| 113 | Since the output file is divided into sections delimited by strings |
---|
| 114 | "CommandBegin" and "CommandEnd." (See below.), each section is searched for |
---|
| 115 | the patterns specified by the CompareItem declarations. If the pattern is |
---|
| 116 | matched, the item appears in the table for that section. |
---|
| 117 | |
---|
| 118 | To run the regression_test we need configuration file, template file and |
---|
| 119 | design list. In case of template file, we need a two template files, since |
---|
| 120 | we want to compare two different run of vis. The syntax of the two templates |
---|
| 121 | is identical. We describe the template for "new," but all we say applies to |
---|
| 122 | "ref" as well. The "new" template for our example is shown below. |
---|
| 123 | ---------------------------------------------------------------------- |
---|
| 124 | |
---|
| 125 | echo CommandBegin init |
---|
| 126 | read_model |
---|
| 127 | flt |
---|
| 128 | order_vars |
---|
| 129 | part |
---|
| 130 | time |
---|
| 131 | print_bdd_stats |
---|
| 132 | echo CommandEnd init |
---|
| 133 | echo CommandBegin ci |
---|
| 134 | set hd_frontier_approx_threshold 3500 |
---|
| 135 | ci -t 3600 -g $HintFile $InvFile |
---|
| 136 | time |
---|
| 137 | print_bdd_stats |
---|
| 138 | echo CommandEnd ci |
---|
| 139 | echo CommandBegin ci_cex |
---|
| 140 | ci -d 1 -i -t 3600 -g $HintFile $InvFile |
---|
| 141 | time |
---|
| 142 | print_bdd_stats |
---|
| 143 | echo CommandEnd ci_cex |
---|
| 144 | ---------------------------------------------------------------------- |
---|
| 145 | |
---|
| 146 | Before proceeding to its discussion, it is instructive to examine the script |
---|
| 147 | that regression_test produces from it for one test case. regression_test |
---|
| 148 | relies on vis's shell variables and aliases to customize the script. The |
---|
| 149 | script consists of a preamble in which variables and aliases are defined, |
---|
| 150 | followed by one or more sections, each corresponding to a major step in the |
---|
| 151 | experiment. Statistics will be collected for each step separately. Our |
---|
| 152 | example has three sections, whose names are "init," "ci," and "ci_cex." The |
---|
| 153 | names of the sections are arbitrary, except that the automatically generated |
---|
| 154 | initialization section is always called "init." |
---|
| 155 | ---------------------------------------------------------------------- |
---|
| 156 | |
---|
| 157 | set vis_stdout ./result_inv_gs/am2901/am2901.out |
---|
| 158 | set vis_stderr ./result_inv_gs/am2901/am2901.err |
---|
| 159 | set BlifFile /projects/development/hsv/vis/Regression/Am2901/am2901.mv |
---|
| 160 | set OrdFile /projects/development/hsv/vis/Regression/Am2901/am2901.ord |
---|
| 161 | set InvFile /projects/development/hsv/vis/Regression/Am2901/am2901.inv |
---|
| 162 | set HintFile /projects/development/hsv/vis/Regression/Am2901/am2901.hint |
---|
| 163 | alias read_model "rlmv $BlifFile" |
---|
| 164 | alias set_fairness " " |
---|
| 165 | alias order_vars "so -s input_and_latch $OrdFile" |
---|
| 166 | echo CommandBegin init |
---|
| 167 | read_model |
---|
| 168 | flt |
---|
| 169 | order_vars |
---|
| 170 | part |
---|
| 171 | time |
---|
| 172 | print_bdd_stats |
---|
| 173 | echo CommandEnd init |
---|
| 174 | echo CommandBegin ci |
---|
| 175 | set hd_frontier_approx_threshold 3500 |
---|
| 176 | ci -t 3600 -g $HintFile $InvFile |
---|
| 177 | time |
---|
| 178 | print_bdd_stats |
---|
| 179 | echo CommandEnd ci |
---|
| 180 | echo CommandBegin ci_cex |
---|
| 181 | ci -d 1 -i -t 3600 -g $HintFile $InvFile |
---|
| 182 | time |
---|
| 183 | print_bdd_stats |
---|
| 184 | echo CommandEnd ci_cex |
---|
| 185 | quit |
---|
| 186 | ---------------------------------------------------------------------- |
---|
| 187 | |
---|
| 188 | The variables and the aliases are defined by vis_reg by examining the fields |
---|
| 189 | of the test case description in the design file. regression_test uses file |
---|
| 190 | extensions to identify file types. (E.g., .ord indicates an order file in |
---|
| 191 | "input_and_latch" format.) A full list of extensions is given below. |
---|
| 192 | |
---|
| 193 | regression_test defines variables for all the file types actually available |
---|
| 194 | for the experiment, and only for those. It also defines three aliases that |
---|
| 195 | provide some flexibility in mixing different experiments in the same batch. |
---|
| 196 | |
---|
| 197 | For instance, "read_model" is defined as either |
---|
| 198 | |
---|
| 199 | read_blif_mv $BlifFile |
---|
| 200 | |
---|
| 201 | or |
---|
| 202 | |
---|
| 203 | read_blif $BlifFile |
---|
| 204 | |
---|
| 205 | depending on whether the design list specifies a .mv file or a .blif file as |
---|
| 206 | model file. |
---|
| 207 | |
---|
| 208 | The definition of "order_vars" takes into account whether an order file is |
---|
| 209 | available or not. Finally, the definition of "set_fairness" depends on |
---|
| 210 | whether a .fair file is available. This allows one to mix model checking |
---|
| 211 | experiments for which no fairness condition is specified to experiments that |
---|
| 212 | do require fairness conditions. |
---|
| 213 | |
---|
| 214 | When bDefaultInitCommand is FALSE, the script is simply the preamble, |
---|
| 215 | followed by the unchanged template, follows by the "quit" command, which is |
---|
| 216 | always automatically added. |
---|
| 217 | |
---|
| 218 | When bDefaultInitCommand is TRUE, regression_test also adds an automatically |
---|
| 219 | generated "init" section between the preamble and the template. This default |
---|
| 220 | section is equivalent to |
---|
| 221 | |
---|
| 222 | echo CommandBegin init |
---|
| 223 | read_model |
---|
| 224 | flt |
---|
| 225 | order_vars |
---|
| 226 | part |
---|
| 227 | time |
---|
| 228 | print_bdd_stats |
---|
| 229 | echo CommandEnd init |
---|
| 230 | |
---|
| 231 | Notice that one has no control over dynamic reordering during the |
---|
| 232 | initialization phase if bDefaultInitCommand is TRUE. |
---|
| 233 | |
---|
| 234 | The template should refer to the defined aliases and variables to tailor |
---|
| 235 | itself to each test case. Notice, however, that one is not forced to use |
---|
| 236 | "order_vars." If one wants to always use vis's static ordering algorithm, |
---|
| 237 | (s)he can simply use "so" instead of "order_vars." Likewise, one can decide |
---|
| 238 | to always ignore fairness constraints by not including "set_fairness." |
---|
| 239 | |
---|
| 240 | On the other hand, it is usually a bad idea to explicitly include commands |
---|
| 241 | like |
---|
| 242 | |
---|
| 243 | read_blif_mv |
---|
| 244 | read_blif |
---|
| 245 | so -s input_and_latch |
---|
| 246 | read_fairness |
---|
| 247 | |
---|
| 248 | in the templates, because they limit the range of test cases to which they |
---|
| 249 | can be applied. |
---|
| 250 | |
---|
| 251 | As we said, the templates, and hence the scripts, are divided in sections. |
---|
| 252 | Each section begins and ends with an "echo" command. These commands write |
---|
| 253 | delimiters to the output file, which are used by vis_reg to define the scope |
---|
| 254 | of the pattern searches. It is important that the tag specified in |
---|
| 255 | corresponding pairs of CommandBegin and CommandEnd be the same. |
---|
| 256 | |
---|
| 257 | The calls to "print_bdd_stats" are typically used to add to the output files |
---|
| 258 | the information requested by the CompareItem declarations. It is a good idea |
---|
| 259 | to specify timeouts for potentially expensive steps. |
---|
| 260 | |
---|
| 261 | The design list file has one line for each test case. A master design list |
---|
| 262 | is in ~hsv/vis/Regression/DesignList. Specific design lists can be obtained |
---|
| 263 | by commenting out or deleting lines from this master list. One can also |
---|
| 264 | create an entirely new list according to the following rules. |
---|
| 265 | |
---|
| 266 | Lines starting with '#' are interpreted as comments and are skipped, as in |
---|
| 267 | all files read by vis_reg. |
---|
| 268 | |
---|
| 269 | A test case is a named collection of related files that can be used to run |
---|
| 270 | one or more experiments. As mentioned before, each line consists of white |
---|
| 271 | space-separated fields. The first is a (relative) path, while the second is |
---|
| 272 | the collection name. The remaining fields specify the files in the |
---|
| 273 | collection. |
---|
| 274 | |
---|
| 275 | The smallest collection will specify one .mv or one .blif file. This file is |
---|
| 276 | called the "model file," and should always appear in the third field of the |
---|
| 277 | line. If the model file is the only file specified, one can run reachability |
---|
| 278 | analysis, but not much more. Vis_reg understands the file extensions shown |
---|
| 279 | in the following table. For each extension, the variable bound to the file |
---|
| 280 | name and a short explanation are given. |
---|
| 281 | |
---|
| 282 | .mv $BlifFile model file to be read with rlmv |
---|
| 283 | .blif $BlifFile model file to be read with rl |
---|
| 284 | .ord $OrdFIle order file to be read with so -s input_and_latch |
---|
| 285 | .inv $InvFile invariant file to be read with ci |
---|
| 286 | .ctl $CtlFile property file to be read with mc |
---|
| 287 | .fair $FairFile fairness file to be read with rf |
---|
| 288 | .hint $HintFile hint file to be read with the -g option of mc, ci, and rch |
---|
| 289 | |
---|
| 290 | Only one file of each type can appear in the collection defining one test |
---|
| 291 | case. If multiple property files, or different order files are available for |
---|
| 292 | one model file, one has to create two test cases for that model. Different |
---|
| 293 | models, on the other hand, can share property files, order files, and so on. |
---|
| 294 | The master design list provides examples for both these situations. |
---|
| 295 | |
---|
| 296 | It is not possible to specify both a .mv file and a .blif file for one test |
---|
| 297 | case. There must be exactly one model file per test case. |
---|
| 298 | |
---|
| 299 | As was mentioned before, it is important that the name of a test case be |
---|
| 300 | unique in the entire batch used for a regression test. On the other hand, |
---|
| 301 | several test cases may share the same directory, and often do when they have |
---|
| 302 | some files in common. |
---|
| 303 | |
---|
| 304 | The result summary created by regression_test is a LaTeX file consisting of |
---|
| 305 | several tables. There is one table with performance stats for each section |
---|
| 306 | of the output file. In addition, there is an optional counterexample |
---|
| 307 | comparison table that summarizes PASS/FAIL information for the properties |
---|
| 308 | and the length of counterexamples. For each failing property, the length of |
---|
| 309 | the stem and the length of the cycle (if applicable) are reported. |
---|
| 310 | |
---|
| 311 | Each performance table consists of one column for the test case name, |
---|
| 312 | followed by as many groups of three columns as there are CompareItem |
---|
| 313 | declarations applicable to the section that the table summarizes. |
---|
| 314 | |
---|
| 315 | Each group of three columns gives the data for the reference run, the data |
---|
| 316 | for the new run, and the improvement. Vis_reg regards smaller as better. |
---|
| 317 | Hence, the improvement is positive if the CPU time, memory occupation, and |
---|
| 318 | so on, have decreased from the reference run to the new run. The improvement |
---|
| 319 | is computed as |
---|
| 320 | |
---|
| 321 | (value(ref) - value(new)) / value(ref) * 100. |
---|
| 322 | |
---|
| 323 | Notice that this formula tends to make "new" look bad. for instance, if new |
---|
| 324 | is twice as fast as ref, the improvement is 50%. If ref is twice as fast, |
---|
| 325 | however, the "improvement" is -100%. |
---|
| 326 | |
---|
| 327 | The results of each test case are stored in a separate directory, whose name |
---|
| 328 | is obtained by concatenating ReferenceOutputDir (or NewOutputDir) with the |
---|
| 329 | test case name. Each directory stores three files, whose names are given by |
---|
| 330 | the test case name concatenated with .src, .out, and .err. |
---|
| 331 | |
---|
| 332 | * test_case_name.src stores the script used to run vis. |
---|
| 333 | * test_case_name.out stores what vis writes to vis_stdout. |
---|
| 334 | * test_case_name.err stores what vis writes to vis_stderr. |
---|
| 335 | |
---|
| 336 | One has to be careful, because vis_reg will overwrite these results |
---|
| 337 | directories without warning if run twice with the same ReferenceOutputDir or |
---|
| 338 | NewOutputDir and bRefRun or bNewRun set to TRUE. |
---|
| 339 | |
---|
| 340 | With '-s' option, you can create c-shell script file to be able to run |
---|
| 341 | experiments in shell. We add this option to solve the problem that are |
---|
| 342 | happened when your template file contains child-process by system call. |
---|
| 343 | _________________________________________________________________ |
---|
| 344 | |
---|
| 345 | Last updated on 20100410 00h02 |
---|