[11] | 1 | |
---|
| 2 | regression_test - perform regression test with the information given in |
---|
| 3 | arguments file |
---|
| 4 | _________________________________________________________________ |
---|
| 5 | |
---|
| 6 | regression_test [-s <script_file_name>] |
---|
| 7 | |
---|
| 8 | Invokes two set of execution of vis to compare the performance and to |
---|
| 9 | check sanity of each execution. Then make a set of tables summarizing |
---|
| 10 | results, such as CPU sec, memoty usage and fail/pass of model_check, |
---|
| 11 | etc.. This command will read a description of the regression test to |
---|
| 12 | be performed from the argument file, which is configuartion file. |
---|
| 13 | Besides the configuration file, one needs to prepare two template |
---|
| 14 | files and a design list. We now describe these in turn. As running |
---|
| 15 | example we shall use a regression test that compares check_invariant |
---|
| 16 | with and without guided search. |
---|
| 17 | |
---|
| 18 | The configuration file for our example is shown below, delimited by |
---|
| 19 | lines of dashes (which are not part of the file). The order in which |
---|
| 20 | the lines appear in the file is immaterial. A line beginning with '#' |
---|
| 21 | is a comment, and is ignored by regression_test. |
---|
| 22 | ---------------------------------------------------------------------- |
---|
| 23 | DesignDirectory : /projects/development/hsv/vis/Regression |
---|
| 24 | DesignList : ./DesignListInv |
---|
| 25 | ReferenceVis : /usr/local/vis/i686-o/vis-cu |
---|
| 26 | ReferenceVisNickName : BFS |
---|
| 27 | ReferenceOutputDir : ./result_inv_bfs |
---|
| 28 | RefCommandTemplate : ./TemplateRefInv |
---|
| 29 | bRefRun : FALSE |
---|
| 30 | NewVis : /usr/local/vis/i686-o/vis-cu |
---|
| 31 | NewVisNickName : GS |
---|
| 32 | NewOutputDir : ./result_inv_gs |
---|
| 33 | bNewRun : FALSE |
---|
| 34 | NewCommandTemplate : ./TemplateNewInv |
---|
| 35 | bCompareCounterExample : TRUE |
---|
| 36 | bDefaultInitCommand : FALSE |
---|
| 37 | ResultForPerformance : ./BFSvsGS.tex |
---|
| 38 | CompareItem : cpu : elapse: |
---|
| 39 | CompareItem : mem : Memory in use: |
---|
| 40 | CompareItem : bdd : Peak number of live nodes: |
---|
| 41 | ---------------------------------------------------------------------- |
---|
| 42 | |
---|
| 43 | Each line defines a parameter of the regression test. |
---|
| 44 | |
---|
| 45 | * DesignDirectory gives the path to the directory where the input |
---|
| 46 | files are stored. In this case, we are using the central repository |
---|
| 47 | under ~hsv/vis/Regression. It's typically a good idea to store the |
---|
| 48 | results of regression tests in a separate place from the input files. |
---|
| 49 | In our examples, the results are stored in the current directory. |
---|
| 50 | |
---|
| 51 | * DesignList gives the name of the file containing the list of test |
---|
| 52 | cases to run. Each line consists of several white space-separated |
---|
| 53 | fields describing the files making up the test case. The first field |
---|
| 54 | is a path to the files relative to DesignDirectory. The second field |
---|
| 55 | is the test case name. The remaining fields are file names. More on |
---|
| 56 | this later, when we talk about the design list file in detail. |
---|
| 57 | |
---|
| 58 | * ReferenceVis and NewVis give paths to the two versions of vis to be |
---|
| 59 | compared in the regression test. They can be the same, as in this case |
---|
| 60 | in which both are the "nightly build." |
---|
| 61 | |
---|
| 62 | * ReferenceVisNickName and NewVisNickName are two identifiers that |
---|
| 63 | are used in the LaTeX tables to refer to the two versions that are |
---|
| 64 | compared. Short mnemonics are preferable. Avoid using underscores, |
---|
| 65 | because they currently cause trouble with LaTeX. |
---|
| 66 | |
---|
| 67 | * ReferenceOutputDir and NewOutputDir give the names of the |
---|
| 68 | directories in which the results of the runs will be stored. In each |
---|
| 69 | of these directories, vis_reg will create a directory for each test |
---|
| 70 | case. The names of these directories are the test case names taken |
---|
| 71 | from the design list. Therefore, these names must be unique. |
---|
| 72 | |
---|
| 73 | * RefCommandTemplate and NewCommandTemplate are the templates from |
---|
| 74 | which vis_reg produces the scripts used to run the test cases. The |
---|
| 75 | mechanism used to derive the scripts from the templates and the design |
---|
| 76 | list is described in detail when we talk about the template files. The |
---|
| 77 | two templates can be the same file. |
---|
| 78 | |
---|
| 79 | * bRefRun and bNewRun are flags that tell vis_reg which of the two |
---|
| 80 | sets of experiments should be run (if any). If one set of results has |
---|
| 81 | been already computed, the corresponding flag should be set to FALSE. |
---|
| 82 | If both flags are FALSE, vis_reg simply regenerates the summary table. |
---|
| 83 | |
---|
| 84 | * bCompareCounterExample is a flag that instructs vis_reg to scan the |
---|
| 85 | output files for counterexamples and record their lengths. For each |
---|
| 86 | counterexample, stem and cycle length are found. If this flag is set |
---|
| 87 | to FALSE, counterexamples are not analyzed and the corresponding table |
---|
| 88 | is not produced. |
---|
| 89 | |
---|
| 90 | * bDefaultInitCommand is a flag that, if set to TRUE, instructs |
---|
| 91 | vis_reg to generate a default set of commands to initialize each vis |
---|
| 92 | run. This option allows one to save some time in preparing the |
---|
| 93 | templates, at the expense of some flexibility. |
---|
| 94 | |
---|
| 95 | * ResultForPerformance is the name of the LaTeX file to which vis_reg |
---|
| 96 | writes the summary of the experiments. Its format is described in |
---|
| 97 | detail later. |
---|
| 98 | |
---|
| 99 | * CompareItem specifies one item to be monitored in the experiments. |
---|
| 100 | Multiple CompareItem lines may appear in the configuration file. Each |
---|
| 101 | one consists of two items: an identifier, and a pattern. The |
---|
| 102 | identifier is used in the table to denote the item, whereas the |
---|
| 103 | pattern is used to extract from the output files the information to be |
---|
| 104 | put in the tables. |
---|
| 105 | |
---|
| 106 | In our example, we declare |
---|
| 107 | |
---|
| 108 | CompareItem : mem : Memory in use: |
---|
| 109 | |
---|
| 110 | We instruct regression_test to search for a line beginning with |
---|
| 111 | "Memory in use:" in each section of the output file. The token |
---|
| 112 | immediately following the occurrence of the pattern is the data item |
---|
| 113 | to be put in the table under "mem." so that the resulting LaTeX table |
---|
| 114 | contains data on the memmory usage. |
---|
| 115 | |
---|
| 116 | Since the output file is divided into sections delimited by strings |
---|
| 117 | "CommandBegin" and "CommandEnd." (See below.), each section is |
---|
| 118 | searched for the patterns specified by the CompareItem declarations. |
---|
| 119 | If the pattern is matched, the item appears in the table for that |
---|
| 120 | section. |
---|
| 121 | |
---|
| 122 | To run the regression_test we need configuration file, template file |
---|
| 123 | and design list. In case of template file, we need a two template |
---|
| 124 | files, since we want to compare two different run of vis. The syntax |
---|
| 125 | of the two templates is identical. We describe the template for "new," |
---|
| 126 | but all we say applies to "ref" as well. The "new" template for our |
---|
| 127 | example is shown below. |
---|
| 128 | ---------------------------------------------------------------------- |
---|
| 129 | |
---|
| 130 | echo CommandBegin init |
---|
| 131 | read_model |
---|
| 132 | flt |
---|
| 133 | order_vars |
---|
| 134 | part |
---|
| 135 | time |
---|
| 136 | print_bdd_stats |
---|
| 137 | echo CommandEnd init |
---|
| 138 | echo CommandBegin ci |
---|
| 139 | set hd_frontier_approx_threshold 3500 |
---|
| 140 | ci -t 3600 -g $HintFile $InvFile |
---|
| 141 | time |
---|
| 142 | print_bdd_stats |
---|
| 143 | echo CommandEnd ci |
---|
| 144 | echo CommandBegin ci_cex |
---|
| 145 | ci -d 1 -i -t 3600 -g $HintFile $InvFile |
---|
| 146 | time |
---|
| 147 | print_bdd_stats |
---|
| 148 | echo CommandEnd ci_cex |
---|
| 149 | ---------------------------------------------------------------------- |
---|
| 150 | |
---|
| 151 | Before proceeding to its discussion, it is instructive to examine the |
---|
| 152 | script that regression_test produces from it for one test case. |
---|
| 153 | regression_test relies on vis's shell variables and aliases to |
---|
| 154 | customize the script. The script consists of a preamble in which |
---|
| 155 | variables and aliases are defined, followed by one or more sections, |
---|
| 156 | each corresponding to a major step in the experiment. Statistics will |
---|
| 157 | be collected for each step separately. Our example has three sections, |
---|
| 158 | whose names are "init," "ci," and "ci_cex." The names of the sections |
---|
| 159 | are arbitrary, except that the automatically generated initialization |
---|
| 160 | section is always called "init." |
---|
| 161 | ---------------------------------------------------------------------- |
---|
| 162 | |
---|
| 163 | set vis_stdout ./result_inv_gs/am2901/am2901.out |
---|
| 164 | set vis_stderr ./result_inv_gs/am2901/am2901.err |
---|
| 165 | set BlifFile /projects/development/hsv/vis/Regression/Am2901/am2901.mv |
---|
| 166 | set OrdFile /projects/development/hsv/vis/Regression/Am2901/am2901.ord |
---|
| 167 | set InvFile /projects/development/hsv/vis/Regression/Am2901/am2901.inv |
---|
| 168 | set HintFile |
---|
| 169 | /projects/development/hsv/vis/Regression/Am2901/am2901.hint |
---|
| 170 | alias read_model "rlmv $BlifFile" |
---|
| 171 | alias set_fairness " " |
---|
| 172 | alias order_vars "so -s input_and_latch $OrdFile" |
---|
| 173 | echo CommandBegin init |
---|
| 174 | read_model |
---|
| 175 | flt |
---|
| 176 | order_vars |
---|
| 177 | part |
---|
| 178 | time |
---|
| 179 | print_bdd_stats |
---|
| 180 | echo CommandEnd init |
---|
| 181 | echo CommandBegin ci |
---|
| 182 | set hd_frontier_approx_threshold 3500 |
---|
| 183 | ci -t 3600 -g $HintFile $InvFile |
---|
| 184 | time |
---|
| 185 | print_bdd_stats |
---|
| 186 | echo CommandEnd ci |
---|
| 187 | echo CommandBegin ci_cex |
---|
| 188 | ci -d 1 -i -t 3600 -g $HintFile $InvFile |
---|
| 189 | time |
---|
| 190 | print_bdd_stats |
---|
| 191 | echo CommandEnd ci_cex |
---|
| 192 | quit |
---|
| 193 | ---------------------------------------------------------------------- |
---|
| 194 | |
---|
| 195 | The variables and the aliases are defined by vis_reg by examining the |
---|
| 196 | fields of the test case description in the design file. |
---|
| 197 | regression_test uses file extensions to identify file types. (E.g., |
---|
| 198 | .ord indicates an order file in "input_and_latch" format.) A full list |
---|
| 199 | of extensions is given below. |
---|
| 200 | |
---|
| 201 | regression_test defines variables for all the file types actually |
---|
| 202 | available for the experiment, and only for those. It also defines |
---|
| 203 | three aliases that provide some flexibility in mixing different |
---|
| 204 | experiments in the same batch. |
---|
| 205 | |
---|
| 206 | For instance, "read_model" is defined as either |
---|
| 207 | |
---|
| 208 | read_blif_mv $BlifFile |
---|
| 209 | |
---|
| 210 | or |
---|
| 211 | |
---|
| 212 | read_blif $BlifFile |
---|
| 213 | |
---|
| 214 | depending on whether the design list specifies a .mv file or a .blif |
---|
| 215 | file as model file. |
---|
| 216 | |
---|
| 217 | The definition of "order_vars" takes into account whether an order |
---|
| 218 | file is available or not. Finally, the definition of "set_fairness" |
---|
| 219 | depends on whether a .fair file is available. This allows one to mix |
---|
| 220 | model checking experiments for which no fairness condition is |
---|
| 221 | specified to experiments that do require fairness conditions. |
---|
| 222 | |
---|
| 223 | When bDefaultInitCommand is FALSE, the script is simply the preamble, |
---|
| 224 | followed by the unchanged template, follows by the "quit" command, |
---|
| 225 | which is always automatically added. |
---|
| 226 | |
---|
| 227 | When bDefaultInitCommand is TRUE, regression_test also adds an |
---|
| 228 | automatically generated "init" section between the preamble and the |
---|
| 229 | template. This default section is equivalent to |
---|
| 230 | |
---|
| 231 | echo CommandBegin init |
---|
| 232 | read_model |
---|
| 233 | flt |
---|
| 234 | order_vars |
---|
| 235 | part |
---|
| 236 | time |
---|
| 237 | print_bdd_stats |
---|
| 238 | echo CommandEnd init |
---|
| 239 | |
---|
| 240 | Notice that one has no control over dynamic reordering during the |
---|
| 241 | initialization phase if bDefaultInitCommand is TRUE. |
---|
| 242 | |
---|
| 243 | The template should refer to the defined aliases and variables to |
---|
| 244 | tailor itself to each test case. Notice, however, that one is not |
---|
| 245 | forced to use "order_vars." If one wants to always use vis's static |
---|
| 246 | ordering algorithm, (s)he can simply use "so" instead of "order_vars." |
---|
| 247 | Likewise, one can decide to always ignore fairness constraints by not |
---|
| 248 | including "set_fairness." |
---|
| 249 | |
---|
| 250 | On the other hand, it is usually a bad idea to explicitly include |
---|
| 251 | commands like |
---|
| 252 | |
---|
| 253 | read_blif_mv |
---|
| 254 | read_blif |
---|
| 255 | so -s input_and_latch |
---|
| 256 | read_fairness |
---|
| 257 | |
---|
| 258 | in the templates, because they limit the range of test cases to which |
---|
| 259 | they can be applied. |
---|
| 260 | |
---|
| 261 | As we said, the templates, and hence the scripts, are divided in |
---|
| 262 | sections. Each section begins and ends with an "echo" command. These |
---|
| 263 | commands write delimiters to the output file, which are used by |
---|
| 264 | vis_reg to define the scope of the pattern searches. It is important |
---|
| 265 | that the tag specified in corresponding pairs of CommandBegin and |
---|
| 266 | CommandEnd be the same. |
---|
| 267 | |
---|
| 268 | The calls to "print_bdd_stats" are typically used to add to the output |
---|
| 269 | files the information requested by the CompareItem declarations. It is |
---|
| 270 | a good idea to specify timeouts for potentially expensive steps. |
---|
| 271 | |
---|
| 272 | The design list file has one line for each test case. A master design |
---|
| 273 | list is in ~hsv/vis/Regression/DesignList. Specific design lists can |
---|
| 274 | be obtained by commenting out or deleting lines from this master list. |
---|
| 275 | One can also create an entirely new list according to the following |
---|
| 276 | rules. |
---|
| 277 | |
---|
| 278 | Lines starting with '#' are interpreted as comments and are skipped, |
---|
| 279 | as in all files read by vis_reg. |
---|
| 280 | |
---|
| 281 | A test case is a named collection of related files that can be used to |
---|
| 282 | run one or more experiments. As mentioned before, each line consists |
---|
| 283 | of white space-separated fields. The first is a (relative) path, while |
---|
| 284 | the second is the collection name. The remaining fields specify the |
---|
| 285 | files in the collection. |
---|
| 286 | |
---|
| 287 | The smallest collection will specify one .mv or one .blif file. This |
---|
| 288 | file is called the "model file," and should always appear in the third |
---|
| 289 | field of the line. If the model file is the only file specified, one |
---|
| 290 | can run reachability analysis, but not much more. Vis_reg understands |
---|
| 291 | the file extensions shown in the following table. For each extension, |
---|
| 292 | the variable bound to the file name and a short explanation are given. |
---|
| 293 | |
---|
| 294 | .mv $BlifFile model file to be read with rlmv |
---|
| 295 | .blif $BlifFile model file to be read with rl |
---|
| 296 | .ord $OrdFIle order file to be read with so -s input_and_latch |
---|
| 297 | .inv $InvFile invariant file to be read with ci |
---|
| 298 | .ctl $CtlFile property file to be read with mc |
---|
| 299 | .fair $FairFile fairness file to be read with rf |
---|
| 300 | .hint $HintFile hint file to be read with the -g option of mc, ci, and |
---|
| 301 | rch |
---|
| 302 | |
---|
| 303 | Only one file of each type can appear in the collection defining one |
---|
| 304 | test case. If multiple property files, or different order files are |
---|
| 305 | available for one model file, one has to create two test cases for |
---|
| 306 | that model. Different models, on the other hand, can share property |
---|
| 307 | files, order files, and so on. The master design list provides |
---|
| 308 | examples for both these situations. |
---|
| 309 | |
---|
| 310 | It is not possible to specify both a .mv file and a .blif file for one |
---|
| 311 | test case. There must be exactly one model file per test case. |
---|
| 312 | |
---|
| 313 | As was mentioned before, it is important that the name of a test case |
---|
| 314 | be unique in the entire batch used for a regression test. On the other |
---|
| 315 | hand, several test cases may share the same directory, and often do |
---|
| 316 | when they have some files in common. |
---|
| 317 | |
---|
| 318 | The result summary created by regression_test is a LaTeX file |
---|
| 319 | consisting of several tables. There is one table with performance |
---|
| 320 | stats for each section of the output file. In addition, there is an |
---|
| 321 | optional counterexample comparison table that summarizes PASS/FAIL |
---|
| 322 | information for the properties and the length of counterexamples. For |
---|
| 323 | each failing property, the length of the stem and the length of the |
---|
| 324 | cycle (if applicable) are reported. |
---|
| 325 | |
---|
| 326 | Each performance table consists of one column for the test case name, |
---|
| 327 | followed by as many groups of three columns as there are CompareItem |
---|
| 328 | declarations applicable to the section that the table summarizes. |
---|
| 329 | |
---|
| 330 | Each group of three columns gives the data for the reference run, the |
---|
| 331 | data for the new run, and the improvement. Vis_reg regards smaller as |
---|
| 332 | better. Hence, the improvement is positive if the CPU time, memory |
---|
| 333 | occupation, and so on, have decreased from the reference run to the |
---|
| 334 | new run. The improvement is computed as |
---|
| 335 | |
---|
| 336 | (value(ref) - value(new)) / value(ref) * 100. |
---|
| 337 | |
---|
| 338 | Notice that this formula tends to make "new" look bad. for instance, |
---|
| 339 | if new is twice as fast as ref, the improvement is 50%. If ref is |
---|
| 340 | twice as fast, however, the "improvement" is -100%. |
---|
| 341 | |
---|
| 342 | The results of each test case are stored in a separate directory, |
---|
| 343 | whose name is obtained by concatenating ReferenceOutputDir (or |
---|
| 344 | NewOutputDir) with the test case name. Each directory stores three |
---|
| 345 | files, whose names are given by the test case name concatenated with |
---|
| 346 | .src, .out, and .err. |
---|
| 347 | |
---|
| 348 | * test_case_name.src stores the script used to run vis. |
---|
| 349 | * test_case_name.out stores what vis writes to vis_stdout. |
---|
| 350 | * test_case_name.err stores what vis writes to vis_stderr. |
---|
| 351 | |
---|
| 352 | One has to be careful, because vis_reg will overwrite these results |
---|
| 353 | directories without warning if run twice with the same |
---|
| 354 | ReferenceOutputDir or NewOutputDir and bRefRun or bNewRun set to TRUE. |
---|
| 355 | |
---|
| 356 | With '-s' option, you can create c-shell script file to be able to run |
---|
| 357 | experiments in shell. We add this option to solve the problem that are |
---|
| 358 | happened when your template file contains child-process by system |
---|
| 359 | call. |
---|
| 360 | _________________________________________________________________ |
---|
| 361 | |
---|
| 362 | Last updated on 20050519 10h16 |
---|