| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Boogie Homework

Page history last edited by tayfunelmas 12 years, 6 months ago Saved with comment

Introduction

 

In this assignment, you need to implement a static verifier for Boogie programs. The details follow.

Please submit you general questions about the homework to Piazza class forum.

You can also contact Tayfun Elmas (elmas@eecs.berkeley.edu) and Prof. Koushik Sen with any specific questions.

 

Related material

 

You are advised to look at the following paper before proceeding with the implementation:

 

The following papers are not mandatory to read but may be helpful to optimize your verification condition generation:

 

Slides:

 

Requirements

 

Operating system: For part of this assignment, you will be required to use a Windows machine (due to the use of the Boogie framework)

We provide you with a virtual machine with Windows 7 and all the required software preinstalled.

 

To get the virtual machine, download the following directory from login.eecs.berkeley.edu (e.g., using scp): /home/eecs/elmas/Boogie101/Windows7/

You can run the virtual machine by opening the file Windows7.vbox from VirtualBox (http://www.virtualbox.org/).

Username & password: boogie

(Please email to elmas@eecs if you encounter any problems about downloading/running the virtual machine.)

 

The rest of the assignment will refer to this virtual machine “Windows-VM”.

While coding and testing Boogie-independent parts of the assignment, you can use Windows, Linux, or Mac OS X.

 

If you want to use your own Windows machine instead of Windows-VM, we recommend that you install the following software:

     Optional: 

 

Programming language: For this assignment, you can use your preferred programming language (including scripting languages, e.g., Python). Whenever you need to interact with the Boogie framework (e.g., to parse a Boogie file), we will provide scripts/tools that you can call/run from the language you use.

 

Installation

 

Download: boogie101.zip

 

Small test programs (Boogie): [test.zip]

 

After downloading the zip file, follow the steps below:

  • Open the Visual Studio 2010 Command Prompt (do not use a standard shell).
  • Go (cd) into the directory in which you want to install the assignment files (no need to create a root directory; extracting the zip file will create one).
  • Extract the zip file boogie101.zip. From Windows-VM you can extract the zip using the following command:
    7z x boogie101.zip
    The files will be extrated into a new directory named boogie101, which will become the root of your assignment files.
  • Go (cd) into the boogie101 directory.
  • Set the following environment variables:

set BOOGIE101_HOME=%cd%

set HAVOC_INSTALL_DIR=%BOOGIE101_HOME%\havoc

set HAVOC_PDF_DIR=%HAVOC_INSTALL_DIR%\pfd

Some scripts we provide require these variables set whenever you start working with them. Thus, we suggest that you set BOOGIE101_HOME as a global environment variable (storing the full path to the boogie101 root directory in which the assignment files reside). Otherwise, whenever you start working in Windows-VM, you should first set the variables from the command line as given above. 

  • Run the batch script compile-boogie.bat. The script builds the Boogie framework (in boogie101/boogie), and runs some regression tests to check if the Boogie framework works correctly (which you may skip).
  • Then, you can start working with the assignment files.

 

Some directories in boogie101 to note:

bench: Boogie benchmarks selected from ones in boogie/Test.

bench/havoc: Bogie benchmarks translated from C files using HAVOC

bench101: Boogie101 benchmarks translated from programs in bench

bench101/havoc: Boogie101 benchmarks translated from programs in bench/havoc

boogie: Boogie framework. Customized for the assignment.

boogie/Binaries: Where Boogie.exe resides, if you want to call Boogie directly.

boogie/Test: Boogie regression tests, which you might find useful to debug your tool. (Use only the .bpl files)

havoc: HAVOC distribution using our customized Boogie framework.

stubs: Parser stubs for Python and C, to demonstrate how to write a parser for Boogie101.

work: Initially empty. Some scripts use it for storing intermediate files.

 

Overview

 

In this homework, you will write a (command line) tool that verifies programs written in the Boogie language. For clarity, we will use the term “tool” to refer to the program you will write for the assignment, and “program” to refer to a program your tool operates on. Some Boogie programs are translated from C programs using HAVOC. 

  • Running tool: Since this assignment is programming language-independent, we require that you provide a Windows batch script named run-boogie.bat as the launcher for your tool. This script should at least launch your actual tool (for example, running java with ta proper class path and your main Java class) and pass the arguments to the tool.
  • Input: run-boogie.bat will take a path to a Boogie program.
  • Output: Your tool is expected to print to the console a report on which procedures in the program are verified (satisfy their specifications).

          Your tool should print its output to the console in the following format.

     For each procedure in the program, either:

     ProcedureName:Verified

     or

     ProcedureName:Failed(Reason)

           Reason is one of the following: {Assertion violation, Timed out, Out of memory, Inconclusive} (see the SMT interface given below to obtain these reasons).
          Any other reasonable explanation is acceptable.

 

Although the input to your tool is a Boogie program, we do not want to expose you the details of parsing and manipulating the Boogie language (while Boogie is an intermediate language, its syntax and type system is getting rich). In order to achieve a separation between the Boogie language details and the verification process you need to implement, we have come up with a simpler language called Boogie101. We will provide you a script that takes a Boogie source file and creates a Boogie101 source file, which captures the semantics of the original Boogie program but in a simpler form. Your tool is expected to run on Boogie101 programs.

Note: You must use the .bpl extension for both Boogie and Boogie101 source file names.

 

In the rest of this document, we will be explaining the Boogie101 language, as your tool is only required to understand and work on Boogie101.

If you want to learn about the Boogie language, please refer to the site http://research.microsoft.com/en-us/projects/boogie/, especially and the following paper:

If you want to start trying the Boogie language quickly, see http://rise4fun.com/boogie

 

Assumptions about Boogie input programs:

  • The polymorphic type system of Boogie 2 is not supported. Thus, do not try to verify programs with type parameters (in the form <T,K,...>) in map type declarations, function declarations, axioms, and quantified expressions. (Type declarations of the form type Name T K ...; are allowed.)
  • Assume that the input Boogie program is structured and there are no irreducible loops in the Boogie101 program. Note that, although Boogie programs have no continue statements (but break), there may be multiple back edges for the same loop (because of continue statements in the C programs from which Boogie inputs are produced). 

 

Benchmarks

 We provide a set of benchmarks you can run your tools. The Boogie version of the benchmarks are in the boogie101/bench directory. The boogie101/bench/havoc directory contains Boogie programs translated from the standard HAVOC benchmarks (you can find the C programs in boogie101/havoc/examples/prefast directory. You can also find the Boogie101 translation of the Boogie benchmarks in the boogie/bench101 directory.

 

Observing internal Boogie transformations:

Here is an example Boogie program: example.bpl and the version of the input program after each transformation Boogie.exe performs: example_out.txt.

You can also obtain the same kind of information for another Boogie file, say file.bpl, as follows:

 

boogie101/boogie/Binaries/Boogie.exe /traceverify file.bpl > file_out.txt

 

Boogie will output the programs generated after each translation into file_out.txt.

You can also check for Boogie.exe command line options for other kinds of information:

 

boogie101/boogie/Binaries/Boogie.exe /help

 

 

Helper scripts provided in boogie101.zip

 

Since the following scripts run programs such as Boogie or C# compiler, which may write some output to the console, the inputs and outputs of the scripts are both files, not command line and console (resp.).

 

Translating Boogie to Boogie101: For this, you need to run a Python script named boogie101translator.py. The script takes two arguments, an existing Boogie source file path (input) and a Boogie101 file path (output):  

 

python boogie101translator.py boogie_file_path boogie101_file_path

 

Notes:

If boogie101_file_path already exists, it is overwritten.

The second argument can be skipped. If so, the Boogie101 output is written to file BOOGIE101_HOME/work/boogie101.bpl.


 

Translating Boogie101 to Boogie: For debugging purposes, we also provide a script named boogie101reversetranslator.py. The script translates an existing Boogie101 file back to a Boogie file. Since Boogie101 programs have unstructured control flow (and some other simplifications explained below), translating a Boogie program to Boogie101 and then back to Boogie will lose the original form (for example, structured statements, e.g., if, while) of the original Boogie program. 

 

python boogie101reversetranslator.py boogie101_file_path boogie_file_path

 

Note:

If boogie_file_path already exists, it is overwritten. 

The second argument can be skipped. If so, the Boogie output is written to file BOOGIE101_HOME/work/boogie.bpl. 

 


 

Translating Boogie101 expression to a Boogie expression: This is used to format the output of your verifier. The script named boogie101exprtranslator.py takes a file containing a Boogie101 expression and generates a file containing the Boogie form of the same expression.  

 

python boogie101exprtranslator.py boogie101_expr_file_path boogie_expr_file_path

 

Note: 

If boogie_expr_file_path already exists, it is overwritten. 

The second argument can be skipped. If so, the Boogie output is written to file BOOGIE101_HOME/work/boogie_expr.bpl.

Important: Since this is used to generate the output of the tool, do not expect this script to generate a completely-valid Boogie expression, but a human-readable one. Thus, do not attempt to parse the result of the script, but only print it to the console.


 

Checking a Boogie101 program: For debugging purposes, if you want to verify a Boogie101 directly (without translating to Boogie), you can use the script named boogie101checker.py. This script performs the verification without generating a Boogie file, thus you may want to combine this script with boogie101reversetranslator.py, if you also want the Boogie file. This script invokes Boogie.exe to do the verification, so the results will be given in Boogie’s standard output format.

 

python boogie101checker.py boogie101_file_path

 

Note:

The scripts boogie101reversetranslator.py and boogie101checker.py run by invoking a plugin in the Boogie framework that we wrote for this assignment. The plugin translates the Boogie101 input to an internal Boogie AST data structure, and then uses the standard verification mechanisms in the Boogie framework. As the plugin does not directly work on Boogie101 programs, you should not try to use this plugin to implement your homework (but use it for debugging the Boogie101 files you generate).


 

Translating C files to Boogie files: You can use the boogie101havoc.py script to translate a C file to a Boogie program. In addition to the path of the C file, the script takes a directory name, say DIR, and invokes HAVOC (located in boogie101/havoc). HAVOC generates for each function in the C file a separate Boogie file. The output files are placed in the boogie101/bench/havoc/DIR directory.

 

python boogie101havoc.py c_file_path bench_dir_name_for_output

 

 

 

Boogie101 Programs

 

In order to simplify the parsing of Boogie101 programs, we have come up with a function call-like syntax for Boogie101 programs. Roughly speaking, each element of the abstract syntax tree (AST) of the program will be created by a specific function call, say f, and the children of each AST element will be created by other function calls that will appear as the arguments of the call to f. Thus, a Boogie101 source file looks like a giant function call. The top function call is made to bpl_program, which is supposed to return an AST element representing the program. The arguments of bpl_program are other function calls that generate other AST elements, such as procedures, as well as lists of AST elements.

 

Here is an example contents for a Boogie101 source file that represents an empty program (since the children of the program AST are given as empty lists).

 

bpl_program(bpl_list_create(), bpl_list_create(), bpl_list_create(), bpl_list_create(), bpl_list_create())

 

Here, bpl_list_create() represents an empty list. A non-empty list is represented by first creating an empty list via bpl_list_create(), and then by applying a sequence of  bpl_list_append to append new elements at the end of the list. bpl_list_append(L, elt) represents a list containing elements of list L and element elt at the tail.

A list may contain either AST elements (of the same kind) or other lists. For some AST kinds, the ordering elements in the list are essential.

 

(As you have already noticed, every element in a Boogie101 source will start with bpl_)

 

In the following, we will explain the AST elements in a Boogie101 program. Note also that, wherever an AST element contains a list of another AST element, that list is represented by bpl_list_create and bpl_list_append as explained above.

 

Implementation note: We recommend that you implement a parser for Boogie101 by defining for each AST element a function with the same name and taking the same number of components. Then, parsing a program amounts to evaluating the source as a function (or procedure) call in the implementation language. (You may need to modify the Boogie101 source depending on the programing language you use.) In the following, we will not specify input and return types of the functions that create AST elements. Instead, we indicate what other functions will be called to create the arguments of these functions. Thus, you can determine the actual signature of the functions corresponding to the AST elements in your implementation. For example, supposing that you are using C, you can (naively) implement bpl_list_create and bpl_list_append as follows (this implementation will be much simpler in a scripting language with built-in list support):

 

typedef struct {

  void** ast_elements;

  int size;

} mylist;

 

mylist* bpl_list_create() {

  list = (mylist*) malloc(sizeof(mylist));

  list->ast_elements = NULL;

  list->size = 0;

}

 

// here we are a little bit cheating by returning the same list,

// since the list always expands and the old list is not reused

mylist* bpl_list_append(mylist* prefix, void* new_ast_element) {

  tmp = prefix->ast_elements;

  prefix->ast_elements = (void**) malloc(sizeof(void*) * (prefix->size + 1))

  // copy old elements from tmp to prefix->ast_elements

  ...

  // add new element

  prefix->ast_elements[prefix->size] = new_ast_element;

  prefix->size++;

  free(tmp); // dispose the old list

  return prefix;

}

 

Boogie101 Syntax

 

Here is the full Boogie101 syntax in the BNF form. In the following, list<T> represents a list whose elements are of only kind T.

 

program        ::=  bpl_program(list<typedecl>, list<global_var>, list<function>, list<axiom>, list<procedure>)

typedecl       ::=  bpl_typedecl(string, integer)

type           ::=  bpl_type(string, list<type>)

 

global_var     ::=  typedident

typedident     ::=  bpl_typedident(string, type)

 

function       ::=  bpl_function(string, list<type>, type)

axiom          ::=  expression

 

procedure      ::=  bpl_procedure(string, list<spec>, list<spec>, list<modifies>, list<input_param>, list<output_param>, list<local_var>, list<block>)

spec           ::=  bpl_spec(expression, boolean)

modifies       ::=  lhs

input_param    ::=  typedident

output_param   ::=  typedident

local_var      ::=  typedident

 

block          ::=  bpl_block(string, list<command>, goto)

 

command        ::=  assignment | assume | assert | havoc | call

assignment     ::=  bpl_assignment(lhs, rhs)

lhs            ::=  string

rhs            ::=  expression

assume         ::=  bpl_assume(expression)

assert         ::=  bpl_assert(expression)

havoc          ::=  bpl_havoc(string)

call           ::=  bpl_call(string, list<rhs>, list<lhs>)

 

goto           ::=  bpl_goto(list<string>)

 

expression     ::=  literal | ident | apply | quantified

literal        ::=  int_literal | bool_literal

int_literal    ::=  bpl_int(integer)

bool_literal   ::=  bpl_bool(boolean)

ident          ::=  bpl_ident(string)

apply          ::=  bpl_apply(op, list<expression>)

op             ::= "old" | unary | binary | ternary | mapops | typecoercion | functionname

unary          ::= "!"

binary         ::= "+" | "-" | "*" | "/" | "%" | "==" | "!=" | ">" | ">=" | "<" | "<=" | "&&" | "||" | "==>" | "<==>" | "<:"

ternary        ::= "if-then-else"

mapops         ::= "MapSelect" | "MapStore"

typecoercion   ::= ":"

functionname   ::= string

 

quantified     ::=  forall | exists

forall         ::=  bpl_forall(list<bound_var>, list<trigger>, expression)

exists         ::=  bpl_exists(list<bound_var>, list<trigger>, expression)

bound_var      ::=  typedident

trigger        ::=  list<expression>

 

list<T>        ::=  bpl_list_create() | bpl_list_append(list<T>, T)

 

string         ::=  “BoogieIdentifier“

boolean        ::=  true | false

integer        ::=  0 | 1 | 2 | ...

 

 

 

Notes on Specific Elements:

 

string

Every string identifier is delimited by “...”, and must conform the rules that apply for Boogie language identifiers.

 

boolean

Boolean literals are not delimited by “...”. Thus, while implementing the parser, you need to define true and false, for example, as constants in your programming language.

 

integer

In Boogie101, all numerical literals are scalars. All integers literals are positive; negative integers -N are represented by the expression 0-N (represented by bpl_apply). There is no floating-point literal.

 

typedecl, type

A type declaration is needed to be able to reconstruct a Boogie program out of Boogie101 program. In Boogie, every type must be declared at the top level.

Each type is declared with an arity (integer). The arity indicates how type arguments are needed when instatiating the declared type.

There are two types every Boogie101 program should always declare: int and bool. Both int and bool have arity 0.

There is also the “map” type, which is used without being declared. This is because a map type can be instantiated with an arbitrary number of type arguments.

A map type with N arguments refers to maps that take (N-1) index values and return 1 result value.

Example:

The following declares a type named “Set” with arity 1.

bpl_typedecl(“Set”, 1)

The type Set has arity 1, so it takes another type argument. The following instantiates an integer set:

bpl_type(“Set”, bpl_list_append(bpl_list_create(), bpl_type(“int”, bpl_list_create())

 

function

Each function is declared with only input types and (one) output type.

Example:

The following declares a function that takes an integer and returns the singleton set containing only that integer.

bpl_function(“Singleton”,

bpl_list_append(

bpl_type(“int”, bpl_list_create()),

bpl_bpl_list_create()),

bpl_type(“Set”, bpl_list_append(

bpl_list_create(),

bpl_type(“int”, bpl_list_create()))))

 

Although Boogie programs may contain constants, Boogie101 programs do not. Instead, we will represent constants with functions with input_types empty. For example, the following declares the constant representing the maximum integer:
bpl_function(“MAX_INT”, bpl_list_create(), bpl_type(“int”, bpl_list_create())

 

axiom

An axiom is an expression that cannot contain free variables. We will explain the form of expressions below. Axioms are used to give semantics to functions. For example, suppose that we define a function “abs” that returns the absolute value of an integer. The following axioms (in Boogie syntax) complete the definition of abs.
function abs (int): int;
axiom (forall x: int :: abs(x) >= 0)
axiom (forall x: int :: x >= 0 ==> abs(x) == x)axiom (forall x: int :: x < 0 ==> abs(x) == 0-x)

 

procedure 

The specification of each procedure has three parts: Pre-conditions, post-conditions, and modifies, each of which is given as a list.

Pre- and post-conditions are defined using spec constructs. Each modifies is a global variable name.

Input parameters, output parameters and locals together form the local variables of the procedure.

 

spec

A specification is either a pre-condition or a post-condition. Each specification consists of two parts: An expression that evaluates to a boolean value, and a boolean indicating whether the specification is free or not.

A free specification is used for modular verification without being checked. That is, a free pre-condition is not asserted at the call points (but still assumed at the beginning of the callee), and a free post-condition is not asserted at the end of the procedure (but still assumed after each call to the procedure).

 

typedident

A typed identifier is a (name, type) pair and used to define a variable with a type.

 

block

A block represents a sequence of Boogie commands ending with a transfer command. Each block is referred to using a string label.

 

command

There are five types of commands: assignment, assume, assert, havoc, and procedure call.

In assignment, the left-hand-side of the command is the (string) name of the variable being assigned. The right-hand-side of the command an expression that evaluates to a value of the same type as lhs.

Assume and assert commands take an expression that evaluates to a boolean value.

Havoc command takes the (string) name of a variable which is assigned a nondeterministic value.

In procedure call, each input argument is an expressions and each output argument is the (string) name of a variable to which the value of an output variables of the callee is written upon the end of the call.

 

goto

Each block contains a goto command. The goto command contains zero or more block labels, say target_labels.

  • If target_labels is empty, this indicates a procedure return (there is no other return command).
  •  If target_labels contains only a single label, the execution after the current block continues with indicated block.
  •  If target_labels contains multiple labels, the execution after the current block continues with one of the indicated blocks nondeterministically.

 

expression

There are four kinds of expressions: Literals (integer or boolean), identifier expressions (variable names), function/operator applications, and quantified expressions (forall or exist)

 

ident

An identifier expression refers to a local variable defined either in the containing procedure or a global variable.

 

trigger

Each quantified expression (forall or exists) contains a (possibly empty) list of triggers. Each trigger consists of a list of expressions. Each such expression refers to the variables visible to the quantified expression (including bound variables) For this assignment, you are not required to make an effective use of triggers. But, you are required to preserve the triggers along with the quantified expression they belong to, and apply transformations (such as substitution) to the triggers accordingly.

 

apply

Every nary function/operator application is expressed using the same format: bpl_apply(function_name, arguments), where function_name is the (string) name of the function or an n-ary operator being applied. Operators consists of integer operators (e.g., +, -, /), as well as boolean operators (e.g., &&, ||, ==>), and the unary operator "!". Arguments is a list of expressions on which the function/operation is applied.

Examples:

Sum of variable x and 2 (x + 2)
bpl_apply(“+”, bpl_list_append(bpl_list_append(bpl_list_create(), bpl_ident(“x”)), bpl_int(2))))

Implication (x && y) ==> z
bpl_apply(“==>”, bpl_list_append(bpl_list_append(bpl_list_create(),

bpl_apply(“&&”, bpl_list_append(bpl_list_append (bpl_list_create(), bpl_ident(“x”)), bpl_ident(“y”))),

bpl_ident(“z”))))

 

"old"

     Old function is used only in procedure post-conditions to refer to the value of a variable at the beginning of the procedure. Usually the argument is a identifier (e.g., old(x)), but it may also take map select expressions (e.g., old(M[i])). (Note that, map select expressions appear in Boogie101 as bpl_apply with "MapSelect" as the function name).

 

typecoercion

     Differently from other functions/operators, which take expression arguments, type coercion takes one type and one expression argument. Since you will not be doing transformations on types, you can just leave the type argument as is and perform other kinds of transformations on the expression argument.

 

 

SMT Solver Interface

 

You are not required to directly interact with Z3 or any other SMT solver. Instead, if you want to check the validity of a formula, say P, you should do the following:

Generate a Boogie101 program with the following content:

  • All type and function declarations referred to in P.
  • May contain global variables.
  • No axioms.
  • A procedure with the following content:
    • No pre- or post-conditions and no modifies.
    • No input or output parameters, but only local variables.
    • Only one block with only an assert command with condition P, and a return as the transfer command.

Save the program in a file.

Use the boogie101solver.py script as follows:

python boogie101solver.py boogie101_file_path model_file_path
 

The script writes its output into a (text) file (model_file_path). The output file contains the result of the SMT solving, i.e., whether the assertion in each procedure in the input program is valid or not. In the output model file, there is an entry for each procedure as follow:

ProcedureName:Outcome

 

Outcome is one of the following: {valid, invalid, timeout, outofmemory, inconclusive}

Outcome valid indicates that the assertion in the given procedure is valid, and outcome invalid indicates that that assertion is invalid. 

 

There is no restriction on the number of procedures in the input file to boogie101solver.py (as long as they obey the format above). Thus, you can check the validity of multiple formulas by a single call to the script by adding a separate procedure for each formula to the input Boogie101 program.

 

 

Bonus feature (extra credit)

 

Students that provide the following feature will get extra credit.

In the output of your verifier, if the verification of a procedure fails, report why the verification failed. The report should list all the specifications (assertions) violated including the kind of the specification (assertion, loop invariant, a procedure pre-condition violated at a call site, or a procedure post-condition violated at the return point). The report looks like the following:

     ProcedureName:Failed(Reason)

          Errors:

    • Assertion violated: Expression

    • Loop invariant violated: Expression

    • Pre-condition of CalleeName violated: Expression

    • Post-condition violated: Expression

 

Note: The expressions must be in a human readable syntax like the Boogie syntax. You can use the script boogie101exprtranslator.py (described above) to translate a Boogie101 expression to a Boogie-like expression, or you can translate the expression yourself.

 

In order to implement the bonus feature, you will need to parse the model output by Boogie whenever the verification of an assertion fails.

For this, we provide the following script.

 

python boogie101solver_with_model.py boogie101_file_path model_file_path

 

The format of the output file model_file_path is similar to the output of boogie101solver.py explained above, except that if the verification fails for a procedure, the entry for that procedure looks like the following:

ProcedureName:invalid

*** MODEL

constant*

function*

*** END_MODEL

 

Syntax of model elements:

element  ::= constant | function

constant ::= variable_name -> constant_value

function ::= function_name { 

                (element* -> element)*

               else -> element

             }

             

 

 

Due Date

 

 

October 14, 2011

 

 

How to submit?

 

 

You should email ksen@cs.berkeley.edu a link to the entire archive before the due date along with instructions to run it.  Ideally, I will run the run-boogie.bat script with a Boogie file as input and see if your implementation verifies (or detects assertion violations in) exactly the same set of procedures as the actual Boogie tool. Please come to my office on October 14, 2011 afternoon to give me a demo. 

 

Comments (0)

You don't have permission to comment on this page.