Overview » History » Revision 12
Revision 11 (Sergey Smolov, 12/06/2019 05:00 PM) → Revision 12/14 (Sergey Smolov, 12/06/2019 06:39 PM)
h1. Overview {{toc}} h2. Basic Concepts Fortress provides a Java API for generating _pseudorandom values_ that satisfy certain _constraints_. At logical level, a constraint is represented by a set of expressions that specify limitations for input values (_assertions_ that must be hold for those values). If there are values satisfying all of the specified assertions they will be used a solution for the constraint. If there is a multitude of values satisfying the constraint, specific values will be selected from the range of possible solutions on random basis. From an implementational point of view, the API represents a wrapper around some kind of an freely distributed _SMT solver_ engine (in the current version, we support the following solvers: "Yices":https://github.com/SRI-CSL/yices2, "Z3":https://github.com/Z3Prover/z3, "CVC4":https://cvc4.github.io). It can be extended to support other solver engines and provides a possibility to interact with different solver engines in a uniform way. Also, it facilitates creating task-specific custom solvers and extending functionality of existing solver engines by adding custom operations (macros based on built-in operations). h2. SMT-LIB In SMT solvers, a special functional language is used to specify constraints. The library components allow to generate constructions in the "SMT-LIBv2":https://stp.readthedocs.io/en/latest/smt-input-language.html language and run solver to process them and produce the results (find values of unknown input variables). h2. Constraints and SMT Constraints (so-called _SMT model_) are represented by a set of assertions that must be satisfied. An SMT solver checks the satisfiability of the model and suggests a _solution_ (variable values) that would satisfy the model. In the example below, we specify a model that should help us create a test that will cause a MIPS processor to generate an exception. We want to find values of the _rs_ and _rt_ general purpose registers that will cause the _ADD_ instruction to raise an integer overflow exception. It should be correct 32-bit signed integers that are not equal to each other. Here is an SMT-LIBv2 code: <pre> (define-sort Int_t () (_ BitVec 64)) (define-fun INT_ZERO () Int_t (_ bv0 64)) (define-fun INT_BASE_SIZE () Int_t (_ bv32 64)) (define-fun INT_SIGN_MASK () Int_t (bvshl (bvnot INT_ZERO) INT_BASE_SIZE)) (define-fun IsValidPos ((x!1 Int_t)) Bool (ite (= (bvand x!1 INT_SIGN_MASK) INT_ZERO) true false)) (define-fun IsValidNeg ((x!1 Int_t)) Bool (ite (= (bvand x!1 INT_SIGN_MASK) INT_SIGN_MASK) true false)) (define-fun IsValidSignedInt ((x!1 Int_t)) Bool (ite (or (IsValidPos x!1) (IsValidNeg x!1)) true false)) (declare-const rs Int_t) (declare-const rt Int_t) ; rt and rs must contain valid sign-extended 32-bit values (bits 63..31 equal) (assert (IsValidSignedInt rs)) (assert (IsValidSignedInt rt)) ; the condition for an overflow: the summation result is not a valid sign-extended 32-bit value (assert (not (IsValidSignedInt (bvadd rs rt)))) ; just in case: rs and rt are not equal (to make the results more interesting) (assert (not (= rs rt))) (check-sat) (echo "Values that lead to an overflow:") (get-value (rs rt)) </pre> h3. SMT Limitations. # *Recursion in not allowed* in SMT-LIB. At least, this applies to the Z3 implementation. In other words, code like provided below is not valid: <pre> (define-fun fact ((x Int)) Int (ite (= x 0) 1 (fact (- x 1)))) (simplify (fact 10)) </pre> h3. Constraints in XML Constraints can also be described in the XML format. The library provides functionality to load and save constraints in XML. Here is an example of an XML document describing a simple constraint. <pre><code class="xml"> <?xml version="1.0" encoding="UTF-8" standalone="no"?> <Constraint version="1.0"> <Name>SimpleBitVector</Name> <Description>SimpleBitVector constraint</Description> <Solver id="Z3_TEXT"/> <Signature> <Variable length="3" name="a" type="BIT_VECTOR" value=""/> <Variable length="3" name="b" type="BIT_VECTOR" value=""/> </Signature> <Syntax> <Formula> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="NOT"/> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="EQ"/> <VariableRef name="a"/> <VariableRef name="b"/> </Expression> </Expression> </Formula> <Formula> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="EQ"/> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="BVOR"/> <VariableRef name="a"/> <VariableRef name="b"/> </Expression> <Value length="3" type="BIT_VECTOR" value="111"/> </Expression> </Formula> <Formula> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="EQ"/> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="BVLSHL"/> <VariableRef name="a"/> <Value length="3" type="BIT_VECTOR" value="011"/> </Expression> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="BVSMOD"/> <VariableRef name="a"/> <Value length="3" type="BIT_VECTOR" value="010"/> </Expression> </Expression> </Formula> <Formula> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="EQ"/> <Expression> <Operation family="ru.ispras.fortress.expression.StandardOperation" id="BVAND"/> <VariableRef name="a"/> <VariableRef name="b"/> </Expression> <Value length="3" type="BIT_VECTOR" value="000"/> </Expression> </Formula> </Syntax> </Constraint> </code></pre> The same constraint described in SMT-LIBv2 looks like this: <pre> (declare-const a (_ BitVec 3)) (declare-const b (_ BitVec 3)) (assert (not (= a b))) (assert (= (bvor a b) #b111)) (assert (= (bvand a b) #b000)) (assert (= (bvshl a (_ bv3 3))(bvsmod a (_ bv2 3)))) (check-sat) (get-value (a b)) (exit) </pre> As it can be noticed, the description in XML is more redundant. However, this format is independent of a particular solver engine and can be extended with additional information. h2. Tree Representation In Fortress library context-independent syntax trees are used to represent constraints. These trees are used to generate a representation that can be understood by a particular SMT solver. The syntax tree contains nodes of the following types: # *Constraint* This is the root node of the tree. It holds the list of unknown variables and the list of assertions (formulas) for these variables. # *Formula* Represents an assertion expression. Can be combined with other formulas to build a more complex expression (by applying logic _OR_ , _AND_ or _NOT_ to it). The underlying expression must be a logic expression that can be solved to true or false. # *Operation* Represents an operation on some variables, values, parameters or other operations. # *Variable* Represents an input variable. It can have an assigned value and, in such a case, will be treated as a value. Otherwise, it is an unknown variable. A variable includes a type as an attribute. # *Value* Specifies some known value of the specified type which can be accessed as an attribute. Note: Operation, Variables and Value are designed to be treated polymorphically. This allows combining them to build complex expressions. h2. Constraint Solver Java Library Packages The Fortress library Constraint Solver subsystem is implemented in Java. The source code files are located in the "microtesk++/constraint-solver" folder. The Java classes are organized in the following packages: # ru.ispras.fortress.calculator ru.ispras.microtesk.constraints - constant sub-expressions simplifier contains SMT model generation logic and solver implementations. # ru.ispras.fortress.data ru.ispras.microtesk.constraints.syntax - data types (basic, bit vectors, arrays, etc.) # ru.ispras.fortress.esexpr - Lisp-like S-expressions # ru.ispras.fortress.expression - contains classes implementing syntax tree nodes, operation identifiers nodes. # ru.ispras.fortress.jaxb ru.ispras.microtesk.constraints.syntax.types - XML integration # ru.ispras.fortress.logic - normal form orthogonalizer # ru.ispras.fortress.randomizer - randomization engine # ru.ispras.fortress.solver - SMT solver integration # ru.ispras.fortress.transformer - syntax tree transformation rules contains code that specifies particular data types and mechanisms operation types. # ru.ispras.fortress.util ru.ispras.microtesk.constraints.tests - utility methods contains JUnit test cases. h3. Core classes/interfaces *Syntax Tree Implementation* The syntax tree nodes are implemented in the following classes: * Constraint. Parameterized by a collection of Variable objects and a collection of Formula objects. * Formulas. Formula. Parameterized by a collection of syntax tree node objects. an Operation object. * Operation. Implements SyntaxElement. Parameterized by operands operand objects implementing SyntaxElement and an operation type. type object implementing OperationType. * Variable. Implements SyntaxElement. Parameterized by the variable name string, a data type object implemeting DataType and a value. BigInteger value object. * Value. Implements SyntaxElement. Parameterized by a data type object implemeting DataType and a value. BigInteger value object. The SyntaxElement interface provides the ability to combine different kinds of elements into expressions. The current implementation supports operations with the following data types: (1) Bit vectors, (2) Booleans, (3) Integers, (4) Real numbers, (5) Arrays. Booleans. They are implemented in the BitVector and LogicBoolean classes. The BitVectorOperation and LogicBooleanOperation classes specify supported operation with these types. For example, the LogicBooleanOperation class looks like this: <pre><code class="java"> public final class LogicBooleanOperation extends OperationType { private LogicBooleanOperation() {} /** Operation: Logic - Equality */ public static final OperationType EQ = new LogicBooleanOperation(); /** Operation: Logic - AND */ public static final OperationType AND = new LogicBooleanOperation(); /** Operation: Logic - OR */ public static final OperationType OR = new LogicBooleanOperation(); /** Operation: Logic - NOT */ public static final OperationType NOT = new LogicBooleanOperation(); /** Operation: Logic - XOR */ public static final OperationType XOR = new LogicBooleanOperation(); /** Operation: Logic - Implication */ public static final OperationType IMPL= new LogicBooleanOperation(); } </code></pre> The code below demonstrates how we can build a syntax tree representation for the integer overflow constraint: <pre><code class="java"> public class IntegerOverflowBitVectorTestCase extends GenericSolverTestBase { public IntegerOverflowBitVectorTestCase() { super(new IntegerOverflow()); } public static class IntegerOverflow BitVectorIntegerOverflowTestCase implements SampleConstraint SolverTestCase { private static final int BIT_VECTOR_LENGTH = 64; private static final DataType BIT_VECTOR_TYPE = DataType.bitVector(BIT_VECTOR_LENGTH); DataType.getBitVector(BIT_VECTOR_LENGTH); private static final NodeValue Value INT_ZERO = new NodeValue(BIT_VECTOR_TYPE.valueOf("0", 10)); Value(new BigInteger("0"), BIT_VECTOR_TYPE); private static final NodeValue Value INT_BASE_SIZE = new NodeValue(BIT_VECTOR_TYPE.valueOf("32", 10)); Value(new BigInteger("32"), BIT_VECTOR_TYPE); private static final NodeOperation Operation INT_SIGN_MASK = Nodes.bvlshl(Nodes.bvnot(INT_ZERO), INT_BASE_SIZE); @Override public Constraint getConstraint() { final ConstraintBuilder builder = new ConstraintBuilder(); builder.setName("IntegerOverflow"); builder.setKind(ConstraintKind.FORMULA_BASED); builder.setDescription("IntegerOverflow constraint"); // Unknown variables final NodeVariable rs = Operation(BitVectorOperation.BVSHL, new NodeVariable(builder.addVariable("rs", BIT_VECTOR_TYPE)); final NodeVariable rt = new NodeVariable(builder.addVariable("rt", BIT_VECTOR_TYPE)); final Formulas formulas = new Formulas(); builder.setInnerRep(formulas); formulas.add(newIsValidSignedInt(rs)); formulas.add(newIsValidSignedInt(rt)); formulas.add(Nodes.not(newIsValidSignedInt(Nodes.bvadd(rs, rt)))); formulas.add(Nodes.not(Nodes.eq(rs, rt))); return builder.build(); } Operation(BitVectorOperation.BVNOT, INT_ZERO, null), INT_BASE_SIZE); private NodeOperation newIsValidPos(final Node Operation IsValidPos(SyntaxElement arg) { return Nodes.eq(Nodes.bvand(arg, new Operation(LogicBooleanOperation.EQ, new Operation(BitVectorOperation.BVAND, arg, INT_SIGN_MASK), INT_ZERO); } private NodeOperation newIsValidNeg(final Node Operation IsValidNeg(SyntaxElement arg) { return Nodes.eq(Nodes.bvand(arg, new Operation(LogicBooleanOperation.EQ, new Operation(BitVectorOperation.BVAND, arg, INT_SIGN_MASK), INT_SIGN_MASK); } private NodeOperation newIsValidSignedInt(final Node Operation IsValidSignedInt(SyntaxElement arg) { return Nodes.or(newIsValidPos(arg), newIsValidNeg(arg)); new Operation(LogicBooleanOperation.OR, IsValidPos(arg), IsValidNeg(arg)); } @Override public Iterable<Variable> Constraint getConstraint() { Constraint constraint = new Constraint(); Variable rs = new Variable("rs", BIT_VECTOR_TYPE, null); constraint.addVariable(rs); Variable rt = new Variable("rt", BIT_VECTOR_TYPE, null); constraint.addVariable(rt); constraint.addFormula( new Formula( IsValidSignedInt(rs) ) ); constraint.addFormula( new Formula( IsValidSignedInt(rt) ) ); constraint.addFormula( new Formula( new Operation( LogicBooleanOperation.NOT, IsValidSignedInt(new Operation(BitVectorOperation.BVADD, rs, rt)), null ) ) ); constraint.addFormula( new Formula( new Operation(LogicBooleanOperation.NOT, new Operation(LogicBooleanOperation.EQ, rs, rt), null) ) ); return constraint; } public Vector<Variable> getExpectedVariables() { final List<Variable> Vector<Variable> result = new ArrayList<>(); Vector<Variable>(); result.add(new Variable("rs", BIT_VECTOR_TYPE.valueOf("000000009b91b193", BIT_VECTOR_TYPE, new BigInteger("000000009b91b193", 16))); result.add(new Variable("rt", BIT_VECTOR_TYPE.valueOf("000000009b91b1b3", BIT_VECTOR_TYPE, new BigInteger("000000009b91b1b3", 16))); return result; } } </code></pre> *Representation Translation* The logic that translates a tree representation into an SMT representation is implemented in the following way: Methods of the Translator class traverse the constraint syntax tree and use methods of the RepresentationBuilder interface to translate information about its nodes into a representation that can be understood by a particular solver. The RepresentationBuilder interface looks like follows: <pre><code class="java"> public interface RepresentationBuilder { public void addVariableDeclaration(Variable variable); public void beginConstraint(); public void endConstraint(); public void beginFormula(); public void endFormula(); public void beginExpression(); public void endExpression(); public void appendValue(Value value); public void appendVariable(Variable variable); public void appendOperation(OperationType type); } </code></pre> *Solver Implementation* Solvers use the Translator class and a specific implementation of the RepresentationBuilder interface to generate an SMT representation of a constraint. Then they run a solver engine to solve the constraint and produce the results. Solver implement a common interface called Solver that looks like this: <pre><code class="java"> public interface Solver { String getName(); String getDescription(); public boolean isSupported(ConstraintKind kind); boolean isGeneric(); SolverResult solve(Constraint solveConstraint(Constraint constraint); public boolean addCustomOperation(Function function); isSolved(); public boolean addCustomOperation(FunctionTemplate template); isSatisfiable(); public int getErrorCount(); public String getSolverPath(); void setSolverPath(String value); getErrorText(int index); public int getVariableCount(); public Variable getVariable(int index); } </code></pre>