Project

General

Profile

Whitepaper » History » Version 20

Alexander Kamkin, 03/28/2012 02:44 PM

1 1 Alexander Kamkin
h1. Whitepaper
2
3
h2. Introduction
4
5 19 Alexander Kamkin
In this document, the basic facilities of C++TESK, a C++ based toolkit for simulation-based hardware verification, are briefly described. The toolkit implements the model-based approach to verification of HDL  models, which means that all verification tasks, like stimulus generation, reaction checking and coverage tracking, are carried out employing a design model. The key feature of C++TESK is scalability – the toolkit can handle complex designs by using abstract models and/or parallelizing verification on computer clusters.
6 1 Alexander Kamkin
7 18 Alexander Kamkin
Simulation-based verification is known to be the main means for ensuring functional correctness of hardware designs of industrial size and complexity. A lot of methods, tools and technologies have appeared to overcome the ever-growing system complexity and to bring a higher level of automation to verification. C++TESK developed at "Institute for System Programming of the Russian Academy of Sciences":http://www.ispras.ru (ISPRAS) is one of them. Like all tools intended for creating verification environments, so-called testbenches, it provides facilities for defining stimulus generators, reaction checkers and coverage trackers.
8 1 Alexander Kamkin
9 20 Alexander Kamkin
C++TESK is a C++ based toolkit, which implies that testbench components are developed in pure C++ (the toolkit’s core is an open-source C++ library). In this regard, it is similar to "SystemC":http://www.systemc.org, but simpler and has a more specialized application domain. The C++ language is used for several reasons.
10 15 Alexander Kamkin
11
* First of all, most engineers are familiar with it.
12
* Second, microprocessor instruction set simulators (ISS) are usually written in C/C++ (thus, it is possible to reuse ISS components as reference models for verification).
13
* Third, there are many C++ programming tools (compilers, debuggers, profilers, etc.) and libraries ("STL":http://www.cplusplus.com/reference/stl, "Boost":http://www.boost.org, etc.) that can be used for free.
14 1 Alexander Kamkin
15 20 Alexander Kamkin
Besides the usability achieved by employing C++, the toolkit has many advantages comparing with the existing solutions. These advantages include hardware modeling at different abstraction levels (from a functional, untimed level up to a cycle-accurate one), automated generation of stimulus sequences based on state graph exploration (state enumeration), support for used-defined coverage criteria including temporal coverage, verification parallelization on computer clusters (using distributed graph exploration), and some others. The rest of the document describes the C++TESK facilities a bit more in detail. 
16 1 Alexander Kamkin
17
h2. Hardware Modeling and Reaction Checking
18
19 10 Alexander Kamkin
The central part of C++TESK is a library of hardware modeling primitives. The library allows developing reference models of hardware designs at different abstraction levels (untimed, time-approximate and time-accurate models) and composing complex models from simple ones using data transmission channels (thus, C++TESK supports transaction-level modeling, TLM). Hardware component is modeled as a class declaring input and output interfaces and stimulus processing operations. An example is given below (C++TESK macros are written in capital letters).
20 1 Alexander Kamkin
21 11 Alexander Kamkin
<pre><code class="cpp">
22 1 Alexander Kamkin
MODEL(MyModel) {
23
public:
24
    DECLARE_INPUT   (in_iface);        // input interface(s)
25
    DECLARE_OUTPUT  (out_iface);       // output interface(s)
26
    DECLARE_STIMULUS(operation);       // operation(s)
27
    ...
28
};
29 11 Alexander Kamkin
</code></pre>
30 1 Alexander Kamkin
31
Stimulus processing operations are modeled as methods with a fixed signature (an input interface and a message). Within operations, in addition to common C++ statements, special constructs are used to model time and reaction dispatching.
32
33 13 Alexander Kamkin
<pre><code class="cpp">
34 1 Alexander Kamkin
DEFINE_STIMULUS(MyModel::operation) { 
35
    START_STIMULUS();                  // starts an operation
36
    ...                                // emulates a one-cycle
37
    CYCLE();                           // time delay 
38
    SEND_REACTION(out_iface, out_msg); // produces a model reaction
39
    STOP_STIMULUS();                   // stops an operation
40
}
41 12 Alexander Kamkin
</code></pre>
42 1 Alexander Kamkin
43 6 Alexander Kamkin
Adaptation of a reference model for co-simulation with the design under verification (DUV) is done in a descendant class (so-called model adapter) by defining input and output interface adapters. An input interface adapter (being launched in @START_STIMULUS@) serializes the input message into the input signals distributed in time. An output interface adapter (being launched in @SEND_REACTION@) waits until either the design reaction is detected or time limit is reached and, then, deserializes the output signals into the output message.
44 1 Alexander Kamkin
45 6 Alexander Kamkin
Roughly speaking, reaction checking is done as follows. Every time when a model calls @SEND_REACTION@, it puts a model reaction into the reaction queue associated with the corresponding output interface and returns. When a design reaction is received, it should be associated with one of the model reactions stored in the reaction queue (if a model is accurate, the reaction queue should contain exactly one model reaction; for abstract models there can be several reactions though). Choosing a model reaction corresponding to a design reaction is carried out by the output interface’s reaction arbiter. As soon as the correspondence is found, the model reaction and design reactions are compared. If they are not equal, the bug is reported.
46 1 Alexander Kamkin
47
Reaction arbitration is a powerful technique that makes it possible to use abstract order-inaccurate reference models for simulation-based verification. C++TESK has a library of ready-to-use reaction arbiters covering various verification purposes. The simplest one is a FIFO arbiter, which chooses the first model reaction stored in the reaction queue.
48
49
h2. Scenario Description and Stimulus Generation
50
51
Verification scenario in C++TESK is specified as a state machine whose state corresponds to abstract state of the DUV, while transitions are stimuli. A special component, called engine or traverser, interprets a scenario and generates a stimulus sequence by exploring the corresponding state graph (the purpose is to try each transition in each state reachable from initial). Scenario is described in a separate class; specifications of transitions are grouped into so-called scenario methods.
52
53 13 Alexander Kamkin
<pre><code class="cpp">
54 1 Alexander Kamkin
SCENARIO(MyScenario) {
55
    MyStateType get_state();            // scenario state
56
    bool scenario_method(Context &ctx); // scenario method(s)
57
    ...
58
    MyModel &duv;                       // model adapter
59
};
60 12 Alexander Kamkin
</code></pre>
61 1 Alexander Kamkin
62
Each scenario method should be organized as a co-routine: it iterates stimulus parameters and applies a stimulus. After each invocation it should return control to the engine. Let us consider a scenario method example.
63
64 13 Alexander Kamkin
<pre><code class="cpp">
65 1 Alexander Kamkin
bool MyScenario::scenario_method(Context& ctx) {
66
    IBEGIN // enters an iteration section
67
    // IVAR(x) accesses iteration variable named x
68
    for(IVAR(x) = -1; IVAR(x) <= 1; IVAR(x)++) {
69
        IACTION {
70
            MyMessage in_msg(x);  // applies a stimulus
71 14 Alexander Kamkin
            START_STIMULUS_OF(duv, MyModel::operation, duv.in_iface, in_msg);
72 1 Alexander Kamkin
            YIELD(duv.verdict()); // returns a verdict
73
        }
74
    }
75
    IEND   // exits an iteration section
76
}
77 12 Alexander Kamkin
</code></pre>
78 1 Alexander Kamkin
79
It should be noticed that C++TESK graph exploration engine supports non-deterministic state machines, which is especially important when abstract reference models are used (abstraction is a frequent cause of indeterminacy). Besides the graph-based engine, C++TESK has an engine that constructs a sequence by applying the randomization techniques.
80
81
h2. Coverage Definition and Tracking
82
83
C++TESK supports user-defined test coverage which is described in a reference model and is tracked during simulation. The resulting coverage is summarized in verification reports. Coverage structure is specified using some set of macros. As an example let us consider the sign coverage having three elements (negative, zero and positive).
84
85 13 Alexander Kamkin
<pre><code class="cpp">
86 1 Alexander Kamkin
DEFINE_ENUMERATED_COVERAGE(SignCoverage, "Sign coverage", (
87
    (NEGATIVE, "Negative"), // coverage item: an identifier
88
    (ZERO,     "Zero"),     // and a human-readable name
89
    (POSITIVE, "Positive")  // used in coverage reports
90
));
91 12 Alexander Kamkin
</code></pre>
92 1 Alexander Kamkin
93
The toolkit implements several operations with coverage structures: aliasing, composition and partial composition. Aliasing constructs coverage with different type of elements, but the same coverage elements (i.e. the identifiers and human readable names are the same). Composition builds Cartesian product of two coverage structures. The composed coverage enumerates elements that are ordered pairs of the elements of the operand coverage structures. Unreachable elements can be excluded from the coverage using the partial composition.
94
95
C++TESK also supports defining and tracking temporal coverage, which is specified as a set of temporal sequences. Each sequence defines a pattern of interaction with the DUV (events, their order and delays between them). If the pattern is recognized, then the corresponding situation is covered. A temporal sequence example is given below.
96
97 13 Alexander Kamkin
<pre><code class="cpp">
98 1 Alexander Kamkin
// after stimulus S is applied, reactions R1, R2, R1 and R2, should
99
// appear one after the other with 1-2 cycles delay between them
100
if_then(S) << any_delay() <<
101
    (R1 << delay(1, 2) << R2 << delay(1, 2)).repeat(2)
102 12 Alexander Kamkin
</code></pre>
103 1 Alexander Kamkin
104
h2. Verification Parallelization
105
106
A useful facility implemented in C++TESK is that each testbench can be executed on a computer cluster in parallel. The approach significantly speeds up verification, shrinks bug detection time and accelerates the design process in whole. Parallelization is done dynamically without using static information on a verification scenario. From the perspective of engineers parallelization is fully transparent – development of a testbench does not depend on how it will be executed (on one computer, on several computers or on a computer cluster). Moreover, it is not more difficult to launch a testbench in a distributed environment than on a single computer.
107
108
The key idea used for parallelization is distributed graph exploration. All testbench instances explore the same state graph and share information about traversed parts of the graph. The engine remains the same, but there are several sources of traversed arcs. Each testbench instance has a build-in component, called synchronizer, responsible for exchanging information with other instances. Synchronizers of all instances are interconnected into a virtual communication network, which allows a state graph’s arc traversed by one instance to be known to all other instances (thus, they will not traverse it by themselves wasting no time to duplicate work that has been already done).
109
110 5 Alexander Kamkin
Parallelization has been used for simulation-based verification of various hardware designs. Depending on the design complexity and verification purposes, model graphs included from thousands to millions of nodes and up to several millions of arcs. Testbench execution has been performed on 1-150 computers. We have conducted a number of experiments and have measured the parallelization efficiency _T_(1)/(_n_ ∙ _T_(_n_)), where _T_(_n_) is time of testbench execution on _n_ computers. The experiments show that if the communication topology is chosen correctly, the parallelization efficiency always exceeds 0.8 (we used “ring” for 8 or less computers and “two-dimensional torus” for 9 or more computers).