Project

General

Profile

NML Language Reference » History » Version 79

Andrei Tatarnikov, 10/09/2015 06:17 PM

1 64 Alexander Kamkin
h1. nML Language Reference
2 21 Andrei Tatarnikov
3
*UNDER CONSTRUCTION*
4 1 Alexander Kamkin
5 20 Andrei Tatarnikov
_~By Andrei Tatarnikov~_
6
7 64 Alexander Kamkin
This reference manual describes syntax and semantics of the nML language. It covers language facilities supported by MicroTESK and does not attempt to be complete. It may differ from documentation provided by other vendors as there are differences in language implementations.
8 22 Andrei Tatarnikov
9 27 Andrei Tatarnikov
{{toc}}
10 6 Andrei Tatarnikov
11 24 Andrei Tatarnikov
h2. Introduction
12 23 Andrei Tatarnikov
13 64 Alexander Kamkin
*nML_* is an architecture description language (ADL) used in MicroTESK to describe the architecture of a microprocessor under verification. It is a flexible and easy-to-use language based on attribute grammar. nML was designed to provide a retargetable way to specify microprocessor architecture for various microprocessor-related software tools including instruction-set simulators, assemblers, disassemblers, compiler back-ends etc. It works at the instruction-set level concentrating on behavioral properties and hiding implementation details. A nML specification represents a programmer''s model of the microprocessor which covers the following aspects:
14 23 Andrei Tatarnikov
15
* supported data types;
16
* registers and memory;
17
* addressing modes;
18
* syntax and semantics of instructions.
19
20 64 Alexander Kamkin
nML uses a hierarchical tree-like structure to describe an instruction set. Such a structure facilitates grouping related instructions and sharing their common parts. An instruction is described as a path in the tree from the root node to a leaf node. The set of all possible paths represents an instruction set. A node describes a primitive operation responsible for some task within an instruction. Each node has certain attributes that can be shared by its descendants. Actions performed by instructions are described as operations with registers and memory represented by bit vectors of arbitrary size.
21 23 Andrei Tatarnikov
22 64 Alexander Kamkin
A specification in nML starts with definitions of types and constants. For example, a type definition for a 32-bit word looks as follows:
23 23 Andrei Tatarnikov
24 35 Andrei Tatarnikov
<pre><code>let WORD_SIZE = 32
25 39 Andrei Tatarnikov
type word = card(WORD_SIZE)</code></pre>
26 23 Andrei Tatarnikov
27 64 Alexander Kamkin
Type definitions and constants can be used to describe registers and memory. In addition to registers and memory, it is also possible to define temporary variables that are internal abstractions provided by nML to store intermediate results of operations. They do not correspond to any data storage in real hardware and do not save their data across instruction calls. Also, there is often a need to specify some properties of the described model. For this purpose, special constants are used. For example, the code below defines general-purpose registers, memory and a temporary variable. Also, it includes a special constant to establish a correspondence between the general purpose register number 15 and the program counter (PC). Here is the code:
28 23 Andrei Tatarnikov
29 36 Andrei Tatarnikov
<pre><code>reg GPR[32, word]
30 1 Alexander Kamkin
mem M[2 ** 20, byte]
31 23 Andrei Tatarnikov
var carry[1, bit]
32 39 Andrei Tatarnikov
let PC = "GPR[15]"</code></pre>
33 23 Andrei Tatarnikov
34 1 Alexander Kamkin
As stated above, an instruction set is described as a tree of primitive operations. There two kinds of primitives: _operations_ and _addressing modes_. Operations describe parts of instructions responsible for specific tasks and can be used as leaf and root nodes. Addressing modes are aimed to customize operations (for example, they encapsulate rules for accessing microprocessor resources). They can only be used as leaf nodes. For example, here are simplified examples of operation and addressing mode specifications:
35
36 38 Andrei Tatarnikov
<pre><code>mode REG(i: nibble) = R[i]
37 1 Alexander Kamkin
syntax = format("R%d", i)
38 41 Andrei Tatarnikov
image  = format("01%4b", i)</code></pre>
39 39 Andrei Tatarnikov
40 41 Andrei Tatarnikov
<pre><code>op Add()
41 33 Andrei Tatarnikov
syntax = "add"
42 32 Andrei Tatarnikov
image  = "00"
43 39 Andrei Tatarnikov
action = { DEST = SRC1 + SRC2; }</code></pre>
44 32 Andrei Tatarnikov
45 46 Andrei Tatarnikov
Operations and addressing modes have three standard attributes: _syntax_, _image_ and _action_. The first two specify textual and binary syntax. The third describes semantics of the primitive. In addition, addressing modes have a return expression that enables them to be used as variables in various expressions. Attributes can be used by parent primitives referring to a given primitive to describe more complex abstractions.
46 43 Andrei Tatarnikov
47 48 Andrei Tatarnikov
Primitives are arranged into a tree using production rules. There are two kinds of production rules: _AND rules_ and _OR rules_. AND rules specify parent-child relationships where a child primitive is described as a parameter of its parent. Here is an example of an AND rule:
48 1 Alexander Kamkin
49 49 Andrei Tatarnikov
<pre><code>op arith_inst(act: Add, op1: OPRND, op2: OPRND)</code></pre>
50
51
This is the header of the "arith_inst" operation that states that the "arith_inst" operation node has three child nodes: the "act" operation and the "op1" and "op2" addressing modes. The syntax of an operation header is similar to a function where parameter types specify the primitives the rule refers to. Parameter can be, in turn, parameterized with other primitives (they will be encapsulated behind attributes). For this reason child nodes represent independent instances that are accessed from their parent node via parameters. OR rules specify alternatives. This means that a group of primitives is united under some alias so that each of them can used when this alias is specified in an AND rule. An OR rule looks as follows:
52
53 50 Andrei Tatarnikov
<pre><code>op Add_Sub_Mov = Add | Sub | Mov</code></pre>
54 47 Andrei Tatarnikov
55 64 Alexander Kamkin
Figure 1 displays a tree path describing the "mov" instruction from an imaginary instruction set. This instruction copies data from one register to another. The root operation of the instruction is called “instruction”. According to nML conventions, the root operation is always called "instruction" and it specifies the point from which all paths describing specific instructions start. The "instruction" operation can be defined either as AND or OR rule. In the latter case, there are several starting points. Usually root operations holds some common properties and perform common actions (such as increment of the program counter). In the given example, the root operation is linked to the “Arithm” operation with the help of an AND rule. This operation describes a group of arithmetic operations. It is parameterized with the "Add_Mov_Sub" and "OPRND" primitives. Both of them are specified as OR rules. The first one describes arithmetic operations that can be performed by the "Arithm" primitive while the second one specifies supported addressing modes. Dashed lines that connect OR-rules with their child primitives specify possible alternative paths. Instructions are identified by the terminal operation node of the path (in this example, it is the "Mov" node). An important note is that, to avoid ambiguity, nodes can have only one child operation.
56 51 Andrei Tatarnikov
57 52 Andrei Tatarnikov
p=. !Figure1.png!
58 1 Alexander Kamkin
59 53 Andrei Tatarnikov
p=. *Figure 1. Operation tree for the Mov instruction*
60 51 Andrei Tatarnikov
61 64 Alexander Kamkin
The syntax of nML resembles the syntax of the pseudocode used in microprocessor architecture manuals to describe instruction semantics. For example, here is the description of instruction ADD from the MIPS64 manual:
62 55 Andrei Tatarnikov
63 56 Andrei Tatarnikov
<pre><code>if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
64
    UNPREDICTABLE
65
endif
66
temp ← GPR[rs]31||GPR[rs]31..0) + (GPR[rt]31||GPR[rt]31..0)
67
if temp32 ≠ temp31 then
68
    SignalException(IntegerOverflow)
69
else
70
    GPR[rd] ← sign_extend(temp31..0)
71
endif</code></pre>
72
73 64 Alexander Kamkin
Such a description can be translated to nML with minimal effort. Providing that all needed data types, resources and operations describing common functionality of instructions have already been specified, the specification of the ADD instruction (or, to be more precise, the terminal operation that distinguishes it from other similar instructions) will look as follows:
74 57 Andrei Tatarnikov
75 58 Andrei Tatarnikov
<pre><code>op ADD(rd: GPR, rs: GPR, rt: GPR)
76
action = {
77
    if (NotWordValue(rs) || NotWordValue(rt)) then
78
        UNPREDICTABLE();
79
    endif;
80 61 Andrei Tatarnikov
    tmp_word = rs<31>::rs<31..0> + rs<31>::rt<31..0>;
81
    if(tmp_word<32> != tmp_word<31>) then
82 58 Andrei Tatarnikov
        SignalException("IntegerOverflow");
83
    else
84
        rd = sign_extend(tmp_word<31..0>);
85
    endif;
86
}</code></pre>
87
88
As we can see, describing an instruction based on an instruction set manual is a relatively easy task that can be performed by a verification engineer who does not have significant programming skills.
89 56 Andrei Tatarnikov
90 64 Alexander Kamkin
h2. Overview of nML
91 65 Andrei Tatarnikov
92 77 Andrei Tatarnikov
h2. Floating-Point Data Types
93
94 79 Andrei Tatarnikov
Floating-point types are defined using the <code>float(N, M)</code> construct, where <code>N</code> is fraction size and <code>M</code> is exponent size. The resulting size of the type is <code>N + M + 1</code>, where 1 is an implicitly added bit for store the sign. All types and operations with them are implemented according to the IEEE 754 standard. Supported floating-point formats include:
95 1 Alexander Kamkin
96 79 Andrei Tatarnikov
# 32-bit single-precision, defined as <code>float(23, 8)</code>
97
# 64-bit double-precision, defined as <code>float(52, 11</code>
98
# 80-bit double-extended-precision, defined as <code>float(64, 15)</code>
99
# 128-bit quadruple-precision, defined as <code>float(112, 15)</code>
100
101
102 77 Andrei Tatarnikov
103 66 Andrei Tatarnikov
h2. Converting Data Types
104 1 Alexander Kamkin
105 69 Andrei Tatarnikov
In nML, type conversion is performed explicitly using the following functions:
106
107 75 Andrei Tatarnikov
# *_sign_extend(type, value)_*. Sign-extends the value to the specified type. Applied to any types. Requires the new type to be larger or equal to the original type.
108
# *_zero_extend(type, value)_*. Zero-extends the value to the specified type. Applied to any types. Requires the new type to be larger or equal to the original type.
109
# *_coerce(type, value)_*. Converts an integer value to another integer type. For smaller types the value is truncated, for larger types it is extended. If the original type is a signed integer, sign extension is performed. Otherwise, zero-extension is performed.
110 73 Andrei Tatarnikov
# *_cast(type, value)_*. Reinterprets the value as the specified type. Applied to any types. The new type must be of the same size as the original type.
111 74 Andrei Tatarnikov
# *_int_to_float(type, value)_*. Converts 32 and 64-bit integers into 32, 64, 80 and 128-bit floating-point values (IEEE 754).
112
# *_float_to_int(type, value)_*. Converts 32, 64, 80 and 128-bit floating-point values (IEEE 754) into 32 and 64-bit integers.
113
# *_float_to_float(type, value)_*. Converts 32, 64, 80 and 128-bit floating-point values (IEEE 754) into each other.
114 76 Andrei Tatarnikov
115
Examples:
116
117
*TODO*