Template Description Language » History » Revision 133
Revision 132 (Andrei Tatarnikov, 02/27/2015 03:42 PM) → Revision 133/139 (Andrei Tatarnikov, 02/27/2015 03:45 PM)
h1. Template Description Language _~By Artemiy Utekhin and Andrei Tatarnikov~_ *UNDER CONSTRUCTION* {{toc}} h2. Introduction MicroTESK generates test programs on the basis of _test templates_ that provide an abstract description of scenarios to be reproduced by the generated programs. Test templates are created using the _test template description language_. It is a _Ruby_-based domain-specific language that provides facilities to describe test cases using symbolic names (that refer to a set of data satisfying certain conditions) instead of concrete input data and to manage the structure of the generated test programs. The language is implemented as a library that includes functionality for describing test templates and for further processing these test templates to produce a test program. MicroTESK uses the JRuby interpreter to process Ruby files. This allows Ruby libraries to interact with other components of MicroTESK written in Java. h2. How It Works A test template in Ruby describes a test program in terms of the model of the target microprocessor ISA. The structure of the test program is described using built-in features of Ruby (conditions, loops, etc.) and facilities provided by MicroTESK libraries (blocks that help organize instruction sequences). To provide access to elements of the model such as instructions and their addressing modes, corresponding Ruby methods are created at runtime on the basis on the meta-information provided by the model. The test template subsystem interacts with the model and the testing library of MicroTESK to create a symbolic test program, simulate it on the model and generate its textual representation. Generally speaking, processing of a test template is performed in the following steps: * The model of the microprocessor is loaded; * Runtime methods to access architecture-specific elements are created on the basis of the model''s meta-information; * The code of the test template is executed to build the internal representation of the template described as a hierarchy of code blocks; * Blocks are processed bottom-up to produce sequences of abstract instruction calls (at this step, their arguments can be described as a set of conditions instead of being assigned concrete values); * A symbolic test program is built on the basis of the produced abstract instruction call sequences by applying corresponding algorithms to find values satisfying the specified conditions; * The symbolic test program is simulated on the microprocessor model; * The code of the test program is generated and saved to the output file. h2. Configuration Global settings for the test template subsystem are specified in the <code>config.rb</code> file. These settings are related to the package structure and dependencies of the subsystem. They are predefined and rarely need to be modified. Also, there are local settings that control processing of individual test templates. They are specified as member variables of the <code>Template</code> class. Test templates can override them to customize the behavior of the subsystem. The settings will be discussed in more detail in the "Writing Test Templates" section. h2. Running Test Program Generation To start test program generation, a user needs to run the <code>generate.sh</code> script (Unix, Linux, OS X) or the <code>generate.bat</code> script (Windows) located in the <code>bin</code> folder. The script launches a Ruby program that processes the specified test template and produces a test program. The command to run the script has the following format: <pre>generate <model name> <template file.rb> [<output file.asm>]</pre> There are three parameters: (1) the name of the microprocessor model (generated by the [[Sim-nML Translator]] on the basis of Sim-nML specifications), (2) the name of the test template file to be processed and (3) the name of the test program file to be generated (optional, if it is skipped the program is printed to the console). For example, the following command processes the <code>example.rb</code> test template and saves the generated test program to the <code>test.asm</code> file: <pre>sh bin/generate.sh cpu arch/demo/cpu/templates/example.rb test.asm</pre> h2. Writing Test Templates h3. Test Template Structure A test template is implemented as a class inherited from the <code>Template</code> library class that provides access to all features of the library. Information on the location of the <code>Template</code> class is stored in the <code>TEMPLATE</code> environment variable. So, the definition of a test template class looks like this: <pre><code class="ruby"> require ENV[''TEMPLATE''] class MyTemplate < Template</code></pre> Test template classes should contain implementations of the following methods: # <code>initialize</code> (optional) - specifies settings for the given test template; # <code>pre</code> (optional) - specifies the initialization code for the test program; # <code>post</code> (optional) - specifies the finalization code for the test program; # <code>run</code> - specifies the main code of the test program (test cases). The definitions of optional methods can be skipped. In this case, the default implementations provided by the parent class will be used. The default implementation of the <code>initialize</code> method initializes the settings with default values. The default implementations of the <code>pre</code> and <code>post</code> methods do nothing. The full interface of a test template looks as follows: <pre><code class="ruby">require ENV[''TEMPLATE''] class MyTemplate < Template def initialize super # Initialize settings here end def pre # Place your initialization code here end def post # Place your finalization code here end def run # Place your test problem description here end end</code></pre> h3. Reusing Test Templates It is possible to reuse code of existing test templates in other test templates. To do this, you need to subclass the template you want to reuse instead of the <code>Template</code> class. For example, the <code>MyTemplate</code> class below reuses code from the <code>MyPrepost</code> class that provides initialization and finalization code for similar test templates. <pre><code class="ruby">require ENV[''TEMPLATE''] require_relative ''MyPrepost'' class MyTemplate < MyPrepost def run ... end end</code></pre> h3. Test Template Settings Test templates use the following settings: # Use the standard output to print the generated test program (in addition to the output file); # Enable logging information on the simulated instruction calls; # Starting characters for single-line comments in the test program; # Starting characters for multi-line comments in the test program; # Terminating characters for multi-line comments in the test program; # Seed for the randomizer. Here is how these settings are initialized with default values in the <code>Template</code> class: <pre><code class="ruby"> @use_stdout = true @log_execution = true @sl_comment_starts_with = "// " @ml_comment_starts_with = "/*" @ml_comment_ends_with = "*/" @random_seed = 0 </code></pre> The settings can be overridden in the <code>initialize</code> method of a test template. For example: <pre><code class="ruby">class MyTemplate < Template def initialize super @sl_comment_starts_with = ";" @ml_comment_starts_with = "/=" @ml_comment_ends_with = "=/" end ... end</code></pre> h3. Data Definitions Describing data requires the use of assembler-specific directives. Information of these directives in not included in ISA specifications and should be provided in test templates. It includes textual format of data directives and mappings between nML and assembler data types used by these directives. Configuration information on data directives is specified in the <code>data_config</code> block, which is usually placed in the <code>pre</code> method. Only one such block per template is allowed. Here is an example: <pre><code class="ruby"> data_config(:text => ''.data'', :target => ''M'', :addressableSize => 8) { define_type :id => :byte, :text => ''.byte'', :type => type(''card'', 8) define_type :id => :half, :text => ''.half'', :type => type(''card'', 16) define_type :id => :word, :text => ''.word'', :type => type(''card'', 32) define_space :id => :space, :text => ''.space'', :fillWith => 0 define_ascii_string :id => :ascii, :text => ''.ascii'', :zeroTerm => false define_ascii_string :id => :asciiz, :text => ''.asciiz'', :zeroTerm => true } </code></pre> The block takes the following parameters (compulsory): # _text_ - specifies the keyword that marks the beginning of the data section of the generated test program; # _target_ - specifies the memory array defined in the nML specification to which data will be placed during simulation; # _addressableSize_ - specifies the size (in bits) of addressable memory locations. To set up particular directives, the language provides special methods that must be called inside the block. All the methods share two common parameters: _id_ and _text_. The first specifies the keyword to be used in a test template to address the directive and the second specifies how it will be printed in the test program. The current version of MicroTESK provides the following methods: # _define_type_ - defines a directive to allocate memory for a data element of an nML data type specified by the _type_ parameter; # _define_space_ - defines a directive to allocate memory (one or more for the specified number of addressable locations) locations filled with a default value specified by the _fillWith_ parameter; # _define_ascii_string_ - defines a directive to allocate memory for an ASCII string terminated or not terminated with zero depending on the _zeroTerm_ parameter. The above example defines the directives _byte_, _half_, _word_, _ascii_ (non-zero terminated string) and _asciiz_ (zero terminated string) that place data in the memory array _M_ (specified in nML using the <code>mem</code> keyword). The size of an addressable memory location is 8 bits (or 1 byte). After all data directives are configured, data can be defined using the <code>data</code> block: <pre><code class="ruby"> data { label :data1 word 1, 2, 3, 4 label :data2 half 0xDEAD, 0xBEEF label :hello ascii ''Hello'' label :world asciiz ''World'' space 8 } </code></pre> TODO ------------------- h3. Instruction Calls The <code>pre</code>, <code>post</code> and <code>run</code> methods of a test template class contain specifications of instruction call sequences. Instruction calls are specified using the *_instruction_* and *_addressing mode_* abstractions. Instructions are self-explanatory, they simply represent target assembler instructions. Every instruction argument is a parameterized addressing mode that explains the meaning of the provided values. For example, an addressing mode can refer to a register, a memory location or hold an immediate value. In other words, an instruction call is an instruction that uses appropriate addressing modes initialized with appropriate values. The format of an instruction call description looks like this: <pre><code class="ruby">instruction addr_mode1(:arg1_1 => value1_1, :arg1_2 => value1_2, ...), addr_mode2(:arg2_1 => value2_1, ...), ...</code></pre> This format implies that addressing modes are parameterized with hash tables where they key is in the name of the addressing mode parameter and the value is the value to be assigned to this parameter. Also, there is a shorter format based on methods with a variable number of arguments. In this case, values are expected to come in the same order as corresponding parameter definitions. The shorter format looks like this: <pre><code class="ruby">instruction addr_mode1(value1_1, value1_2, ...), addr_mode2(value2_1, ...), ...</code></pre> The code below demonstrates both approaches: <pre><code class="ruby"> mov reg(:i => 0), imm(:i => 0xFF) # The use of hash maps mov reg(0), imm(0xFF) # The use of variable numbers of arguments </code></pre> h3. Instruction Call Blocks h2. *TODO: REWRITE* h3. Basic features The two core abstractions used by MicroTESK parser/simulator and Ruby-TDL are an *instruction* and an *addressing mode*. An instruction is rather self-explanatory, it simply represents a target assembler instruction. Every argument of an instruction is a parametrized *addressing mode* that explains the meaning of the provided values to the simulator. The mode could point to the registers, for instance, or to a specific memory location. It can also denote an immediate value - e.g. a simple integer or a string. Thus, a basic template is effectively a sequence of instructions with parametrized addressing modes as their arguments. Each template is a class that inherits a basic Template class that provides most of the core Ruby-TDL functionality. So, to write a template you need to subclass Template first: <pre><code class="ruby">require_relative "_path-to-the-rubymt-library_/mtruby" class MyTemplate < Template</code></pre> While processing a template Ruby-TDL calls its %pre%, %run% and %post% methods, loosely meaning the pre-conditions, the main body and the post-conditions. The %pre% method is mostly useful for setup common to many templates, the %post% method will be more important once sequential testing is introduced. Most of the template code is supposed to be in the %run% method. Thus, a template needs to override one or more of these methods, most commonly %run%. To get %pre% and %post% over with, the most common usage of these is to make a special non-executable class and then subclass it with the actual templates: <pre><code class="ruby">require_relative "_path-to-the-rubymt-library_/mtruby" class MyPrepost < Template def initialize super @is_executable = no end def pre # Your ''startup'' code goes here end def post # Your ''cleanup'' code goes here end end</code></pre> <pre><code class="ruby">require_relative "_path-to-the-rubymt-library_/mtruby" class MyTemplate < MyPrepost def initialize super @is_executable = yes end def run # Your template code goes here end end</code></pre> These methods essentially contain the instructions. The general instruction format is slightly more intimidating than the native assembler and looks like this: <pre><code class="ruby">instruction_name addr_mode1(:arg1_1 => value, :arg1_2 => value, ...), addr_mode2(:arg2_1 => value, ...), ...</code></pre> So, for instance, if the simulator has an ADD(MEM(i), MEM(i)|IMM(i)) instruction, it would look like: <pre><code class="ruby">add mem(:i => 42), imm(:i => 128)</code></pre> Thankfully, there are shortcuts. If there''s only one argument expected in the addressing mode, you can simply write its value and never have to worry about the argument name. And, by convention, the immediate values are always denoted in the simulator as the IMM addressing mode, so the template parser automatically accepts numbers and strings as such. Thus, in this case, the instruction can be simplified to: <pre><code class="ruby">add mem(42), 128</code></pre> As a matter of fact, if you''re sure about the order of addressing mode arguments, you can omit the names altogether and simply provide the values: <pre><code class="ruby">instruction_name addr_mode1(value1, value2, ...) ...</code></pre> If the name of the instruction conflicts with an already existing Ruby method, the instruction will be available with an %op_% prefix before its name. h3. Test situations _This section is to be taken with a grain of salt because the logic and the interface behind the situations is not yet finalized and mostly missing from the templates and shouldn''t be used yet_ _Big TODO: define what is a test situation_ To denote a test situation, add a Ruby block that describes situations to an instruction, this will loosely look like this (likely similar to the way the addressing modes are denoted): <pre><code class="ruby">sub mem(42), mem(21) do overflow(:op1 => 123, :op2 => 456) end</code></pre> h3. Instruction blocks Sometimes a certain test situation should influence more than just one instruction. In that case, you can pass the instructions in an atomic block that can optionally accept a Proc of situations as its argument (because Ruby doesn''t want to be nice and allow multiple blocks for a method, and passing a Hash of Proc can hardly be called comfortable). <pre><code class="ruby">p = lambda { overflow(:op1 => 123, :op2 => 456) } atomic p { mov mem(25), mem(26) add mem(27), 28 sub mem(29), 30 }</code></pre> h3. Groups and random selections _(N.B. REMOVED in r1923. The implementation does not work in the current build and, therefore, was removed. The described features must be reviewed and reimplemented if required.)_ From source code comments: <pre> # VERY UNTESTED leftovers from the previous version ("V2", this is V3) # Should work with the applied fixes but I''d be very careful to use these # As things stand this is just a little discrete probability utility that # may or may not find its way into the potential ruby part of the test engine </pre> There are certain ways to group together or randomize addressing modes and instructions. To group several addressing modes together (this only works if they have similar arguments) create a mode group like this: <pre><code class="ruby">mode_group "my_group" [:mem, :imm]</code></pre> You can also set weights to each of the modes in the group like this: <pre><code class="ruby">mode_group "my_group" {:mem => 1.5, :imm => 2.5}</code></pre> The name of the group is converted into a method in the Template class. To select a random mode from a group, use %sample% on this generated method: <pre><code class="ruby">add mem(42), my_group.sample(21)</code></pre> _TODO: sampling already parametrized modes_ The first method of grouping instructions works in a similar manner with the same restrictions on arguments: <pre><code class="ruby">group "i_group" [:add, :sub]</code></pre> <pre><code class="ruby">group "i_group" {:add => 0.3, :sub => 0.7]</code></pre> <pre><code class="ruby">i_group.sample mem(42), 21</code></pre> You can also run all of the instructions in a group at once by using the %all% method: <pre><code class="ruby">i_group.all mem(42), 21</code></pre> The second one allows you to create a normal block of instructions, setting their arguments separately. <pre><code class="ruby">block_group "b_group" do mov mem(25), mem(26) add mem(27), 28 sub mem(29), 30 end</code></pre> In this case to set weights you should call a %prob% method before every instruction: <pre><code class="ruby">block_group "b_group" do prob 0.1 mov mem(25), mem(26) prob 0.7 add mem(27), 28 prob 0.4 sub mem(29), 30 end</code></pre> The usage is almost identical, but without providing the arguments as they are already set: <pre><code class="ruby">b_group.sample b_group.all</code></pre> _Not sure how does it work inside atomics when the group is defined outside, needs more consideration_ _TODO: Permutations_ Any normal Ruby code is allowed inside the blocks as well as the %run%-type methods, letting you write more complex or inter-dependent templates. h3. TODO: Labels To set a label write: <pre><code class="ruby">label :label_name</code></pre> To use a label in an instruction that accepts one (under the hood it''s just a simple immediate #IMM value - just not a pre-defined one until it''s actually defined): <pre><code class="ruby">b greaterThan, :label_name</code></pre> h3. TODO: Debug To get a value from registers use: <pre><code class="ruby">get_reg_value("register_name", index)</code></pre> Right now the pre-processing and the execution of instructions are separated due to ambiguous logic regarding labels and various blocks and atomics. This may be changed later, so these special debugging blocks might become unnecessary. By default what''s written in the template is run during pre-processing so you have to use special blocks if you want to run some Ruby code during the execution stage, most likely some debugging. To print some debug in the console during the execution of the instructions use the exec_debug block: <pre><code class="ruby">exec_debug { puts "R0: " + get_reg_value("GPR", 0).to_s + ", R1: " + get_reg_value("GPR", 1).to_s# + ", label code: " + self.send("cycle" + ind.to_s).to_s }</code></pre> To save something that depends on the current state of the simulator to the resulting assembler code use exec_output that should return a string: <pre><code class="ruby">exec_output { "// The result should be " + self.get_reg_value("GPR", 0).to_s }</code></pre>