5. Customizing and Extending KEDR

5.1. Using Code Generator to Create Custom Modules
5.2. Writing Custom Payloads for Call Monitoring
5.3. Writing Custom Payloads for Fault Simulation
5.4. trace.happensBefore Parameter for Call Monitoring and Fault Simulation Payloads
5.5. Writing Custom Scenarios for Fault Simulation
5.6. Implementing Custom Types of Analysis
5.6.1. Choosing the Counters and the Functions to Process
5.6.2. Creating the Payload Module
5.6.3. Building the Payload Module
5.6.4. Using the Payload Module

5.1. Using Code Generator to Create Custom Modules

To automate creation of multiple modules with simular functionality, KEDR actively uses template-based generation of files. This approach facilitates code reuse as it makes it possible to separate the common parts in the implementation of the modules and the parts specific to each module. This also allows the developer of the new modules to concentrate mostly on the logic of what (s)he wants to implement there rather than on writing and debugging boilerplate code.

So, to develop a new module this way, it is only necessary to prepare a short definition of what this module is supposed to do in addition to the basic functionality. Creation of the source file(s) for this module will be performed automatically by the code generator.

Apart from usage within KEDR, this mechanism can also be used for creating custom specialized modules for different purposes: payload modules for call monitoring or fault simulation, fault simulation indicators, etc. This approach to development of custom modules has many advantages:

  • fast development - the implementation of a new payload payload module for call monitoring requires, for example, about 10 lines in the definition file for the header part (it contains the name of the module, the author and the license, etc.), and about 10 lines per replacement function (description of the arguments and the return value, etc.).

  • clear and readable definition files - all features of your module are described in one place, the so called definition file rather than scattered over different files or over one long file. Every line in a definition file is self-explanatory.

  • high level of abstraction - when writing a definition file, you do not need to care about what file(s) will be generated from it and how exactly any particular feature will be implemented.

  • less error-prone code - if the definition file is written correctly, correct code of the module will be generated from it. Most of the lines in that file simply define the names of some entities (variables, types, etc.) that will appear in the generated code. The rare inter-line dependencies as well as code chunk definitions can be easily debugged in the clear and short definition file.

  • easier maintenance - if the templates are updated to implement some new basic functionality, to fix errors or for any other reason, it is enough to run the generator again to update the code of the modules you have created. The enhancements and fixes will thus automatically propagate to all the modules generated using those templates.

Of course, using the generator is not an universal way to extend functionality of the standard KEDR modules. If some functionality is not provided by the templates, it will not be available for the generated modules. You will probably need to implement it manually - or prepare the templates of your own. Still, in many cases it can be very convenient to use the generator with the default templates to create modules for KEDR.

Let us now consider the common format of definition files.

The generator is based on MiST Engine library from Template2Code project and is very similar to mist_gen example from that project. As a result, the format of definition files accepted by the generator is the same as the format of configuration files accepted by mist_gen. The format is fully described here. The only difference is that a definition file (as well as a configuration file for mist_gen) may contain [group] keywords that divide the file into blocks.

A definition file is treated as an array of records. The lines that contain only whitespace characters (spaces and tabs) are ignored, so are the lines where the first non-whitespace character is #:

# The next line is empty, so it will be ignored

    # This line will be ignored too.

Lines like

<parameter-name> = <value>

define a parameter with name <parameter-name> and assign the string <value> to it. <parameter-name> may only contain latin letters, digits, dots, hyphens and underscores. The names are case-sensitive. Whitespace characters surrounding <parameter-name> and <value> are ignored.

# Define parameter with name 'a' and value '135'
a = 135
# Define parameter with name 'b' and value 'some string'
b = some string
# Define parameter with name 'expression' and value '2 + 3 = 5'
expression = 2 + 3 = 5

There is a way to define parameter with a long value:

# Define parameter with name 'long-string' and value 'string1 string2 string3'
# Note, that leading whitespace characters are ignored.
long-string = string1 \
    string2 \
    string3

In addition, parameters with multiline values can be defined too:

multi-line-parameter =>>
    line1
    line2
    ...
    lineN
<<

The value of multi-line-parameter is precisely as follows:

    line1
    line2
    ...
    lineN

Note that a newline character should immediately follow >> delimiter and apart from the delimiter << the should be no characters on the line (except whitespace characters).

# Correct definition of a multiline parameter containing >>
multi-line-parameter =>>
    <<a>>
    <<b>>
    <<

The generator only extracts the set of parameters with and their values from the definition file. The order in which these parameters are listed is not important. For example, the following definition files

a = 5
b = 10

and

b = 10
a = 5

have actually the same meaning.

However when several definitions assign values to the same parameter, the parameter becomes multi-valued and the order of the assignments becomes important. Example:

a = 5
a = 10

This means a={'5','10'}, but

a = 10
a = 5

means a={'10','5'}.

Depending on the meaning of the parameter, the difference in the order of its values may be important (e.g. the order of the function parameters is critical), or it may be not (e.g. the order of the replacement functions in the file).

As a rule, the order of values of two multi-valued parameters is only significant if these parameters describe one-value attributes of same object:

obj.name = object1
obj.description = This is object1
obj.name = object2
obj.description = This is object2

This defines obj.name as object1, object2, obj.description as This is object1, This is object2. This may mean there are two object instances with attributes object1, This is object1 and object2, This is object2

Let us consider the following definitions where the values of obj.description are given in a reverse order.

obj.name = object1
obj.description = This is object2
obj.name = object2
obj.description = This is object1

This defines obj.name as object1, object2, obj.description as This is object2, This is object1. This may mean two object instances with attributes object1, This is object2 and object2, This is object1, which is probably not what you want.

A simple way to avoid such confusion with ordering is to define all attributes for one instance first and only then define attributes for another one.

If some object has a non-constant set of attributes (e.g., one of its attributes may have multiple values or one of its attribute is optional), then you cannot define several instances of this object in one definition file. This is because the generator cannot determine which instance each particular value of an attribute belongs to. To address this problem, [group] keyword was introduced in the format of definition files. This keyword denotes a new group of definitions that starts just after this keyword and ends before the next occurence of same keyword or at the end of the file.

module_name = Arrays
[group]
array.name = array1
array.values = val1
[group]
array.name = array2
array.values = val2
array.values = val3
[group]
array.name = array3

There are 3 groups in this file. The first one defines array.name='array1' and array.values='val1', the second - array.name='array2' and array.values={'val2', 'val3'}, third - array.name='array3'. Each group can be interpreted as a definition of an array object. The object named array1 contains one element val1, the object named array2 contains two elements val2 and val3, the object named array3 contains no elements.

[group] keyword does not prevent gathering of all parameter assignments. That is, the global meaning of this file is module_name='Arrays', array.name={'array1', 'array2', 'array3'} and array.values={'val1', 'val2', val3'}. This information will be processed by the generator using one set of templates. But besides that, the information from each group will also be processed using another set of templates. This processing will result in a new multi-valued parameter which values are the results processing of the groups. This parameter is referred to as block, and so is the set of templates used to generate it. This parameter can be used at the top level of processing, the set of templates for which is referred to as document.

As far as payload modules are concerned, document templates define the overall layout of the generated source and header files while block templates define the parts of the code related to a replacement function. That is, a [group] block corresponds to a replacement function in this case.

This section has given a brief overview of template-based code generation mechanism used in KEDR. This should be enough though if you would like to write you own definition files for custom modules with the templates provided by KEDR. See MiST Engine Reference Manual and mist_gen example if you want to learn more about this way of template-based code generation.

5.2. Writing Custom Payloads for Call Monitoring

This section describes how to write a new payload module for call monitoring using a tool provided by KEDR to generate source files from the templates. Common abilities of this tool are described in detail in Section 5.1, “Using Code Generator to Create Custom Modules”.

Typical purposes of a custom payload module of this kind could be as follows:

  • support call monitoring for the functions for which it is not supported by KEDR out-of-the-box;

  • change the set of parameters output to the trace, in case you need something other than the arguments and the return value of the replacement function to be output.

The whole infrastructure necessary for building the payload module from the definition file is located in custom_payload_callm subdirectory in the directory where the examples provided with KEDR are installed. Here are its contents:

payload.data
'definition' file to create the payload module
makefile
file for common build infrastructure for make utility
Kbuild
file for building kernel module from C sources
templates
directory containing the templates used for generating sources from the 'definition' file

To use all this in development of your payload module, copy the contents of that directory to a directory of your choice.

The first and the main step is to rewrite payload.data to reflect definitions of your payload module.

At the global level (i.e. before the first group begins), this file should contain definitions for the following parameters:

module.name
string, which will be used as module name inside its source files
module.author
author of the module
module.license
license for the module

In addition, the following parameters may be defined at the global level:

header
the code (may be multiline) to be inserted before the definition of replacement functions. This code usually contains '#include' directives for header files which define target functions and types of its parameters.

Example of global section of the payload.data file:

# This module processes the calls to module_put function.

module.name = payload_callm_module_put
module.author = Andrey Tsyvarev
module.license = GPL

header =>>
#include <linux/module.h>
<<

For each target function the calls to which should be processed, a group should be prepared in the definition file. Each group should contain definitions for the following parameters:

function.name
name of the target function
returnType
return type of the target function if it is not void, otherwise shouldn't be defined at all
arg.type
(multi-valued) types of the parameters of the target function, starting with the first one. If the function has no parameters, shouldn't be assigned at all.
arg.name
(multi-valued) names of the parameters of the target function, starting with the first one. If the function has no parameters, shouldn't be assigned at all. Parameters of the replacement function will be accessible via these names in the code.
trace.param.name
(multi-valued) variable names, which values will be output to the trace. This variables should be accessible in replacement function (see below).
trace.param.type
(multi-valued) types of the values, which will be output into the trace. This types will be used for casting the values of the corresponding variables before they will be output (so, these types may differ from the real types of variables).
trace.formatString
format string which is used for printf-like output of values from replacement function (see parameters trace.param.name and trace.param.type)

Important

Output to the trace is currently supported only for the variables of simple types (i.e. no strings, arrays, structures, etc.). Pointers can be output (e.g., using %p or a similar format) but not the data they point to. This is due to the limitations of kedr_gen. In the future versions, these limitations may be removed or at least relaxed.

Additionally, the following parameters can be defined at group level:

prologue
code (may be multiline) which will be inserted at the start of replacement function (before the call to target function). Usually, this code declares variables which will be used in output and the following code sections (see below).
middleCode
code (may be multiline) which will be inserted after call to the target function and before the output of values to the trace. Variables can not be declared there. Usually, this code calculates the values of the variables if it is necessary before the output to the trace is done.
epilogue
code (may be multiline) which will be inserted at the end of the replacement function (after the values are output to the trace). If prologue or middleCode request some resources from the kernel, this code can be used to release these resources.

Visibility of variables in different parts of code of the replacement function can be described in pseudocode as follows:

returnType replacement_function(arg.name...)
{
    prologue
    {
        returnType returnValue = target_function(arg.name...);
        middleCode
        output(trace.formatString, trace.param.name...);
    }
    epilogue
}

If the target function does not return void, variable returnValue refers to the returning value of the function. It may be used as the name of variable to output (arg.name) and in the middleCode.

Example of the group section for module_put() target function:

[group]
    # Name and return type of the target function
    function.name = module_put

    # Names and types of the arguments of the target function
    arg.type = struct module*
    arg.name = m
    
    # The parameters to be output to the trace. 
    trace.param.type = void*
    trace.param.name = m

    # The format string to be used for trace output.
    trace.formatString = "arguments: (%p)"

# End of the group of definitions for module_put().

Example of the group section for __kmalloc() target function (note the usage of returnValue variable as the value of trace.param.name parameter):

[group]
    # Name and return type of the target function
    function.name = __kmalloc
    returnType = void*

    # Names and types of the arguments of the target function
    arg.type = size_t
    arg.name = size

    arg.type = gfp_t
    arg.name = flags
    
    # The parameters to be output to the trace. 
    trace.param.type = size_t
    trace.param.name = size

    trace.param.type = unsigned int
    trace.param.name = flags

    trace.param.type = void*
    trace.param.name = returnValue

    # The format string to be used for trace output.
    trace.formatString = "arguments: (%zu, %x), result: %p"

# End of the group of definitions for __kmalloc().

Example of the group section for kmem_cache_alloc() target function (note the contents of prologue parameter):

    [group]
    # Name and return type of the target function
    function.name = kmem_cache_alloc
    returnType = void*

    # Names and types of the arguments of the target function
    arg.type = struct kmem_cache*
    arg.name = mc
    
    arg.type = gfp_t
    arg.name = flags
    
    prologue =>>
size_t size = kmem_cache_size(mc);
<<
    # The parameters to be output to the trace. 
    trace.param.type = size_t
    trace.param.name = size

    trace.param.type = unsigned int
    trace.param.name = flags

    trace.param.type = void*
    trace.param.name = returnValue

    # The format string to be used for trace output.
    trace.formatString = "arguments: (%zu, %x), result: %p"

# End of the group of definitions for kmem_cache_alloc().

As you can see, kmem_cache_alloc() function does not have size argument. If we still want to output the size of the requested memory block to the trace, we need to determine it within the replacement function. This is exactly what is done in the prologue code above.

After writing payload.data file, you can change the value of module_name variable in the makefile and Kbuild according to the one you use as value of module.name parameter. In the future, this step may be implemented in the makefile itself.

The last step is to run make utility. This will invoke the code generator tool (kedr_gen) to create the sources for your payload module, then the module will be built.

5.3. Writing Custom Payloads for Fault Simulation

This section describes how to write a new payload module for fault simulation using a tool provided by KEDR to generate source files from the templates. Common abilities of this tool are described in detail in Section 5.1, “Using Code Generator to Create Custom Modules”.

Typical purposes of a custom payload module of this kind could be as follows:

  • support fault simulation for the functions for which it is not supported by KEDR out-of-the-box;

  • change the set of parameters output to the trace, in case you need something other than the arguments and the return value of the replacement function to be output;

  • provide a different set of variables to be passed to fault simulation indicator (see below) - this can be necessary if you would like to implement custom fault simulation scenarios.

Note

Note that the infrastructure provided by KEDR for fault simulation (points, indicators and the respective control facilities) could be used for other purposes as well. What it does in general, is to alter the behaviour of a call made by the target module if the indicator returns nonzero, or allow the target function to do its work normally if the indicator returns 0. The altered behaviour is controlled by the user, see the description of fpoint.fault_code parameter below. So, instead of fault simulation, you could implement, say, timeout/sleep injection (i.e., delaying the return from the replacement function which might help with concurrency analysis, etc.) or whatever else you want.

The files necessary to build the payload module from the definition file are located in custom_payload_fsim subdirectory in the directory where the examples provided with KEDR are installed. Here are its contents:

payload.data
'definition' file to create the payload module
makefile
file for common build infrastructure for make utility
Kbuild
file for building kernel module from C sources
templates
directory containing the templates used for generating sources from the 'definition' file

To use all this in development of your payload module, copy the contents of that directory to a directory of your choice.

The first and the main step is to rewrite payload.data to reflect definitions of your payload module.

At the global level (i.e. before the first group begins), this file should contain definitions for the following parameters:

module.name
string, which will be used as module name inside its source files
module.author
author of the module
module.license
license for the module

In addition, the following parameters may be defined at the global level:

header
the code (may be multiline) to be inserted before the definition of replacement functions. This code usually contains '#include' directives for header files which define target functions and types of its parameters.

Example of global section of the payload.data file:

# This module processes the calls to kstrdup function.

module.name = payload_fsim_kstrdup
module.author = Andrey Tsyvarev
module.license = GPL

header =>>
#include <linux/string.h>
<<

For each target function the calls to which should be processed, a group should be prepared in the definition file. Each group should contain definitions for the following parameters:

function.name
name of the target function
returnType
return type of the target function if it is not void, otherwise shouldn't be defined at all
arg.type
(multi-valued) types of the parameters of the target function, starting with the first one. If the function has no parameters, shouldn't be assigned at all.
arg.name
(multi-valued) names of the parameters of the target function, starting with the first one. If the function has no parameters, shouldn't be assigned at all. Parameters of the replacement function will be accessible via these names in the code.
trace.param.name
(multi-valued) variable names, which values will be output to the trace. This variables should be accessible in replacement function (see below).
trace.param.type
(multi-valued) types of the values, which will be output into the trace. This types will be used for casting the values of the corresponding variables before they will be output (so, these types may differ from the real types of variables).
trace.formatString
format string which is used for printf-like output values from replacement function (see parameters trace.param.name and trace.param.type)

Important

Output to the trace is currently supported only for the variables of simple types (i.e. no strings, arrays, structures, etc.). Pointers can be output (e.g., using %p or a similar format) but not the data they point to. This is due to the limitations of kedr_gen. In the future versions, these limitations may be removed or at least relaxed.

Until this moment, only parameters which are also used for the payload modules for call monitoring have been described (see Section 5.2, “Writing Custom Payloads for Call Monitoring”). These parameters have almost the same meaning for the payload modules for fault simulation (these modules extend the functionality of call monitoring payloads). Listed below are the parameters that are meaningful only for fault simulation:

fpoint.fault_code
code (may be multiline) which should be executed instead the call to the target function to simulate failure of the latter. Usually, this code simply sets returnValue variable which will be returned to the caller to indicate that a failure has occured.
fpoint.param.name
(multi-valued) names of the variables which values will be passed to the indicator function and may be used to specify the scenario of fault simulation in it. The order of these variables is important, because they will be passed sequentially to the indicator function. Usually, only the parameters of the target function are passed to the indicator.
fpoint.param.type
(multi-valued) types of the values that will be passed to the indicator function. These types will be used to properly cast the values before passing to the indicator (so these types may differ from the real types of the variables).

Additionally, the following parameters can be defined at the group level. They are similar to the ones used in the payload modules for call monitoring. The differences concerning fault simulation are also described below.

prologue
code (may be multiline) which will be inserted at the start of replacement function (before the call to the indicator function which should decide whether need to simulate a failure or not). Usually, this code declares variables that will be used in the output, will be passed to the indicator function or used by the following code sections (see below).
middleCode
code (may be multiline) which may calculate data for output. Variables can not be declared there. Note, that this code is placed after the call to the indicator function and after the possible call to the target function (or the error-path). Therefore, middleCode cannot affect the behaviour of these parts of the replacement function. Its intended usage it to prepare the data to the output to the trace: calculate the necessary values, etc.
epilogue
code (may be multiline) which will be inserted at the end of the replacement function (after the values are output to the trace). If prologue or middleCode request some resources from the kernel, this code can be used to release these resources.

Visibility of variables in different parts of code of the replacement function may be described in pseudocode as follows:

returnType replacement_function(arg.name...)
{
    prologue
    {
        returnType returnValue;
        if(indicator_function(fpoint.param.name...) == 0)
        {
            returnValue = target_function(arg.name...);
        }
        else
        {
            fpoint.fault_code;
        }
        middleCode
        output(trace.formatString, trace.param.name...);
    }
    epilogue
}

If the target function does not return void, variable returnValue can be used as name of variable to output (arg.name) and in the middleCode. Also, this variable should be assigned in the fpoint.fault_code (otherwise it will not be initialized if the failure of the target function is simulated).

Here is an example of the group section for __kmalloc target function. Note the definition of fpoint.fault_code parameter. Its value is returnValue = NULL; because '=' characters after the leftmost one have no special meaning and are treated as the part of the value.

[group]
    # Name and return type of the target function
    function.name = __kmalloc
    returnType = void*

    # Names and types of the arguments of the target function
    arg.type = size_t
    arg.name = size

    arg.type = gfp_t
    arg.name = flags
    
    # The parameters to be output to the trace. 
    trace.param.type = size_t
    trace.param.name = size

    trace.param.type = unsigned int
    trace.param.name = flags

    trace.param.type = void*
    trace.param.name = returnValue

    # The format string to be used for trace output.
    trace.formatString = "arguments: (%zu, %x), result: %p"

    # Fault Simulation
    fpoint.param.type = size_t
    fpoint.param.name = size

    fpoint.param.type = gfp_t
    fpoint.param.name = flags

    fpoint.fault_code = returnValue = NULL;

# End of the group of definitions for __kmalloc().

Example of the group section for kstrdup() target function:

[group]
    # Name and return type of the target function
    function.name = kstrdup
    returnType = char*

    # Names and types of the arguments of the target function
    arg.type = const char*
    arg.name = str
    
    arg.type = gfp_t
    arg.name = flags

    # Calculate length of the string
    prologue = size_t len = strlen(str);

    # The parameters to be output to the trace.
    trace.param.type = size_t
    trace.param.name = len

    trace.param.type = unsigned int
    trace.param.name = flags

    trace.param.type = void*
    trace.param.name = returnValue

    # The format string to be used for trace output.
    trace.formatString = "arguments: (strlen=%zu, flags=%x), result: %p"

    # Fault Simulation
    fpoint.param.type = size_t
    fpoint.param.name = len

    fpoint.param.type = gfp_t
    fpoint.param.name = flags

    fpoint.fault_code = returnValue = NULL;

# End of the group of definitions for kstrdup().

Note the usage of len variable for fault simulation in the example above. This value is calculated in the prologue based on the target function parameter str and is then used as one of the parameters to be passed to the fault simulation indicator.

It is possible for different replacement functions to share the same indicator function (and, therefore, share the scenario). This is more than simply using the same indicator functions, this is using a single instance of an indicator. These functions may use some data private for each indicator instance. In case if sharing, this data will be also shared.

Sharing of the indicator functions can be useful, for example, for the target functions that are known to use a common mechanism internally (e.g. memory allocator), and you want to simulate a failure of this mechanism.

If, say, function g should share the fault simulation scenario with f, you should define parameter fpoint.reuse_point in the group of function g with value f. In this case, the group for function f should precede the group for function g. Example of sharing a fault simulation scenario for __kmalloc and krealloc:

    ...
[group]
    # Name and return type of the target function
    function.name = __kmalloc
    returnType = void*

    # Names and types of the arguments of the target function
    arg.type = size_t
    arg.name = size

    arg.type = gfp_t
    arg.name = flags
    
    ...
    
    # Fault Simulation
    fpoint.param.type = size_t
    fpoint.param.name = size

    fpoint.param.type = gfp_t
    fpoint.param.name = flags

    fpoint.fault_code = returnValue = NULL;
    
# End of the group of definitions for __kmalloc().

[group]
    # Name and return type of the target function
    function.name = krealloc
    returnType = void*

    # Names and types of the arguments of the target function
    arg.type = const void*
    arg.name = p

    arg.type = size_t
    arg.name = size

    arg.type = gfp_t
    arg.name = flags
    
    ... 

    # Fault Simulation
    fpoint.reuse_point = __kmalloc

    fpoint.param.type = size_t
    fpoint.param.name = size

    fpoint.param.type = gfp_t
    fpoint.param.name = flags

    fpoint.fault_code = returnValue = NULL;

# End of the group of definitions for krealloc().

Note that in the group for krealloc function, we use the same names and types of the variables intended to be passed to the indicator function, as for __kmalloc.

After writing payload.data file, you can change the value of module_name variable in the makefile and Kbuild according to the one you use as value of module.name parameter. In the future, this step may be implemented in the makefile itself.

The last step is to run make utility. This will invoke the code generator tool (kedr_gen) to create the sources for your payload module, then the module will be built.

5.4. trace.happensBefore Parameter for Call Monitoring and Fault Simulation Payloads

There is a parameter that changes control flow of the replacement function in call monitor and fault simulation payloads:

trace.happensBefore
If this parameter is defined (its precise value does not matter), the trace will be output before the target function is called. With this parameter defined, middleCode parameter should not be used as well as returnValue variable.

The main purpose of using trace.happensBefore parameter is collecting correct trace on SMP systems and the like. Suppose, two threads of execution call mutex_lock and mutex_unlock functions for the same mutex. One of the correct orders of these calls is:

[1]    mutex_lock
[1]    mutex_unlock
[2]    mutex_lock
[2]    mutex_unlock

([n] means that the operation is performed by the thread n).

So one may expect that same order will be recorded in the trace:

1    called_mutex_lock
1    called_mutex_unlock
2    called_mutex_lock
2    called_mutex_unlock

As described in the Section 5.2, “Writing Custom Payloads for Call Monitoring”, a replacement function calls the target function first and only after that outputs its parameters to the trace. So the following order of the instructions is possible:

[1]    [call replacement function for mutex_lock]
[1]    mutex_lock
[1]    output("called_mutex_lock")
[1]    [replacement function for mutex_lock returns]
[1]    [call replacement function for mutex_unlock]
[1]    mutex_unlock
[2]    [call replacement function for mutex_lock]
[2]    mutex_lock
[2]    output("called_mutex_lock")
[2]    [replacement function for mutex_lock returns]
[1]    output("called_mutex_unlock")
[1]    [replacement function for mutex_unlock returns]
[2]    [call replacement function for mutex_unlock]
[2]    mutex_unlock
[2]    output("called_mutex_unlock")
[2]    [replacement function for mutex_unlock returns]

Even though the order of calls to the target functions is correct, these instructions produce the trace that reflects impossible call order:

1    called_mutex_lock
2    called_mutex_lock
1    called_mutex_unlock
2    called_mutex_unlock

From the kernel's point of view, the calls to mutex_lock and mutex_unlock are not related to the trace output made by KEDR. So the operations that output trace can be performed in any order, no matter in what order the target functions were called.

To get a correct trace, we need to use trace.happensBefore parameter for the replacement function for mutex_unlock. At the abstract level, this parameter means Whenever mutex_unlock is called before some other function and nobody enforces this order explicitly, the order should be preserved in the trace.

Note that although mutex_lock function must also be called before the corresponding mutex_unlock, this order is not affected by trace.happensBefore parameter.

...
[1]    mutex_lock
...
[1]    mutex_unlock
...

The difference is that such order should be enforced explicitly, that is, by the user of these functions and thus of the target module. If the target module calls, say, mutex_lock strictly before mutex_unlock, the replacement function for mutex_lock will return strictly before the one for mutex_unlock starts executing. This, in turn, automatically enforces that the corresponding trace records will go in the right order too.

If trace.happensBefore parameter is defined, the control flow of the replacement functions could be described in pseudocode as follows:

returnType replacement_function(arg.name...)
{
    prologue
    output(trace.formatString, trace.param.name...);
    target_function(arg.name...);
    epilogue
}

For a fault simulation replacement function, the control flow is as follows in this case:

returnType replacement_function(arg.name...)
{
    prologue
    output(trace.formatString, trace.param.name...);
    if(indicator_function(fpoint.param.name...) == 0)
    {
        target_function(arg.name...);
    }
    else
    {
        fpoint.fault_code;
    }
    epilogue
}

Another example of internal happens-before relationship is the one between kfree and __kmalloc functions. This relationship reflects the fact that __kmalloc cannot return an address that was previously returned by another __kmalloc call and was not processed by kfree. Using trace.happensBefore parameter one can make sure the order of the trace records is correct:

 
    [group]
    # Name and return type of the target function
    function.name = kfree

    # Names and types of the arguments of the target function
    arg.type = void*
    arg.name = p
    
    # The parameters to be output to the trace.
    trace.param.type = void*
    trace.param.name = p

    # Happens-before relationship with kmalloc
    trace.happensBefore = yes

    # The format string to be used for trace output.
    trace.formatString = "arguments: (%p)"

# End of the group of definitions for kfree().

Note

For the trace records to reflect the fact that function A has happens-before ordering with function B, one should define trace.happensBefore parameter for the function A and should not define it for the function B. trace.happensBefore cannot be used to enforce two different orderings for the calls to a single function such as krealloc (which may be modelled as __kmalloc + kfree).

Usually, the functions for which trace.happensBefore parameter makes sense, return void and are not interesting for fault simulation.

5.5. Writing Custom Scenarios for Fault Simulation

The fault simulation scenarios described in Section 4.5, “Fault Simulation” are configurable and are probably enough for many cases. If they are not, a kernel module implementing a custom fault simulation indicator can be developed. This section describes how do this using a tool provided by KEDR to generate source files from the templates. Common abilities of this tool are described in detail in Section 5.1, “Using Code Generator to Create Custom Modules”

The whole infrastructure needed to build the module based on the definition file and the templates is located in custom_indicator_fsim subdirectory in the directory where the examples provided with KEDR are installed. Here are its contents:

indicator.data
'definition' file to create the module that will implement the fault simulation indicator
makefile
file for common build infrastructure for make utility
Kbuild
file for building kernel module from C sources
templates
directory containing the templates used for generating sources from the 'definition' file
calculator.c, calculator.h, control_file.c, control_file.h
additional source and header files that implement some of the indicator's functionality. These files are used for building the module.

To use all this in development of your module, copy the contents of that directory to a directory of your choice.

The first and the main step is to rewrite file indicator.data to reflect the definitions of your indicator module.

Unlike a payload module for call monitoring or fault simulation that can implement replacement functions for several target functions in a single module, each fault simulation indicator should be implemented in a separate module. So, groups are not used in the definition file for the indicator module. Only the global set of parameters is taken into account.

Indicator module should contain definitions of the following parameters:

module.author
author of the module
module.license
license for the module
indicator.name
name of the indicator, provided by the module. This is the very name that should be used when one applies the indicator to some target function (to be exact, to a fault simulation point).
indicator.parameter.type
(multi-valued) types of the values that the indicator function accepts. This is an important part of the indicator and will be described later in detail. This parameter may be assigned no value at all - in this case, the indicator function will accept no parameters.
indicator.parameter.name
(multi-valued) names of the values that the indicator function accepts.
expression.variable.name
(multi-valued) names of variables that can be used in an expression to set a particular scenario for the indicator (see also Section 4.5, “Fault Simulation”). The names themselves are by no means bound to the names of variables used in the indicator. The order of values is not important for this parameter. This parameter may even be left undefined as there are other ways to declare expression variables.
expression.variable.value
(multi-valued) values of the corresponding expression variables that will be used during the evaluation of the expression (that is, when the indicator function is called). Typically, these values refer to the parameters of the indicator function.
expression.variable.pname
(multi-valued) names of the parameters of the indicator function that can be used in an expression to set a particular scenario for the indicator. expression.variable.pname = var_a is equivalent to expression.variable.name = var_a + expression.variable.value = var_a. This parameter may even be left undefined as there are other ways to declare expression variables.

The main characteristic of a fault simulation indicator is a set of scenarios, which it can implement. Apart from pid parameter that can be used for each generated indicator and simply restricts the area of fault simulation, expression is the only indicator's parameter, which may affect the fault simulation scenario. An expression that uses only constant integers as arguments may implement only always simulate fail or never simulate fail scenarios. But if the expression can use variables which may have different values each time the expression is evaluated, the set of supported scenarios increases dramatically.

One type of variable that can be used in the expression is the parameter of the target function. E.g., expression (size > 100), where size corresponds to the target function parameter, implements the scenario simulate fail when size is greater than 100. The only way for the indicator to implement such usage of a target function's parameter is to declare this parameter as parameter of indicator function. So, the corresponding replacement function should pass this parameter to the indicator function whenever it should choose whether it needs to simulate a failure. This behaviour of the indicator is achieved by the following definitions (assume size parameter of the target function has type size_t):

indicator.parameter.type = size_t
indicator.parameter.name = size

This fragment only defines that the indicator function itself accepts parameter size. To permit using this parameter in the expression, the following definition should be used as well:

expression.variable.pname = size

Parameters expression.variable.name and expression.variable.value may be useful for such cases:

...
indicator.parameter.type = const char*
indicator.parameter.name = str
...
# Expression may use variables only with integer values, so we cannot use 
# a string parameter in it.
# But we can use the length of this string as parameter 'len'
expression.variable.name = len
expression.variable.value = strlen(str)
...

...
indicator.parameter.type = size_t
# Cannot use 'strlen' as name of the parameter, because strlen() is 
# the kernel function.
indicator.parameter.name = len
...
# But here 'strlen' is available - this is not a name of C variable.
expression.variable.name = strlen
# We only need to bind expression variable to its value.
expression.variable.value = len
...

However if we declare that the indicator function accepts parameter size of type size_t, we make this indicator not applicable for those target functions that do not accept a parameter of that type. Or to be more exact, the indicator is not applicable for (cannot be used from) the replacement functions that do not provide a parameter of this type to indicator function. This limitation holds even if this parameter is not really used in the current scenario.

Although it is acceptable for the indicator to use the variables in expression, which are not derived from the indicator's parameters like

expression.variable.name = prob50
expression.variable.value = random() % 2

it is not recommended, because there is a more efficient way to do this. The thing is that, the variables of this kind are evaluated every time the indicator function is called, no matter if this variable is used in the expression or not. This evaluation may take relatively long time in some cases. There is another type of variables which is applicable in such cases - runtime variables. Declaration of such variables has the following format:

expression.rvariable.name = prob50
expression.rvariable.code =>>
    return random() % 2;
<<

expression.rvariable.code parameter provides the code of the function which will be used whenever value of variable is really needed. The costs of such optimisation are a function call used instead of the inlined code when the value of the variable is needed, and inability to use local variables of the indicator function (and parameters of this function) to prepared the value of the runtime variable.

To simplify writing expressions and to make them more readable, named constants can be declared and then used there. There are two ways to do this:

expression.constant.name = constant_100
expression.constant.value = 100

This makes the constant with name constant_100 and value 100 available for usage in the expressions.

expression.constant.cname = GFP_ATOMIC

This makes the constant with name GFP_ATOMIC which evaluates to GFP_ATOMIC available for usage in the expressions. Expression flags == GFP_ATOMIC is clearer and easier to read than flags == 32, isn't it?

Additionally, the following parameter can be defined:

global
code (may be multiline) that will be inserted at the global scope and its definitions will be visible everywhere in the source file of the indicator. Usually, this code contains #include directives for the header files containing types definitions of parameters of the indicator and declarations of the functions used to obtain the values of the parameters.

Example of indicator for kstrdup function:

# This module implements indicator for kmalloc-like functions.

module.author = Andrey Tsyvarev
module.license = GPL

global =>>
#include <linux/gfp.h>      /* gfp_flags constants */
#include <linux/types.h>    /* size_t */
<<

indicator.name = kmalloc

indicator.parameter.type = size_t
indicator.parameter.name = size

indicator.parameter.type = gfp_t
indicator.parameter.name = flags

expression.constant.c_name = GFP_NOWAIT
expression.constant.c_name = GFP_KERNEL
expression.constant.c_name = GFP_USER
expression.constant.c_name = GFP_ATOMIC

expression.variable.pname = size
expression.variable.pname = flags

After writing payload.data file, you can change the value of module_name variable in the makefile and Kbuild according to the one you use as value of module.name parameter.

The last step is to run make utility. This will invoke the code generator tool (kedr_gen) to create the sources for your payload module, then the module will be built.

5.6. Implementing Custom Types of Analysis

KEDR is not only a system to perform call monitoring and fault simulation for the target kernel modules. It is a framework that allows to implement different kinds of analysis based on the information about the function calls made by the target module.

In this section, we will show how to create a custom analysis system on top of KEDR. The system we are going to use as an example is rather simple: it maintains a set of counters accessible from user space that provide some information about the actions of the target module.

This analysis system will use neither call monitoring nor fault simulation facilities of KEDR. It will only rely on KEDR core and on the API it provides. Other types of analysis could be implemented in a way similar to this example.

In general, a custom analysis system based on KEDR can be created in the following steps.

  1. Determine which information about the actions of the target module should be processed by your analysis system. Decide whether it is enough to process (and may be alter to some extent) the function calls to collect this information. If so, KEDR could be of help here.

  2. Determine the calls to which functions your system needs to intercept to collect the necessary data or alter the behaviour of the target module in a required way. Note that it is only ordinary functions that count here rather than macros or inlines.

  3. Prepare the source code of the payload module for KEDR that will process these intercepted functions. The examples we provide with KEDR as well as the skeleton of a payload module described in this manual can be helpful here.

  4. Build the payload module. This is done in almost the same way as for any other kernel module.

Once the above steps are completed, KEDR utilities can be used to load your payload module along with the KEDR core. You can now load the target module and your system will start analyzing it.

Note

The source code of the analysis system developed in this example is available in

<kedr_install_dir>/share/kedr/examples/counters/.

5.6.1. Choosing the Counters and the Functions to Process

Suppose the following counters are going to be supported by our analysis system:

  • total number of memory allocation attempts;

  • number of memory allocation attempts that have failed;

  • size of the largest memory block requested to be allocated;

  • total number of mutex lock operations;

  • mutex balance, i.e. the difference between the total numbers of lock and unlock operations.

To make the counters accessible from the user space, we can, for example, provide a file in kedr_counters_example directory in debugfs for each one of them.

Once we have decided which data concerning a target kernel module our system will be collecting and processing, we need to determine which function calls made by the module the system should intercept.

Consider the first three counters. All of them are related to memory allocation. To collect necessary data when the target module operates, we can use call interception facilities provided by KEDR. When the target module calls some function that allocates memory, the corresponding function provided by our analysis system will be called instead with the same arguments.

There is a number of memory allocation functions available for kernel modules. Assume for simplicity that we choose to process only the calls to the following ones:

  • void* __kmalloc(size_t size, gfp_t flags)

  • void* krealloc(const void* p, size_t size, gfp_t flags)

  • void* kmem_cache_alloc(struct kmem_cache* mc, gfp_t flags)

Note

It should not be very hard to extend this example to support other functions that allocate memory like vmalloc(), kstrdup(), etc.

To collect data necessary to provide the remaining two counters, our system needs to process the calls to the operations with mutexes:

  • void mutex_lock(struct mutex* lock)

  • int mutex_lock_interruptible(struct mutex* lock)

  • int mutex_lock_killable(struct mutex* lock)

  • int mutex_trylock(struct mutex* lock)

  • void mutex_unlock(struct mutex* lock)

Note

Note that the functions may be different for different variants and versions of the Linux kernel. There is no stable binary interface in the Linux kernel anyway. Please choose memory allocation operations and mutex-related functions appropriate for your kernel.

5.6.2. Creating the Payload Module

To implement our analysis system, we need to create a payload module for KEDR. As a starting point, we can use, for example, the skeleton of a module given in Section 6.1.7, “A Stub of a Payload Module”. The module should provide a replacement function for each functon we have chosen above.

The instance of struct kedr_payload could be filled as follows (this structure should be used when registering and unregistering the payload module with KEDR core):

/* Names and addresses of the functions of interest */
static void* orig_addrs[] = {
    (void*)&__kmalloc,
    (void*)&krealloc,
    (void*)&kmem_cache_alloc,
    (void*)&mutex_lock,
    (void*)&mutex_lock_interruptible,
    (void*)&mutex_lock_killable,
    (void*)&mutex_trylock,
    (void*)&mutex_unlock
};

/* Addresses of the replacement functions - must go 
 * in the same order as for the original functions.
 */
static void* repl_addrs[] = {
    (void*)&repl___kmalloc,
    (void*)&repl_krealloc,
    (void*)&repl_kmem_cache_alloc,
    (void*)&repl_mutex_lock,
    (void*)&repl_mutex_lock_interruptible,
    (void*)&repl_mutex_lock_killable,
    (void*)&repl_mutex_trylock,
    (void*)&repl_mutex_unlock
};

static struct kedr_payload counters_payload = {
    .mod                    = THIS_MODULE,
    .repl_table.orig_addrs  = &orig_addrs[0],
    .repl_table.repl_addrs  = &repl_addrs[0],
    .repl_table.num_addrs   = ARRAY_SIZE(orig_addrs),
    .target_load_callback   = NULL,
    .target_unload_callback = NULL
};

The initial value of each counter is 0. The replacement functions actually update the counters. They do this with special locks held to avoid some of the concurrency issues. For example, the replacement function for __kmalloc() looks like this:

static void*
repl___kmalloc(size_t size, gfp_t flags)
{
    unsigned long irq_flags;
    void* returnValue;
    
    /* Call the target function */
    returnValue = __kmalloc(size, flags);
    
    spin_lock_irqsave(&spinlock_alloc_total, irq_flags);
    ++cnt_alloc_total;
    spin_unlock_irqrestore(&spinlock_alloc_total, irq_flags);
    
    spin_lock_irqsave(&spinlock_alloc_failed, irq_flags);
    if (returnValue == NULL) ++cnt_alloc_failed;
    spin_unlock_irqrestore(&spinlock_alloc_failed, irq_flags);
    
    spin_lock_irqsave(&spinlock_alloc_max_size, irq_flags);
    if (size > cnt_alloc_max_size) cnt_alloc_max_size = size;
    spin_unlock_irqrestore(&spinlock_alloc_max_size, irq_flags);

    return returnValue;
}

This replacement function calls __kmalloc() (target function) and records its return value. After that, it updates the variables corresponding to the relevant counters, cnt_alloc_total, cnt_alloc_failed and cnt_alloc_max_size. It is not generally mandatory to call the target function there (for example, see Section 4.5, “Fault Simulation”) but it is necessary for the kind of analysis we implement in this example.

The technical details concerning the creation of files for the counters in debugfs, are not described here. If you are interested in these details, see the source code of Counters example.

Note

Note that it is not mandatory to implement all the counters in a single payload module. For example, we could provide a module that implements the counters related to memory allocation and another one for those dealing with mutexes. As long as the sets of the target functions do not intersect with each other, we may create a separate module for each set and use all these modules at the same time to analyse the target module.

For simplicity, we implement all the counters in a single payload module in this example.

5.6.3. Building the Payload Module

The payload module that we have prepared can be built much in the same way as any other kernel module. Still, there is a couple of things to take into account.

First, the module uses header files provided by KEDR, so the top include directory of KEDR should be specified in -I compiler option. The directory is usually <kedr_install_dir>/include/.

Second, each payload module uses functions exported by KEDR core and therefore needs the appropriate .symvers file. Before building the module, you should copy kedr_base.symvers file provided by KEDR to the directory of the payload module and rename it to Module.symvers. kedr_base.symvers is usually located in /lib/modules/`uname -r`/symvers/ or in <kedr_install_dir>/lib/modules/`uname -r`/symvers/ in case of a non-global installation of KEDR.

You can look at Kbuild and makefile files to see how the payload is built in Counters example.

5.6.4. Using the Payload Module

Now that the payload module for our analysis system is built, we can use it to see how the values of the counters change as the target module operates. You can chose any kernel module as a target if you know how to properly load it and to make it operate.

Important

It is not recommended to simultaneously use payload modules implementing different types of analysis. That is, it is better not to mix the payload module from Counters example with those intended for call monitoring, fault simulation, etc. One of the problems that may arise here is the conflicting sets of target functions. Currently, KEDR does not detect whether the sets of target functions that the payload modules process intersect or not. If a target function is processed by more than one payload module that are currently loaded, the behaviour of KEDR is undefined.

Our analysis system makes the counters available via the files in debugfs. So if debugfs is not mounted (usually its directory is /sys/kernel/debug/), mount it first to a directory of your choice. For example,

mount debugfs -t debugfs some_dir/debugfs

Now it is time to load KEDR core and kedr_counters.ko payload module that we have built before. The easiest way is probably to create a configuration file, say, my.conf, with the following contents:

payload path_to_example_directory/kedr_counters.ko

and use kedr start with that file:

kedr start <name_of_target_module> my.conf

See Section 4.1, “Controlling KEDR” for a detailed information about the configuration files, kedr start, etc.

Load target module and do something with it. While it is working (and also after it is unloaded), you can check how the counters are shown in the files in kedr_counters_example subdirectory in debugfs.

tester@lab-x86:> cd /sys/kernel/debug/kedr_counters_example/
tester@lab-x86:> ls
alloc_failed  alloc_max_size  alloc_total  mutex_balance  mutex_locks

tester@lab-x86:> cat alloc_max_size 
Maximum size of a memory chunk requested: 48

Note

Note that if you unload the target module and then load it again while the analysis system (KEDR core modules and kedr_counters.ko payload module) is loaded, the counters will not be reset. This is a known limitation of this example.