4. Using KEDR

4.1. Controlling KEDR
4.1.1. General
4.1.2. Usage
4.1.3. Options
4.1.4. Description
4.1.5. Caveats
4.1.6. Configuration file
4.1.7. Examples
4.2. Capturing the Trace
4.2.1. General
4.2.2. Usage
4.2.3. Description
4.2.4. Options
4.2.5. Examples
4.3. How KEDR Works
4.4. Call Monitoring (Call Tracing)
4.5. Fault Simulation
4.6. Detecting Memory Leaks
4.6.1. Typical Usage
4.6.2. Reports
4.6.3. Analyzing the Results
4.6.4. Stack Depth
4.6.5. Caveats
4.7. Analyzing the Trace
4.7.1. General
4.7.2. Locating the Calls in the Sources with GDB
4.7.3. Locating the Calls in the Sources with Objdump
4.7.4. Obtaining the Call Stack

4.1. Controlling KEDR

4.1.1. General

kedr - a service-like tool to control KEDR.

4.1.2. Usage

kedr start target_name [ -c conf_string | -f conf_file ...]

kedr stop

kedr status

kedr restart

4.1.3. Options

-c conf_string

conf_string string provides configuration parameters that KEDR will use when loading and unloading.

-f conf_file

conf_file file provides configuration parameters that KEDR will use when loading and unloading. For the files in the default configuration directory (/var/opt/kedr/configs) or <install_prefix>/var/configs depending on where KEDR is installed) directory part of the path may be omitted. In the current version, the default configuration directory contains the following files:

callm.conf

Configuration profile for call monitoring (call tracing).

fsim.conf

Configuration profile for fault simulation.

leak_check.conf

Configuration profile for memory leak detection.

default.conf

Default configuration profile is the profile for call monitoring too (the same as callm.conf).

All conf_string and conf_file arguments of the command will actually be combined into one configuration file by KEDR control tool, in the same order as they are listed. The resulting profile will be used when loading and unloading KEDR.

If no conf_string and conf_file options are given, the command will use the default configuration file (/var/opt/kedr/configs/default.conf or <install_prefix>/var/configs/default.conf, depending on where KEDR is installed).

4.1.4. Description

4.1.4.1. kedr start

kedr start loads kedr_base module first (it is a part of KEDR core). Then it processes the configuration file in the on_load mode and executes all resulting strings. The configuration file is the default one or the one prepared based on the options of the command (see Section 4.1.3, “Options”). Finally, kedr_controller module is loaded (also part of the KEDR core) with target_name parameter equal to target_name. That is, the KEDR core is now configured to operate on the module named target_name.

If kedr_base module is already loaded, kedr start does nothing and returns 1.

If the module with name target_name is currently loaded, kedr start does nothing and returns 1.

If starting kedr_base or kedr_controller modules or processing some of the lines in the configuration file fails (the corresponding operation returns nonzero), a rollback is performed. That is, all lines in the configuration file, up to the failed line, are processed in on_unload mode, and kedr_base module is unloaded (if it has been started successfully before). Then 1 is returned.

4.1.4.2. kedr stop

kedr stop unloads kedr_controller module first. Then processes the configuration file (the same file as was used at the last start) in on_unload mode and executes resulting strings. Finally, kedr_base module is unloaded.

If kedr_base module did not start, kedr stop does nothing and returns 1.

If module with name target_name is currently running, kedr stop does nothing and returns 1.

4.1.4.3. kedr status

kedr status outputs information about the current status of KEDR. That is, whether kedr_base and kedr_controller modules are loaded, which payload modules are currently loaded, whether the target module (the module under analysis) is currently loaded.

4.1.4.4. kedr restart

kedr restart does effectively the same as

kedr stop && kedr start target_name conf_file

with target_name being the name of the current target and conf_file - the effective configuration file (default or the one constructed from options) with which KEDR was started last time.

If kedr_base module is not currently loaded, kedr restart does nothing and returns 1.

If module with name target_name is currently loaded, kedr restart does nothing and returns 1.

If kedr_base module is still loaded after stop operation has been executed, start operation will not run and kedr restart will return 1.

4.1.5. Caveats

In its start, stop and restart modes, kedr may break the usage of the trace. So, trace capturing mechanism (kedr_capture_trace) should not be running when kedr command is executed in these modes.

4.1.6. Configuration file

The configuration file is treated as an ordered list of lines each of which has one of the following forms:

on_load shell-command
on_unload shell-command
module|payload module-name|module-filename [parameters...]

Besides that, empty lines as well as the lines containing only spaces and lines starting with # character are allowed and ignored when the control tool processes the configuration file.

A line containing shell-command preceded by on_load is executed in on_load mode and is ignored otherwise.

A line containing shell-command preceded by on_unload is executed in on_unload mode and is ignored otherwise.

on_load and on_unload modes are described in Section 4.1.4, “Description”.

Line

module module-name [parameters...]

where module-name is the name of a module or its alias, is actually equivalent to

on_load modprobe module-name [parameters...]
on_unload modprobe -r module-name 

Line

module module-filename [parameters...]

where module-filename is the absolute path to the module, is actually equivalent to

on_load insmod module-filename [parameters...]
on_unload rmmod module-filename 

payload keyword at the start of the line has same meaning as module but also marks module-name or module-filename as a payload module.

In on_load mode, all processed lines in the configuration file are executed from the first to the last. In on_unload mode, they are executed in a reverse order (i.e. if module A is loaded after module B then A unloaded before B).

4.1.7. Examples

kedr start module1

This command will start KEDR with the default payloads for call monitoring. When module named module1 will be loaded, it will be processed by KEDR.

kedr start module1 -c 'payload payload1 arg1 arg2' -c 'payload payload2'

Same as above, but instead of loading the default payloads, it will load payload modules payload1 (with arguments arg1 arg2) and payload2.

If the payload module payload1 uses symbols from another module, say module_aux, then the configuration file should contain the following lines (the order is important!)

...
module module_aux
payload payload1
...

4.2. Capturing the Trace

4.2.1. General

kedr_capture_trace - a tool to capture the data output by payload modules to the trace.

4.2.2. Usage

kedr_capture_trace [OPTIONS]

4.2.3. Description

kedr_capture_trace captures the trace information output by payload modules until terminated by SIGINT signal or, when used with '-s' option, until all trace information from target module session has been received.

Each trace record can be written to the specified file(s) ('-f' option) and/or passed to user-specified application(s) as STDIN ('-p' option). If neither '-f' nor '-p' option is specified, all trace records are output to STDOUT.

4.2.4. Options

-d debugfs_mount_point

Specify the directory the debugfs filesystem is currently mounted to. This filesystem contains the trace file from which the trace will be captured. If this argument is not supplied, it is assumed that the mount point of debugfs filesystem is /sys/kernel/debug.

-f, --file file

Store every trace record in the given file. If the file doesn't exist, kedr_capture_trace creates it at the beginning, otherwise appends new data to the file.

-p, --program program

On start, kedr_capture_trace launches program. Then it pipes every trace record to the standard input of this program. When capturing is to be stopped, kedr_capture_trace closes its side of pipe and waits until the program terminates.

-s, --session

Read only those records from the trace that were collected since the target module had been loaded and until the target module had been unloaded (target session). Capturing stops after the last record from the target session has been processed.

Note

'-s' option should not be used if some trace records from the current target session have already been processed (and, consequently, removed from the trace).

4.2.5. Examples

kedr_capture_trace -f trace.txt
^C

store all records from the trace in the file trace.txt until 'Ctrl+C' is pressed.

kedr_capture_trace -p "grep called_kfree > frees.txt"
^C

store all records from the trace corresponding to kfree() calls in the file frees.txt, until 'Ctrl+C' is pressed.

kedr_capture_trace -p bzip2 > trace.bz2"
^C

pack the trace records to trace.bz2 archive on the fly.

kedr_capture_trace \
-p "grep called_kfree > frees.txt" \
-p "grep called___kmalloc > allocs.txt" \
-f trace.txt
^C

store all records from the trace in trace.txt and the records corresponding to kfree() calls and to __kmalloc() calls in frees.txt and allocs.txt, respectively.

kedr start moduleA
kedr_capture_trace -s -f trace.txt &
pid=$!
/sbin/modprobe moduleA
...
/sbin/modprobe -r moduleA
wait $pid
kedr stop

store all records from the trace generated by moduleA in the file trace.txt.

See also Section 3.3, “Call Monitoring (Call Tracing)”.

4.3. How KEDR Works

KEDR has a plugin-based architecture similar to the architecture of Valgrind and other binary analysis frameworks for user-space applications and libraries.

Each data collection and analysis tool based on KEDR has at least the following parts:

  • KEDR core - the core components provided by the framework, their main responsibility it to instrument the target module when it is loaded. The core also provides utilities to simplify commonly used operations, for example, output of the collected data, etc.

  • One or more tool plugins (payload modules) that actually collect and, if necessary, analyze the data about the target module.

The interaction between the components involved in the analysis of Linux kernel modules with KEDR is outlined at the figure below. The logical components related to KEDR are colored in green.

KEDR: system architecture

KEDR core (Loading Detector and Call Instrumentation Facilities on the figure) detect when the module to be analyzed (target module) is loaded. The notification system provided by the Linux kernel is used to do this.

When the target has been loaded but before it begins its initialization, KEDR core takes control and instruments it, that is, it changes the memory image of that module in a special way. In particular, the calls to the functions that should be intercepted are now replaced with the calls to the functions with the same signatures provided by payload modules. A payload module should register itself with KEDR core for this to work.

After the instrumentation is done, the target module is allowed to begin its initialization.

The actual analysis of the target kernel module is performed by payload modules of different types.

If the target module tries to call a function which one of the payload modules is interested in (target function), that payload module takes control and executes the corresponding replacement function instead. This way, the payload module can get access to the arguments of the target function. In addition, the payload module can change the behaviour of the target module, for example make it look like the target function has failed, etc.

Payload modules can do various things:

  • Collect data about which function was called when, with what arguments, and what it returned (Call Monitor on the figure). The data is output to a trace (via Data Collector facilities) and can then be analyzed by user-space applications.

  • Simulate the situations when the kernel seems to fail to satisfy some of the requests made by the target module (Fault Similator on the figure). That is, simulate low memory conditions, etc. Fault simulation is performed according to the scenarios selected by the user. It can be used to check if the module is still reliable in error conditions, if possible failures of the called functions are properly handled by the module, etc.

  • Check various requirements for the operation of the module (Base Checker on the figure): whether it uses virtual memory facilities in a right way, etc.

  • ...And much more (see Section 5.6, “Implementing Custom Types of Analysis”).

When loaded, payload modules register themselves with the KEDR core (kedr-base module, actually) - see Section 6.1, “API for Payload Modules”.

The user controls the analysis process via convenience API and tools (Kernel Module Analysis API + Tools on the figure) - either manually or via a user-space application.

Note

Note that the term API is used here in a rather broad sense. Currently, it is not a set of functions or classes provided for the applications written in some programming language. It rather consists of all the means that KEDR provides for the user-space applications to start the analysis process, to control it and to retrieve the results. These means include the parameters of the kernel modules from KEDR, the special files that allow to configure fault simulation scenarios, etc. All these facilities are described in the respective sections of this manual.

KEDR control tool makes sure each component of KEDR is loaded in proper order and with proper parameters. It does roughly the following (the more precise description is available in Section 4.1, “Controlling KEDR”):

  1. Loads the kedr-base module responsible to keep track of payload modules and to provide API to them.

  2. Loads the payload modules listed in the configuration file. Each payload module should register itself with the KEDR core by calling kedr_payload_register (see Section 6.1, “API for Payload Modules”), this is usually done in its init function.

  3. Loads the kedr-controller module passing it the name of the target module as a parameter (target_name). From this moment, the KEDR core begins watching for the specified target module to load (see also Section 3, “Getting Started”).

When the analysis is done, the KEDR control tool unloads the modules mentioned above, in appropriate order.

Note

Note that the KEDR core makes no assumptions about what exactly the currently registered payload modules do. This allows to implement different types of analysis with the help of KEDR.

4.4. Call Monitoring (Call Tracing)

Call monitoring facilities provided by KEDR allow to collect data about the calls to the functions of interest (target functions) made by a kernel module. In this case, each replacement function calls the corresponding target function and outputs its arguments and return value to a trace.

This is similar to what strace utility does for user-space applications.

KEDR contains a set of payload modules intended to perform call monitoring. They collect information about which kernel functions were called by the target module and in what order, about the parameters passed to these functions and about the return values. This information is output to a trace and can then be used by user-space applications in runtime with the help of kedr_capture_trace tool.

The standard payloads modules for call monitoring are built and installed with KEDR by default. If you would like to disable this, set CMake variable KEDR_STANDARD_CALLM_PAYLOADS to OFF when executing CMake before building KEDR:

cmake -DKEDR_STANDARD_CALLM_PAYLOADS=OFF <other_options> <path_to_kedr_sources>

The format of the output data is similar to the trace format of the ftrace system. Here is an example of such trace (the header line is shown only for the purpose of description):

  TASK-PID  CPU#  TIMESTAMP   FUNCTION
--------------------------------------------------------------------
insmod-6416 [001] 805.997320: target_session_begins: target module: "kedr_sample_target"
insmod-6416 [001] 805.997615: called___kmalloc: ([<ffffffffa00e70b9>] init+0xb9) 
    arguments: (320, d0), result: ffff8800165a8000
dd-6438     [000] 858.641942: called___kmalloc: ([<ffffffffa01d661e>] core+0x61e) 
    arguments: (4000, d0), result: ffff88001659e000
dd-6438     [000] 858.642074: called_copy_from_user: ([<ffffffffa01d642a>] core+0x42a) 
    arguments: (ffff88001659e000, 000000000137d000, 1), result: 0
...
rmmod-6441  [001] 869.438875: called_kfree: ([<ffffffffa01d60d8>] core+0xd8) 
    arguments: (ffff88001659e000)
rmmod-6441  [001] 869.438879: called_kfree: ([<ffffffffa01d60d8>] core+0xd8) 
    arguments: ((null))
rmmod-6441  [001] 869.438881: called_kfree: ([<ffffffffa01d6108>] core+0x108) 
    arguments: (ffff8800165a8000)
rmmod-6441  [001] 869.438885: target_session_ends: target module: "kedr_sample_target"

FUNCTION field has the following format for the records corresponding to the detected function calls:

called_<function-name>: (<call_address>) arguments(<arguments-list>), result:<value-returned>

<call_address> specifies the address of the next instruction after the call to the target function. This field has the following format:

[<absolute_address>] <area>+<offset>

<absolute_address> is the absolute address of the instruction immediately following the call in the memory image of the target module. <area> can be init or core. It is the name of the area containing the executable code of the module (these terms are used by module loader in the Linux kernel). An area may contain one or more sections (ELF sections) of the module. <offset> is the offset of the instruction from the beginning of the area.

If you would like to find the lines in the source code of the target module corresponding to the addresses given in the trace records, see Section 4.7, “Analyzing the Trace” for details.

A different format is used for marker records indicating loading and unloading of the target module:

target_session_<begins|ends>: target_module: "<target-module-name>"

Only calls to the subset of all kernel functions are detected. This subset however can be easy extended by implementing your own modules (see Section 5.2, “Writing Custom Payloads for Call Monitoring”). Here is a full list of the payload modules that currently may be used for call monitoring, and the lists of the functions detected by each module. A function name in square brackets means that this function is detected only if it is exported by the kernel (it may or may not be exported in each particular system). Only one of the functions separated by a slash is detected depending on which one of them is actually exported by the kernel.

  • kedr_cm_cmm.ko:
    __kmalloc
    krealloc
    __krealloc
    kfree
    kzfree
    kmem_cache_alloc
    [kmem_cache_alloc_notrace]
    [kmem_cache_alloc_trace]
    kmem_cache_free
    __get_free_pages
    get_zeroed_page
    free_pages
    [__kmalloc_node]
    [kmem_cache_alloc_node]
    [kmem_cache_alloc_node_notrace]
    [kmem_cache_alloc_node_trace]
    [__alloc_pages_nodemask]
    [alloc_pages_current]
    [__free_pages]
    [alloc_pages_exact]
    [free_pages_exact]
  • kedr_cm_user_space_access.ko:
    copy_to_user/_copy_to_user
    copy_from_user/_copy_from_user
    strndup_user
    memdup_user
  • kedr_cm_mutexes.ko:
    __mutex_init
    mutex_lock
    mutex_lock_interruptible
    mutex_lock_killable
    mutex_trylock
    mutex_unlock
  • kedr_cm_spinlocks.ko:
    _spin_lock_irqsave/_raw_spin_lock_irqsave
    _spin_unlock_irqrestore/_raw_spin_unlock_irqrestore
    additionally, if KEDR was configured with enable_full_spinlock option set,
    _spin_lock/_raw_spin_lock
    _spin_lock_irq/_raw_spin_lock_irq
    _spin_unlock/_raw_spin_unlock
    _spin_unlock_irq/_raw_spin_unlock_irq
  • kedr_cm_waitqueue.ko:
    __wake_up
    init_waitqueue_head/__init_waitqueue_head
    prepare_to_wait
    finish_wait
    remove_wait_queue
    add_wait_queue
    add_wait_queue_exclusive
  • kedr_cm_capable.ko:
    capable
  • kedr_cm_vmm.ko:
    vmalloc
    __vmalloc
    vmalloc_user
    vmalloc_node
    vmalloc_32
    vmalloc_32_user
    vfree
  • kedr_cm_schedule.ko:
    schedule
    [preempt_schedule]
    _cond_resched
    schedule_timeout
    schedule_timeout_uninterruptible
    schedule_timeout_interruptible
    io_schedule
    cond_resched_lock/__cond_resched_lock
  • kedr_cm_mem_util.ko:
    kstrdup
    kstrndup
    kmemdup

4.5. Fault Simulation

Fault simulation facilities provided by KEDR allow to put the target kernel module to the conditions that occur not very often during the normal operation of the module. For example, it is possible to simulate a situation when the system is short of memory or of another resource and consequently, at least some of the attempts to acquire the resource (e.g. allocate memory) fail. This allows to check if the target module handles such situations correctly.

The standard payloads modules for fault simulation are built and installed with KEDR by default. If you would like to disable this, set CMake variable KEDR_STANDARD_FSIM_PAYLOADS to OFF when executing CMake before building KEDR:

cmake -DKEDR_STANDARD_FSIM_PAYLOADS=OFF  <other_options> <path_to_kedr_sources>

The fault simulation scenarios (i.e. the instructions that define the calls to which functions to make fail and in what conditions) can be customized by the user.

Note

Note that only the target module is affected during fault simulation, the other parts of the kernel are not.

It is possible to restrict fault simulation even more, to handling of only those requests to the target module that are made by a particular user-space process (it can be a process created by a test application, for example).

During fault simulation, each replacement function serves as a fault simulation point among other things. That is, it calls a special indicator function and decides based on its return value whether to call the corresponding target function normally or simulate its failure. In the latter case the target is often not called at all, just the appropriate value is returned (as if it was returned by the target function).

Like in call monitoring, a trace of the calls to the target functions is recorded by the payload modules used for fault simulation. This allows user-space applications to analyze the behaviour of the target module further, for example, to find out whether it has released all the resources correctly even in case of failure, etc. The format of trace records is the same as for call monitoring (see Section 4.4, “Call Monitoring (Call Tracing)”), so are the tools and techniques for working with the trace.

Note that unlike call monitoring, the return values stored in the trace are the return values of the replacement functions rather than those of the target functions. This is because it is these very values that will be actually returned to the caller function in the target module. As for the calls for which no failure was simulated, the return value is the the same for both the target anf the corresponding replacement functions.

In KEDR, the replacement functions and fault simulation scenarios are independent on one another. A replacement function may even be developed by a different author than a scenario. The person who needs to use some fault simulation scenario for a function, can simply assign the scenario to the replacement function in runtime.

The default fault simulation scenario is never simulate failures. So all the payload modules work as if they were doing just call monitoring by default. To manage scenarios, each payload exports one or more directories in debugfs like /sys/kernel/debug/kedr_fault_simulation/points/<function-name>, where <function-name> is a name of target kernel function (it is assumed here that debugfs filesystem is mounted to /sys/kernel/debug). That is, fault simulation can be controlled separately for each target function.

In each such directory, there is at least file current_indicator containing the name of the fault simulation indicator, currently used for the function. You can consider fault simulation indicator a function which is called whenever the payload module needs to decide whether to simulate a failure of the target function. If the indicator function returns nonzero, a failure will be simulated. You might say that an indicator implements fault simulation scenario. Reading from file current_indicator gives the name of the currently used indicator. Writing a name of some indicator to this file sets this indicator for the function.

Examples:

cat /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator

After the payload module for fault simulation processing __kmalloc has just loaded, the above command will print none. This is because no indicator is currently set for __kmalloc.

echo common > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator

This sets the indicator named common for __kmalloc. If you read current_indicator again, you will see that it contains that name now:

cat /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator

common will be printed as a result of the command above.

If you try to set a non-existent indicator, writing to current_indicator will return an error like bash: echo: write error: Operation not permitted. You can check this using the following command:

echo unknown_indicator_name > \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator

KEDR provides special kernel modules that implement indicators for different use cases. Each of these modules exports a directory in debugfs, /sys/kernel/debug/kedr_fault_simulation/indicators/<indicator-name> (<indicator-name> is the name that identifies the indicator). Actually, an indicator usually implements a parametrized family of fault simulation scenarios rather than a single scenario. The parameters of an indicator can be changed from user space as described below. This can be done either when assigning the indicator to a fault simulation point (by writing a string like <indicator-name> <indicator-params> to the control file current_indicator for that point) or at runtime.

Note

Each fault simulation point uses its own instance of an indicator. That is, changing parameters of the indicator (and hence of the fault simulation scenario) for a target function does not affect other target functions.

The indicator with name common is the common indicator that can be used for any target function. By default, the indicator function always returns 0 (never make the calls fail). Once the indicator has been set, it creates three control files in /sys/kernel/debug/kedr_fault_simulation/points/<function-name> directory: expression, times and pid.

expression file corresponds to the mathematical expression. The indicator function will return the resulting value of this expression when called from a fault simulation point. Reading from this file returns the expression currently used by the indicator function. If you would like to instruct the indicator to use another expression, write the expression to this file.

The expression may contain the following.

  • Signed or unsigned decimal integer numbers

  • Unsigned hexadecimal integer numbers with the format 0x[0-9a-f]+ (similar to 0x%x format for printf() function)

  • Any mathematical operation from the list:
    ! - logical not
    ~ - binary not
    + - unary plus
    - - unary minus
    * - multiplication
    / - integer division
    % - remainder of division
    + - binary plus
    - - binary minus
    >> - arithmetic right shift
    << - arithmetic left shift
    < - less
    > - greater
    <= - less or equal
    >= - greater or equal
    a=b - equal
    a!=b - not equal
    & - binary and
    ^ - binary xor
    | - binary or
    && - logical and
    a||b - logical or
    c ? a : b - conditional operator
  • Variables:
    in_init
    evaluates to nonzero if the target module is currently executing its init function, evaluates to 0 otherwise
    rnd100
    evaluates to a random integer number from [0...99]
    rnd10000
    evaluates to a random integer number from [0...9999]
    times
    evaluates to the ordinal number of the call to the corresponding target function since indicator has been set for it or since calls counter has been reset (see below)
    caller_address
    evaluates to the address of the instruction following the call being processed (this can be used to simulate failures of, say, __kmalloc() only when this function is called from the particular places in the target module: from the particular functions, etc.)

Note

If the kernel is configured to support reliable stack trace information (CONFIG_STACKTRACE is set and CONFIG_FRAME_POINTER or CONFIG_STACK_UNWIND are set), support for caller_address variable is enabled by default. Otherwise, it is disabled. If you would like to disable it manually, set CMake variable KEDR_ENABLE_CALLER_ADDRESS to OFF when executing CMake before building KEDR.

times file corresponds to the counter of target function calls(see description of times times variable, which may be used in the expression for the indicator). This counter is incremented every time when target function is called (while this fault simulation indicator is set fo this function). Reading from the file returns current value of counter, any writing to this file resets value of the counter to 0.

Examples:

echo common > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator

This will set common indicator for __kmalloc function. The default scenario is never simulate failures.

cat /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression

The above command should print 0.

echo 1 > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression

This will set the scenario to make each call fail for __kmalloc function.

echo '!in_init' > \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression

This will set the scenario to after the target module is initialized, make each call to the target function fail.

echo '!in_init && (rnd100 < 20)'> \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression

This will set the scenario to after the target module is initialized, simulate failures of approximately 20% of all calls to the target function.

echo '(caller_address < 0xfe2ab8d0) && (caller_address > 0xfe2ab970) && (rnd100 < 20)'> \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression

This will set the scenario to simulate failures of approximately 20% of the calls to the target function made outside of the range [0xfe2ab8d0, 0xfe2ab970], the calls from within the range should succeed. This helps restrict fault simulation to only particular areas of the target module.

echo 'times = 1'> \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression
echo '0'> \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/times

This will set the scenario to make only the first call to the target function fail (all other calls should succeed). The second command resets the call counter, i.e. forces KEDR to count calls only from this moment.

echo '(times % 3) = 0'> \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression

This will set the scenario to make every third call to the target function fail (succeed, succeed, fail, succeed, succeed, fail, succeed, ...).

The file named pid corresponds to the set of processes affected by the fault simulation for a given target function. That is, KEDR will not simulate failures when the requests to the target module come from the processes other than the listed ones. Initially, after the indicator has been set for the target function, pid is 0. Whenever a call to a target function is intercepted by a payload module, the corresponding replacement function checks (according to the value of expression) whether it should simulate a failure or not. If you write a non-zero value to pid file, only the process with the given pid and its descendants (its children along with their children, etc.) will be affected by the fault simulation.

Note

Note that when a nonzero pid is specified, the calls to the target function will not increment times variable if they are made in the context of a process that is neither the process with that pid nor its descendant.

In the following example, the fault simulation scenario is make a call to the target function fail only if it is made in the context of a process launched from the current shell or of its descendants.

echo common > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator
echo 1 > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression
echo $$ > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/pid

Other indicators provided with KEDR extend the abilities of common indicator. They support additional variables that can be used in the expression.

The indicator named kmalloc allows to use size and flags variables, that is, the parameters of __kmalloc function. It also accepts several named constants of gfp_t type like GFP_KERNEL and GFP_ATOMIC. Example:

echo 'kmalloc (flags = GFP_ATOMIC) && (size > 100)' > \
    /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator

This will set the scenario for __kmalloc function to simulate faulure of each memory allocation request with flags equal to GFP_ATOMIC and size greater than 100.

The convenience however comes at the cost of generality: this indicator can be set only for __kmalloc function as well as those functions that provide size and flags parameters.

Similarly, the indicator named capable allows to use cap variable which is a parameter of capable function. It also accepts the named constants that may be used as the values of this parameter (CAP_SYS_ADMIN, etc.). Example:

echo 'capable cap = CAP_SYS_ADMIN' > \
    /sys/kernel/debug/kedr_fault_simulation/points/capable/current_indicator

This will set the scenario for capable function to make each request for the administrative capabilities fail.

Here is the list of KEDR modules that provide fault simulation indicators:

kedr_fsim_indicator_common.ko
implements common indicator
kedr_fsim_indicator_kmalloc.ko
implements kmalloc indicator
kedr_fsim_indicator_capable.ko
implements capable indicator

If you would like to extend common indicator to support more types of functions, see Section 5.5, “Writing Custom Scenarios for Fault Simulation”.

Here is the list of payload modules for fault simulation provided with KEDR:

kedr_fsim_capable.ko
implements fault simulation for function capable
kedr_fsim_user_space_access.ko
implements fault simulation for functions copy_to_user/_copy_to_user , copy_from_user/_copy_from_user , strndup_user, memdup_user
kedr_fsim_cmm.ko
implements fault simulation for functions __kmalloc, krealloc, __krealloc, kmem_cache_alloc, [kmem_cache_alloc_notrace], [kmem_cache_alloc_trace], __get_free_pages, get_zeroed_page, [__kmalloc_node], [kmem_cache_alloc_node], [kmem_cache_alloc_node_notrace], [kmem_cache_alloc_node_trace], [__alloc_pages_nodemask], [alloc_pages_current], [alloc_pages_exact]
kedr_fsim_mem_util.ko
implements fault simulation for functions kstrdup, kstrndup, kmemdup
kedr_fsim_vmm.ko
implements fault simulation for functions vmalloc, __vmalloc, vmalloc_user, vmalloc_node, vmalloc_32, vmalloc_32_user

Note

Although KEDR module kedr_fsim_cmm.ko currently provides fault simulation facilities for several functions, it exports only one directory (__kmalloc) to manage fault simulation scenarios. All those supported functions use the same scenario that is set for __kmalloc function. Same for kedr_fsim_vmm.ko module where the scenario for vmalloc function is used for all other functions too.

If you would like to create a payload module to perform fault simulation for other functions, see Section 5.3, “Writing Custom Payloads for Fault Simulation”.

4.6. Detecting Memory Leaks

LeakCheck (implemented as payload module kedr_leak_check.ko) allows, as its name implies, to check the target kernel module for memory leaks, that is to reveal which memory blocks were allocated but were not freed by that module.

LeakCheck is enabled in KEDR by default. If you would like to disable it, set CMake variable KEDR_LEAK_CHECK to OFF when executing CMake before building KEDR:

cmake -DKEDR_LEAK_CHECK=OFF <other_options> <path_to_kedr_sources>

Note

Note that LeakCheck may not detect all memory leaks in the target module. For the present, the tool monitors the usage of more than 30 kernel functions that allocate and deallocate memory (these are among the supported functions listed in Section 4.4, “Call Monitoring (Call Tracing)”). Still, the target module can use some other functions for this purpose and LeakCheck will not notice this. Nevertheless, it is possible to add support for more functions to LeakCheck.

Besides that, if the target module allocates a memory block but it is another module (or the kernel proper) that frees it, this will also be reported as a memory leak by LeakCheck. That being said, possible leaks reported by LeakCheck should be further analyzed to find out whether they are actually leaks.

For the present, LeakCheck cannot be used simultaneously with other kinds of payload modules provided by KEDR. In particular, you cannot use LeakCheck and fault simulation for memory-related functions at the same time.

4.6.1. Typical Usage

LeakCheck can be used like any other payload module for KEDR. First you need to load KEDR core and the appropriate payload module:

kedr start <target_name> -f leak_check.conf

leak_check.conf is placed to the main directory for config files when KEDR is installed (default: /var/opt/kedr/configs), so you usually do not need to specify the full path to the file.

Load the target module and do something with it as usual, then unload the target. Do not stop KEDR yet.

Take a look at /sys/kernel/debug/kedr_leak_check directory. Here we assume that debugfs is mounted to /sys/kernel/debug. If it is not, you should mount it:

mount debugfs -t debugfs /sys/kernel/debug

There should be the following files in kedr_leak_check directory:

  • info:

    • information about the target module (its name, addresses of the init and core memory areas);

    • total number of memory allocations performed by the module;

    • number of possible memory leaks (allocations without matching frees);

    • number of free-like calls without matching allocation calls;

  • possible_leaks:

    • information about each detected memory leak: address and size of the memory block and a portion of the call stack of allocation (the allocations with the same call stack are grouped together, only the most recent one is shown);

  • unallocated_frees:

    • information about each free-like call without matching allocation call: address of the memory block and a portion of the call stack of that deallocation call (the deallocations with the same call stack are grouped together)

unallocated_frees file should normally be empty. If it is not empty in some of your analysis sessions, it could be a problem in LeakCheck itself (e.g., the target module used some allocation function that LeakCheck was unaware of) or memory is probably allocated by some other module. If you suppose it is a problem in LeakCheck, please report it to the bug tracker.

4.6.2. Reports

Here are the examples of info and possible_leaks files from a real analysis session. The target is vboxsf module from VirtualBox Guest Additions 4.0.3. The memory leak caught there was fixed in VirtualBox 4.0.4.

info:

Target module: "vboxsf", 
  init area at 0xfe2b6000, core area at 0xfe2aa000
Memory allocations: 49
Possible leaks: 11
Unallocated frees: 0

possible_leaks:

Block at 0xf617e000, size: 4096; stack trace of the allocation:
[<fe2ab904>] sf_follow_link+0x34/0xa0 [vboxsf]
[<c0303caf>] link_path_walk+0x79f/0x910
[<c0303f19>] path_walk+0x49/0xb0
[<c0304089>] do_path_lookup+0x59/0x90
[<c03042bd>] user_path_at+0x3d/0x80
[<c02fd6d7>] vfs_fstatat+0x37/0x70
[<c02fd748>] vfs_stat+0x18/0x20
[<c02fd9af>] sys_stat64+0xf/0x30
[<c0203190>] sysenter_do_call+0x12/0x22
[<ffffe430>] 0xffffe430
[<ffffffff>] 0xffffffff
+8 more allocation(s) with the same call stack.
----------------------------------------
Block at 0xf659a000, size: 4096; stack trace of the allocation:
[<fe2ab904>] sf_follow_link+0x34/0xa0 [vboxsf]
[<c0303caf>] link_path_walk+0x79f/0x910
[<c0303f19>] path_walk+0x49/0xb0
[<c0304089>] do_path_lookup+0x59/0x90
[<c03042bd>] user_path_at+0x3d/0x80
[<c02f8825>] sys_chdir+0x25/0x90
[<c0203190>] sysenter_do_call+0x12/0x22
[<ffffe430>] 0xffffe430
[<ffffffff>] 0xffffffff
+1 more allocation(s) with the same call stack.
----------------------------------------

The format of stack traces is the same as it is used to output data about warnings and errors to the system log:

[<call_address>] <function_name>+<offset_in_func>/<size_of_func> [<module>]

To be exact, each address corresponds to the instruction following the relevant call.

4.6.3. Analyzing the Results

GDB, Objdump or some other tools of this kind can be used to locate the places in the source code corresponding to the entries in the stack traces. The detailed description can be found, for example, in Section 4.7, “Analyzing the Trace”.

In the current version of LeakCheck, the names of the functions from init area (those marked with __init in the source file of the target module) cannot be resolved and the relevant stack trace entries contain only raw call addresses. This is because name resolution is done when "init" area has already been dropped from memory.

Using the the start address of the init area that info file shows and the technique described in Section 4.7, “Analyzing the Trace” referred to above, you can overcome this.

4.6.4. Stack Depth

The maximum number of stack frames displayed is controlled by stack_depth parameter of the module. That is, at most this many stack frames will be shown.

stack_depth parameter is an unsigned integer, not greater than 16. Default value: 12.

For example, to display at most 7 stack frames for each allocation/deallocation, create a configuration file for LeakCheck as follows (and use it instead of the default one when starting KEDR):

payload /usr/local/lib/modules/<kernel_version>/misc/kedr_leak_check.ko stack_depth=7

4.6.5. Caveats

When the target module is loaded, the output files are cleared, the results are reset. Please take this into account when loading and unloading the target module more than once while LeakCheck is loaded.

As usual with debugfs, the output files live only as long as LeakCheck.ko module is loaded. In particular, after unloading the target, please collect the results first and only after that reload the target or stop KEDR.

4.7. Analyzing the Trace

This section describes a couple of techniques that can be used when analyzing the traces output by payload modules for call monitoring or fault simulation. In particular, it shows how to find out which place in the source code of the target module each particular trace record corresponds to and how to obtain call stack for a call of interest.

Note

It is recommended that the kernel of your system should be built with CONFIG_FRAME_POINTER or CONFIG_STACK_UNWIND parameters set to y. This is not the case on some systems by default. If neither of these parameters are not defined, reliable stack trace information could probably be unavailable. KEDR will still work in this case but, for example, its reports concerning memory leaks may be less detailed.

It is desirable to have the target module with debug information enabled. Note that it is only necessary if you would like to match the trace records to the appropriate fragments of the source code of the target module. KEDR itself does not require the modules under analysis to have debug information.

4.7.1. General

Let us consider the trace from the example described in Section 3, “Getting Started” (the records are numbered just for convenience):

[1] insmod-6416 [001] 805.997320: target_session_begins: target module: "kedr_sample_target"
[2] insmod-6416 [001] 805.997615: called___kmalloc: ([<ffffffffa00e70b9>] init+0xb9) 
    arguments: (320, 80d0), result: ffff8800165a8000
[3] dd-6438     [000] 858.641942: called___kmalloc: ([<ffffffffa01d661e>] core+0x61e) 
    arguments: (4000, 80d0), result: ffff88001659e000
[4] dd-6438     [000] 858.642074: called_copy_from_user: ([<ffffffffa01d642a>] core+0x42a) 
    arguments: (ffff88001659e000, 000000000137d000, 1), result: 0
    ...
[5] rmmod-6441 [001] 869.438875: called_kfree: ([<ffffffffa01d60d8>] core+0xd8) 
    arguments: (ffff88001659e000)
[6] rmmod-6441 [001] 869.438879: called_kfree: ([<ffffffffa01d60d8>] core+0xd8) 
    arguments: ((null))
[7] rmmod-6441 [001] 869.438881: called_kfree: ([<ffffffffa01d6108>] core+0x108) 
    arguments: (ffff8800165a8000)
[8] rmmod-6441 [001] 869.438885: target_session_ends: target module: "kedr_sample_target"

The marker records #1 and #8 denote the beginning and the end of the tracing session, i.e. the moments when the target module was loaded and unloaded, respectively. Let us analyze the remaining ones. For each trace record of interest, we need to perform at least the following steps.

  1. Determine the ELF section in the target module from which the reported call was made.
  2. Translate the offset of that call from the beginning of init or core area to the offset from the beginning of the section.
  3. Find the line corresponding to that instruction in the source code of the target module. This can be done, for example, with GDB or Objdump as described below.

In each of the trace records corresponding to the function calls, there is the address of an instruction immediately following the call instruction. Consider, for example, the address in the record #3: [<ffffffffa01d661e>] core+0x61e. This means, the absolute address of that location is 0xffffffffa01d661e. At the same time, that location is at the offset of 0x61e from the beginning of the core area of the memory image of the target. For now, it is technically easier for KEDR to obtain the offset of a location from the beginning of such area than of a section like .text, etc.

According to how the loader of kernel modules currently works, it seems that the sections of the target module containing the executable code are loaded to the beginning of two memory areas, init and core. It is init area that is usually dropped from the memory once the module has completed its initialization. This area often contains only one code section, .init.text, where the functions marked with __init in the source code are placed. .text, .exit.text and other code sections (if present) go to core area.

So, in many cases, the offset in init area is actually the offset in .init.text section and the offset in core area is the offset in .text. This is however not always the case. To find out which section the call in a trace record corresponds to, you can use different techniques. You can, for example, simply obtain the memory addresses of the sections of the target module while it is under analysis. They can be read from /sys/module/<module_name>/sections/<section_name>. Once you have these start addresses of the sections, you can use the absolute address of the call to find out which section it belongs to.

For example, the following command will print the memory address of .text section of module kedr_sample_target:

cat /sys/module/kedr_sample_target/sections/.text

Note that when the target module is unloaded and then loaded again, its sections may be located at some other addresses.

On the other hand, you could assume that the order in which the sections are located in each of init and core areas is the same as in the object file of the module. So, taking the sizes of the sections and their alignment in memory into account, you could obtain the section layout in these memory areas. For example, Readelf (readelf -S) or GDB (info files command) can be used to list the sections for the object file.

Now that you have found the section a call of interest belongs to as well as its offset in that section, you can use the debug information in the target module to find the corresponding place in its source code. The following sections show how to do this.

4.7.2. Locating the Calls in the Sources with GDB

Start GDB and feed the object file of the target module to it:

gdb kedr_sample_target.ko

Get information about the sections and their file addresses:

(gdb) info files

Symbols from "/home/tester/work/kedr/other/sample_target/kedr_sample_target.ko".
Local exec file:
    '/home/tester/work/kedr/other/sample_target/kedr_sample_target.ko', file type elf64-x86-64.
    Entry point: 0x0
    0x0000000000000000 - 0x0000000000000024 is .note.gnu.build-id
    0x0000000000000030 - 0x00000000000006ac is .text
    0x00000000000006ac - 0x00000000000006bc is .exit.text
    0x00000000000006bc - 0x00000000000008b9 is .init.text
    0x00000000000008c0 - 0x0000000000000919 is .rodata
    0x0000000000000920 - 0x0000000000000a44 is .rodata.str1.8
    0x0000000000000a44 - 0x0000000000000a4a is .rodata.str1.1
    0x0000000000000a60 - 0x0000000000000bb7 is .modinfo
    0x0000000000000bb8 - 0x0000000000000c80 is __param
    0x0000000000000c80 - 0x0000000000000cc0 is __mcount_loc
    0x0000000000000cc0 - 0x0000000000001380 is __versions
    0x0000000000001380 - 0x0000000000001470 is .data
    0x0000000000001480 - 0x00000000000016d0 is .gnu.linkonce.this_module
    0x00000000000016d0 - 0x00000000000016e0 is .bss

The sections of interest are .init.text, .text and .exit.text. We will use the file addresses of the first two of these sections below (0x6bc and 0x30, respectively).

Suppose we have already found out that .init.text lies at the beginning of init area in the memory image of the module, .text - at the beginning of core area. All the addresses in the trace records are in .init.text and .text sections.

Consider the trace record #2. The detected call to __kmalloc was made at address init+0xb9, that is the call instruction is at the offset of 0xb9 from the beginning of the init area in the memory image and hence, from the start of .init.text section. The section has address 0x6bc in the file (see above), we can get the corresponding position in the source code using the following command:

(gdb) list *(0x6bc + 0xb9)

0x775 is in cfake_init_module (/home/tester/work/kedr/other/sample_target/cfake.c:323).
318     
319     /* Allocate the array of devices */
320     cfake_devices = (struct cfake_dev*)kzalloc(
321         cfake_ndevices * sizeof(struct cfake_dev), 
322         GFP_KERNEL);
323     if (cfake_devices == NULL) {
324         result = -ENOMEM;
325         goto fail;
326     }

GDB points to the line following the call to __kmalloc (kzalloc is an inline function) we are interested in. If we used the exact offset of the call instruction rather than the offset of the next one, GDB would show the innards of the inline kmalloc function substituted to the source of the target module which is probably not very convenient:

(gdb) list *(0x6bc + 0xb4)

0x770 is in cfake_init_module (include/linux/slub_def.h:262).
257             trace_kmalloc(_THIS_IP_, ret, size, s->size, flags);
258 
259             return ret;
260         }
261     }
262     return __kmalloc(size, flags);
263 }

Note

On x86 and x86-64, the length of the call instruction corresponding to that call to __kmalloc is 5 bytes, so we have subtracted 5 from the offset above.

Usually, the offset shown in the trace corresponds either to the source line containing the call or to the next line.

Let us consider the record #3. It corresponds to a call to __kmalloc too, but this call was made at the offset 0x61e from the beginning of core area and of .text section. To find the corresponding source line, execute the following command (0x30 is the file address of .text section):

(gdb) list *(0x30 + 0x61e)

0x64e is in cfake_open (/home/tester/work/kedr/other/sample_target/cfake.c:85).
82     if (dev->data == NULL)
83     {
84         dev->data = (unsigned char*)kzalloc(dev->buffer_size, GFP_KERNEL);
85         if (dev->data == NULL)
86         {
87             printk(KERN_WARNING "[cr_target] open: out of memory\n");
88             return -ENOMEM;
89         }

The remaining records can be processed in a similar way.

4.7.3. Locating the Calls in the Sources with Objdump

Once we know section names and the offsets in these sections for the call instructions of interest, objdump tool can be used instead of GDB to find the corresponding lines in the source code of target module.

First we need disassemble the code sections of the module:

objdump -dSlr kedr_sample_target.ko > kedr_sample_target.disasm

Let us locate the call mentioned in the trace record #2. Its position is right before the offset of 0x61e in .text section. The instructions in kedr_sample_target.disasm are marked with their offsets in the corresponding section, so we can get the following:

 610:   48 8b 7b 08             mov    0x8(%rbx),%rdi
 614:   be d0 00 00 00          mov    $0xd0,%esi
 619:   e8 00 00 00 00          callq  61e <cfake_open+0x7e>
              61a: R_X86_64_PC32 __kmalloc-0x4
cfake_open():
/home/tester/work/kedr/other/sample_target/cfake.c:85
    if (dev->data == NULL)
    {
        dev->data = (unsigned char*)kzalloc(dev->buffer_size, GFP_KERNEL);
        if (dev->data == NULL)
 61e:   48 85 c0                test   %rax,%rax
        

So we can see from the above listing that the trace records corresponds to the call to kzalloc at line 84 of cfake.c. The remaining records can be processed in a similar way.

4.7.4. Obtaining the Call Stack

Note

The technique described below is quite easy to use. Still, to understand it better, it is recommended that you familiarize yourself first with the instructions on how to create custom payload modules using template-based code generation and fault simulation infrastructure (see Section 5, “Customizing and Extending KEDR”).

Note that we are not actually going to do fault simulation here, we will just reuse the infrastructure for a different purpose.

Sometimes the technique described in the sections above is not enough to find out, what the target module was actually doing when it called the target function. Consider memory allocation and deallocation, for example. The developers of the target module may choose to use kmalloc() directly. Alternatively, they may choose to provide a set of their own functions for memory management that probably use kmalloc() internally but are higher-level and suit the needs of the developers better. If the latter is the case, it may happen that many of the calls to kmalloc() recorded in the trace are performed from the same address in the code even if the target module services completely different requests each time. If there are many such calls recorded, it could become difficult to analyze what was actually happening in the target module.

If we need to analyze only a few of the recorded calls, it could be helpful if we obtained call stack for each of these calls somehow.

Let us look at two fragments of a trace produced by call monitoring facilities from KEDR during the initialization and finalization of some kernel module:

insmod-1910 [000] 338.670490: 
    called___kmalloc: ([<e0c5b55d>] core+0x755d) arguments: (36, d0), result: ddad8300
insmod-1910 [000] 338.670576: 
    called___kmalloc: ([<e0c5b55d>] core+0x755d) arguments: (64, d0), result: ddad6f40
insmod-1910 [000] 338.670595: 
    called_kfree: ([<e0c556d9>] core+0x16d9) arguments: (ddad6f40)
insmod-1910 [000] 338.670676: 
    called___kmalloc: ([<e0c5b55d>] core+0x755d) arguments: (36, d0), result: ddad6f40
insmod-1910 [000] 338.670760: 
    called___kmalloc: ([<e0c5b55d>] core+0x755d) arguments: (64, d0), result: de864f00
rmmod-1956 [000] 437.168068: 
    called_kfree: ([<e0c556d9>] core+0x16d9) arguments: (de864f00)
rmmod-1956 [000] 437.168080: 
    called_kfree: ([<e0c5d511>] core+0x9511) arguments: (ddad8300)

You have probably noticed that there is no matching call to kfree() recorded for the third call to __kmalloc() (the one that returned 0xddad6f40). Looks like a memory leak. Note that all the calls to __kmalloc() were made from the same place in the code of the target module. It was that higher level allocation function provided and used by the target module.

Assuming that the above situation is reproducible, let us try to obtain the call stack for each call to __kmalloc() made in the context of insmod process. However, the default payload modules for call monitoring and fault simulation are currently unable to output call stacks. So we need to prepare a custom paylaod module somehow that suits our needs. Fortunately, it is not that difficult.

As it is mentioned in Section 5.3, “Writing Custom Payloads for Fault Simulation”, the point and indicator infrastructure provided by KEDR for fault simulation, can be used for other purposes as well. Actually, it supports altering the behaviour of the target module according to a scenario chosen by the user (see also Section 4.5, “Fault Simulation”). That is exactly what we need: each time __kmalloc() is called in the context of the specified process (or process tree), current call stack should be output, say, to the system log and the execution of the target module should then continue normally. To develop payload module payload_dump_stack that implements that, we can follow the steps described in Section 5.3, “Writing Custom Payloads for Fault Simulation”.

First, we copy custom_payload_fsim example to some other directory. The templates located there remain unchanged and we change only the name of the payload module to payload_dump_stack in Kbuild and makefile.

Then, to make things easier, we replace payload.data with the corresponding file for the default payload module for fault simulation for common memory management routines. That file can be found in payloads_fsim/common_memory_management subdirectory in the build tree of KEDR. We need to change this payload.data file as follows.

  • Set module.name and module.author appropriately.

  • Add relevant #include directives to the header part:

    #include <linux/kernel.h>   /* dump_stack() */
    #include <linux/sched.h>    /* current, etc. */
    
  • Turn off point reuse by commenting out fpoint.reuse_point = ... lines in each group. We would like to control the calls to __kmalloc separately from the calls to other memory management functions. For those, we only need a usual trace.

  • The most important part is to specify what exactly to do if a call to __kmalloc matches the chosen scenario. We replace the default definition of fpoint.fault_code for the group for __kmalloc function with the following (for simplicity, we do not care about the concurrency issues here):

        fpoint.fault_code =>>
            static int callNo = 0;
            
            /* just output a message and the call stack and go on normally */
            ++callNo;
            printk(KERN_INFO "[__kmalloc()] Matched call, PID=%d, call #%d\n",
                (int)(current->pid),
                callNo
            );
            dump_stack();
            returnValue = __kmalloc(size, flags);
        <<
    

    That is, if the indicator function returns non-zero for a given call to __kmalloc, we output the call number and PID of the corresponding process and call dump_stack() to output the call stack to the system log. Note that we call __kmalloc at the end as we do not intend to simulate its failure rather than to allow the execution continue normally.

That is all for the payload module. Now we can build it and instruct KEDR to load it along with the corresponding core modules and the indicator. A configuration file like the following could be used to do this.

# Tracing support
module /usr/local/lib/modules/2.6.34.7-0.5-default/misc/kedr_trace.ko

# Fault simulation infrastructure
module /usr/local/lib/modules/2.6.34.7-0.5-default/misc/kedr_fault_simulation.ko

# Payload modules
payload /home/tester/work/kedr/payload_dump_stack/payload_dump_stack.ko 

# Indicators 
# We could use kedr_fsim_indicator_common.ko as well because we are 
# not going to set scenarios involving restrictions on the arguments 
# of __kmalloc.
module /usr/local/lib/modules/2.6.34.7-0.5-default/misc/kedr_fsim_indicator_kmalloc.ko

Now we can set the indicator for the point corresponding to __kmalloc. We use kmalloc indicator but common would also do this time. We restrict the scenario to the processes launched from the current shell and then enable the scenario as usual:

# echo "kmalloc" > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/current_indicator 
# echo $$ > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/pid 
# echo 1 > /sys/kernel/debug/kedr_fault_simulation/points/__kmalloc/expression

After that, we can load the target module by executing insmod from that shell. Call stacks for the four relevant calls to __kmalloc will be output to the system log. We are particularly interested in the call #3:

[  338.990197] [__kmalloc()] Matched call, PID=2943, call #3
[  338.990199] Pid: 2943, comm: insmod Not tainted 2.6.34.7-0.5-default #1
[  338.990200] Call Trace:
[  338.990203]  [<c0206303>] try_stack_unwind+0x173/0x190
[  338.990206]  [<c020509f>] dump_trace+0x3f/0xe0
[  338.990208]  [<c020636b>] show_trace_log_lvl+0x4b/0x60
[  338.990210]  [<c0206398>] show_trace+0x18/0x20
[  338.990212]  [<c05b9f5b>] dump_stack+0x6d/0x72
[  338.990215]  [<e0871a89>] repl___kmalloc+0xf9/0x150 [payload_dump_stack]
[  338.990223]  [<e0c5b55d>] my_mem_alloc+0x2d/0x60 [frmwk_mod]
[  338.990232]  [<e0c598cb>] my_object_alloc+0xb/0x20 [frmwk_mod]
[  338.990240]  [<e0c5d52f>] my_object_create+0xf/0x50 [frmwk_mod]
[  338.990250]  [<e0cd106f>] impl_init_subsystem+0xf/0x20 [target_mod]
[  338.990256]  [<e0cd100a>] impl_init+0x2a/0x40 [target_mod]
[  338.990262]  [<e0cb801e>] init+0x1e/0x20a [target_mod]
[  338.990264]  [<c020120e>] do_one_initcall+0x2e/0x180
[  338.990267]  [<c0277c11>] sys_init_module+0xb1/0x220
[  338.990269]  [<c0203190>] sysenter_do_call+0x12/0x22
[  338.990275]  [<ffffe430>] 0xffffe430 

Now that we have got the stack trace for that call, it will probably be easier to find out what exactly was allocated there. The ordinary trace that KEDR facilities have also prepared, could be used to check if this is the call we need to analyze.

If there are still too many calls to the target functions made from in the context of a given process, one might want to filter the calls further to avoid filling the system log with lots of call stacks. We could instruct our system to trigger stack dump only for the call with a given number, or for the first N calls, or only for the calls where size parameter is 36, etc. All features of fault simulation scenarios can be used here (see Section 4.5, “Fault Simulation”).

Note

Note that as far as detection of memory leaks is concerned, the special plugin provided by KEDR for this purpose automatically obtains call stack information for spurious memory allocations and deallocations. So it is not actually necessary to apply the technique described above when analyzing memory leaks. This example is here to demonstrate that point and indicator can be reused to obtain the desired information as well.