Project

General

Profile

Wiki » History » Version 1

Alexey Khoroshilov, 03/14/2016 09:46 PM

1 1 Alexey Khoroshilov
h1. RaceHound
2
3
RaceHound can be used to detect data races in the Linux kernel on x86.
4
5
Kernel 3.14 or newer is required. Kprobes, kallsyms and debugfs support are
6
also needed in the kernel.
7
8
The ideas implemented here are similar to how DataCollider tool operates
9
on MS Windows.
10
11
h2. How it works
12
13
The kernel part of RaceHound (racehound.ko module) monitors the instructions in the kernel code in runtime. It operates as follows:
14
15
1. Place software breakpoints (Kprobes, actually) at the locations specified by the user.
16
17
2. When a software breakpoint hits, check if the corresponding instruction is about to access memory. Determine the address and the size of that memory area.
18
19
3. Save the contents of that area (or at least a part of it).
20
21
4. Place a hardware breakpoint on that memory area to detect writes (if the instruction reads from it) or both reads and writes (if the instruction writes to it).
22
23
5. Make a delay. The length of the delay is configurable. If some code tried to access that memory area during the delay, the hardware
24
breakpoint might have detected it.
25
26
6. Disarm the hardware breakpoint.
27
28
7. Just in case, check if the contents of that memory area have changed during the delay. Might help detect races that were not caught above.
29
30
8. Let the execution continue as usual.
31
32
The found races are reported to the system log and to racehound/events file in debugfs.
33
34
35
h2. Build prerequisites
36
37
* cmake 2.8.10 or newer.
38
* Everything that is needed to build kernel modules (kernel development files, etc.).
39
* GCC C compiler and C++ compiler with C++11 support (GCC 4.9 or newer is preferable).
40
* (only if you want to build "ma_lines" plugin) GCC development files needed to build plugins.
41
* elfutils, libelf, libdw and their development files.
42
43
h3. Build
44
45
  $ mkdir racehound.build
46
  $ cd racehound.build
47
  $ cmake path_to_racehound_sources
48
  $ make
49
50
To install RaceHound, you can now execute "make install" as root.
51
52
Note that RaceHound is installed for the current kernel only. If you update
53
or otherwise change the kernel, please rebuild RaceHound and install it
54
again.
55
56
h3. Self-tests (optional)
57
58
To build the tests provided with RaceHound, run "make build_tests" in the
59
directory where RaceHound was built. Then you can run "ctest" as root there
60
to run all the tests.
61
62
The tests check the basic functionality of RaceHound.
63
64
h2. Usage
65
66
67
It is assumed here that debugfs is available and mounted to /sys/kernel/debug.
68
69
h3. Scenario 1: monitoring a set of instructions
70
71
1. The user can tell the kernel part of RaceHound to monitor the particular instructions in the kernel code. The locations of these instructions should be written to /sys/kernel/debug/racehound/breakpoints in the following format:
72
73
<pre>
74
  [<module_name>:]{init|core}+0xoffset[,delay=<value>]
75
</pre>
76
77
If <module_name> is not specified, the kernel or a built-in module is assumed.
78
79
"init" and "core" are the areas containing the code of the kernel or a module in memory. See module_init and module_core in struct module. Dealing with the ELF sections in the kernel has its difficulties, same for the addresses and sizes of the functions, so RaceHound uses "init" and "core" areas instead.
80
81
"delay" can be used to set the length of the delay for the given monitored instruction when a software breakpoint hits. If it is not specified, "delay" parameter of "racehound" kernel module will be used (or "delay_in_atomic" in atomic context, if set). The value is in milliseconds.
82
83
lines2insns tool from this project can help prepare the location strings in the correct format.
84
85
Suppose you would like to monitor the memory accesses corresponding to the lines 82 and 84 in something.c source file of test_module.ko. The module
86
should be built with debug info. Then you can do the following.
87
88
  $ echo "something.c:82" | lines2insns test_module.ko
89
  test_module:core+0x22f
90
91
  $ echo "something.c:84" | lines2insns test_module.ko
92
  test_module:core+0x251
93
94
If you are interested, say, only in writes at something.c:84, you can specify this as well:
95
96
  $ echo "something.c:84:write" | lines2insns test_module.ko
97
  test_module:core+0x251
98
99
If you know the ELF sections and the offsets there for the instructions of interest, lines2insns can convert that to the proper format too:
100
  $ echo "test_module:.text+0x251" | lines2insns --section-to-area test_module.ko
101
  test_common_target:core+0x251
102
103
  $ echo "test_module:.exit.text+0x1" | lines2insns --section-to-area test_module.ko
104
  test_common_target:core+0x335
105
106
  $ echo "test_module:.init.text+0xdd" | lines2insns --section-to-area test_module.ko
107
  test_common_target:init+0xdd
108
109
See lines2insns --help for more details.
110
111
So, suppose you would like to monitor the instructions at the following locations in test_module:
112
  test_module:core+0x22f
113
  test_module:core+0x251
114
...as well as one location in the kernel proper or in a built-in module:
115
  core+0x77654
116
117
2. Load racehound.ko if it is not loaded yet.
118
119
  insmod /usr/local/lib/modules/$(uname -r)/misc/racehound.ko [delay=...]
120
121
You can optionally specify how long to delay execution of the instructions to check for the conflicting memory accesses. "delay" parameter of "racehound" kernel module can be used for that. The value is in milliseconds. The default is about 5000/HZ (5 jiffies).
122
123
If you would like to use a different delay for atomic context, please specify it in "delay_in_atomic" parameter.
124
125
3. Instruct RaceHound to monitor the given instructions (as root).
126
127
  echo "test_module:core+0x22f" > /sys/kernel/debug/racehound/breakpoints
128
  echo "test_module:core+0x251" > /sys/kernel/debug/racehound/breakpoints
129
  echo "core+0x77654,delay=50" > /sys/kernel/debug/racehound/breakpoints
130
131
Note that the execution of the instruction at "core+0x77654" will be delayed by 50 milliseconds ("delay=50") no matter which global settings for delays RaceHound has.
132
133
Reading /sys/kernel/debug/racehound/breakpoints will show the list of the instructions to be monitored.
134
135
Note that it is no longer required to load RaceHound before the analyzed module(s).
136
137
As soon as RaceHound receives the list of instructions to monitor, it will check if the corresponding components of the kernel are loaded. If they are,
138
the monitoring will start immediately. If a module is not loaded yet, RaceHound will wait for it to load and will process it after that.
139
140
Different kernel components can be monitored simultaneously.
141
142
To stop monitoring an instruction, you can write the same string as before to /sys/kernel/debug/racehound/breakpoints but with ''-'' prepended. Example:
143
144
  echo "-core+0x77654" > /sys/kernel/debug/racehound/breakpoints
145
146
4. Make the analyzed kernel code work.
147
148
5. If RaceHound detects a race, it will output something like this to the system log (see dmesg):
149
150
  [rh] Detected a data race on the memory block at ffffffffa09b936c
151
  between the instruction at test_module:core+0x22f (comm: "sh")
152
  and the instruction right before my_func+0x18/0x20 [test_module] (comm: "sh").
153
154
or
155
156
  [rh] Detected a data race on the memory block at ffffffffa09b936c
157
  that is about to be accessed by the instruction at
158
  test_module:core+0x251 (comm: "sh"):
159
  the memory block was modified during the delay.
160
161
If the race is detected only because the memory area has been modified during the delay, RaceHound, obviously, has no idea which code has done that modification. On the other hand, the hardware breakpoints report the both involved parties.
162
163
6. If "test_module" is built with debug info, you can use addr2line or a similar tool to find the source lines for the conflicting access (my_func+0x18/0x20 in the example above).
164
165
7. Unload "racehound" module when it is no longer needed. If the analyzed kernel components are still loaded then, RaceHound will automatically "detach" from them first.
166
167
Notes and tips
168
169
* If the kernel is built with CONFIG_PREEMPT=y, it may help make the system more responsive when using RaceHound. Especially, - if you use the delays in seconds or longer.
170
171
* There are only 4 hardware breakpoints available on an x86 CPU, so monitoring too many instructions that execute often may be pointless. The exact values
172
of "too many" and "often" may vary, of course.
173
174
* If the "hot code paths" are constantly monitored, the performance overhead may become significant as well.
175
176
* If you suspect two instructions to race against each other, it is usually better to monitor only one of them. The reports about the races might be less useful otherwise: the accesses from RaceHound itself may be listed there.
177
178
* If you try to write a location to monitor to /sys/kernel/debug/racehound/breakpoints and it fails, check the system log (dmesg), it may provide more info.
179
180
On x86-64, adding a location to monitor may also fail if the kernel does not have the following fix:
181
"kprobes/x86: Return correct length in __copy_instruction()"
182
(commit c80e5c0c23ce2282476fdc64c4b5e3d3a40723fd in the mainline)
183
184
h3. Scenario 2: sweeping through the code
185
186
The idea is as follows. Suppose we have a list of the instructions from the given portion of the code, say, part of a module or of the kernel. The list may be quite long and monitoring all these instructions at the same time can be a bad idea (performance overhead, etc.).
187
188
Note that "ma_lines" plugin for GCC 4.9+ can be used to get the list of the locations in the source code where memory accesses may happen (except the code written in assembly). lines2insns will translate it to the list of locations in the binary code. See ma_lines/Readme.txt.
189
190
The kernel part of RaceHound provides info about which software breakpoints have been hit and which races have been found. It is available via /sys/kernel/debug/racehound/events file. Poll/select can be used for that file to wait till the events are available there. The events can then be read, one per line.
191
192
The format for the "BP hit" events is the same as used for racehound/breakpoints file, see above.
193
194
For the found races, the corresponding event lines are prefixed with "[race]".
195
196
An example that demonstrates the usage of this API is also provided:
197
https://github.com/winnukem/racehound/blob/master/examples/events.py
198
199
A user-space application may use the information about the events to automatically start and stop monitoring the instructions from the list according to some policy. This should allow to keep overhead at the acceptable level.
200
201
An example of such application is provided here:
202
https://github.com/winnukem/racehound/blob/master/examples/check_races.py
203
204
Note that both events.py and check_races.py need Python 3.4 or newer.