blob: 6f2aeabf7faadd51a33b146a1ee7b2591e2f9cb8 [file] [log] [blame]
lh9ed821d2023-04-07 01:36:19 -07001 Using the Linux Kernel Latency Histograms
2
3
4This document gives a short explanation how to enable, configure and use
5latency histograms. Latency histograms are primarily relevant in the
6context of real-time enabled kernels (CONFIG_PREEMPT/CONFIG_PREEMPT_RT)
7and are used in the quality management of the Linux real-time
8capabilities.
9
10
11* Purpose of latency histograms
12
13A latency histogram continuously accumulates the frequencies of latency
14data. There are two types of histograms
15- potential sources of latencies
16- effective latencies
17
18
19* Potential sources of latencies
20
21Potential sources of latencies are code segments where interrupts,
22preemption or both are disabled (aka critical sections). To create
23histograms of potential sources of latency, the kernel stores the time
24stamp at the start of a critical section, determines the time elapsed
25when the end of the section is reached, and increments the frequency
26counter of that latency value - irrespective of whether any concurrently
27running process is affected by latency or not.
28- Configuration items (in the Kernel hacking/Tracers submenu)
29 CONFIG_INTERRUPT_OFF_LATENCY
30 CONFIG_PREEMPT_OFF_LATENCY
31
32
33* Effective latencies
34
35Effective latencies are actually occuring during wakeup of a process. To
36determine effective latencies, the kernel stores the time stamp when a
37process is scheduled to be woken up, and determines the duration of the
38wakeup time shortly before control is passed over to this process. Note
39that the apparent latency in user space may be somewhat longer, since the
40process may be interrupted after control is passed over to it but before
41the execution in user space takes place. Simply measuring the interval
42between enqueuing and wakeup may also not appropriate in cases when a
43process is scheduled as a result of a timer expiration. The timer may have
44missed its deadline, e.g. due to disabled interrupts, but this latency
45would not be registered. Therefore, the offsets of missed timers are
46recorded in a separate histogram. If both wakeup latency and missed timer
47offsets are configured and enabled, a third histogram may be enabled that
48records the overall latency as a sum of the timer latency, if any, and the
49wakeup latency. This histogram is called "timerandwakeup".
50- Configuration items (in the Kernel hacking/Tracers submenu)
51 CONFIG_WAKEUP_LATENCY
52 CONFIG_MISSED_TIMER_OFSETS
53
54
55* Usage
56
57The interface to the administration of the latency histograms is located
58in the debugfs file system. To mount it, either enter
59
60mount -t sysfs nodev /sys
61mount -t debugfs nodev /sys/kernel/debug
62
63from shell command line level, or add
64
65nodev /sys sysfs defaults 0 0
66nodev /sys/kernel/debug debugfs defaults 0 0
67
68to the file /etc/fstab. All latency histogram related files are then
69available in the directory /sys/kernel/debug/tracing/latency_hist. A
70particular histogram type is enabled by writing non-zero to the related
71variable in the /sys/kernel/debug/tracing/latency_hist/enable directory.
72Select "preemptirqsoff" for the histograms of potential sources of
73latencies and "wakeup" for histograms of effective latencies etc. The
74histogram data - one per CPU - are available in the files
75
76/sys/kernel/debug/tracing/latency_hist/preemptoff/CPUx
77/sys/kernel/debug/tracing/latency_hist/irqsoff/CPUx
78/sys/kernel/debug/tracing/latency_hist/preemptirqsoff/CPUx
79/sys/kernel/debug/tracing/latency_hist/wakeup/CPUx
80/sys/kernel/debug/tracing/latency_hist/wakeup/sharedprio/CPUx
81/sys/kernel/debug/tracing/latency_hist/missed_timer_offsets/CPUx
82/sys/kernel/debug/tracing/latency_hist/timerandwakeup/CPUx
83
84The histograms are reset by writing non-zero to the file "reset" in a
85particular latency directory. To reset all latency data, use
86
87#!/bin/sh
88
89TRACINGDIR=/sys/kernel/debug/tracing
90HISTDIR=$TRACINGDIR/latency_hist
91
92if test -d $HISTDIR
93then
94 cd $HISTDIR
95 for i in `find . | grep /reset$`
96 do
97 echo 1 >$i
98 done
99fi
100
101
102* Data format
103
104Latency data are stored with a resolution of one microsecond. The
105maximum latency is 10,240 microseconds. The data are only valid, if the
106overflow register is empty. Every output line contains the latency in
107microseconds in the first row and the number of samples in the second
108row. To display only lines with a positive latency count, use, for
109example,
110
111grep -v " 0$" /sys/kernel/debug/tracing/latency_hist/preemptoff/CPU0
112
113#Minimum latency: 0 microseconds.
114#Average latency: 0 microseconds.
115#Maximum latency: 25 microseconds.
116#Total samples: 3104770694
117#There are 0 samples greater or equal than 10240 microseconds
118#usecs samples
119 0 2984486876
120 1 49843506
121 2 58219047
122 3 5348126
123 4 2187960
124 5 3388262
125 6 959289
126 7 208294
127 8 40420
128 9 4485
129 10 14918
130 11 18340
131 12 25052
132 13 19455
133 14 5602
134 15 969
135 16 47
136 17 18
137 18 14
138 19 1
139 20 3
140 21 2
141 22 5
142 23 2
143 25 1
144
145
146* Wakeup latency of a selected process
147
148To only collect wakeup latency data of a particular process, write the
149PID of the requested process to
150
151/sys/kernel/debug/tracing/latency_hist/wakeup/pid
152
153PIDs are not considered, if this variable is set to 0.
154
155
156* Details of the process with the highest wakeup latency so far
157
158Selected data of the process that suffered from the highest wakeup
159latency that occurred in a particular CPU are available in the file
160
161/sys/kernel/debug/tracing/latency_hist/wakeup/max_latency-CPUx.
162
163In addition, other relevant system data at the time when the
164latency occurred are given.
165
166The format of the data is (all in one line):
167<PID> <Priority> <Latency> (<Timeroffset>) <Command> \
168<- <PID> <Priority> <Command> <Timestamp>
169
170The value of <Timeroffset> is only relevant in the combined timer
171and wakeup latency recording. In the wakeup recording, it is
172always 0, in the missed_timer_offsets recording, it is the same
173as <Latency>.
174
175When retrospectively searching for the origin of a latency and
176tracing was not enabled, it may be helpful to know the name and
177some basic data of the task that (finally) was switching to the
178late real-tlme task. In addition to the victim's data, also the
179data of the possible culprit are therefore displayed after the
180"<-" symbol.
181
182Finally, the timestamp of the time when the latency occurred
183in <seconds>.<microseconds> after the most recent system boot
184is provided.
185
186These data are also reset when the wakeup histogram is reset.