Skip to content

Commit c49d14a

Browse files
author
Walter Erquinigo
committedOct 25, 2022
[trace][intel pt] Simple detection of infinite decoding loops
The low-level decoder might fall into an infinite decoding loop for various reasons, the simplest being an infinite direct loop reached due to wrong handling of self-modified code in the kernel, e.g. it might reach ``` 0x0A: pause 0x0C: jump to 0x0A ``` In this case, all the code is sequential and requires no packets to be decoded. The low-level decoder would produce an output like the following ``` 0x0A: pause 0x0C: jump to 0x0A 0x0A: pause 0x0C: jump to 0x0A 0x0A: pause 0x0C: jump to 0x0A ... infinite amount of times ``` These cases require stopping the decoder to avoid infinite work and signal this at least as a trace error. - Add a check that breaks decoding of a single PSB once 500k instructions have been decoded since the last packet was processed. - Add a check that looks for infinite loops after certain amount of instructions have been decoded since the last packet was processed. - Add some `settings` properties for tweaking the thresholds of the checks above. This is also nice because it does the basic work needed for future settings. - Add an AnomalyDetector class that inspects the DecodedThread and the libipt decoder in search for anomalies. These anomalies are then signaled as fatal errors in the trace. - Add an ErrorStats class that keeps track of all the errors in a DecodedThread, with a special counter for fatal errors. - Add an entry for decoded thread errors in the `dump info` command. Some notes are added in the code and in the documention of the settings, so please read them. Besides that, I haven't been unable to create a test case in LLVM style, but I've found an anomaly in the thread #12 of the trace 72533820-3eb8-4465-b8e4-4e6bf0ccca99 at Meta. We have to figure out how to artificially create traces with this kind of anomalies in LLVM style. With this change, that anomalous thread now shows: ``` (lldb)thread trace dump instructions 12 -e -i 23101 thread #12: tid = 8 ...missing instructions 23101: (error) anomalous trace: possible infinite loop detected of size 2 vmlinux-5.12.0-0_fbk8_clang_6656_gc85768aa64da`panic_smp_self_stop + 5 [inlined] rep_nop at processor.h:13:2 23100: 0xffffffff81342785 pause vmlinux-5.12.0-0_fbk8_clang_6656_gc85768aa64da`panic_smp_self_stop + 7 at panic.c:87:2 23099: 0xffffffff81342787 jmp 0xffffffff81342785 ; <+5> [inlined] rep_nop at processor.h:13:2 vmlinux-5.12.0-0_fbk8_clang_6656_gc85768aa64da`panic_smp_self_stop + 5 [inlined] rep_nop at processor.h:13:2 23098: 0xffffffff81342785 pause vmlinux-5.12.0-0_fbk8_clang_6656_gc85768aa64da`panic_smp_self_stop + 7 at panic.c:87:2 23097: 0xffffffff81342787 jmp 0xffffffff81342785 ; <+5> [inlined] rep_nop at processor.h:13:2 vmlinux-5.12.0-0_fbk8_clang_6656_gc85768aa64da`panic_smp_self_stop + 5 [inlined] rep_nop at processor.h:13:2 23096: 0xffffffff81342785 pause vmlinux-5.12.0-0_fbk8_clang_6656_gc85768aa64da`panic_smp_self_stop + 7 at panic.c:87:2 23095: 0xffffffff81342787 jmp 0xffffffff81342785 ; <+5> [inlined] rep_nop at processor.h:13:2 ``` It used to be in an infinite loop where the decoder never stopped. Besides that, the dump info command shows ``` (lldb) thread trace dump info 12 Errors: Number of individual errors: 32 Number of fatal errors: 1 Number of other errors: 31 ``` and in json format ``` (lldb) thread trace dump info 12 -j "errors": { "totalCount": 32, "libiptErrors": {}, "fatalErrors": 1, "otherErrors": 31 } ``` Differential Revision: https://reviews.llvm.org/D136557
1 parent c34de60 commit c49d14a

File tree

11 files changed

+453
-38
lines changed

11 files changed

+453
-38
lines changed
 

‎lldb/include/lldb/Core/PluginManager.h

+6-1
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,8 @@ class PluginManager {
342342
llvm::StringRef name, llvm::StringRef description,
343343
TraceCreateInstanceFromBundle create_callback_from_bundle,
344344
TraceCreateInstanceForLiveProcess create_callback_for_live_process,
345-
llvm::StringRef schema);
345+
llvm::StringRef schema,
346+
DebuggerInitializeCallback debugger_init_callback);
346347

347348
static bool
348349
UnregisterPlugin(TraceCreateInstanceFromBundle create_callback);
@@ -487,6 +488,10 @@ class PluginManager {
487488
Debugger &debugger, const lldb::OptionValuePropertiesSP &properties_sp,
488489
ConstString description, bool is_global_property);
489490

491+
static bool CreateSettingForTracePlugin(
492+
Debugger &debugger, const lldb::OptionValuePropertiesSP &properties_sp,
493+
ConstString description, bool is_global_property);
494+
490495
static lldb::OptionValuePropertiesSP
491496
GetSettingForObjectFilePlugin(Debugger &debugger, ConstString setting_name);
492497

‎lldb/source/Core/PluginManager.cpp

+14-4
Original file line numberDiff line numberDiff line change
@@ -1051,9 +1051,10 @@ struct TraceInstance
10511051
llvm::StringRef name, llvm::StringRef description,
10521052
CallbackType create_callback_from_bundle,
10531053
TraceCreateInstanceForLiveProcess create_callback_for_live_process,
1054-
llvm::StringRef schema)
1054+
llvm::StringRef schema, DebuggerInitializeCallback debugger_init_callback)
10551055
: PluginInstance<TraceCreateInstanceFromBundle>(
1056-
name, description, create_callback_from_bundle),
1056+
name, description, create_callback_from_bundle,
1057+
debugger_init_callback),
10571058
schema(schema),
10581059
create_callback_for_live_process(create_callback_for_live_process) {}
10591060

@@ -1072,10 +1073,10 @@ bool PluginManager::RegisterPlugin(
10721073
llvm::StringRef name, llvm::StringRef description,
10731074
TraceCreateInstanceFromBundle create_callback_from_bundle,
10741075
TraceCreateInstanceForLiveProcess create_callback_for_live_process,
1075-
llvm::StringRef schema) {
1076+
llvm::StringRef schema, DebuggerInitializeCallback debugger_init_callback) {
10761077
return GetTracePluginInstances().RegisterPlugin(
10771078
name, description, create_callback_from_bundle,
1078-
create_callback_for_live_process, schema);
1079+
create_callback_for_live_process, schema, debugger_init_callback);
10791080
}
10801081

10811082
bool PluginManager::UnregisterPlugin(
@@ -1506,6 +1507,7 @@ CreateSettingForPlugin(Debugger &debugger, ConstString plugin_type_name,
15061507
static const char *kDynamicLoaderPluginName("dynamic-loader");
15071508
static const char *kPlatformPluginName("platform");
15081509
static const char *kProcessPluginName("process");
1510+
static const char *kTracePluginName("trace");
15091511
static const char *kObjectFilePluginName("object-file");
15101512
static const char *kSymbolFilePluginName("symbol-file");
15111513
static const char *kJITLoaderPluginName("jit-loader");
@@ -1559,6 +1561,14 @@ bool PluginManager::CreateSettingForProcessPlugin(
15591561
properties_sp, description, is_global_property);
15601562
}
15611563

1564+
bool PluginManager::CreateSettingForTracePlugin(
1565+
Debugger &debugger, const lldb::OptionValuePropertiesSP &properties_sp,
1566+
ConstString description, bool is_global_property) {
1567+
return CreateSettingForPlugin(debugger, ConstString(kTracePluginName),
1568+
ConstString("Settings for trace plug-ins"),
1569+
properties_sp, description, is_global_property);
1570+
}
1571+
15621572
lldb::OptionValuePropertiesSP
15631573
PluginManager::GetSettingForObjectFilePlugin(Debugger &debugger,
15641574
ConstString setting_name) {

‎lldb/source/Plugins/Trace/intel-pt/CMakeLists.txt

+12-1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,14 @@ lldb_tablegen(TraceIntelPTCommandOptions.inc -gen-lldb-option-defs
1313
SOURCE TraceIntelPTOptions.td
1414
TARGET TraceIntelPTOptionsGen)
1515

16+
lldb_tablegen(TraceIntelPTProperties.inc -gen-lldb-property-defs
17+
SOURCE TraceIntelPTProperties.td
18+
TARGET TraceIntelPTPropertiesGen)
19+
20+
lldb_tablegen(TraceIntelPTPropertiesEnum.inc -gen-lldb-property-enum-defs
21+
SOURCE TraceIntelPTProperties.td
22+
TARGET TraceIntelPTPropertiesEnumGen)
23+
1624
add_lldb_library(lldbPluginTraceIntelPT PLUGIN
1725
CommandObjectTraceStartIntelPT.cpp
1826
DecodedThread.cpp
@@ -38,4 +46,7 @@ add_lldb_library(lldbPluginTraceIntelPT PLUGIN
3846
)
3947

4048

41-
add_dependencies(lldbPluginTraceIntelPT TraceIntelPTOptionsGen)
49+
add_dependencies(lldbPluginTraceIntelPT
50+
TraceIntelPTOptionsGen
51+
TraceIntelPTPropertiesGen
52+
TraceIntelPTPropertiesEnumGen)

‎lldb/source/Plugins/Trace/intel-pt/DecodedThread.cpp

+31-6
Original file line numberDiff line numberDiff line change
@@ -170,34 +170,36 @@ DecodedThread::GetNanosecondsRangeByIndex(uint64_t item_index) {
170170
return prev(next_it)->second;
171171
}
172172

173+
uint64_t DecodedThread::GetTotalInstructionCount() const {
174+
return m_insn_count;
175+
}
176+
173177
void DecodedThread::AppendEvent(lldb::TraceEvent event) {
174178
CreateNewTraceItem(lldb::eTraceItemKindEvent).event = event;
175179
m_events_stats.RecordEvent(event);
176180
}
177181

178182
void DecodedThread::AppendInstruction(const pt_insn &insn) {
179183
CreateNewTraceItem(lldb::eTraceItemKindInstruction).load_address = insn.ip;
184+
m_insn_count++;
180185
}
181186

182187
void DecodedThread::AppendError(const IntelPTError &error) {
183188
CreateNewTraceItem(lldb::eTraceItemKindError).error =
184189
ConstString(error.message()).AsCString();
190+
m_error_stats.RecordError(/*fatal=*/false);
185191
}
186192

187-
void DecodedThread::AppendCustomError(StringRef err) {
193+
void DecodedThread::AppendCustomError(StringRef err, bool fatal) {
188194
CreateNewTraceItem(lldb::eTraceItemKindError).error =
189195
ConstString(err).AsCString();
196+
m_error_stats.RecordError(fatal);
190197
}
191198

192199
lldb::TraceEvent DecodedThread::GetEventByIndex(int item_index) const {
193200
return m_item_data[item_index].event;
194201
}
195202

196-
void DecodedThread::LibiptErrorsStats::RecordError(int libipt_error_code) {
197-
libipt_errors_counts[pt_errstr(pt_errcode(libipt_error_code))]++;
198-
total_count++;
199-
}
200-
201203
const DecodedThread::EventsStats &DecodedThread::GetEventsStats() const {
202204
return m_events_stats;
203205
}
@@ -207,6 +209,29 @@ void DecodedThread::EventsStats::RecordEvent(lldb::TraceEvent event) {
207209
total_count++;
208210
}
209211

212+
uint64_t DecodedThread::ErrorStats::GetTotalCount() const {
213+
uint64_t total = 0;
214+
for (const auto &[kind, count] : libipt_errors)
215+
total += count;
216+
217+
return total + other_errors + fatal_errors;
218+
}
219+
220+
void DecodedThread::ErrorStats::RecordError(bool fatal) {
221+
if (fatal)
222+
fatal_errors++;
223+
else
224+
other_errors++;
225+
}
226+
227+
void DecodedThread::ErrorStats::RecordError(int libipt_error_code) {
228+
libipt_errors[pt_errstr(pt_errcode(libipt_error_code))]++;
229+
}
230+
231+
const DecodedThread::ErrorStats &DecodedThread::GetErrorStats() const {
232+
return m_error_stats;
233+
}
234+
210235
lldb::TraceItemKind
211236
DecodedThread::GetItemKindByIndex(uint64_t item_index) const {
212237
return static_cast<lldb::TraceItemKind>(m_item_kinds[item_index]);

‎lldb/source/Plugins/Trace/intel-pt/DecodedThread.h

+49-13
Original file line numberDiff line numberDiff line change
@@ -61,15 +61,6 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
6161
public:
6262
using TSC = uint64_t;
6363

64-
// Struct holding counts for libipts errors;
65-
struct LibiptErrorsStats {
66-
// libipt error -> count
67-
llvm::DenseMap<const char *, int> libipt_errors_counts;
68-
size_t total_count = 0;
69-
70-
void RecordError(int libipt_error_code);
71-
};
72-
7364
/// A structure that represents a maximal range of trace items associated to
7465
/// the same TSC value.
7566
struct TSCRange {
@@ -125,16 +116,38 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
125116
bool InRange(uint64_t item_index) const;
126117
};
127118

128-
// Struct holding counts for events;
119+
// Struct holding counts for events
129120
struct EventsStats {
130121
/// A count for each individual event kind. We use an unordered map instead
131122
/// of a DenseMap because DenseMap can't understand enums.
132-
std::unordered_map<lldb::TraceEvent, size_t> events_counts;
133-
size_t total_count = 0;
123+
///
124+
/// Note: We can't use DenseMap because lldb::TraceEvent is not
125+
/// automatically handled correctly by DenseMap. We'd need to implement a
126+
/// custom DenseMapInfo struct for TraceEvent and that's a bit too much for
127+
/// such a simple structure.
128+
std::unordered_map<lldb::TraceEvent, uint64_t> events_counts;
129+
uint64_t total_count = 0;
134130

135131
void RecordEvent(lldb::TraceEvent event);
136132
};
137133

134+
// Struct holding counts for errors
135+
struct ErrorStats {
136+
/// The following counters are mutually exclusive
137+
/// \{
138+
uint64_t other_errors = 0;
139+
uint64_t fatal_errors = 0;
140+
// libipt error -> count
141+
llvm::DenseMap<const char *, uint64_t> libipt_errors;
142+
/// \}
143+
144+
uint64_t GetTotalCount() const;
145+
146+
void RecordError(int libipt_error_code);
147+
148+
void RecordError(bool fatal);
149+
};
150+
138151
DecodedThread(
139152
lldb::ThreadSP thread_sp,
140153
const llvm::Optional<LinuxPerfZeroTscConversion> &tsc_conversion);
@@ -194,12 +207,22 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
194207
/// The load address of the instruction at the given index.
195208
lldb::addr_t GetInstructionLoadAddress(uint64_t item_index) const;
196209

210+
/// \return
211+
/// The number of instructions in this trace (not trace items).
212+
uint64_t GetTotalInstructionCount() const;
213+
197214
/// Return an object with statistics of the trace events that happened.
198215
///
199216
/// \return
200217
/// The stats object of all the events.
201218
const EventsStats &GetEventsStats() const;
202219

220+
/// Return an object with statistics of the trace errors that happened.
221+
///
222+
/// \return
223+
/// The stats object of all the events.
224+
const ErrorStats &GetErrorStats() const;
225+
203226
/// The approximate size in bytes used by this instance,
204227
/// including all the already decoded instructions.
205228
size_t CalculateApproximateMemoryUsage() const;
@@ -221,7 +244,14 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
221244
void AppendError(const IntelPTError &error);
222245

223246
/// Append a custom decoding.
224-
void AppendCustomError(llvm::StringRef error);
247+
///
248+
/// \param[in] error
249+
/// The error message.
250+
///
251+
/// \param[in] fatal
252+
/// If \b true, then the whole decoded thread should be discarded because a
253+
/// fatal anomaly has been found.
254+
void AppendCustomError(llvm::StringRef error, bool fatal = false);
225255

226256
/// Append an event.
227257
void AppendEvent(lldb::TraceEvent);
@@ -289,10 +319,16 @@ class DecodedThread : public std::enable_shared_from_this<DecodedThread> {
289319
/// TSC -> nanos conversion utility.
290320
llvm::Optional<LinuxPerfZeroTscConversion> m_tsc_conversion;
291321

322+
/// Statistics of all tracing errors.
323+
ErrorStats m_error_stats;
324+
292325
/// Statistics of all tracing events.
293326
EventsStats m_events_stats;
294327
/// Total amount of time spent decoding.
295328
std::chrono::milliseconds m_total_decoding_time{0};
329+
330+
/// Total number of instructions in the trace.
331+
uint64_t m_insn_count = 0;
296332
};
297333

298334
using DecodedThreadSP = std::shared_ptr<DecodedThread>;

‎lldb/source/Plugins/Trace/intel-pt/LibiptDecoder.cpp

+202-9
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,182 @@ CreateQueryDecoder(TraceIntelPT &trace_intel_pt, ArrayRef<uint8_t> buffer) {
128128
return PtQueryDecoderUP(decoder_ptr, QueryDecoderDeleter);
129129
}
130130

131+
/// Class used to identify anomalies in traces, which should often indicate a
132+
/// fatal error in the trace.
133+
class PSBBlockAnomalyDetector {
134+
public:
135+
PSBBlockAnomalyDetector(pt_insn_decoder &decoder,
136+
TraceIntelPT &trace_intel_pt,
137+
DecodedThread &decoded_thread)
138+
: m_decoder(decoder), m_decoded_thread(decoded_thread) {
139+
m_infinite_decoding_loop_threshold =
140+
trace_intel_pt.GetGlobalProperties()
141+
.GetInfiniteDecodingLoopVerificationThreshold();
142+
m_extremely_large_decoding_threshold =
143+
trace_intel_pt.GetGlobalProperties()
144+
.GetExtremelyLargeDecodingThreshold();
145+
m_next_infinite_decoding_loop_threshold =
146+
m_infinite_decoding_loop_threshold;
147+
}
148+
149+
/// \return
150+
/// An \a llvm::Error if an anomaly that includes the last instruction item
151+
/// in the trace, or \a llvm::Error::success otherwise.
152+
Error DetectAnomaly() {
153+
RefreshPacketOffset();
154+
uint64_t insn_added_since_last_packet_offset =
155+
m_decoded_thread.GetTotalInstructionCount() -
156+
m_insn_count_at_last_packet_offset;
157+
158+
// We want to check if we might have fallen in an infinite loop. As this
159+
// check is not a no-op, we want to do it when we have a strong suggestion
160+
// that things went wrong. First, we check how many instructions we have
161+
// decoded since we processed an Intel PT packet for the last time. This
162+
// number should be low, because at some point we should see branches, jumps
163+
// or interrupts that require a new packet to be processed. Once we reach
164+
// certain threshold we start analyzing the trace.
165+
//
166+
// We use the number of decoded instructions since the last Intel PT packet
167+
// as a proxy because, in fact, we don't expect a single packet to give,
168+
// say, 100k instructions. That would mean that there are 100k sequential
169+
// instructions without any single branch, which is highly unlikely, or that
170+
// we found an infinite loop using direct jumps, e.g.
171+
//
172+
// 0x0A: nop or pause
173+
// 0x0C: jump to 0x0A
174+
//
175+
// which is indeed code that is found in the kernel. I presume we reach
176+
// this kind of code in the decoder because we don't handle self-modified
177+
// code in post-mortem kernel traces.
178+
//
179+
// We are right now only signaling the anomaly as a trace error, but it
180+
// would be more conservative to also discard all the trace items found in
181+
// this PSB. I prefer not to do that for the time being to give more
182+
// exposure to this kind of anomalies and help debugging. Discarding the
183+
// trace items would just make investigation harded.
184+
//
185+
// Finally, if the user wants to see if a specific thread has an anomaly,
186+
// it's enough to run the `thread trace dump info` command and look for the
187+
// count of this kind of errors.
188+
189+
if (insn_added_since_last_packet_offset >=
190+
m_extremely_large_decoding_threshold) {
191+
// In this case, we have decoded a massive amount of sequential
192+
// instructions that don't loop. Honestly I wonder if this will ever
193+
// happen, but better safe than sorry.
194+
return createStringError(
195+
inconvertibleErrorCode(),
196+
"anomalous trace: possible infinite trace detected");
197+
}
198+
if (insn_added_since_last_packet_offset ==
199+
m_next_infinite_decoding_loop_threshold) {
200+
if (Optional<uint64_t> loop_size = TryIdentifyInfiniteLoop()) {
201+
return createStringError(
202+
inconvertibleErrorCode(),
203+
"anomalous trace: possible infinite loop detected of size %" PRIu64,
204+
*loop_size);
205+
}
206+
m_next_infinite_decoding_loop_threshold *= 2;
207+
}
208+
return Error::success();
209+
}
210+
211+
private:
212+
Optional<uint64_t> TryIdentifyInfiniteLoop() {
213+
// The infinite decoding loops we'll encounter are due to sequential
214+
// instructions that repeat themselves due to direct jumps, therefore in a
215+
// cycle each individual address will only appear once. We use this
216+
// information to detect cycles by finding the last 2 ocurrences of the last
217+
// instruction added to the trace. Then we traverse the trace making sure
218+
// that these two instructions where the ends of a repeating loop.
219+
220+
// This is a utility that returns the most recent instruction index given a
221+
// position in the trace. If the given position is an instruction, that
222+
// position is returned. It skips non-instruction items.
223+
auto most_recent_insn_index =
224+
[&](uint64_t item_index) -> Optional<uint64_t> {
225+
while (true) {
226+
if (m_decoded_thread.GetItemKindByIndex(item_index) ==
227+
lldb::eTraceItemKindInstruction) {
228+
return item_index;
229+
}
230+
if (item_index == 0)
231+
return None;
232+
item_index--;
233+
}
234+
return None;
235+
};
236+
// Similar to most_recent_insn_index but skips the starting position.
237+
auto prev_insn_index = [&](uint64_t item_index) -> Optional<uint64_t> {
238+
if (item_index == 0)
239+
return None;
240+
return most_recent_insn_index(item_index - 1);
241+
};
242+
243+
// We first find the most recent instruction.
244+
Optional<uint64_t> last_insn_index_opt =
245+
*prev_insn_index(m_decoded_thread.GetItemsCount());
246+
if (!last_insn_index_opt)
247+
return None;
248+
uint64_t last_insn_index = *last_insn_index_opt;
249+
250+
// We then find the most recent previous occurrence of that last
251+
// instruction.
252+
Optional<uint64_t> last_insn_copy_index = prev_insn_index(last_insn_index);
253+
uint64_t loop_size = 1;
254+
while (last_insn_copy_index &&
255+
m_decoded_thread.GetInstructionLoadAddress(*last_insn_copy_index) !=
256+
m_decoded_thread.GetInstructionLoadAddress(last_insn_index)) {
257+
last_insn_copy_index = prev_insn_index(*last_insn_copy_index);
258+
loop_size++;
259+
}
260+
if (!last_insn_copy_index)
261+
return None;
262+
263+
// Now we check if the segment between these last positions of the last
264+
// instruction address is in fact a repeating loop.
265+
uint64_t loop_elements_visited = 1;
266+
uint64_t insn_index_a = last_insn_index,
267+
insn_index_b = *last_insn_copy_index;
268+
while (loop_elements_visited < loop_size) {
269+
if (Optional<uint64_t> prev = prev_insn_index(insn_index_a))
270+
insn_index_a = *prev;
271+
else
272+
return None;
273+
if (Optional<uint64_t> prev = prev_insn_index(insn_index_b))
274+
insn_index_b = *prev;
275+
else
276+
return None;
277+
if (m_decoded_thread.GetInstructionLoadAddress(insn_index_a) !=
278+
m_decoded_thread.GetInstructionLoadAddress(insn_index_b))
279+
return None;
280+
loop_elements_visited++;
281+
}
282+
return loop_size;
283+
}
284+
285+
// Refresh the internal counters if a new packet offset has been visited
286+
void RefreshPacketOffset() {
287+
lldb::addr_t new_packet_offset;
288+
if (!IsLibiptError(pt_insn_get_offset(&m_decoder, &new_packet_offset)) &&
289+
new_packet_offset != m_last_packet_offset) {
290+
m_last_packet_offset = new_packet_offset;
291+
m_next_infinite_decoding_loop_threshold =
292+
m_infinite_decoding_loop_threshold;
293+
m_insn_count_at_last_packet_offset =
294+
m_decoded_thread.GetTotalInstructionCount();
295+
}
296+
}
297+
298+
pt_insn_decoder &m_decoder;
299+
DecodedThread &m_decoded_thread;
300+
lldb::addr_t m_last_packet_offset = LLDB_INVALID_ADDRESS;
301+
uint64_t m_insn_count_at_last_packet_offset = 0;
302+
uint64_t m_infinite_decoding_loop_threshold;
303+
uint64_t m_next_infinite_decoding_loop_threshold;
304+
uint64_t m_extremely_large_decoding_threshold;
305+
};
306+
131307
/// Class that decodes a raw buffer for a single PSB block using the low level
132308
/// libipt library. It assumes that kernel and user mode instructions are not
133309
/// mixed in the same PSB block.
@@ -155,9 +331,10 @@ class PSBBlockDecoder {
155331
/// appended to. It might have already some instructions.
156332
PSBBlockDecoder(PtInsnDecoderUP &&decoder_up, const PSBBlock &psb_block,
157333
Optional<lldb::addr_t> next_block_ip,
158-
DecodedThread &decoded_thread)
334+
DecodedThread &decoded_thread, TraceIntelPT &trace_intel_pt)
159335
: m_decoder_up(std::move(decoder_up)), m_psb_block(psb_block),
160-
m_next_block_ip(next_block_ip), m_decoded_thread(decoded_thread) {}
336+
m_next_block_ip(next_block_ip), m_decoded_thread(decoded_thread),
337+
m_anomaly_detector(*m_decoder_up, trace_intel_pt, decoded_thread) {}
161338

162339
/// \param[in] trace_intel_pt
163340
/// The main Trace object that own the PSB block.
@@ -192,7 +369,7 @@ class PSBBlockDecoder {
192369
return decoder_up.takeError();
193370

194371
return PSBBlockDecoder(std::move(*decoder_up), psb_block, next_block_ip,
195-
decoded_thread);
372+
decoded_thread, trace_intel_pt);
196373
}
197374

198375
void DecodePSBBlock() {
@@ -213,12 +390,24 @@ class PSBBlockDecoder {
213390
}
214391

215392
private:
216-
/// Decode all the instructions and events of the given PSB block.
217-
///
218-
/// \param[in] status
219-
/// The status that was result of synchronizing to the most recent PSB.
393+
/// Append an instruction and return \b false if and only if a serious anomaly
394+
/// has been detected.
395+
bool AppendInstructionAndDetectAnomalies(const pt_insn &insn) {
396+
m_decoded_thread.AppendInstruction(insn);
397+
398+
if (Error err = m_anomaly_detector.DetectAnomaly()) {
399+
m_decoded_thread.AppendCustomError(toString(std::move(err)),
400+
/*fatal=*/true);
401+
return false;
402+
}
403+
return true;
404+
}
405+
/// Decode all the instructions and events of the given PSB block. The
406+
/// decoding loop might stop abruptly if an infinite decoding loop is
407+
/// detected.
220408
void DecodeInstructionsAndEvents(int status) {
221409
pt_insn insn;
410+
222411
while (true) {
223412
status = ProcessPTEvents(status);
224413

@@ -238,7 +427,9 @@ class PSBBlockDecoder {
238427
} else if (IsEndOfStream(status)) {
239428
break;
240429
}
241-
m_decoded_thread.AppendInstruction(insn);
430+
431+
if (!AppendInstructionAndDetectAnomalies(insn))
432+
return;
242433
}
243434

244435
// We need to keep querying non-branching instructions until we hit the
@@ -247,7 +438,8 @@ class PSBBlockDecoder {
247438
// https://github.com/intel/libipt/blob/master/doc/howto_libipt.md#parallel-decode
248439
if (m_next_block_ip && insn.ip != 0) {
249440
while (insn.ip != *m_next_block_ip) {
250-
m_decoded_thread.AppendInstruction(insn);
441+
if (!AppendInstructionAndDetectAnomalies(insn))
442+
return;
251443

252444
status = pt_insn_next(m_decoder_up.get(), &insn, sizeof(insn));
253445

@@ -313,6 +505,7 @@ class PSBBlockDecoder {
313505
PSBBlock m_psb_block;
314506
Optional<lldb::addr_t> m_next_block_ip;
315507
DecodedThread &m_decoded_thread;
508+
PSBBlockAnomalyDetector m_anomaly_detector;
316509
};
317510

318511
Error lldb_private::trace_intel_pt::DecodeSingleTraceForThread(

‎lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.cpp

+78-4
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
#include "TraceIntelPTBundleSaver.h"
1717
#include "TraceIntelPTConstants.h"
1818
#include "lldb/Core/PluginManager.h"
19+
#include "lldb/Interpreter/OptionValueProperties.h"
1920
#include "lldb/Target/Process.h"
2021
#include "lldb/Target/Target.h"
2122
#include "llvm/ADT/None.h"
@@ -39,11 +40,57 @@ TraceIntelPT::GetThreadTraceStartCommand(CommandInterpreter &interpreter) {
3940
new CommandObjectThreadTraceStartIntelPT(*this, interpreter));
4041
}
4142

43+
#define LLDB_PROPERTIES_traceintelpt
44+
#include "TraceIntelPTProperties.inc"
45+
46+
enum {
47+
#define LLDB_PROPERTIES_traceintelpt
48+
#include "TraceIntelPTPropertiesEnum.inc"
49+
};
50+
51+
ConstString TraceIntelPT::PluginProperties::GetSettingName() {
52+
return ConstString(TraceIntelPT::GetPluginNameStatic());
53+
}
54+
55+
TraceIntelPT::PluginProperties::PluginProperties() : Properties() {
56+
m_collection_sp = std::make_shared<OptionValueProperties>(GetSettingName());
57+
m_collection_sp->Initialize(g_traceintelpt_properties);
58+
}
59+
60+
uint64_t
61+
TraceIntelPT::PluginProperties::GetInfiniteDecodingLoopVerificationThreshold() {
62+
const uint32_t idx = ePropertyInfiniteDecodingLoopVerificationThreshold;
63+
return m_collection_sp->GetPropertyAtIndexAsUInt64(
64+
nullptr, idx, g_traceintelpt_properties[idx].default_uint_value);
65+
}
66+
67+
uint64_t TraceIntelPT::PluginProperties::GetExtremelyLargeDecodingThreshold() {
68+
const uint32_t idx = ePropertyExtremelyLargeDecodingThreshold;
69+
return m_collection_sp->GetPropertyAtIndexAsUInt64(
70+
nullptr, idx, g_traceintelpt_properties[idx].default_uint_value);
71+
}
72+
73+
TraceIntelPT::PluginProperties &TraceIntelPT::GetGlobalProperties() {
74+
static TraceIntelPT::PluginProperties g_settings;
75+
return g_settings;
76+
}
77+
4278
void TraceIntelPT::Initialize() {
43-
PluginManager::RegisterPlugin(GetPluginNameStatic(), "Intel Processor Trace",
44-
CreateInstanceForTraceBundle,
45-
CreateInstanceForLiveProcess,
46-
TraceIntelPTBundleLoader::GetSchema());
79+
PluginManager::RegisterPlugin(
80+
GetPluginNameStatic(), "Intel Processor Trace",
81+
CreateInstanceForTraceBundle, CreateInstanceForLiveProcess,
82+
TraceIntelPTBundleLoader::GetSchema(), DebuggerInitialize);
83+
}
84+
85+
void TraceIntelPT::DebuggerInitialize(Debugger &debugger) {
86+
if (!PluginManager::GetSettingForProcessPlugin(
87+
debugger, PluginProperties::GetSettingName())) {
88+
const bool is_global_setting = true;
89+
PluginManager::CreateSettingForTracePlugin(
90+
debugger, GetGlobalProperties().GetValueProperties(),
91+
ConstString("Properties for the intel-pt trace plug-in."),
92+
is_global_setting);
93+
}
4794
}
4895

4996
void TraceIntelPT::Terminate() {
@@ -273,6 +320,20 @@ void TraceIntelPT::DumpTraceInfo(Thread &thread, Stream &s, bool verbose,
273320
event_to_count.second);
274321
}
275322
}
323+
// Trace error stats
324+
{
325+
const DecodedThread::ErrorStats &error_stats =
326+
decoded_thread_sp->GetErrorStats();
327+
s << "\n Errors:\n";
328+
s.Format(" Number of individual errors: {0}\n",
329+
error_stats.GetTotalCount());
330+
s.Format(" Number of fatal errors: {0}\n", error_stats.fatal_errors);
331+
for (const auto &[kind, count] : error_stats.libipt_errors) {
332+
s.Format(" Number of libipt errors of kind [{0}]: {1}\n", kind,
333+
count);
334+
}
335+
s.Format(" Number of other errors: {0}\n", error_stats.other_errors);
336+
}
276337

277338
if (storage.multicpu_decoder) {
278339
s << "\n Multi-cpu decoding:\n";
@@ -353,6 +414,19 @@ void TraceIntelPT::DumpTraceInfoAsJson(Thread &thread, Stream &s,
353414
}
354415
});
355416
});
417+
// Trace error stats
418+
const DecodedThread::ErrorStats &error_stats =
419+
decoded_thread_sp->GetErrorStats();
420+
json_str.attributeObject("errors", [&] {
421+
json_str.attribute("totalCount", error_stats.GetTotalCount());
422+
json_str.attributeObject("libiptErrors", [&] {
423+
for (const auto &[kind, count] : error_stats.libipt_errors) {
424+
json_str.attribute(kind, count);
425+
}
426+
});
427+
json_str.attribute("fatalErrors", error_stats.fatal_errors);
428+
json_str.attribute("otherErrors", error_stats.other_errors);
429+
});
356430

357431
if (storage.multicpu_decoder) {
358432
json_str.attribute(

‎lldb/source/Plugins/Trace/intel-pt/TraceIntelPT.h

+19
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,23 @@ namespace trace_intel_pt {
2222

2323
class TraceIntelPT : public Trace {
2424
public:
25+
/// Properties to be used with the `settings` command.
26+
class PluginProperties : public Properties {
27+
public:
28+
static ConstString GetSettingName();
29+
30+
PluginProperties();
31+
32+
~PluginProperties() override = default;
33+
34+
uint64_t GetInfiniteDecodingLoopVerificationThreshold();
35+
36+
uint64_t GetExtremelyLargeDecodingThreshold();
37+
};
38+
39+
/// Return the global properties for this trace plug-in.
40+
static PluginProperties &GetGlobalProperties();
41+
2542
void Dump(Stream *s) const override;
2643

2744
llvm::Expected<FileSpec> SaveToDisk(FileSpec directory,
@@ -59,6 +76,8 @@ class TraceIntelPT : public Trace {
5976
CreateInstanceForLiveProcess(Process &process);
6077

6178
static llvm::StringRef GetPluginNameStatic() { return "intel-pt"; }
79+
80+
static void DebuggerInitialize(Debugger &debugger);
6281
/// \}
6382

6483
lldb::CommandObjectSP
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
include "../../../../include/lldb/Core/PropertiesBase.td"
2+
3+
let Definition = "traceintelpt" in {
4+
def InfiniteDecodingLoopVerificationThreshold:
5+
Property<"infinite-decoding-loop-verification-threshold", "UInt64">,
6+
Global,
7+
DefaultUnsignedValue<10000>,
8+
Desc<"Specify how many instructions following an individual Intel PT "
9+
"packet must have been decoded before triggering the verification of "
10+
"infinite decoding loops. If no decoding loop has been found after this "
11+
"threshold T, another attempt will be done after 2T instructions, then "
12+
"4T, 8T and so on, which guarantees a total linear time spent checking "
13+
"this anomaly. If a loop is found, then decoding of the corresponding "
14+
"PSB block is stopped. An error is hence emitted in the trace and "
15+
"decoding is resumed in the next PSB block.">;
16+
def ExtremelyLargeDecodingThreshold:
17+
Property<"extremely-large-decoding-threshold", "UInt64">,
18+
Global,
19+
DefaultUnsignedValue<500000>,
20+
Desc<"Specify how many instructions following an individual Intel PT "
21+
"packet must have been decoded before stopping the decoding of the "
22+
"corresponding PSB block. An error is hence emitted in the trace and "
23+
"decoding is resumed in the next PSB block.">;
24+
}

‎lldb/test/API/commands/trace/TestTraceDumpInfo.py

+6
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,12 @@ def testDumpRawTraceSizeJSON(self):
7878
"software disabled tracing": 2,
7979
"trace synchronization point": 1
8080
}
81+
},
82+
"errors": {
83+
"totalCount": 0,
84+
"libiptErrors": {},
85+
"fatalErrors": 0,
86+
"otherErrors": 0
8187
}
8288
},
8389
"globalStats": {

‎lldb/test/API/commands/trace/TestTraceLoad.py

+12
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,12 @@ def testLoadMultiCoreTrace(self):
3737
"totalCount": 0,
3838
"individualCounts": {}
3939
},
40+
"errors": {
41+
"totalCount": 0,
42+
"libiptErrors": {},
43+
"fatalErrors": 0,
44+
"otherErrors": 0
45+
},
4046
"continuousExecutions": 0,
4147
"PSBBlocks": 0
4248
},
@@ -72,6 +78,12 @@ def testLoadMultiCoreTrace(self):
7278
"HW clock tick": 8
7379
}
7480
},
81+
"errors": {
82+
"totalCount": 1,
83+
"libiptErrors": {},
84+
"fatalErrors": 0,
85+
"otherErrors": 1
86+
},
7587
"continuousExecutions": 1,
7688
"PSBBlocks": 1
7789
},

0 commit comments

Comments
 (0)
Please sign in to comment.