Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#6662 public traces, part 2: encoding_filter #6663

Merged
merged 61 commits into from
May 6, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
c4106e0
i#6662 public traces, part 1: encoding_filter
edeiana Feb 21, 2024
66a29a2
Merge branch 'master' into i6662-public-record-filter
edeiana Apr 15, 2024
710a5d3
Saving previous work.
edeiana Apr 24, 2024
7d16393
Reverted back to not changing trace_entry_t length and pc.
edeiana Apr 25, 2024
d0f102c
Reverted back workaround for virtual register remapping.
edeiana Apr 25, 2024
38c2c60
Merge branch 'master' into i6662-public-record-filter
edeiana Apr 25, 2024
3653ee4
Added -encoding_filter_enabled flag, which also (indirectly) disables…
edeiana Apr 25, 2024
8263e76
Removed unnecessary headers.
edeiana Apr 25, 2024
8a121b4
We now use dynamorio encoding functionality, so we need to link it
edeiana Apr 25, 2024
cedc3bc
Fixing static dynamorio linking to client not working on mac.
edeiana Apr 25, 2024
064697f
Added TODO.
edeiana Apr 25, 2024
edb3d55
Fixed warning as error on windows.
edeiana Apr 25, 2024
8825a91
Minor comment improvement.
edeiana Apr 25, 2024
d4eb820
Another fix of warning as error on windows.
edeiana Apr 25, 2024
8924620
Now modifying the file_type of the trace to regdeps ISA.
edeiana Apr 29, 2024
9a8cd02
Improved comments.
edeiana Apr 29, 2024
f1fdd2c
Moved is_any_instr_type() to trace_entry.h.
edeiana Apr 29, 2024
0ef52fa
Refactoring: renaming of generic encoding_filter to
edeiana Apr 30, 2024
09c4d15
Renaming class encoding_filter_t to encodings2regdeps_t.
edeiana Apr 30, 2024
96d0925
Fixed memory leak.
edeiana Apr 30, 2024
3ebaf39
Added encodings2regdeps test (note: record_filter is tested
edeiana Apr 30, 2024
bfbe022
Fixed opcode_mix analyzer to work with DR_ISA_REGDEPS
edeiana Apr 30, 2024
6950c7b
Removed unnecessary space.
edeiana Apr 30, 2024
241819c
Indentation fix.
edeiana Apr 30, 2024
33e3bf0
Formatting fixed.
edeiana Apr 30, 2024
b61c5b7
Code cleanup.
edeiana Apr 30, 2024
05abd7e
clang-format pass.
edeiana Apr 30, 2024
454aac2
Templatex for new encodings2regdeps test. To fix.
edeiana Apr 30, 2024
84fa480
Improved comments.
edeiana Apr 30, 2024
4c7c201
Added OP_UNDECODED and interface between record_filter and its filters.
edeiana May 1, 2024
599fb04
Addressed minor PR feedback.
edeiana May 1, 2024
19c06a6
Updated version.
edeiana May 1, 2024
f31811c
Fix arm warning as error.
edeiana May 1, 2024
5fcc6c1
Fixed doxygen link in release doc.
edeiana May 1, 2024
201df94
From -encodings2regdeps to -filter_encodings2regdeps in test.
edeiana May 1, 2024
a31cd26
Doxygen could not resolve links.
edeiana May 1, 2024
d763c30
Fixed test, using existing infrastructure.
edeiana May 1, 2024
1fe93cb
Testing arm fix for opcode name.
edeiana May 1, 2024
7d40b2a
Fixed tests.
edeiana May 1, 2024
86bd2f8
Fixed encodings2regdeps opcode_mix analyzer test.
edeiana May 1, 2024
831d0a6
Reverted changed to OP_UNDECODED name for arm.
edeiana May 1, 2024
91fca5f
Moved include to its correct place, not the top of the file like clangd
edeiana May 1, 2024
f12c3a8
Addressing PR feedback.
edeiana May 2, 2024
807df45
Attempt to fix doxygen.
edeiana May 2, 2024
0e5d075
Attempt to fix doxygen 2.
edeiana May 2, 2024
8f483e9
Handling opcode name of OP_INVALID and OP_UNDECODED for x86.
edeiana May 2, 2024
fc50942
Fixed doxygen comment.
edeiana May 2, 2024
0e0a9b9
Added test entries to check different conditions.
edeiana May 2, 2024
3c416cb
Attempt to fix macos build.
edeiana May 2, 2024
db81af6
Attempt to fix macos build 2.
edeiana May 2, 2024
5b20bb5
Now using configure_DynamoRIO_standalone() for drmemtrace_record_filter.
edeiana May 2, 2024
b35faf4
Build fix, use before def in cmakelist.
edeiana May 2, 2024
2509adf
Fixed chunking in encodings2regdeps unit test.
edeiana May 2, 2024
fc0a744
Fixed test_encodings2regdeps_filter().
edeiana May 3, 2024
7a04272
Updated comment on chunk_footer, whose value is the chunk ordinal.
edeiana May 3, 2024
0f76ec8
Fixed doxygen comment on chunk_footer.
edeiana May 3, 2024
6b95e4d
Removed comment on marker value of TRACE_MARKER_TYPE_CHUNK_FOOTER.
edeiana May 5, 2024
4ffe2d1
Added memset for result reproducibility.
edeiana May 6, 2024
3d617ab
Added test for when real ISA encoding has more (or less)
edeiana May 6, 2024
bc7e4a0
Renaming encodings2regdeps* to encodings2regdeps_filter*
edeiana May 6, 2024
82a113b
Merge branch 'master' into i6662-public-record-filter
edeiana May 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion clients/drcachesim/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,7 @@ add_exported_library(drmemtrace_record_filter STATIC
tools/filter/cache_filter.h
tools/filter/cache_filter.cpp
tools/filter/type_filter.h
tools/filter/encoding_filter.h
tools/filter/null_filter.h)
target_link_libraries(drmemtrace_record_filter drmemtrace_simulator)

Expand Down Expand Up @@ -392,9 +393,14 @@ add_dependencies(prefetch_analyzer_launcher api_headers)
add_executable(record_filter_launcher
tools/record_filter_launcher.cpp
tests/test_helpers.cpp)
target_link_libraries(record_filter_launcher drmemtrace_analyzer drmemtrace_record_filter)
target_link_libraries(record_filter_launcher drmemtrace_analyzer drmemtrace_record_filter
drmemtrace_raw2trace drcovlib_static drfrontendlib ${static_libc} ${zlib_libs})
edeiana marked this conversation as resolved.
Show resolved Hide resolved
add_dependencies(record_filter_launcher api_headers)
append_property_list(TARGET record_filter_launcher COMPILE_DEFINITIONS "NO_HELPER_MAIN")
use_DynamoRIO_extension(record_filter_launcher droption)
if (NOT APPLE)
configure_DynamoRIO_static(record_filter_launcher)
endif()

# We want to use test_helper's disable_popups() but we have _tmain and so do not want
# the test_helper library's main symbol: so we compile ourselves and disable.
Expand Down
3 changes: 2 additions & 1 deletion clients/drcachesim/analyzer_multi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,8 @@ record_analyzer_multi_t::create_analysis_tool_from_options(
op_outdir.get_value(), op_filter_stop_timestamp.get_value(),
op_filter_cache_size.get_value(), op_filter_trace_types.get_value(),
op_filter_marker_types.get_value(), op_trim_before_timestamp.get_value(),
op_trim_after_timestamp.get_value(), op_verbose.get_value());
op_trim_after_timestamp.get_value(), op_encoding_filter_enabled.get_value(),
op_verbose.get_value());
}
ERRMSG("Usage error: unsupported record analyzer type \"%s\". Only " RECORD_FILTER
" is supported.\n",
Expand Down
5 changes: 5 additions & 0 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -972,6 +972,11 @@ droption_t<std::string>
"Comma-separated integers for marker types to remove. "
"See trace_marker_type_t for the list of marker types.");

droption_t<bool> op_encoding_filter_enabled(
DROPTION_SCOPE_FRONTEND, "encoding_filter_enabled", false,
edeiana marked this conversation as resolved.
Show resolved Hide resolved
"Enable converting the encoding of instructions to synthetic ISA DR_ISA_REGDEPS.",
"Enable converting the encoding of instructions to synthetic ISA DR_ISA_REGDEPS.");

droption_t<uint64_t> op_trim_before_timestamp(
DROPTION_SCOPE_ALL, "trim_before_timestamp", 0, 0,
(std::numeric_limits<uint64_t>::max)(),
Expand Down
1 change: 1 addition & 0 deletions clients/drcachesim/common/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,7 @@ extern dynamorio::droption::droption_t<uint64_t> op_filter_stop_timestamp;
extern dynamorio::droption::droption_t<int> op_filter_cache_size;
extern dynamorio::droption::droption_t<std::string> op_filter_trace_types;
extern dynamorio::droption::droption_t<std::string> op_filter_marker_types;
extern dynamorio::droption::droption_t<bool> op_encoding_filter_enabled;
extern dynamorio::droption::droption_t<uint64_t> op_trim_before_timestamp;
extern dynamorio::droption::droption_t<uint64_t> op_trim_after_timestamp;
extern dynamorio::droption::droption_t<bool> op_abort_on_invariant_error;
Expand Down
3 changes: 2 additions & 1 deletion clients/drcachesim/reader/reader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,8 @@ reader_t::process_input_entry()
++cur_instr_count_;
// Look for encoding bits that belong to this instr.
if (last_encoding_.size > 0) {
if (last_encoding_.size != cur_ref_.instr.size) {
if (!ignore_encoding_size_vs_instr_length_check_ &&
edeiana marked this conversation as resolved.
Show resolved Hide resolved
(last_encoding_.size != cur_ref_.instr.size)) {
edeiana marked this conversation as resolved.
Show resolved Hide resolved
ERRMSG(
"Encoding size %zu != instr size %zu for PC 0x%zx at ord %" PRIu64
" instr %" PRIu64 " last_timestamp=0x%" PRIx64 "\n",
Expand Down
1 change: 1 addition & 0 deletions clients/drcachesim/reader/reader.h
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,7 @@ class reader_t : public std::iterator<std::input_iterator_tag, memref_t>,
// some thread-based checks may not apply.
bool core_sharded_ = false;
bool found_filetype_ = false;
bool ignore_encoding_size_vs_instr_length_check_ = false;

private:
memref_t cur_ref_;
Expand Down
14 changes: 9 additions & 5 deletions clients/drcachesim/tests/record_filter_unit_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
#include "tools/filter/record_filter.h"
#include "tools/filter/trim_filter.h"
#include "tools/filter/type_filter.h"
#include "trace_entry.h"
#include "zipfile_ostream.h"

#include <inttypes.h>
Expand Down Expand Up @@ -89,7 +90,8 @@ class test_record_filter_t : public dynamorio::drmemtrace::record_filter_t {
test_record_filter_t(std::vector<std::unique_ptr<record_filter_func_t>> filters,
uint64_t last_timestamp, bool write_archive = false)
: record_filter_t("", std::move(filters), last_timestamp,
/*verbose=*/0)
/*verbose=*/0,
/*ignore_encoding_size_vs_instr_length_check*/ false)
, write_archive_(write_archive)
{
}
Expand Down Expand Up @@ -541,7 +543,8 @@ test_chunk_update()
return nullptr;
}
bool
parallel_shard_filter(trace_entry_t &entry, void *shard_data) override
parallel_shard_filter(trace_entry_t &entry, void *shard_data,
std::vector<trace_entry_t> &last_encoding) override
{
bool res = true;
if (type_is_instr(static_cast<trace_type_t>(entry.type))) {
Expand Down Expand Up @@ -1026,9 +1029,10 @@ test_null_filter()
// other entries are expected to stay.
static constexpr uint64_t stop_timestamp_us = 1;
auto record_filter = std::unique_ptr<dynamorio::drmemtrace::record_filter_t>(
new dynamorio::drmemtrace::record_filter_t(output_dir, std::move(filter_funcs),
stop_timestamp_us,
/*verbosity=*/0));
new dynamorio::drmemtrace::record_filter_t(
output_dir, std::move(filter_funcs), stop_timestamp_us,
/*verbosity=*/0,
/*ignore_encoding_size_vs_instr_length_check=*/false));
std::vector<record_analysis_tool_t *> tools;
tools.push_back(record_filter.get());
record_analyzer_t record_analyzer(op_trace_dir.get_value(), &tools[0],
edeiana marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
3 changes: 2 additions & 1 deletion clients/drcachesim/tools/filter/cache_filter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,8 @@ cache_filter_t::parallel_shard_init(memtrace_stream_t *shard_stream,
return per_shard;
}
bool
cache_filter_t::parallel_shard_filter(trace_entry_t &entry, void *shard_data)
cache_filter_t::parallel_shard_filter(trace_entry_t &entry, void *shard_data,
std::vector<trace_entry_t> &last_encoding)
{
if (entry.type == TRACE_TYPE_MARKER && entry.size == TRACE_MARKER_TYPE_FILETYPE) {
if (filter_instrs_)
Expand Down
3 changes: 2 additions & 1 deletion clients/drcachesim/tools/filter/cache_filter.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@ class cache_filter_t : public record_filter_t::record_filter_func_t {
parallel_shard_init(memtrace_stream_t *shard_stream,
bool partial_trace_filter) override;
bool
parallel_shard_filter(trace_entry_t &entry, void *shard_data) override;
parallel_shard_filter(trace_entry_t &entry, void *shard_data,
std::vector<trace_entry_t> &last_encoding) override;
bool
parallel_shard_exit(void *shard_data) override;

Expand Down
173 changes: 173 additions & 0 deletions clients/drcachesim/tools/filter/encoding_filter.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
/* **********************************************************
* Copyright (c) 2022-2024 Google, Inc. All rights reserved.
* **********************************************************/

/*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of Google, Inc. nor the names of its contributors may be
* used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL VMWARE, INC. OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
* DAMAGE.
*/

#ifndef _ENCODING_FILTER_H_
#define _ENCODING_FILTER_H_ 1

#include "record_filter.h"
#include "trace_entry.h"
#include "utils.h"

#include <cstring>
#include <vector>

/* We are not exporting the defines in core/ir/isa_regdeps/encoding_common.h, so we
* redefine DR_ISA_REGDEPS alignment requirement here.
*/
#define REGDEPS_ALIGN_BYTES 4

#define REGDEPS_MAX_ENCODING_LENGTH 16

namespace dynamorio {
namespace drmemtrace {

class encoding_filter_t : public record_filter_t::record_filter_func_t {
edeiana marked this conversation as resolved.
Show resolved Hide resolved
public:
encoding_filter_t()
{
}

void *
parallel_shard_init(memtrace_stream_t *shard_stream,
bool partial_trace_filter) override
{
dcontext_.dcontext = dr_standalone_init();
return nullptr;
}

bool
parallel_shard_filter(trace_entry_t &entry, void *shard_data,
std::vector<trace_entry_t> &last_encoding) override
edeiana marked this conversation as resolved.
Show resolved Hide resolved
{
/* TODO i#6662: modify trace_entry_t header entry to regdeps ISA, instead of the
* real ISA of the incoming trace.
*/

/* We have encoding to convert.
* Normally the sequence of trace_entry_t(s) looks like:
* [encoding,]+ instr_with_PC, [read | write]*
edeiana marked this conversation as resolved.
Show resolved Hide resolved
* (+ = one or more, * = zero or more)
* If we enter here, trace_entry_t is instr_with_PC.
*/
if (is_any_instr_type(static_cast<trace_type_t>(entry.type)) &&
!last_encoding.empty()) {
/* Gather real ISA encoding bytes looping through all previously saved
* encoding bytes in last_encoding.
*/
const app_pc pc = reinterpret_cast<app_pc>(entry.addr);
byte encoding[MAX_ENCODING_LENGTH];
memset(encoding, 0, sizeof(encoding));
uint encoding_offset = 0;
for (auto &trace_encoding : last_encoding) {
memcpy(encoding + encoding_offset, trace_encoding.encoding,
trace_encoding.size);
encoding_offset += trace_encoding.size;
}

/* Genenerate the real ISA instr_t by decoding the encoding bytes.
*/
instr_t instr;
instr_init(dcontext_.dcontext, &instr);
app_pc next_pc = decode_from_copy(dcontext_.dcontext, encoding, pc, &instr);
if (next_pc == NULL || !instr_valid(&instr)) {
instr_free(dcontext_.dcontext, &instr);
error_string_ =
"Failed to decode instruction " + to_hex_string(entry.addr);
return false;
}

/* Convert the real ISA instr_t into a regdeps ISA instr_t.
*/
instr_t instr_regdeps;
instr_init(dcontext_.dcontext, &instr_regdeps);
instr_convert_to_isa_regdeps(dcontext_.dcontext, &instr, &instr_regdeps);
edeiana marked this conversation as resolved.
Show resolved Hide resolved

/* Obtain regdeps ISA instr_t encoding bytes.
*/
byte ALIGN_VAR(REGDEPS_ALIGN_BYTES)
encoding_regdeps[REGDEPS_MAX_ENCODING_LENGTH];
app_pc next_pc_regdeps =
instr_encode(dcontext_.dcontext, &instr_regdeps, encoding_regdeps);

/* Compute number of trace_entry_t to contain regdeps ISA encoding.
* Each trace_entry_t record can contain 8 byte encoding.
*/
uint trace_entry_encoding_size = (uint)sizeof(entry.addr); /* == 8 */
uint regdeps_encoding_size = (uint)(next_pc_regdeps - encoding_regdeps);
uint num_regdeps_encoding_entries =
ALIGN_FORWARD(regdeps_encoding_size, trace_entry_encoding_size) /
trace_entry_encoding_size;
last_encoding.resize(num_regdeps_encoding_entries);
edeiana marked this conversation as resolved.
Show resolved Hide resolved

/* Copy regdeps ISA encoding, splitting it among the last_encoding
* trace_entry_t records.
*/
uint regdeps_encoding_offset = 0;
for (trace_entry_t &encoding_entry : last_encoding) {
encoding_entry.type = TRACE_TYPE_ENCODING;
uint size = regdeps_encoding_size < trace_entry_encoding_size
? regdeps_encoding_size
: trace_entry_encoding_size;
encoding_entry.size = (unsigned short)size;
memset(encoding_entry.encoding, 0, trace_entry_encoding_size);
memcpy(encoding_entry.encoding,
encoding_regdeps + regdeps_encoding_offset, encoding_entry.size);
regdeps_encoding_size -= trace_entry_encoding_size;
regdeps_encoding_offset += encoding_entry.size;
}
}
return true;
}

bool
parallel_shard_exit(void *shard_data) override
{
return true;
}

private:
struct dcontext_cleanup_last_t {
public:
~dcontext_cleanup_last_t()
{
if (dcontext != nullptr)
dr_standalone_exit();
}
void *dcontext = nullptr;
};

dcontext_cleanup_last_t dcontext_;
};

} // namespace drmemtrace
} // namespace dynamorio
#endif /* _ENCODING_FILTER_H_ */
3 changes: 2 additions & 1 deletion clients/drcachesim/tools/filter/null_filter.h
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ class null_filter_t : public record_filter_t::record_filter_func_t {
return nullptr;
}
bool
parallel_shard_filter(trace_entry_t &entry, void *shard_data) override
parallel_shard_filter(trace_entry_t &entry, void *shard_data,
std::vector<trace_entry_t> &last_encoding) override
{
return true;
}
Expand Down
Loading
Loading