yactfr (yac·tif·er) is yet another CTF reader.
The yactfr library is written in C++14 and offers a (🥁…) C++14 API.
While the CTF reading libraries that I know about focus on decoding and providing you with complete, ordered event record objects, the yactfr API offers a lower level of CTF processing: you iterate individual element sequences to obtain elements: beginning/end of packet, beginning/end of event record, beggining/end of structure, individual data stream scalar values like fixed-length integers, fixed-length floating point numbers, and null-terminated strings, specific clock value update, known data stream ID, and the rest. This means the sizes of event records and packets don’t influence the performance or memory usage of yactfr.
Some use cases where this library can prove useful:
-
Trace inspection programs, for example when you need to debug a custom CTF producer.
-
Foundation of an API with a higher level of abstraction.
-
Targeted offline analyses with low memory usage.
The name yactfr takes its inspiration from yajl, a JSON parser with a similar approach.
Warning
|
yactfr is not mature yet! Its API is still experimental: the interface could change at any time. |
-
Only depends on Boost at build time, and nothing at run time (except for your C++ runtime library):
$ ldd libyactfr.so linux-vdso.so.1 (0x00007ffe98fa5000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fb8337d2000) libm.so.6 => /usr/lib/libm.so.6 (0x00007fb83343d000) libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fb833225000) libc.so.6 => /usr/lib/libc.so.6 (0x00007fb832e69000) /usr/lib64/ld-linux-x86-64.so.2 (0x00007fb833e51000)
-
Simple, documented C++14 API with a focus on immutability.
-
Efficient data stream decoding: zero memory allocation, zero smart pointer reference count change, zero virtual method call, and zero copy during an element sequence iteration.
-
yactfr decodes a data stream fixed-length bit array in place: the result is available as an element member.
-
A data stream string/BLOB is available as one or more raw data parts in consecutive elements. The actual raw data is a pointer to the current data block of the data source (for example, from a memory mapped region), which means zero string/BLOB copy.
This is possible because, as per CTF 1.8.3 and CTF2-SPEC-2.0, data stream strings and BLOBs have an 8-bit alignment requirement.
-
-
Decodes CTF packets of any size and event records of any size with steady memory usage and performance.
-
Offers a memory mapped file view data source, but you can also implement your own data source.
-
It’s safe to use two different iterators on the same element sequence in two different threads.
-
Full CTF metadata text inspection and validation: a parsing error exception contains a list of locations (offset in bytes, line number, and column number) with associated context messages.
-
Full data stream packet decoding information with the input iterator interface, for example:
-
The decoded packet magic number.
-
The decoded metadata stream UUID.
-
The expected total length of the current packet.
-
The expected content length of the current packet.
-
The default clock is updated.
-
The next data stream type to use.
-
The next event record type to use.
-
The data stream (instance) ID.
-
-
“Seek packet” iterator method to seek the beginning of an element known to be located at a specific offset, in bytes, within the same element sequence.
You can use this method with non-CTF packet index information, for example LTTng's packet index files (not directly supported by yactfr).
-
Decoding error reporting with precise message, offset in element sequence (bits) where this message applies within the element sequence, and other relevant properties.
-
Performance (
nop
iteration loop) is similar to Babeltrace 1.5.3’s (-o dummy
) with a release build.I don’t publish numbers: experiment by yourself to confirm or deny this claim.
Note that after a few years of working with all sorts of real life CTF traces, I never caught one which yactfr would refuse to decode, but I’m still honest:
-
Only builds on a Linux/Unix platform.
The non-portable part is the memory mapped file source which uses system functions such as
open()
,close()
,fstat()
,mmap()
, andmadvise()
. -
Decodes up to, and including, 64-bit signed/unsigned CTF fixed-length bit arrays and integers (no “big integers”).
-
Decodes up to, and including, 63-bit (effective, which means 72 total) signed/unsigned CTF variable-length integers.
-
Only decodes 32-bit and 64-bit CTF fixed-length floating point numbers.
-
Packet lengths must be multiples of 8 bits (I’m still not sure, but this could be enforced by the specification anyway), and be at least 8 bits.
-
TSDL (CTF 1.8): Doesn’t handle single-line comment continuation when the line ends with
\
:event { name = "hello"; // this is a comment \ id = 23; we're still in the comment started above here id = 42; ... };
-
TSDL (CTF 1.8): Doesn’t support relative dynamic-length array type lengths and variant type selectors in data type aliases (or named structure/variant types) which target structure member types outside this data type alias.
For example, this is not supported (TSDL):
fields := struct { int len; typealias struct { int sequence[len]; } := my_struct; struct { int len; my_struct a_struct; } field; };
This is also not supported (TSDL):
fields := struct { enum { ... } tag; variant my_variant <tag> { ... } a_variant; my_variant the_variant; };
The example above would work, however, if the selector location of the variant type would be absolute:
fields := struct { enum { ... } tag; variant my_variant <event.fields.tag> { ... } a_variant; my_variant the_variant; };
-
API and ABI backward compatibility is not guaranteed at this point.
Please rebuild your project if you change the yactfr version.
Make sure you have the build time requirements:
$ git clone https://github.com/eepp/yactfr $ cd yactfr $ mkdir build $ cd build $ cmake -DCMAKE_BUILD_TYPE=release .. $ make # make install
You can specify your favorite C and C++ compilers with the usual
CC
and CXX
environment variables when you run cmake
, and
additional options with CFLAGS
and CXXFLAGS
.
Specify -DOPT_BUILD_DOC=YES
to cmake
to enable the HTML API
documentation build (requires Doxygen). The documentation is available
in BUILD/doc/api/output/html
, where BUILD
is your build
directory.
Specify -DCMAKE_INSTALL_PREFIX=PREFIX
to cmake
to install yactfr
to the PREFIX
directory instead of the default /usr/local
directory.
For example, this is how I run cmake
for development:
$ CC=clang CXX=clang++ CXXFLAGS='-Wextra -Wall -pedantic' \ cmake .. -DCMAKE_BUILD_TYPE=debug -DOPT_BUILD_DOC=ON
For production, you should make a release build:
$ CC=clang CXX=clang++ \ cmake .. -DCMAKE_BUILD_TYPE=release -DOPT_BUILD_DOC=ON
Once you have built the project in the build directory, you can run the tests. You need Python 3 and pytest.
$ make check
If you’re in a hurry and you have the
pytest-xdist package, you can
parallelize the testing process. You need to set the YACTFR_BINARY_DIR
environment variable to the build directory (absolute path), for
example:
$ make tests $ YACTFR_BINARY_DIR=$(pwd) pytest -n logical
In the examples below, the program accepts two arguments:
-
The path to the metadata stream file of the trace (required).
-
The path to a data stream file of the same trace (required by some example).
Build the API documentation for a thorough reference.
Note
|
The examples are not necessarily optimal: their purpose is to show what the yactfr API looks like. |
#include <cassert>
#include <fstream>
#include <iostream>
#include <yactfr/yactfr.hpp>
int main(const int argc, const char * const argv[])
{
assert(argc == 3);
// open metadata stream file
std::ifstream metadataFile {argv[1], std::ios::binary};
// create metadata stream object
const auto metadataStream = yactfr::createMetadataStream(metadataFile);
// we have the metadata text at this point: safe to close the file
metadataFile.close();
// get a trace type from the metadata text
auto traceTypeMsUuidPair = yactfr::fromMetadataText(metadataStream->text());
// create a memory mapped file view factory to read the data stream file
yactfr::MemoryMappedFileViewFactory factory {argv[2]};
// create an element sequence from the trace type and data source factory
yactfr::ElementSequence seq {*traceTypeMsUuidPair.first, factory};
// print all the event record names
for (auto& elem : seq) {
if (elem.isEventRecordInfoElement()) {
auto& erInfo = elem.asEventRecordInfoElement();
// the name of an event record type is optional
if (erInfo.type()->name()) {
std::cout << *erInfo.type()->name() << std::endl;
}
}
}
}
sched_switch
event records and their offset.#include <cassert>
#include <fstream>
#include <iostream>
#include <yactfr/yactfr.hpp>
int main(const int argc, const char * const argv[])
{
assert(argc == 3);
// open metadata stream file
std::ifstream metadataFile {argv[1], std::ios::binary};
// create metadata stream object
const auto metadataStream = yactfr::createMetadataStream(metadataFile);
// we have the metadata text at this point: safe to close the file
metadataFile.close();
// get a trace type from the metadata text
auto traceTypeMsUuidPair = yactfr::fromMetadataText(metadataStream->text());
// create a memory mapped file view factory to read the data stream file
yactfr::MemoryMappedFileViewFactory factory {argv[2]};
// create an element sequence from the trace type and data source factory
yactfr::ElementSequence seq {*traceTypeMsUuidPair.first, factory};
// print all the fixed-length signed integers of the `sched_switch` ERs
const auto endIt = seq.end();
bool inSchedSwitchEventRecord = false;
for (auto it = seq.begin(); it != endIt; ++it) {
if (it->isEventRecordInfoElement()) {
auto& ertElem = it->asEventRecordInfoElement();
// the name of an event record type is optional
if (ertElem.type()->name() && *ertElem.type()->name() == "sched_switch") {
std::cout << "---" << std::endl;
inSchedSwitchEventRecord = true;
} else {
inSchedSwitchEventRecord = false;
}
continue;
}
if (inSchedSwitchEventRecord && it->isFixedLengthSignedIntegerElement()) {
std::cout << it.offset() << ": ";
auto& intElem = it->asFixedLengthSignedIntegerElement();
if (intElem.structureMemberType()) {
std::cout << intElem.structureMemberType()->displayName() << ": ";
}
std::cout << intElem.value() << std::endl;
}
}
}
In this example, we iterate all the elements of the data stream. The next example shows how to do the same faster.
#include <cassert>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <yactfr/yactfr.hpp>
int main(const int argc, const char * const argv[])
{
assert(argc == 3);
// open metadata stream file
std::ifstream metadataFile {argv[1], std::ios::binary};
// create metadata stream object
const auto metadataStream = yactfr::createMetadataStream(metadataFile);
// we have the metadata text at this point: safe to close the file
metadataFile.close();
// get a trace type from the metadata text
auto traceTypeMsUuidPair = yactfr::fromMetadataText(metadataStream->text());
// create a memory mapped file view factory to read the data stream file
yactfr::MemoryMappedFileViewFactory factory {argv[2]};
// create an element sequence from the trace type and data source factory
yactfr::ElementSequence seq {*traceTypeMsUuidPair.first, factory};
// print all the packet offsets and lengths (both in bits)
const auto endIt = seq.end();
yactfr::Index curPktOffset = 0;
unsigned long curPktNumber = 0;
for (auto it = seq.begin(); it != endIt; ++it) {
if (it->isPacketBeginningElement()) {
// save packet beginning offset
curPktOffset = it.offset();
} else if (it->isPacketEndElement()) {
// back to first level: end of packet
const auto pktLen = it.offset() - curPktOffset;
std::cout << "Packet #" << curPktNumber << ": " <<
"Offset: " << std::setw(10) << curPktOffset << " " <<
"Size: " << std::setw(10) << pktLen <<
std::endl;
++curPktNumber;
}
}
}
This is a faster version of the previous example.
Instead of decoding the whole packet to find its length, we use the expected packet total length property of the “packet info” element. This element is available after the decoder reads the packet context. Then, we make the iterator seek the next packet directly.
Note that this example doesn’t work if the packet context type doesn’t contain an expected packet total length fixed-length unsigned integer, in which case the data stream must contain a single packet. This could be detected by inspecting the metadata (trace type) and using the size of the whole data stream file as the unique packet total length.
#include <cassert>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <yactfr/yactfr.hpp>
int main(const int argc, const char * const argv[])
{
assert(argc == 3);
// open metadata stream file
std::ifstream metadataFile {argv[1], std::ios::binary};
// create metadata stream object
const auto metadataStream = yactfr::createMetadataStream(metadataFile);
// we have the metadata text at this point: safe to close the file
metadataFile.close();
// get a trace type from the metadata text
auto traceTypeMsUuidPair = yactfr::fromMetadataText(metadataStream->text());
// create a memory mapped file view factory to read the data stream file
yactfr::MemoryMappedFileViewFactory factory {argv[2]};
// create an element sequence from the trace type and data source factory
yactfr::ElementSequence seq {*traceTypeMsUuidPair.first, factory};
// print all the packet offsets and lengths (both in bits)
const auto endIt = seq.end();
auto it = seq.begin();
yactfr::Index curPktOffset = 0;
unsigned long curPktNumber = 0;
while (it != endIt) {
if (it->isPacketBeginningElement()) {
// save packet beginning offset
curPktOffset = it.offset();
} else if (it->isPacketInfoElement()) {
// this element contains the expected total length of the current packet
auto& elem = it->asPacketInfoElement();
assert(elem.expectedTotalLength());
std::cout << "Packet #" << curPktNumber << ": " <<
"Offset: " << std::setw(10) << curPktOffset << " " <<
"Size: " << std::setw(10) << *elem.expectedTotalLength() <<
std::endl;
++curPktNumber;
/*
* Seek the next packet without iterating the intermediate
* elements. The expected offset is in bytes, so we need to
* divide what we have by 8.
*/
it.seekPacket((curPktOffset + *elem.expectedTotalLength()) / 8);
continue;
}
++it;
}
}
Please contribute with GitHub pull requests and report bugs as GitHub issues.
See eepp.ca.
I’m eepp
on Libera.Chat and
OFTC.