Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Natvis visualizations for some Regex types #849

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ jobs:
- nightly
- macos
- win-msvc
- win-msvc (nightly)
- win-gnu
include:
- build: pinned
Expand Down Expand Up @@ -55,6 +56,9 @@ jobs:
- build: win-msvc
os: windows-latest
rust: stable
- build: win-msvc (nightly)
os: windows-latest
rust: nightly
- build: win-gnu
os: windows-latest
rust: stable-x86_64-gnu
Expand Down Expand Up @@ -154,6 +158,13 @@ jobs:
run: |
cargo test --test default --no-default-features --features 'std pattern unicode-perl'

# The #[debugger_visualizer] attribute is currently gated behind an unstable feature flag.
# In order to test the visualizers for the regex crate, they have to be tested on a nightly build.
- if: matrix.build == 'win-msvc (nightly)'
name: Run tests with debugger_visualizer feature
run: |
cargo test --test visualizers --features 'debugger_visualizer' -- --test-threads=1

rustfmt:
name: rustfmt
runs-on: ubuntu-18.04
Expand Down
12 changes: 12 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,9 @@ unstable = ["pattern"]
# by default if the unstable feature is enabled.
pattern = []

# Enable to use the #[debugger_visualizer] attribute.
debugger_visualizer = []

# For very fast prefix literal matching.
[dependencies.aho-corasick]
version = "0.7.18"
Expand Down Expand Up @@ -132,6 +135,9 @@ rand = { version = "0.8.3", default-features = false, features = ["getrandom", "
# See: https://github.com/rust-lang/regex/issues/684
# See: https://github.com/rust-lang/regex/issues/685
# doc-comment = "0.3"
# To test debugger visualizers defined for the regex crate such as regex.natvis
debugger_test = "0.1.0"
debugger_test_parser = "0.1.0"

# Run the test suite on the default behavior of Regex::new.
# This includes a mish mash of NFAs and DFAs, which are chosen automatically
Expand Down Expand Up @@ -184,6 +190,12 @@ name = "backtrack-bytes"
path = "tests/test_crates_regex.rs"
name = "crates-regex"

[[test]]
path = "tests/test_visualizers.rs"
name = "visualizers"
required-features = ["debugger_visualizer"]
test = false

[profile.release]
debug = true

Expand Down
7 changes: 7 additions & 0 deletions HACKING.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,13 @@ invoking `cargo test`. Note that this variable is inspected at compile
time, so if the tests don't seem to be running, you may need to run
`cargo clean`.

This crate also supports defining and testing custom debugger visualizers.
The `#[debugger_visualizer]` attribute is currently unstable and behind a
`debugger_visualizer` feature gate. To test these visualizers, enable the
`debugger_visualizer` feature for this crate and run the `tests/test_visualizer.rs`
tests using the nightly toolchain. For more information on debugger visualizers,
see `debug_metadata/README.md`.

## Benchmarking

The benchmarking in this crate is made up of many micro-benchmarks. Currently,
Expand Down
111 changes: 111 additions & 0 deletions debug_metadata/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
## Debugger Visualizers

Many languages and debuggers enable developers to control how a type is
displayed in a debugger. These are called "debugger visualizations" or "debugger
views".

The Windows debuggers (WinDbg\CDB) support defining custom debugger visualizations using
the `Natvis` framework. To use Natvis, developers write XML documents using the natvis
schema that describe how debugger types should be displayed with the `.natvis` extension.
(See: https://docs.microsoft.com/en-us/visualstudio/debugger/create-custom-views-of-native-objects?view=vs-2019)
The Natvis files provide patterns which match type names a description of how to display
those types.

The Natvis schema can be found either online (See: https://code.visualstudio.com/docs/cpp/natvis#_schema)
or locally at `<VS Installation Folder>\Xml\Schemas\1033\natvis.xsd`.

The GNU debugger (GDB) supports defining custom debugger views using Pretty Printers.
Pretty printers are written as python scripts that describe how a type should be displayed
when loaded up in GDB/LLDB. (See: https://sourceware.org/gdb/onlinedocs/gdb/Pretty-Printing.html#Pretty-Printing)
The pretty printers provide patterns, which match type names, and for matching
types, descibe how to display those types. (For writing a pretty printer, see: https://sourceware.org/gdb/onlinedocs/gdb/Writing-a-Pretty_002dPrinter.html#Writing-a-Pretty_002dPrinter).

### Embedding Visualizers

Through the use of the currently unstable `#[debugger_visualizer]` attribute, the `regex`
crate can embed debugger visualizers into the crate metadata.

Currently the two types of visualizers supported are Natvis and Pretty printers.

For Natvis files, when linking an executable with a crate that includes Natvis files,
the MSVC linker will embed the contents of all Natvis files into the generated `PDB`.

For pretty printers, the compiler will encode the contents of the pretty printer
in the `.debug_gdb_scripts` section of the `ELF` generated.

### Testing Visualizers

The `regex` crate supports testing debugger visualizers defined for this crate. The entry point for
these tests are `tests/test_visualizer.rs`. These tests are defined using the `debugger_test` and
`debugger_test_parser` crates. The `debugger_test` crate is a proc macro crate which defines a
single proc macro attribute, `#[debugger_test]`. For more detailed information about this crate,
see https://crates.io/crates/debugger_test. The CI pipeline for the `regex` crate has been updated
to run the debugger visualizer tests to ensure debugger visualizers do not become broken/stale.

The `#[debugger_test]` proc macro attribute may only be used on test functions and will run the
function under the debugger specified by the `debugger` meta item.

This proc macro attribute has 3 required values:

1. The first required meta item, `debugger`, takes a string value which specifies the debugger to launch.
2. The second required meta item, `commands`, takes a string of new line (`\n`) separated list of debugger
commands to run.
3. The third required meta item, `expected_statements`, takes a string of new line (`\n`) separated list of
statements that must exist in the debugger output. Pattern matching through regular expressions is also
supported by using the `pattern:` prefix for each expected statement.

#### Example:

```rust
#[debugger_test(
debugger = "cdb",
commands = "command1\ncommand2\ncommand3",
expected_statements = "statement1\nstatement2\nstatement3")]
fn test() {

}
```

Using a multiline string is also supported, with a single debugger command/expected statement per line:

```rust
#[debugger_test(
debugger = "cdb",
commands = "
command1
command2
command3",
expected_statements = "
statement1
pattern:statement[0-9]+
statement3")]
fn test() {

}
```

In the example above, the second expected statement uses pattern matching through a regular expression
by using the `pattern:` prefix.

#### Testing Locally

Currently, only Natvis visualizations have been defined for the `regex` crate via `debug_metadata/regex.natvis`,
which means the `tests/test_visualizer.rs` tests need to be run on Windows using the `*-pc-windows-msvc` targets.
To run these tests locally, first ensure the debugging tools for Windows are installed or install them following
the steps listed here, [Debugging Tools for Windows](https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/).
Once the debugging tools have been installed, the tests can be run in the same manner as they are in the CI
pipeline.

#### Note

When running the debugger visualizer tests, `tests/test_visualizer.rs`, they need to be run consecutively
and not in parallel. This can be achieved by passing the flag `--test-threads=1` to rustc. This is due to
how the debugger tests are run. Each test marked with the `#[debugger_test]` attribute launches a debugger
and attaches it to the current test process. If tests are running in parallel, the test will try to attach
a debugger to the current process which may already have a debugger attached causing the test to fail.

For example:

```
cargo test --test visualizers --features debugger_visualizer -- --test-threads=1
```
105 changes: 105 additions & 0 deletions debug_metadata/regex.natvis
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
<?xml version="1.0" encoding="utf-8"?>
<AutoVisualizer xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
<Type Name="regex::re_builder::unicode::RegexBuilder">
<DisplayString>{{ text={__0.pats[0]} }}</DisplayString>
<Expand>
<ExpandedItem>__0</ExpandedItem>
</Expand>
</Type>

<Type Name="regex::re_bytes::Captures">
<Intrinsic Name="tag" Expression="locs.__0.buf.ptr.pointer.pointer[i].tag">
<Parameter Name="i" Type="int" />
</Intrinsic>
<Intrinsic Name="location" Expression="locs.__0.buf.ptr.pointer.pointer[i].variant1.value.__0">
<Parameter Name="i" Type="int" />
</Intrinsic>
<Intrinsic Name="match_length" Expression="location(end)-location(start)">
<Parameter Name="start" Type="int" />
<Parameter Name="end" Type="int" />
</Intrinsic>
<DisplayString>{{ named_groups={named_groups.ptr.pointer->data.base.table.table.items} }}</DisplayString>
<Expand>
<Item Name="[text]">text</Item>
<Item Name="[named_groups]">named_groups</Item>
<CustomListItems>
<Variable Name="i" InitialValue="0" />
<Variable Name="index" InitialValue="0" />
<Variable Name="len" InitialValue="locs.__0.len" />
<Loop>
<Break Condition="i &gt;= len || tag(i) == 0" />
<Item Name="{index}">(char*)text.data_ptr+location(i),[location(i+1)-location(i)]s8</Item>
<Exec>i+=2</Exec>
<Exec>index++</Exec>
</Loop>
</CustomListItems>
</Expand>
</Type>

<Type Name="regex::re_bytes::Match">
<DisplayString>{text.data_ptr+start,[end-start]s8}</DisplayString>
<Expand>
<Item Name="[text]">text</Item>
<Synthetic Name="[match_text]">
<DisplayString>{(char*)text.data_ptr+start,[end-start]s8}</DisplayString>
</Synthetic>
<Item Name="[start]">start,d</Item>
<Item Name="[end]">end,d</Item>
</Expand>
</Type>

<Type Name="regex::re_bytes::Regex">
<DisplayString>{{ text={__0.ro.ptr.pointer->data.res[0]} }}</DisplayString>
<Expand>
<ExpandedItem>__0.ro</ExpandedItem>
</Expand>
</Type>

<Type Name="regex::re_unicode::Captures">
<Intrinsic Name="tag" Expression="locs.__0.buf.ptr.pointer.pointer[i].tag">
<Parameter Name="i" Type="int" />
</Intrinsic>
<Intrinsic Name="location" Expression="locs.__0.buf.ptr.pointer.pointer[i].variant1.value.__0">
<Parameter Name="i" Type="int" />
</Intrinsic>
<Intrinsic Name="match_length" Expression="location(end)-location(start)">
<Parameter Name="start" Type="int" />
<Parameter Name="end" Type="int" />
</Intrinsic>
<DisplayString>{{ named_groups={named_groups.ptr.pointer->data.base.table.table.items} }}</DisplayString>
<Expand>
<Item Name="[text]">text</Item>
<Item Name="[named_groups]">named_groups</Item>
<CustomListItems>
<Variable Name="i" InitialValue="0" />
<Variable Name="index" InitialValue="0" />
<Variable Name="len" InitialValue="locs.__0.len" />
<Loop>
<Break Condition="i &gt;= len || tag(i) == 0" />
<Item Name="{index}">(char*)text.data_ptr+location(i),[location(i+1)-location(i)]s8</Item>
<Exec>i+=2</Exec>
<Exec>index++</Exec>
</Loop>
</CustomListItems>
</Expand>
</Type>

<Type Name="regex::re_unicode::Match">
<DisplayString>{text.data_ptr+start,[end-start]s8}</DisplayString>
<Expand>
<Item Name="[text]">text</Item>
<Synthetic Name="[match_text]">
<DisplayString>{(char*)text.data_ptr+start,[end-start]s8}</DisplayString>
</Synthetic>
<Item Name="[start]">start,d</Item>
<Item Name="[end]">end,d</Item>
</Expand>
</Type>

<Type Name="regex::re_unicode::Regex">
<DisplayString>{{ text={__0.ro.ptr.pointer->data.res[0]} }}</DisplayString>
<Expand>
<ExpandedItem>__0.ro</ExpandedItem>
</Expand>
</Type>
</AutoVisualizer>
5 changes: 5 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -610,6 +610,11 @@ another matching engine with fixed memory requirements.

#![deny(missing_docs)]
#![cfg_attr(feature = "pattern", feature(pattern))]
#![cfg_attr(feature = "debugger_visualizer", feature(debugger_visualizer))]
#![cfg_attr(
feature = "debugger_visualizer",
debugger_visualizer(natvis_file = "../debug_metadata/regex.natvis")
)]
#![warn(missing_debug_implementations)]

#[cfg(not(feature = "std"))]
Expand Down
Loading