Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore crate: Fix reference cycle for compiled matchers #2692

Merged
merged 1 commit into from
Jan 6, 2024

Conversation

fe9lix
Copy link
Contributor

@fe9lix fe9lix commented Dec 20, 2023

This attempts to fix the issue around unbounded memory growth in the ignore crate when ignore flags are enabled, see #2690

I don't have full understanding of the ignore crate codebase but it looks like there is a reference cycle caused by the compiled matchers (compiled HashMap holds ref to Ignore and Ignore holds ref to HashMap). Using weak refs fixes issue #2690 in my test project. Also confirmed via before and after when profiling the code, see the attached screenshots.

CleanShot 2023-12-20 at 16 30 56@2x
CleanShot 2023-12-20 at 16 26 02@2x

Copy link
Owner

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thank you.

It looks like there is a reference cycle caused by the compiled
matchers (compiled HashMap holds ref to Ignore and Ignore holds ref
to HashMap). Using weak refs fixes issue BurntSushi#2690 in my test project.
Also confirmed via before and after when profiling the code, see the
attached screenshots in BurntSushi#2692.

Fixes BurntSushi#2690
@BurntSushi BurntSushi merged commit b9c7749 into BurntSushi:master Jan 6, 2024
17 checks passed
klensy added a commit to klensy/rust that referenced this pull request Jan 16, 2024
 $ cargo update  -p ignore --precise=0.4.22
    Updating crates.io index
    Updating aho-corasick v1.0.2 -> v1.1.2
    Updating bstr v1.5.0 -> v1.9.0
    Updating globset v0.4.10 -> v0.4.14
    Updating ignore v0.4.20 -> v0.4.22
    Updating log v0.4.19 -> v0.4.20
    Updating memchr v2.5.0 -> v2.7.1
      Adding regex-automata v0.4.3
    Updating walkdir v2.3.3 -> v2.4.0

some notable change is BurntSushi/ripgrep#2692

reduces memory usage from

==47796== Total:     821,467,407 bytes in 3,955,595 blocks
==47796== At t-gmax: 10,976,209 bytes in 66,100 blocks
==47796== At t-end:  2,944,016 bytes in 12,490 blocks
==47796== Reads:     4,788,959,023 bytes
==47796== Writes:    975,493,639 bytes

to

==66633== Total:     791,565,538 bytes in 3,503,144 blocks
==66633== At t-gmax: 10,914,511 bytes in 65,997 blocks
==66633== At t-end:  395,531 bytes in 941 blocks
==66633== Reads:     4,249,388,949 bytes
==66633== Writes:    814,119,580 bytes
klensy added a commit to klensy/rust that referenced this pull request Jan 16, 2024
 $ cargo update  -p ignore --precise=0.4.22
    Updating crates.io index
    Updating aho-corasick v1.0.2 -> v1.1.2
    Updating bstr v1.5.0 -> v1.9.0
    Updating globset v0.4.10 -> v0.4.14
    Updating ignore v0.4.20 -> v0.4.22
    Updating log v0.4.19 -> v0.4.20
    Updating memchr v2.5.0 -> v2.7.1
      Adding regex-automata v0.4.3
    Updating walkdir v2.3.3 -> v2.4.0

some notable change is BurntSushi/ripgrep#2692

reduces memory usage from

==47796== Total:     821,467,407 bytes in 3,955,595 blocks
==47796== At t-gmax: 10,976,209 bytes in 66,100 blocks
==47796== At t-end:  2,944,016 bytes in 12,490 blocks
==47796== Reads:     4,788,959,023 bytes
==47796== Writes:    975,493,639 bytes

to

==66633== Total:     791,565,538 bytes in 3,503,144 blocks
==66633== At t-gmax: 10,914,511 bytes in 65,997 blocks
==66633== At t-end:  395,531 bytes in 941 blocks
==66633== Reads:     4,249,388,949 bytes
==66633== Writes:    814,119,580 bytes

bump regex to dedupe one regex-syntax

$ cargo update -p regex
    Updating crates.io index
    Updating regex v1.8.4 -> v1.10.2
    Removing regex-syntax v0.7.2
klensy added a commit to klensy/rust that referenced this pull request Jan 22, 2024
 $ cargo update  -p ignore --precise=0.4.22
    Updating crates.io index
    Updating aho-corasick v1.0.2 -> v1.1.2
    Updating bstr v1.5.0 -> v1.9.0
    Updating globset v0.4.10 -> v0.4.14
    Updating ignore v0.4.20 -> v0.4.22
    Updating log v0.4.19 -> v0.4.20
    Updating memchr v2.5.0 -> v2.7.1
      Adding regex-automata v0.4.3
    Updating walkdir v2.3.3 -> v2.4.0

some notable change is BurntSushi/ripgrep#2692

reduces memory usage from

==47796== Total:     821,467,407 bytes in 3,955,595 blocks
==47796== At t-gmax: 10,976,209 bytes in 66,100 blocks
==47796== At t-end:  2,944,016 bytes in 12,490 blocks
==47796== Reads:     4,788,959,023 bytes
==47796== Writes:    975,493,639 bytes

to

==66633== Total:     791,565,538 bytes in 3,503,144 blocks
==66633== At t-gmax: 10,914,511 bytes in 65,997 blocks
==66633== At t-end:  395,531 bytes in 941 blocks
==66633== Reads:     4,249,388,949 bytes
==66633== Writes:    814,119,580 bytes

bump regex to dedupe one regex-syntax

$ cargo update -p regex
    Updating crates.io index
    Updating regex v1.8.4 -> v1.10.2
    Removing regex-syntax v0.7.2
klensy added a commit to klensy/rust that referenced this pull request Jan 22, 2024
 $ cargo update  -p ignore --precise=0.4.22
    Updating crates.io index
    Updating aho-corasick v1.0.2 -> v1.1.2
    Updating bstr v1.5.0 -> v1.9.0
    Updating globset v0.4.10 -> v0.4.14
    Updating ignore v0.4.20 -> v0.4.22
    Updating log v0.4.19 -> v0.4.20
    Updating memchr v2.5.0 -> v2.7.1
      Adding regex-automata v0.4.3
    Updating walkdir v2.3.3 -> v2.4.0

some notable change is BurntSushi/ripgrep#2692

reduces memory usage from

==47796== Total:     821,467,407 bytes in 3,955,595 blocks
==47796== At t-gmax: 10,976,209 bytes in 66,100 blocks
==47796== At t-end:  2,944,016 bytes in 12,490 blocks
==47796== Reads:     4,788,959,023 bytes
==47796== Writes:    975,493,639 bytes

to

==66633== Total:     791,565,538 bytes in 3,503,144 blocks
==66633== At t-gmax: 10,914,511 bytes in 65,997 blocks
==66633== At t-end:  395,531 bytes in 941 blocks
==66633== Reads:     4,249,388,949 bytes
==66633== Writes:    814,119,580 bytes

bump regex to dedupe one regex-syntax

$ cargo update -p regex
    Updating crates.io index
    Updating regex v1.8.4 -> v1.10.2
    Removing regex-syntax v0.7.2
klensy added a commit to klensy/rust that referenced this pull request Jan 22, 2024
 $ cargo update  -p ignore --precise=0.4.22
    Updating crates.io index
    Updating aho-corasick v1.0.2 -> v1.1.2
    Updating bstr v1.5.0 -> v1.9.0
    Updating globset v0.4.10 -> v0.4.14
    Updating ignore v0.4.20 -> v0.4.22
    Updating log v0.4.19 -> v0.4.20
    Updating memchr v2.5.0 -> v2.7.1
      Adding regex-automata v0.4.3
    Updating walkdir v2.3.3 -> v2.4.0

some notable change is BurntSushi/ripgrep#2692

reduces memory usage from

==47796== Total:     821,467,407 bytes in 3,955,595 blocks
==47796== At t-gmax: 10,976,209 bytes in 66,100 blocks
==47796== At t-end:  2,944,016 bytes in 12,490 blocks
==47796== Reads:     4,788,959,023 bytes
==47796== Writes:    975,493,639 bytes

to

==66633== Total:     791,565,538 bytes in 3,503,144 blocks
==66633== At t-gmax: 10,914,511 bytes in 65,997 blocks
==66633== At t-end:  395,531 bytes in 941 blocks
==66633== Reads:     4,249,388,949 bytes
==66633== Writes:    814,119,580 bytes

bump regex to dedupe one regex-syntax

$ cargo update -p regex
    Updating crates.io index
    Updating regex v1.8.4 -> v1.10.2
    Removing regex-syntax v0.7.2
klensy added a commit to klensy/rust that referenced this pull request Jan 30, 2024
 $ cargo update  -p ignore --precise=0.4.22
    Updating crates.io index
    Updating aho-corasick v1.0.2 -> v1.1.2
    Updating bstr v1.5.0 -> v1.9.0
    Updating globset v0.4.10 -> v0.4.14
    Updating ignore v0.4.20 -> v0.4.22
    Updating log v0.4.19 -> v0.4.20
    Updating memchr v2.5.0 -> v2.7.1
      Adding regex-automata v0.4.3
    Updating walkdir v2.3.3 -> v2.4.0

some notable change is BurntSushi/ripgrep#2692

reduces memory usage from

==47796== Total:     821,467,407 bytes in 3,955,595 blocks
==47796== At t-gmax: 10,976,209 bytes in 66,100 blocks
==47796== At t-end:  2,944,016 bytes in 12,490 blocks
==47796== Reads:     4,788,959,023 bytes
==47796== Writes:    975,493,639 bytes

to

==66633== Total:     791,565,538 bytes in 3,503,144 blocks
==66633== At t-gmax: 10,914,511 bytes in 65,997 blocks
==66633== At t-end:  395,531 bytes in 941 blocks
==66633== Reads:     4,249,388,949 bytes
==66633== Writes:    814,119,580 bytes

bump regex to dedupe one regex-syntax

$ cargo update -p regex
    Updating crates.io index
    Updating regex v1.8.4 -> v1.10.2
    Removing regex-syntax v0.7.2
klensy added a commit to klensy/rust that referenced this pull request Jan 30, 2024
 $ cargo update  -p ignore --precise=0.4.22
    Updating crates.io index
    Updating aho-corasick v1.0.2 -> v1.1.2
    Updating bstr v1.5.0 -> v1.9.0
    Updating globset v0.4.10 -> v0.4.14
    Updating ignore v0.4.20 -> v0.4.22
    Updating log v0.4.19 -> v0.4.20
    Updating memchr v2.5.0 -> v2.7.1
      Adding regex-automata v0.4.3
    Updating walkdir v2.3.3 -> v2.4.0

some notable change is BurntSushi/ripgrep#2692

reduces memory usage from

==47796== Total:     821,467,407 bytes in 3,955,595 blocks
==47796== At t-gmax: 10,976,209 bytes in 66,100 blocks
==47796== At t-end:  2,944,016 bytes in 12,490 blocks
==47796== Reads:     4,788,959,023 bytes
==47796== Writes:    975,493,639 bytes

to

==66633== Total:     791,565,538 bytes in 3,503,144 blocks
==66633== At t-gmax: 10,914,511 bytes in 65,997 blocks
==66633== At t-end:  395,531 bytes in 941 blocks
==66633== Reads:     4,249,388,949 bytes
==66633== Writes:    814,119,580 bytes

bump regex to dedupe one regex-syntax

$ cargo update -p regex
    Updating crates.io index
    Updating regex v1.8.4 -> v1.10.2
    Removing regex-syntax v0.7.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants