Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework NFA matching and try with resources filtering #1020

Merged
merged 10 commits into from
May 18, 2022

Conversation

hcoles
Copy link
Owner

@hcoles hcoles commented May 17, 2022

The slot storage system in the NFA matching library was a bit of hack,
storing any matched values even if that branch do not match. Working
around this complicated using the matching library as the behaviour was
not intuitive.

This change fixes the implementation so updates to the context are tied
to the state that made them. The behaviour is now 'correct' and much
easier to understand.

Unfortunately the analysis is now much slower to execute (8 seconds vs < 1 seconds on one example project) if the queries are left unmodified. To ensure that this will scale as the size of the input project grows, mitigation has been put in place. Previously the filters were run in the prescan (because they were so cheap). Most filters will now not be run, unless they have the new type of PRESCAN_FILTER. Some queries have also been rewritten.

The try with resources filter relied on the previous broken behaviour and has been re-written based on the analysis used in jacoco. This should now detect cases previously missed.

Henry Coles added 9 commits May 10, 2022 09:26
The slot storage system in the NFA matching library was a bit of hack,
storing any matched values even if that branch do not match. Working
around this complicated using the matching library as the behaviour was
not intuitive.

This change fixes the implementation so updates to the context are tied
to the state that made them. The behaviour is now 'correct' and much
easier to understand.

Unfortunately this has two consequences.

1. It is much slower. A 1 second analysis on an example project now
   takes 8 seconds.

2. The try with resources filter is broken as it relied on peculiar
   behaviour of the original implementation.

Some mitigation has been put in place for the first point. Previously
the filters were run in the prescan (because they were so cheap). Most
filters will now not be run, unless they have the new type of
PRESCAN_FILTER.

Before this can be merged the try with resources filter must be fixed
and further benchmarks taken with large projects to understand the
impact.
Rework filter based on the analysis done by the jacoco team. Also puts
mitigation in place against the increased analysis cost of properly
branching the filter context.
A large amount of static analysis time is spent computing hashcodes for
maps. A significant performance improvement can be made by creating
specialised classes for the common cases of contexts with 0 or 1 values.
@hcoles hcoles force-pushed the feature/extract_state_in_sequence_match branch from b209fc0 to dfc8d5d Compare May 18, 2022 10:45
@hcoles hcoles merged commit 98f4c55 into master May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant