perf: short circuit logic nodes when appropriate #827

williballenthin · 2021-11-08T20:48:07Z

closes #824
review and merge #828 first

Adds logic to and, or, and some statements and substring and regex features to detect when the nodes are minimally satisfied (or not) and to complete evaluation early. For example, if one child of an and statement fails, then the and statement will never be satisfied, and the remaining children don't actually have to be evaluated.

However, in some cases "thorough" evaluation may still be desirable, such as with or statements that have multiple children that may be satisfied. In a verbose output mode, users may want to see all the evidence related to a rule match, not just the minimal set of evidence.

Therefore, this PR includes logic to invoke the (fast) short circuiting mode first, and only if there's a match (uncommon), go back and collect the thorough results for display.

label	count(evaluations)	avg(time)	min(time)	max(time)
`6c8d246` base	108,121	0.37s	0.31s	0.41s
`ad119d7` short circuiting (always)	69,211	0.20s	0.16s	0.28s
`3e74da9` short circuiting (hybrid)	69,401	0.25s	0.24s	0.29s

(via: PMA01-01, 30 iterations)

In the above table, we see that the PR here matches about 30% faster in this test case. When always short circuiting (at the expense of non-thorough results) then we can go a bit faster.

As expected, there are more node evaluations in the "hybrid" mode, as the engine goes back and collects thorough results once a match has been found.

Checklist

No documentation update needed

fixes circular import error in capa.features.freeze

github-actions

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

CHANGELOG updated or no update needed, thanks! 😄

capa/engine.py

Co-authored-by: Moritz <mr-tz@users.noreply.github.com>

williballenthin added 22 commits November 4, 2021 12:20

main: add timing ctx manager

ed3bd4e

main: add coarse timing measurements

f982360

scripts: add utilities for collecting profile traces

3d068fe

add perf counters in module capa.perf

86cab26

main: perf: human format the numbers

6524449

perf: render: show evaluate.feature counter

3a12472

gitignore

702d00d

engine: statement: document that the order of children is important

623bac1

rules: optimize by cost

8d9f418

engine: or: short circuit

18ba986

engine: some: short circuit

a329147

rules: optimizer: use recursive cost of statements

e63f072

rule: optimization: add some documentation

d573b83

common: move Result to capa.common from capa.engine

d86c3f4

fixes circular import error in capa.features.freeze

pep8

35fa50d

perf: add reset routine

a995b53

scripts: add py script for profiling time

480df32

common: move Result to capa.common from capa.engine

0629c58

fixes circular import error in capa.features.freeze

perf: add reset routine

5770d0c

scripts: add py script for profiling time

a35be4a

engine: move optimizer into its own module

e3496b0

pep8

70f0075

williballenthin added the enhancement New feature or request label Nov 8, 2021

github-actions bot previously requested changes Nov 8, 2021

View reviewed changes

williballenthin added 4 commits November 8, 2021 13:48

remove old improt

96813c3

engine: some: correctly count satisfied children

d987719

changelog

1a84051

tests: add test demonstrating short circuiting

9fa9c6a

mypy

b621205

williballenthin added 7 commits November 8, 2021 14:24

scripts: remove old profiling scripts

f598acb

Merge branch 'master' into profiling

26b7a0b

fix bad merge

6c8d246

Merge branch 'profiling' into perf/short-circuit

ad119d7

engine: make short circuiting configurable

3e74da9

changelog

334425a

Merge branch 'profiling' into perf/short-circuit

d425bb3

williballenthin marked this pull request as ready for review November 8, 2021 22:17

williballenthin requested review from mr-tz, Ana06 and mike-hunhoff November 8, 2021 22:17

williballenthin added 7 commits November 8, 2021 15:20

perf: document that counters is unstable

6f6831f

Merge branch 'profiling' into perf/short-circuit

9fbbda1

main: remove perf messages

0b517c5

main: remove perf messages

2abebfb

Merge branch 'profiling' into perf/short-circuit

3c4f4d3

main: remove perf debug msgs

18c30e4

Merge branch 'profiling' into perf/short-circuit

a6e2cfc

This was referenced Nov 8, 2021

perf: add query optimizer #829

Merged

perf: don't try to match rules that will never match #830

Merged

mr-tz approved these changes Nov 9, 2021

View reviewed changes

capa/engine.py Show resolved Hide resolved

capa/engine.py Outdated Show resolved Hide resolved

capa/engine.py Outdated Show resolved Hide resolved

capa/engine.py Outdated Show resolved Hide resolved

williballenthin and others added 3 commits November 9, 2021 10:48

Update capa/engine.py

a68812b

Co-authored-by: Moritz <mr-tz@users.noreply.github.com>

Update capa/engine.py

51af2d4

Co-authored-by: Moritz <mr-tz@users.noreply.github.com>

Update capa/engine.py

f427c5e

Co-authored-by: Moritz <mr-tz@users.noreply.github.com>

williballenthin merged commit 9350ee9 into master Nov 9, 2021

williballenthin deleted the perf/short-circuit branch November 9, 2021 23:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: short circuit logic nodes when appropriate #827

perf: short circuit logic nodes when appropriate #827

williballenthin commented Nov 8, 2021 •

edited

Loading

github-actions bot left a comment

perf: short circuit logic nodes when appropriate #827

perf: short circuit logic nodes when appropriate #827

Conversation

williballenthin commented Nov 8, 2021 • edited Loading

Checklist

github-actions bot left a comment

Choose a reason for hiding this comment

williballenthin commented Nov 8, 2021 •

edited

Loading