Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a script for upgrading rules into the new dynamic format #1696

Conversation

yelhamer
Copy link
Collaborator

@yelhamer yelhamer commented Aug 9, 2023

At the time of the opening of this PR, the script provides the main logic for parsing the rules. This logic is composed of 3 recursive functions that each handle a specific task, and that call into each other. I settled on this approach after trying other things (such as the strategy mentioned in #1674) and failing/getting stuff too complicated.

There remains the following important tasks, but they should be easy to add for the most part:

  • merge the static and dynamic statements generated at the initial node (rn, it returns them separately)
  • add all supported dynamic features into the script
  • add the logic for saving the generated rules
  • add static to dynamic scope translation (right now it assumes the "thread" scope for all)
  • sort rules by dependency, and exclude match features that depend on out-of-context rules
  • optimize and clean the code

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

Copy link
Collaborator

@williballenthin williballenthin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you're ready for more review, would you please also provide some example output, such as the git diff for some upgraded rules? this will help us assess the logic of the script.

scripts/upgrage-legacy-rules.py Outdated Show resolved Hide resolved
scripts/upgrage-legacy-rules.py Outdated Show resolved Hide resolved
scripts/upgrage-legacy-rules.py Outdated Show resolved Hide resolved
scripts/upgrage-legacy-rules.py Outdated Show resolved Hide resolved
scripts/upgrage-legacy-rules.py Outdated Show resolved Hide resolved
scripts/upgrage-legacy-rules.py Outdated Show resolved Hide resolved
Co-authored-by: Willi Ballenthin <willi.ballenthin@gmail.com>
Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree that some example output and results would be helpful

scripts/upgrage-legacy-rules.py Outdated Show resolved Hide resolved
@yelhamer
Copy link
Collaborator Author

yelhamer commented Aug 9, 2023

The script's logic is finished I believe, so feel free to review it. All that remains is to add support for filtering-out match rules that point to static-only rules when appropriate, as well as adding the final list of supported dynamic features (perhaps we want to add some characteristics?)

Here's the script's result on a made up complex rule:

original rule:

rule:
  meta:
    name: test rule
    namespace: test
    authors:
      - test@aol.com
    scope: function
    att&ck:
      - Persistence::Event Triggered Execution::Unix Shell Configuration Modification
        [T1546.004]
    examples:
      - 7351f8a40c5450557b24622417fc478d:0x407D11
  features:
    - and:
      - basic block:
        - or:
          - os: linux
          - match: host-interaction/file-system/write
          - or:
            - mnemonic: call
            - api: someapi_1
          - instruction:
            - mnemonic: write
            - string: somestring_1
      - instruction:
        - api: someapi_2
        - number: 1
      - string: somestring_2

modified rule:

rule:
  meta:
    name: test rule
    namespace: test
    authors:
      - test@aol.com
    scopes:
      static: function
      dynamic: process
    att&ck:
      - Persistence::Event Triggered Execution::Unix Shell Configuration Modification
        [T1546.004]
    examples:
      - 7351f8a40c5450557b24622417fc478d:0x407D11
  features:
    - and:
      - or:
        - basic block:
          - or:
            - os: linux
            - match: host-interaction/file-system/write
            - or:
              - mnemonic: call
              - api: someapi_1
            - instruction:
              - mnemonic: write
              - string: somestring_1
        - thread:
          - or:
            - os: linux
            - match: host-interaction/file-system/write
            - or:
              - api: someapi_1
      - or:
        - instruction:
          - api: someapi_2
          - number: 1
        - call:
          - and:
            - api: someapi_2
            - number: 1
      - string: somestring_2

@yelhamer yelhamer linked an issue Aug 9, 2023 that may be closed by this pull request
@williballenthin williballenthin mentioned this pull request Aug 10, 2023
18 tasks
scripts/upgrade-legacy-rules.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

  • offset features get removed
    2023-08-14_11-23-05_WinMergeU

edit: @yelhamer this seems to still be the case

  • same for property (property/read)

@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

  • comments and quotes get removed
    2023-08-14_11-23-51_WinMergeU

  • also note the escaping sequences
    2023-08-14_11-36-07_WinMergeU

@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

  • number exported as decimal
    2023-08-14_11-24-31_WinMergeU

Co-authored-by: Moritz <mr-tz@users.noreply.github.com>
@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

  • new line inserted
    2023-08-14_11-27-42_WinMergeU

@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

  • mnemonic/number removed
    2023-08-14_11-29-32_WinMergeU

@williballenthin
Copy link
Collaborator

good finds @mr-tz

these might be tricky to address, since the yaml parser probably doesn't preserve things like the hex/decimal number format once the data is parsed.

worst case, we could do something like:

  1. if its easy to just update scope -> scopes, then do a str.replace(..., ...) on the raw text and emit it, otherwise,
  2. print to the user that manual intervention is required, possibly setting scopes.dynamic: TODO (which is invalid and will force us to make the fix)

@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

See capa\scripts\capafmt.py for potential hints on how to handle the formatting.

@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

  • may be a good opportunity to convert all rules to use Unix line endings (\n only)

@yelhamer
Copy link
Collaborator Author

@mr-tz regarding the removal of mnemonic and offset, isn't that desirable? since instructions aren't extracted by sandbox output (to my understanding)

@mr-tz
Copy link
Collaborator

mr-tz commented Aug 14, 2023

For the static rule part mnemonic and offset should still be there.

@yelhamer
Copy link
Collaborator Author

Ah yeah I see. Will work on it...

@yelhamer
Copy link
Collaborator Author

TODO(yelhamer):
once this PR is merged, update #1709 in order to use the default ruleset.

@mr-tz
Copy link
Collaborator

mr-tz commented Aug 16, 2023

Currently, getting capa.rules.InvalidRule: invalid rule: subscope must have exactly one child statement on the rule with name: inject pe.

@yelhamer
Copy link
Collaborator Author

@mr-tz could you please take a look at the rules now for any other issues?

what remains is preserving the double backwards slashes (important), and the comments.

@yelhamer yelhamer requested a review from mr-tz August 16, 2023 10:51
@yelhamer yelhamer changed the base branch from dynamic-feature-extraction to master August 17, 2023 08:23
@yelhamer yelhamer changed the base branch from master to dynamic-feature-extraction August 17, 2023 08:24
Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments in mandiant/capa-rules#814
Nice progress, there's some code style issues to address here as well.

@williballenthin williballenthin added this to the v7.0 milestone Oct 16, 2023
@mr-tz
Copy link
Collaborator

mr-tz commented Nov 8, 2023

I think this can be closed?

@mr-tz mr-tz closed this Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrading legacy non-dynamic rules to the new syntax
3 participants