Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIND will not recreate variable but append #207

Open
subbyte opened this issue Apr 19, 2022 · 2 comments
Open

FIND will not recreate variable but append #207

subbyte opened this issue Apr 19, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@subbyte
Copy link
Member

subbyte commented Apr 19, 2022

Describe the bug
When executing the FIND command on the same return variable, the variable is not recreated, but outputs are appended.

Details of the bug

  • What is the hunt flow/script you are executing?
nt = get network-traffic
     from file:///tmp/d.json
     where [network-traffic:src_port > 0]
p = FIND process CREATED nt
  • What is the command that failed? The FIND command, when executing multiple times, will get p to be larger and larger.

To Reproduce
Steps to reproduce the behavior:

  1. data source: https://github.com/opencybersecurityalliance/kestrel-lang/blob/develop/tests/doctored-1k.json
  2. run hunt flow as above

Expected behavior
p should not change even the FIND command has been executed multiple times.

Environment (please complete the following information):

  • OS: Fedora 34
  • Python version: Python 3.9.12
@subbyte subbyte added the bug Something isn't working label Apr 19, 2022
@subbyte
Copy link
Member Author

subbyte commented Apr 19, 2022

This is caused by prefetch + the non-deterministic process id.

Prefetch will query the original data source, e.g., the stix-bundle in this case. The prefetch query will load the bundle again. Without a way to deterministically generate process id, all records will be reloaded as different records in firepit process table and __reflist table. This causes the issue to double the size of the query results then.

Will postpone the fix until we have better process id generation, which is not easy. For static stix bundle, it is feasible, but for stix-shifter created results, the observation id is different for each query though the results could point to the same. we need better support from stix-shifter to handle it.

@pcoccoli
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants