Towards better inference: bits → nibbles #3808

originalsouth · 2024-11-06T18:30:28Z

Changes

Bits → Nibbles

Issue link

N/A

Demo

T.B.D.

QA notes

Test all ported bits/new nibbles from front-end:

disallowed_csp_hostnames (with/without config)
check_cve_2021_41773
domain_owner_verification
expiring_certificate
ask_url_params_to_ignore
website_discovery
missing_certificate
cipher_classification
default_findingtype_risk
spf_discovery
max_url_length_config (with config for testing only)
oois_in_headers (with/without config)
ask_port_specification
ask_disallowed_domains
check_hsts_header (with/without config)
url_classification
missing_spf

Code Checklist

All the commits in this PR are properly PGP-signed and verified.
This PR only contains functionality relevant to the issue.
I have written unit tests for the changes or fixes I made.
I have checked the documentation and made changes where necessary.
I have performed a self-review of my code and refactored it to the best of my abilities.

Tickets have been created for newly discovered issues.
For any non-trivial functionality, I have added integration and/or end-to-end tests.
I have informed others of any required .env changes files if required and changed the .env-dist accordingly.
I have included comments in the code to elaborate on what is not self-evident from the code itself, including references to issues and discussions online, or implicit behavior of an interface.

Checklist for code reviewers:

Copy-paste the checklist from the docs/source/templates folder into your comment.

Checklist for QA:

Copy-paste the checklist from the docs/source/templates folder into your comment.

…coordination into set-default-risk-in-model

…uler from recreating already deleted oois trhough affirmations

…cheduler_from_reacreating_already_deleted_oois_through_affirmations' into feature/nibbles

This reverts commit 1b4aed6.

…rchitecture because at this point in time they always run, maximizing available information)

sonarqubecloud · 2024-12-19T14:13:04Z

Quality Gate failed

Failed conditions
1 Security Hotspot
45.7% Coverage on New Code (required ≥ 80%)
9.0% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Donnype

Great work! I didn't check all the queries or dove too deep into some parts (such as run/infer and the whole going down the inference tree), but at least for the queries I am happy to see the integration tests.

There is for me one big caveat: the nibble runner, and in particular its stateful properties, are now in charge of managing enabled/disabled nibbles and selecting them after a query. This will not work in practice over application lifetime.

Some less urging comments are about the number of methods/endpoints and how flexible they should/could be.

Please have a look! I could definitely re-review this to go into more detail for other parts later on.

Donnype · 2024-12-20T07:31:33Z

octopoes/octopoes/models/origin.py

+                        self.method,
+                        f"[{','.join(sorted([str(param) for param in self.parameters_references]))}]"
+                        if self.parameters_references is not None
+                        else "Null",


Perhaps "None" would be more consistent with other instances where fields in the natural key might be None

Donnype · 2024-12-20T08:00:08Z

octopoes/octopoes/models/origin.py

+                        self.__class__.__name__,
+                        self.origin_type.value,
+                        self.method,
+                        f"[{','.join(sorted([str(param) for param in self.parameters_references]))}]"


Should this be the has of the parameters instead? Since there could be an arbitrary amount

Donnype · 2024-12-20T09:08:19Z

octopoes/octopoes/repositories/ooi_repository.py

@@ -249,6 +262,30 @@ def deserialize(cls, data: dict[str, Any]) -> OOI:
        stripped["user_id"] = user_id
        return object_cls.model_validate(stripped)

+    @classmethod
+    def objectify(cls, t: list | Any, obj: dict | list | set | Any) -> tuple | frozenset | Any:


Do you mean "hydrate"? This deserves a comment explaining what the method accepts and returns, as I doubt Any is the best we can do seeing this method only being called once? Also the parameter t and variables tt and o could be more explicit about what they might hold. For the main if/elif branches you could specify the scenario we are in.

Donnype · 2024-12-20T09:37:53Z

octopoes/nibbles/definitions.py

+
+import structlog
+from pydantic import BaseModel
+from xxhash import xxh3_128_hexdigest as xxh3  # INFO: xxh3_64_hexdigest is faster but hash more collision probabilities


Pun intended?

Donnype · 2024-12-20T11:29:49Z

octopoes/octopoes/api/router.py

@@ -582,3 +582,73 @@ def migrate_origins(
        session.add((OperationType.DELETE, origin.id, valid_time))

    session.commit()  # The save-delete order is important to avoid garbage collection of the results
+
+
+@router.get("/nibbles/list", tags=["nibbles"])


For consistency:

Suggested change

@router.get("/nibbles/list", tags=["nibbles"])

@router.get("/nibbles", tags=["nibbles"])

Donnype · 2024-12-20T12:24:30Z

octopoes/tests/test_ooi_repository.py

+        assert self.repository.objectify(int, set((3 * "9 ").split())) == frozenset([9])
+        assert self.repository.objectify(int, {"1", "2", "5"}) == frozenset([1, 2, 5])
+
+        assert self.repository.objectify(str, "potato") == "potato"


Given all these examples: should we rename to something like parse_recursively_as(data, parse_type: type)?

Donnype · 2024-12-20T12:25:56Z

octopoes/tests/test_ooi_repository.py

+            "potato2": "king-edward",
+        }
+
+        assert self.repository.objectify([str, int], ["seven", "11"]) == tuple(["seven", 11])


I think this gets a bit tricky. Some fields/query results might be strings that should remain strings even if they represent an integer? I.e. perhaps some of this parsing should happen outside the query funtion by users that know whether to expect a string of integer?

Donnype · 2024-12-20T12:33:18Z

octopoes/octopoes/models/origin.py

@@ -13,6 +13,7 @@ class OriginType(Enum):
    OBSERVATION = "observation"
    INFERENCE = "inference"
    AFFIRMATION = "affirmation"
+    NIBBLET = "nibblet"


After Nibble and Nibbler, perhaps we should reduce the introduced naming by making this:

Suggested change

NIBBLET = "nibblet"

NIBBLET = "nibble"

And renaming the variables to nibble_origin

Donnype · 2024-12-20T12:36:42Z

octopoes/tests/integration/test_hsts_nibble.py

+
+STATIC_IP = ".".join((4 * "1 ").split())
+
+


So these tests create objects in XTDB and then query them using nibble queries to assert the behavior? Then I'm very pleased to see this!

Donnype · 2024-12-20T12:37:03Z

octopoes/tests/integration/test_missing_spf_nibble.py

+if os.environ.get("CI") != "1":
+    pytest.skip("Needs XTDB multinode container.", allow_module_level=True)
+
+


originalsouth and others added 30 commits August 27, 2024 09:31

Introducing nibbles

ae00a8e

Prototyping

c90fcb0

Merge remote-tracking branch 'origin/main' into feature/nibbles

d57cf19

Merge remote-tracking branch 'origin/main' into feature/nibbles

64ece62

Merge remote-tracking branch 'origin/main' into feature/nibbles

bba22a3

set default in model

0896eba

remove default bit

964b89b

fix test

5915f03

Fix Octopoes tests for patch related changes

ed7be58

Merge branch 'set-default-risk-in-model' of github.com:minvws/nl-kat-…

efa3c97

…coordination into set-default-risk-in-model

Fix Octopoes tests for patch related changes II

663a9bb

Merge branch 'main' into set-default-risk-in-model

bd78ed9

Fix Octopoes tests for patch related changes III

b5ba90a

Merge branch 'set-default-risk-in-model' of github.com:minvws/nl-kat-…

f885652

…coordination into set-default-risk-in-model

Prevent race conditions between Octopoes' event manager and the sched…

b05283e

…uler from recreating already deleted oois trhough affirmations

Merge branch 'main' into set-default-risk-in-model

06d1080

Merge branch 'main' into set-default-risk-in-model

5bf8b35

Merge branch 'main' into set-default-risk-in-model

967d41b

Merge remote-tracking branch 'origin/main' into feature/nibbles

d30b33f

Merge branch 'fix/prevent_race_conditions_between_event_manager_and_s…

86fe7d5

…cheduler_from_reacreating_already_deleted_oois_through_affirmations' into feature/nibbles

Merge branch 'set-default-risk-in-model' into feature/nibbles

dca2b20

Fixes for idle run

7699d93

Merge branch 'main' into feature/nibbles

0eb106f

Manual merge

2ed89fb

Revert "Set default findingtype risk in model instead of in bit (#3562)"

d9c9fa2

This reverts commit 1b4aed6.

Pre-commit after revert

20c5abf

Remove bogus rlu_cache

2d09141

Merge remote-tracking branch 'origin/main' into feature/nibbles

6adeffe

Register origins and add parameters begins

f3f4277

Merge remote-tracking branch 'origin/main' into feature/nibbles

ef9ad80

originalsouth added 4 commits December 16, 2024 23:31

Merge remote-tracking branch 'origin/main' into feature/nibbles

c51dca9

Introduce tests for config nibbles

0ecceec

Fix tests for config nibbles

66a287a

Merge remote-tracking branch 'origin/main' into feature/nibbles

38dc032

stephanie0x00 mentioned this pull request Dec 17, 2024

Documentation - Adding Nibbles #3976

Open

originalsouth added 4 commits December 17, 2024 19:26

Bypass SonarCloudSecurity check?

1e81c09

Merge remote-tracking branch 'origin/main' into feature/nibbles

e9b29b4

Fix config queries

b74d36a

Patch the ROBOT tetst (there should be more objects with the nibble a…

779afb7

…rchitecture because at this point in time they always run, maximizing available information)

originalsouth requested review from Donnype and underdarknl December 18, 2024 09:42

originalsouth marked this pull request as ready for review December 18, 2024 09:42

originalsouth requested a review from a team as a code owner December 18, 2024 09:42

originalsouth added the nibbles Everything nibble related label Dec 18, 2024

originalsouth added 9 commits December 18, 2024 10:59

Remove nibbles reset routine (unwanted and premature for now)

ec0f8bd

Update disallow_csp_hostnames from upstream

15304bc

Allow none configs in new nibble

2076e50

Retire perform_writes option

d7403c5

Make sonarcloud happier

719da7c

Make sonarcloud happier

7524a11

Make sonarcloud happier

a2d383c

Cleanup unused code

e680adc

Allow reruns when nibbles are updated (part I)

b4181f1

madelondohmen assigned underdarknl Dec 19, 2024

originalsouth assigned Donnype Dec 19, 2024

originalsouth added 3 commits December 19, 2024 12:09

Add retrieve functionality

e27fbbe

Implement yields

777b1b8

Add update nibble routines

7cd694f

Donnype reviewed Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Towards better inference: bits → nibbles #3808

Towards better inference: bits → nibbles #3808

originalsouth commented Nov 6, 2024 •

edited

Loading

sonarqubecloud bot commented Dec 19, 2024

Donnype left a comment

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

Donnype Dec 20, 2024

	@router.get("/nibbles/list", tags=["nibbles"])
	@router.get("/nibbles", tags=["nibbles"])

		if os.environ.get("CI") != "1":
		pytest.skip("Needs XTDB multinode container.", allow_module_level=True)

Towards better inference: bits → nibbles #3808

Are you sure you want to change the base?

Towards better inference: bits → nibbles #3808

Conversation

originalsouth commented Nov 6, 2024 • edited Loading

Changes

Issue link

Demo

QA notes

Code Checklist

Checklist for code reviewers:

Checklist for QA:

sonarqubecloud bot commented Dec 19, 2024

Quality Gate failed

Donnype left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

originalsouth commented Nov 6, 2024 •

edited

Loading