Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom configuration schema, and fault injection testing #398

Merged
merged 91 commits into from
Dec 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
c414a11
Merge acto-dev commits
tylergu Oct 14, 2024
40ad2b7
Upload scripts
tylergu Oct 14, 2024
d3cce80
Fix diff ignore field
tylergu Oct 14, 2024
53b8bf0
add config for redis
TZ-zzz Oct 16, 2024
c3852bb
fix bugs in acto
TZ-zzz Oct 16, 2024
1a13278
Patch Cass operator's CRD
tylergu Oct 17, 2024
a63ed3f
Delete jvm related config from cass-operator for now
tylergu Oct 18, 2024
92b65ec
Ported elasticsearch operator cloud-on-k8s
Oct 19, 2024
990563b
upload the config generation scripts.
TZ-zzz Oct 20, 2024
07f19ac
update the script for config values
TZ-zzz Oct 20, 2024
0a03a76
Change dir name because python module cannot have dot in name
tylergu Oct 20, 2024
a6d1a93
Support UnderSpecified schema for configuration testing
tylergu Oct 20, 2024
f06f410
Fix import issue
tylergu Oct 20, 2024
55f294d
Fix partial func name
tylergu Oct 20, 2024
b0625eb
Fix get value by path
tylergu Oct 20, 2024
41e4ab0
Fix set value by path callsite
tylergu Oct 20, 2024
5335ae5
Fix null value in toml
tylergu Oct 20, 2024
8e35d7a
Add cass config test
tylergu Oct 20, 2024
2e5dfc7
Add cass config test
tylergu Oct 20, 2024
152b311
Add mongodb config test
tylergu Oct 21, 2024
48e4003
Fix mongodb config schema
tylergu Oct 21, 2024
551306d
Fix cass-operator config crd
tylergu Oct 21, 2024
c8355a6
Fix string schema for loading unknown properties
tylergu Oct 21, 2024
a286782
updated ES operator config
Oct 21, 2024
be1dcb9
Workaround the cass-operator's config CRD
tylergu Oct 22, 2024
3ac82cf
Fix import path
tylergu Oct 22, 2024
df3cf21
Fix merge error
tylergu Oct 22, 2024
cac8e74
Fix cassandra config schema
tylergu Oct 22, 2024
f809920
Fix tidb config mapping
tylergu Oct 23, 2024
d66da9f
Fix tidb config mapping
tylergu Oct 23, 2024
7576a6d
Fix boolean schema for configuration tests
tylergu Oct 23, 2024
edf5e73
Fix mongodb config name
tylergu Oct 23, 2024
a90a04c
added steady state fault injection impl
Oct 24, 2024
b2a08b8
updated elastic search acto config
Oct 24, 2024
bb5aef3
added steady state fault injection
Oct 24, 2024
e6ade70
Add mariadb config test
tylergu Oct 25, 2024
1374e60
Complete MongoDB configuration
tylergu Oct 25, 2024
a1bc178
Fix configparser for MariaDB
tylergu Oct 25, 2024
692f346
Retrieve all pod log when it is unhealthy
tylergu Oct 29, 2024
35b111b
Retry wait for pod to be ready
tylergu Oct 29, 2024
8d21f0f
Fix Cassandra configuration
tylergu Oct 29, 2024
f153f5f
Fix result is_error check for deletion tests
tylergu Oct 29, 2024
e9fe23f
Fix deletion test
tylergu Oct 29, 2024
b021f02
Revert tidb config change
tylergu Oct 29, 2024
5ccdfbb
Separate config test and func test
tylergu Oct 29, 2024
f537449
Fix config name
tylergu Oct 29, 2024
8a882bb
Fix mariadb config
tylergu Oct 29, 2024
25d311f
Put operator port into versioned dir
tylergu Oct 30, 2024
dc52852
Fix mariadb ini file parsing
tylergu Oct 30, 2024
6591d34
Fix mongodb configuration
tylergu Oct 30, 2024
84c1cf0
Fix configparser value set
tylergu Oct 31, 2024
4292ed5
Fix post diff test and run health oracle for rejected inputs
tylergu Oct 31, 2024
1e7a4f3
Update scripts
tylergu Oct 31, 2024
4e4d128
Fix mariadb config
tylergu Oct 31, 2024
58b7979
Raise exception if precondition is not satisfied
tylergu Nov 3, 2024
0cc7e59
Fix cr path bug in deletion tests
tylergu Nov 3, 2024
f27497e
Use updated health oracle to check convergence
tylergu Nov 4, 2024
65f3f51
Format
tylergu Nov 4, 2024
f5321c2
Fix collecting steady_system_state
tylergu Nov 4, 2024
7412bbf
Enlarge tidb operator wait time
tylergu Nov 5, 2024
2eab066
Correct Cassandra configuration schema
tylergu Nov 5, 2024
9773333
Use semantic replicas tests for cass operator and mongodb operator
tylergu Nov 5, 2024
97c2e6c
Cleanup fault injection code
tylergu Nov 5, 2024
10dbccc
Fix Cassandra configuration
tylergu Nov 7, 2024
68ce99f
Fix: skip oracle if cli indicates invalid input
tylergu Nov 8, 2024
da2087f
update bad values for configuration test
TZ-zzz Nov 8, 2024
ce3916f
Support custom oracle
tylergu Nov 11, 2024
ad9091f
add custom oracle for mongodb
TZ-zzz Nov 14, 2024
eb7f923
fix bugs in mongodb oracle
TZ-zzz Nov 15, 2024
447565e
add tidb oracle
TZ-zzz Nov 18, 2024
186711a
remove tidb commented codes
TZ-zzz Nov 18, 2024
dabe831
update tidb oracle
TZ-zzz Nov 18, 2024
05f47cd
add oracle for mariadb and fixes some bugs in acto
TZ-zzz Nov 19, 2024
508198d
update maraidb oracel and fixes bugs in mongodb oracle
TZ-zzz Nov 20, 2024
d8a6c9f
fix bugs in oracle in mongodb
TZ-zzz Nov 20, 2024
e56df04
fix bugs in acto
TZ-zzz Nov 21, 2024
c7dc631
fix mongodb oracle
TZ-zzz Nov 21, 2024
5db9a97
fix mariadb oracle
TZ-zzz Nov 21, 2024
74894d8
fix bugs in mariadb oracle
TZ-zzz Nov 21, 2024
f98fc0c
fix mongodb config
TZ-zzz Nov 22, 2024
f7d09a5
make acto compatible with oracle
TZ-zzz Nov 25, 2024
bfa8856
fix bugs in acto
TZ-zzz Nov 25, 2024
6ad9c05
fix cass-operator oracle
TZ-zzz Dec 5, 2024
70ae28b
add missing properties for objects in cass-config
TZ-zzz Dec 6, 2024
ba0d297
run cass-op with custom oracle
TZ-zzz Dec 6, 2024
9d9b0d2
add logs to tidb oracle
TZ-zzz Dec 13, 2024
a7d1000
fix bugs in tidb oracle
TZ-zzz Dec 14, 2024
8135171
Update the run scripts
tylergu Dec 15, 2024
c0322f7
Delete unused scripts
tylergu Dec 15, 2024
38e68a1
Fix failed unittests
tylergu Dec 15, 2024
36f5a60
Fix cass test
tylergu Dec 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ repos:
args: [--extra=dev, --output-file=requirements-dev.txt]
files: ^pyproject.toml$
- repo: https://github.com/psf/black
rev: 23.12.0
rev: 24.10.0
hooks:
- id: black
name: black
Expand Down
24 changes: 14 additions & 10 deletions acto/checker/checker_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

from typing import Optional

from acto.checker.checker import CheckerInterface

from acto.checker.impl.consistency import ConsistencyChecker
from acto.checker.impl.crash import CrashChecker
from acto.checker.impl.health import HealthChecker
Expand All @@ -23,10 +25,8 @@ def __init__(
trial_dir: str,
input_model: InputModel,
oracle_handle: OracleHandle,
checker_generators: Optional[list] = None,
custom_checker: Optional[type[CheckerInterface]] = None,
):
if checker_generators:
checker_generators.extend(checker_generators)
self.context = context
self.input_model = input_model
self.trial_dir = trial_dir
Expand All @@ -39,7 +39,12 @@ def __init__(
context=self.context,
input_model=self.input_model,
)
_ = oracle_handle

# Custom checker
self._oracle_handle = oracle_handle
self._custom_checker: Optional[CheckerInterface] = (
custom_checker(self._oracle_handle) if custom_checker else None
)

def check(
self,
Expand Down Expand Up @@ -68,12 +73,6 @@ def check(
num_delta,
)

# generation_result_path = os.path.join(
# self.trial_dir, f"generation-{generation:03d}-runtime.json"
# )
# with open(generation_result_path, "w", encoding="utf-8") as f:
# json.dump(run_result.to_dict(), f, cls=ActoEncoder, indent=4)

return OracleResults(
crash=self._crash_checker.check(
generation, snapshot, prev_snapshot
Expand All @@ -87,6 +86,11 @@ def check(
consistency=self._consistency_checker.check(
generation, snapshot, prev_snapshot
),
custom=(
self._custom_checker.check(generation, snapshot, prev_snapshot)
if self._custom_checker
else None
),
)

def count_num_fields(self, snapshot: Snapshot, prev_snapshot: Snapshot):
Expand Down
7 changes: 6 additions & 1 deletion acto/checker/impl/health.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,14 @@
"""System health oracle"""

def check(
self, _: int, snapshot: Snapshot, __: Snapshot
self,
_: int = 0,
snapshot: Optional[Snapshot] = None,
__: Optional[Snapshot] = None,
) -> Optional[OracleResult]:
"""System health oracle"""
if snapshot is None:
return None

Check warning on line 20 in acto/checker/impl/health.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on line 20
logger = get_thread_logger(with_prefix=True)

system_state = snapshot.system_state
Expand Down
33 changes: 33 additions & 0 deletions acto/checker/impl/state_compare.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from deepdiff.helper import NotPresent

from acto.k8s_util.k8sutil import canonicalize_quantity
from acto.common import flatten_dict


def is_none_or_not_present(value: Any) -> bool:
Expand Down Expand Up @@ -83,6 +84,18 @@
return False
return False

def compare_application_config(input_config: Any, output_config: Any) -> bool:
if isinstance(input_config, dict) and isinstance(output_config, dict):
try:
set_input_config = flatten_dict(input_config, ["root"])
set_output_config = flatten_dict(output_config, ["root"])
for item in set_input_config:
if item not in set_output_config:
return False
return True
except configparser.Error:
return False
return False

Check warning on line 98 in acto/checker/impl/state_compare.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on lines 88-98

class CompareMethods:
def __init__(self, enable_k8s_value_canonicalization=True):
Expand Down Expand Up @@ -143,3 +156,23 @@

# return original values
return in_prev, in_curr, out_prev, out_curr

class CustomCompareMethods():
def __init__(self):
self.custom_equality_checkers = []
self.custom_equality_checkers.extend([compare_application_config])

Check warning on line 163 in acto/checker/impl/state_compare.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on lines 162-163

def equals(self, left: Any, right: Any) -> bool:
"""
Compare two values. If the values are not equal, then try to use custom_equality_checkers to see if they are
@param left:
@param right:
@return:
"""
if left == right:
return True
else:
for equals in self.custom_equality_checkers:
if equals(left, right):
return True
return False

Check warning on line 178 in acto/checker/impl/state_compare.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on lines 172-178
2 changes: 1 addition & 1 deletion acto/checker/impl/tests/test_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def input_model_and_context_mapping() -> (
Dict[str, Tuple[Dict, DeterministicInputModel]]
):
"""Returns a mapping from apiVersion to (context, input_model)"""
configs = glob.glob("./data/**/config.json")
configs = glob.glob("./data/**/config.json", recursive=True)
ret = {}
for config_path in configs:
with open(config_path, "r", encoding="utf-8") as f:
Expand Down
7 changes: 6 additions & 1 deletion acto/deploy.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,12 @@
"""Wait for all pods to be ready"""
now = time.time()
try:
p = kubectl_client.wait_for_all_pods(timeout=600)
i = 0
while i < 3:
p = kubectl_client.wait_for_all_pods(timeout=600)
if p.returncode == 0:
break
i += 1

Check warning on line 28 in acto/deploy.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on lines 23-28
except subprocess.TimeoutExpired:
logging.error("Timeout waiting for all pods to be ready")
return False
Expand Down
63 changes: 35 additions & 28 deletions acto/engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
import jsonpatch
import yaml

from acto.checker.checker import CheckerInterface
from acto.checker.checker_set import CheckerSet
from acto.checker.impl.health import HealthChecker
from acto.common import (
Expand Down Expand Up @@ -81,6 +82,11 @@
testcase.mutator(field_curr_value), list(path)
)
curr = value_with_schema.raw_value()
else:
raise RuntimeError(

Check warning on line 86 in acto/engine.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on line 86
"Running test case while precondition fails"
f" {path} {field_curr_value}"
)

# Satisfy constraints
assumptions: list[tuple[PropertyPath, bool]] = []
Expand Down Expand Up @@ -141,16 +147,16 @@
# remove pods that belong to jobs from both states to avoid observability problem
curr_pods = curr_system_state["pod"]
prev_pods = prev_system_state["pod"]
curr_system_state["pod"] = {
k: v
for k, v in curr_pods.items()
if v["metadata"]["owner_references"][0]["kind"] != "Job"
}
prev_system_state["pod"] = {
k: v
for k, v in prev_pods.items()
if v["metadata"]["owner_references"][0]["kind"] != "Job"
}

for k, v in curr_pods.items():
if "owner_reference" in v["metadata"] and v["metadata"]["owner_reference"] is not None and ["owner_references"][0]["kind"] == "Job":
continue
curr_system_state[k] = v

for k, v in prev_pods.items():
if "owner_reference" in v["metadata"] and v["metadata"]["owner_reference"] is not None and ["owner_references"][0]["kind"] == "Job":
continue
prev_system_state[k] = v

Check warning on line 159 in acto/engine.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on lines 151-159

for obj in prev_system_state["secret"].values():
if "data" in obj and obj["data"] is not None:
Expand Down Expand Up @@ -249,8 +255,8 @@
runner_t: type,
checker_t: type,
wait_time: int,
custom_on_init: list[Callable],
custom_oracle: list[Callable],
custom_on_init: Optional[Callable],
custom_checker: Optional[type[CheckerInterface]],
workdir: str,
cluster: base.KubernetesEngine,
worker_id: int,
Expand Down Expand Up @@ -288,7 +294,7 @@
)

self.custom_on_init = custom_on_init
self.custom_oracle = custom_oracle
self.custom_checker = custom_checker

Check warning on line 297 in acto/engine.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on line 297
self.dryrun = dryrun
self.is_reproduce = is_reproduce

Expand Down Expand Up @@ -407,8 +413,8 @@
)
# first run the on_init callbacks if any
if self.custom_on_init is not None:
for on_init in self.custom_on_init:
on_init(oracle_handle)
for callback in self.custom_on_init:
callback(oracle_handle)

Check warning on line 417 in acto/engine.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on lines 416-417

runner: Runner = self.runner_t(
self.context,
Expand All @@ -423,7 +429,7 @@
trial_dir,
self.input_model,
oracle_handle,
self.custom_oracle,
self.custom_checker,
)

curr_input = self.input_model.get_seed_input()
Expand Down Expand Up @@ -555,7 +561,6 @@
run_result.oracle_result.differential = self.run_recovery( # pylint: disable=assigning-non-slot
runner
)
generation += 1
trial_err = run_result.oracle_result
setup_fail = True
break
Expand Down Expand Up @@ -586,7 +591,6 @@
run_result.oracle_result.differential = self.run_recovery(
runner
)
generation += 1
trial_err = run_result.oracle_result
break

Expand All @@ -596,10 +600,10 @@
break

if trial_err is not None:
trial_err.deletion = self.run_delete(runner, generation=generation)
trial_err.deletion = self.run_delete(runner, generation=0)
else:
trial_err = OracleResults()
trial_err.deletion = self.run_delete(runner, generation=generation)
trial_err.deletion = self.run_delete(runner, generation=0)

Check warning on line 606 in acto/engine.py

View workflow job for this annotation

GitHub Actions / coverage-report

Missing coverage

Missing coverage on lines 603-606

return TrialResult(
trial_id=trial_id,
Expand Down Expand Up @@ -767,9 +771,9 @@
logger = get_thread_logger(with_prefix=True)

logger.debug("Running delete")
success = runner.delete(generation=generation)
deletion_failed = runner.delete(generation=generation)

if not success:
if deletion_failed:
return DeletionOracleResult(message="Deletion test case")
else:
return None
Expand Down Expand Up @@ -884,13 +888,16 @@

self.sequence_base = 0

self.custom_oracle: Optional[type[CheckerInterface]] = None
self.custom_on_init: Optional[Callable] = None
if operator_config.custom_oracle is not None:
module = importlib.import_module(operator_config.custom_oracle)
self.custom_oracle = module.CUSTOM_CHECKER
self.custom_on_init = module.ON_INIT
else:
self.custom_oracle = None
self.custom_on_init = None
if hasattr(module, "CUSTOM_CHECKER") and issubclass(
module.CUSTOM_CHECKER, CheckerInterface
):
self.custom_checker = module.CUSTOM_CHECKER
if hasattr(module, "ON_INIT"):
self.custom_on_init = module.ON_INIT

# Generate test cases
self.test_plan = self.input_model.generate_test_plan(
Expand Down Expand Up @@ -1125,7 +1132,7 @@
self.checker_type,
self.operator_config.wait_time,
self.custom_on_init,
self.custom_oracle,
self.custom_checker,
self.workdir_path,
self.cluster,
i,
Expand Down
2 changes: 0 additions & 2 deletions acto/input/constraint.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
""""""

from typing import Literal, Optional

import pydantic
Expand Down
Loading
Loading