Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix DB unittest reliability #4548

Merged
merged 37 commits into from
Sep 20, 2024
Merged

fix: Fix DB unittest reliability #4548

merged 37 commits into from
Sep 20, 2024

Conversation

anticorrelator
Copy link
Contributor

@anticorrelator anticorrelator commented Sep 7, 2024

Updates our DB management strategy for unittests, this resolves the majority of stability issues we were seeing in CI.

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Sep 7, 2024
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. size:M This PR changes 30-99 lines, ignoring generated files. labels Sep 7, 2024
@anticorrelator anticorrelator changed the title fix(DO NOT MERGE): Lifespan testing fix: Fix DB unittest reliability Sep 11, 2024
.github/workflows/python-CI.yml Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
@@ -248,21 +248,21 @@ dependencies = [
]

[tool.hatch.envs.default.scripts]
tests = "pytest -n auto {args}"
tests = "pytest {args}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we no longer need pytest-xdist?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many cores are we using if we don't specify this option? do we still need pytest-xdist?

tests/datasets/test_experiments.py Show resolved Hide resolved
@@ -116,7 +111,7 @@ def openai_api_key(monkeypatch: pytest.MonkeyPatch) -> str:
postgresql_connection = factories.postgresql("postgresql_proc")


@pytest.fixture()
@pytest.fixture(scope="function")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't "function" the default scope?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but it's better to be explicit in case the default changes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. fwiw i doubt pytest will change that default since it would be a very disruptive change.

@@ -248,21 +248,21 @@ dependencies = [
]

[tool.hatch.envs.default.scripts]
tests = "pytest -n auto {args}"
tests = "pytest {args}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many cores are we using if we don't specify this option? do we still need pytest-xdist?

@anticorrelator anticorrelator merged commit 5e1d0b5 into auth Sep 20, 2024
16 checks passed
@anticorrelator anticorrelator deleted the dustin/lifespan-testing branch September 20, 2024 15:40
RogerHYang pushed a commit that referenced this pull request Sep 21, 2024
* Remove busywait

* Ruff 🐶

* Use closure loop

* Use nest-asyncio for nested asgi fixture management

* Wait for db insertions before reading in test

* Ensure the entire experiment has run

* Experiment with locks

* xfail unstable tests

* Use asyncio.sleep before querying database after client interactions

* Ruff 🐶

* Reduce number of evaluators to make tests more reliable

* Only bypass lock for unittests

* Convert to an integration test

* Set default loop scope for unit tests

* Remove loop policy

* xfail tests where evals do not reliably write to the database

* Ensure databases are function scoped

* Ensure inmemory sqlite testing

* Ruff 🐶

* Wipe DBs between tests

* Continue github actions on error

* Use async sleep in spans test

* Remove needless import

* Refactor engine setup to potentially reduce deadlock risk

* Wait for evaluations for more stable tests

* Don't continue on failure

* Ruff 🐶

* BulkInsterters insert immediately in tests

* Remove xdist

* Increase timeout to 30

* Xfail test

* Use shared cache

* Use tempfile based sqlite db

* Use tempdirs for windows compatibility

* Xfail test again

* Wait a waiter to llm eval test

* Skip flaky tests only on windows and mac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants