Add integration test for notebooks #389

thejumpman2323 · 2023-07-08T14:33:16Z

Solves a task in #135

This pull request introduces an integration test that automates the conversion of Jupyter notebooks to Python files and runs them individually as tests. This integration test enhances our testing infrastructure by providing a streamlined process for verifying the functionality and correctness of Jupyter notebook conversions.

Changes Made

Added a new integration test module test_notebooks.py which performs the Jupyter notebook to Python file conversion and execution.
Implemented the necessary logic to convert Jupyter notebooks to Python files using the nbconvert library.
Configured the test module to execute the converted Python files as individual tests using the testing framework.

Test Execution Flow

The integration test module test_notebooks.py locates the target Jupyter notebook files.
Each notebook file is automatically converted to a corresponding Python file using the nbconvert library.
The converted Python files are executed individually as tests, ensuring their correctness and functionality.
Test results, including any failures or errors, are reported in the test execution summary.

Benefits

Automation: The integration test automates the process of converting Jupyter notebooks to Python files and executing them as tests.
Verification: This test ensures that Jupyter notebook conversions produce valid Python code that can be executed successfully.
Continuous Integration: The integration test can be incorporated into our CI/CD pipeline to catch potential issues early in the development process.
Improved Code Reliability: Running Jupyter notebooks as tests helps identify any code inconsistencies or errors during conversion, leading to more reliable code.
This integration test contributes to the overall quality and reliability of our codebase, enabling us to confidently convert Jupyter notebooks to Python files and validate their correctness through automated testing.

…g things.

…upport

renamed all mentions of urls to uris to enable s3 support

…-print-statements Feature/50/logger instead of print statements

…data-types-for-major-use-cases added common data types

…r-tasks fixed broken cluster tests

…ask-if-fails handled error in case learning task, deleting original task

…erving added basic ray serve examples

feat(datalayer): format

Fix minor bugs and typos minor refactors logger to logging formatting and refactors

Fix some bugs

docs(cdc): Added working example on cdc added integration tests refactors format

Fix some issues and tests Fix minor typo Refactor tests and cdc code Fix minor typos and doc string

fkiraly · 2023-07-10T10:37:05Z

Great PR summary, @thejumpman2323!
I immediately get what we're doing here and that allows me to make hopefully useful comments.

FYI, in case it helps (I think this is still WiP?) the same type of workflow in sktime are here:

https://github.com/sktime/sktime/blob/main/.github/workflows/test.yml, CI element run-notebook-examples
this calls https://github.com/sktime/sktime/blob/main/build_tools/run_examples.sh which uses jupyter nbconvert directly (command in line 6, execution in line 10)

It's interesting how you do the same thing, but from within python / pytest.

Question: would it be cleaner if it is its own CI element (like in sktime)?

@thejumpman2323 call logic: GHA -> pytest -> python -> jupyter -> python
sktime call logic: GHA -> sh -> jupyter -> python

thejumpman2323 · 2023-07-10T13:12:08Z

Great PR summary, @thejumpman2323! I immediately get what we're doing here and that allows me to make hopefully useful comments.

FYI, in case it helps (I think this is still WiP?) the same type of workflow in sktime are here:

https://github.com/sktime/sktime/blob/main/.github/workflows/test.yml, CI element run-notebook-examples

this calls https://github.com/sktime/sktime/blob/main/build_tools/run_examples.sh which uses jupyter nbconvert directly (command in line 6, execution in line 10)

It's interesting how you do the same thing, but from within python / pytest.

Question: would it be cleaner if it is its own CI element (like in sktime)?

@thejumpman2323 call logic: GHA -> pytest -> python -> jupyter -> python sktime call logic: GHA -> sh -> jupyter -> python

Hi,
This make sense, creating separate ci for integration tests on notebooks!
Thanks

blythed · 2023-07-10T19:14:24Z

tests/integration/test_notebooks.py

+        completed_process = subprocess.run(['python3', os.path.join(tmp_dir, py_file_path)], capture_output=True, text=True, timeout=10)
+        assert completed_process.returncode == 0
+
+def test_notebooks():


Would prefer this to be configured manually - some notebooks won't work without credentials, etc..
Could we do just the mnist initially?

We would need a test account for openAI, and secrets management on GitHub.

GitHub Actions SuperDuperDB and others added 30 commits March 29, 2023 15:16

Automated docs builder

0fdb5d1

added error handling for allowed measures in faiss

29ba3ae

Merge branch 'main' of github.com:SuperDuperDB/superduperdb

31ff214

latest

12724f3

backup

f301396

latest fixes on abstraction etc.

6166d6f

initial tests with celery - requires big refactor to manage tasks

039b5f8

init

5181c1a

initial design pattern

5f6be45

changes

243021a

many reorganizing changes

5a3ba18

Made a few changes related to passing data into annotations and fixin…

dfde38c

…g things.

can train sklearn classifiers with DB now...

5ea914b

changed structure quite signficantly

ef7c582

changed structure quite signficantly

64568e7

renamed all mentions of urls to uris to enable s3 support; added s3 s…

54af49f

…upport

Merge pull request #51 from SuperDuperDB/feature/6/refactor-url-to-uri

9c5342a

renamed all mentions of urls to uris to enable s3 support

logging

e48d815

Merge branch 'feature/6/refactor-url-to-uri'

2930e1f

Merge pull request #52 from SuperDuperDB/feature/50/logger-instead-of…

0da2768

…-print-statements Feature/50/logger instead of print statements

added common data types

96a0fba

Merge pull request #54 from SuperDuperDB/feature/49/create-pre-baked-…

d550bd9

…data-types-for-major-use-cases added common data types

fixed broken cluster tests

b360df8

Merge pull request #56 from SuperDuperDB/feature/55/retest-all-cluste…

070fcb7

…r-tasks fixed broken cluster tests

handled error in case learning task, deleting original task

ed111c7

Merge pull request #57 from SuperDuperDB/feature/24/delete-learning-t…

f5339e4

…ask-if-fails handled error in case learning task, deleting original task

added basic ray serve examples

73f2314

Merge pull request #58 from SuperDuperDB/feature/21/support-for-ray-s…

59dcd22

…erving added basic ray serve examples

Fixed distinction between predict/ predict_one

aeb70cd

A few fixes to repair VC demo interface.

10e53f6

nenb and others added 17 commits July 5, 2023 16:37

Add __setitem__ to ComponentList, fix incorrect usage of repopulate

d546279

Fix up Component.validate

7825ad8

Reduce uncessary subpackages and short-form documentation

8fe94b6

Ignore warnings by default in tests

1cd8047

Add inital setup for mongo cdc

d150def

feat(datalayer): format

Add taskworflow execution on cdc

30dcaff

Fix minor bugs and typos minor refactors logger to logging formatting and refactors

Fix minor bugs and typos

53d86d9

Add copy vectors function node to taskgraph

2dee6e4

Fix some bugs

Add batch support for cdc

a5d21e2

docs(cdc): Added working example on cdc added integration tests refactors format

Fix mnist notebook

66c7771

Fix openai notebook

47364f4

Fix InsertOne class with abstract methods

0c02a84

Add type_ids to mongodb classes

08d90e0

Use unions to simplify mongodb/queries.py

6aff3e0

Fix up Component.validate

dd2d6e5

Add integration test

0a9b92a

Fix some issues and tests Fix minor typo Refactor tests and cdc code Fix minor typos and doc string

Add integration test for notebooks

bd57e56

thejumpman2323 requested a review from blythed July 8, 2023 14:33

thejumpman2323 self-assigned this Jul 8, 2023

thejumpman2323 marked this pull request as draft July 8, 2023 17:07

blythed reviewed Jul 10, 2023

View reviewed changes

rec force-pushed the main branch 2 times, most recently from bf5f0c5 to 0f516a0 Compare July 11, 2023 09:50

thejumpman2323 added the 📉 technical debt Things that slow us down label Jul 11, 2023

rec closed this Jul 15, 2023

rec force-pushed the main branch from 9770cb6 to 5696eb7 Compare July 15, 2023 13:22

blythed mentioned this pull request Jul 15, 2023

Notebook/fix vector search #448

Merged

thejumpman2323 mentioned this pull request Jul 23, 2023

Test/135/test jupyter notebook #508

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add integration test for notebooks #389

Add integration test for notebooks #389

thejumpman2323 commented Jul 8, 2023 •

edited

Loading

fkiraly commented Jul 10, 2023 •

edited

Loading

thejumpman2323 commented Jul 10, 2023

blythed Jul 10, 2023

blythed Jul 10, 2023

Add integration test for notebooks #389

Add integration test for notebooks #389

Conversation

thejumpman2323 commented Jul 8, 2023 • edited Loading

Changes Made

Test Execution Flow

Benefits

fkiraly commented Jul 10, 2023 • edited Loading

thejumpman2323 commented Jul 10, 2023

blythed Jul 10, 2023

Choose a reason for hiding this comment

blythed Jul 10, 2023

Choose a reason for hiding this comment

thejumpman2323 commented Jul 8, 2023 •

edited

Loading

fkiraly commented Jul 10, 2023 •

edited

Loading