feat: add pytest for e2e testing #1188

namchuai · 2024-09-11T05:39:28Z

Describe Your Changes

Sample testing for our CLI and API server

Items

How to use

Build cortex binary first. Binary should be available at engine/build/cortex-cpp
Install python
pip install pytest requests
2.1. On windows, I have to use pip install pytest requests --user
python e2e-test/main.py

Remaining things to do

Retest on windows to see if this works as expected
Wiring with our CI/CD?

Screenshots

windows

Sample output

~/w/c/engine ❯❯❯ python e2e-test/main.py
===================================================================== test session starts ======================================================================
platform darwin -- Python 3.11.4, pytest-8.3.2, pluggy-1.5.0 -- /Users/jamesnguyen/.pyenv/versions/3.11.4/bin/python
cachedir: .pytest_cache
rootdir: /Users/jamesnguyen/workspace/cortex/engine
plugins: anyio-4.3.0
collected 7 items                                                                                                                                              

e2e-test/main.py::TestCliEngineList::test_engines_list_run_successfully_on_windows SKIPPED (Windows-specific test)                                       [ 14%]
e2e-test/main.py::TestCliEngineList::test_engines_list_run_successfully_on_macos PASSED                                                                  [ 28%]
e2e-test/main.py::TestCliEngineList::test_engines_list_run_successfully_on_linux SKIPPED (Linux-specific test)                                           [ 42%]
e2e-test/main.py::TestCliEngineGet::test_engines_list_run_successfully_on_windows SKIPPED (Windows-specific test)                                        [ 57%]
e2e-test/main.py::TestCliEngineGet::test_engines_get_run_successfully_on_macos PASSED                                                                    [ 71%]
e2e-test/main.py::TestCliEngineGet::test_engines_get_llamacpp_on_macos PASSED                                                                            [ 85%]
e2e-test/main.py::TestCliEngineGet::test_engines_list_run_successfully_on_linux SKIPPED (Linux-specific test)                                            [100%]

================================================================= 3 passed, 4 skipped in 2.12s =================================================================

Fixes Issues

epic: Basic Automated test for Installation & Inference on compatible Hardware & OS #1147

Self Checklist

Added relevant comments, esp in complex areas
Updated docs (for bug fixes / features)
Created issues for follow-up changes or refactoring needed

dan-menlo · 2024-09-11T14:01:12Z

@namchuai I think this is a great start, but I'd like to push our testing strategy to be more rigorous:

What are the key failure modes of engine installation? (e.g. detecting hardware type?)
Or are we pushing hardware detection to a separate group of tests?
EDIT: I see what you are doing with checking for CLI output - I agree that it's the correct way to do it

Also: can I check my understanding:

Are we intending to do an actual pull of TensorRT-LLM or ONNX engines? (i.e. e2e)
How often do we intend this test to run?

engine/e2e-test/test_runner.py

engine/e2e-test/test_cli_engine_uninstall.py

namchuai · 2024-09-11T15:37:53Z

@namchuai I think this is a great start, but I'd like to push our testing strategy to be more rigorous:

What are the key failure modes of engine installation? (e.g. detecting hardware type?)

I see that most of our tests are cursory log level checks

Or are we pushing hardware detection to a separate group of tests?

What are the key failure modes of engine installation? (e.g. detecting hardware type?)
IMO, some of failure modes of engine installations is:

Not supported hardware or low version of Cuda or driver.
Internet issue (engine, e.g tensorrt might be heavy user got a corrupted download)

I see that most of our tests are cursory log level checks
This PR is intended for put a foundation for end to end testing. It's not meant to cover all features of the app. Other devs should checkin and write their feature's testcases.

I did put:
- API to start and stop api server for other developers can easily write their tests with API, covered by setup and teardown for each test suite.
- API to run a command in case developer want to write test for CLI.
- I write test for download engine and cortex actually download the suitable engine based on user's specs.

IMO, end to end testing is based on user's perspective. It ensure that whatever we changes or update in the future, the feature still works. And it will include mostly stdout and stderr validation.

Or are we pushing hardware detection to a separate group of tests?
Other tests should be in other PRs. Let's keep this PR be the foundation.
And, I'm a bit curious about hardware detection test. Since user's hardware spec is varies, I don't think we can put it in e2e tests.

I think we can separate it into 2 parts:
Part 1: Detect user's hardware
Part 2: Based on above result, pick most suitable engine variant

With this, we can mock and write unit test the Part 2 with ease.

Part 1 is more complicated and involves:

CPU check (AVX stuffs, ARM/x86_64, etc.)
-> Current state: we can have all of this information in our code base.
GPU check (nVidia/AMD/Intel ARC/Integrated GPU, etc.)
-> Current state: we have to rely on other software and parse their output to read information (nvidia-smi, vulkanSDKInfo, etc.)
Since the input is varies, I can't think of a way to test this with all of the hardwares.

For this, I think we can write an e2e test for cortex-cpp run tinyllama and expect the output to be successful and improve from there.

dan-menlo · 2024-09-12T02:13:07Z

@namchuai I think this is a great start, but I'd like to push our testing strategy to be more rigorous:

What are the key failure modes of engine installation? (e.g. detecting hardware type?)

I see that most of our tests are cursory log level checks

Or are we pushing hardware detection to a separate group of tests?

What are the key failure modes of engine installation? (e.g. detecting hardware type?)
IMO, some of failure modes of engine installations is:

Not supported hardware or low version of Cuda or driver.

Internet issue (engine, e.g tensorrt might be heavy user got a corrupted download)

I see that most of our tests are cursory log level checks
This PR is intended for put a foundation for end to end testing. It's not meant to cover all features of the app. Other devs should checkin and write their feature's testcases.

I did put: - API to start and stop api server for other developers can easily write their tests with API, covered by setup and teardown for each test suite. - API to run a command in case developer want to write test for CLI. - I write test for download engine and cortex actually download the suitable engine based on user's specs.

IMO, end to end testing is based on user's perspective. It ensure that whatever we changes or update in the future, the feature still works. And it will include mostly stdout and stderr validation.

Or are we pushing hardware detection to a separate group of tests?
Other tests should be in other PRs. Let's keep this PR be the foundation.
And, I'm a bit curious about hardware detection test. Since user's hardware spec is varies, I don't think we can put it in e2e tests.

I think we can separate it into 2 parts: Part 1: Detect user's hardware Part 2: Based on above result, pick most suitable engine variant

With this, we can mock and write unit test the Part 2 with ease.

Part 1 is more complicated and involves:

CPU check (AVX stuffs, ARM/x86_64, etc.)
-> Current state: we can have all of this information in our code base.

GPU check (nVidia/AMD/Intel ARC/Integrated GPU, etc.)
-> Current state: we have to rely on other software and parse their output to read information (nvidia-smi, vulkanSDKInfo, etc.)
Since the input is varies, I can't think of a way to test this with all of the hardwares.

For this, I think we can write an e2e test for cortex-cpp run tinyllama and expect the output to be successful and improve from there.

@namchuai Got it - thank you for thinking through this very clearly. Approved and fingers crossed that these new e2e tests will help us catch bugs before it goes out to users 🙏

dan-menlo

Thanks for driving this 🙏

Signed-off-by: James <namnh0122@gmail.com>

namchuai force-pushed the j/add-pytest-for-e2e-testing branch from 358d751 to f3dc114 Compare September 11, 2024 07:45

namchuai marked this pull request as ready for review September 11, 2024 07:45

namchuai force-pushed the j/add-pytest-for-e2e-testing branch from f3dc114 to 67a7b2e Compare September 11, 2024 07:50

namchuai requested review from vansangpfiev, nguyenhoangthuan99, dan-menlo and louis-menlo September 11, 2024 07:52

namchuai force-pushed the j/add-pytest-for-e2e-testing branch from 67a7b2e to 081b635 Compare September 11, 2024 13:32

namchuai force-pushed the j/add-pytest-for-e2e-testing branch from ce1b575 to f6bc323 Compare September 11, 2024 14:47

dan-menlo reviewed Sep 11, 2024

View reviewed changes

engine/e2e-test/test_runner.py Outdated Show resolved Hide resolved

engine/e2e-test/test_cli_engine_uninstall.py Outdated Show resolved Hide resolved

namchuai force-pushed the j/add-pytest-for-e2e-testing branch from f6bc323 to 2b0983d Compare September 11, 2024 15:40

dan-menlo approved these changes Sep 12, 2024

View reviewed changes

vansangpfiev approved these changes Sep 12, 2024

View reviewed changes

namchuai added 4 commits September 12, 2024 14:55

feat: add pytest for e2e testing

5f9c7e4

feat: add api test

7d30680

feat: add install and uninstall e2e testing

0220b35

fix: running test on windows

505d727

Signed-off-by: James <namnh0122@gmail.com>

namchuai force-pushed the j/add-pytest-for-e2e-testing branch from 2b0983d to 505d727 Compare September 12, 2024 08:05

namchuai merged commit 848af19 into dev Sep 12, 2024
4 checks passed

namchuai deleted the j/add-pytest-for-e2e-testing branch September 12, 2024 08:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add pytest for e2e testing #1188

feat: add pytest for e2e testing #1188

Uh oh!

namchuai commented Sep 11, 2024 •

edited

Loading

Uh oh!

dan-menlo commented Sep 11, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

namchuai commented Sep 11, 2024

Uh oh!

dan-menlo commented Sep 12, 2024

Uh oh!

dan-menlo left a comment

Uh oh!

Uh oh!

Uh oh!

feat: add pytest for e2e testing #1188

feat: add pytest for e2e testing #1188

Uh oh!

Conversation

namchuai commented Sep 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe Your Changes

Items

How to use

Remaining things to do

Screenshots

windows

Sample output

Fixes Issues

Self Checklist

Uh oh!

dan-menlo commented Sep 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

namchuai commented Sep 11, 2024

Uh oh!

dan-menlo commented Sep 12, 2024

Uh oh!

dan-menlo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

namchuai commented Sep 11, 2024 •

edited

Loading

dan-menlo commented Sep 11, 2024 •

edited

Loading