Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run testbed with Textual in CI #2823

Merged
merged 2 commits into from
Sep 12, 2024
Merged

Conversation

rmartin16
Copy link
Member

@rmartin16 rmartin16 commented Sep 8, 2024

Changes

  • Adds a poor man's test for Textual...since it just starts the app with Textual and falsely reports testing success

Notes

  • While this is far from actually testing Textual...it does, at least, confirm a Textual app can start

Related

PR Checklist:

  • All new features have been tested
  • All new features have been documented
  • I have read the CONTRIBUTING.md file
  • I will abide by the code of conduct

@rmartin16 rmartin16 force-pushed the textual-testbed branch 4 times, most recently from 4a63acb to 5b52443 Compare September 8, 2024 20:33
@rmartin16
Copy link
Member Author

rmartin16 commented Sep 8, 2024

well, there's something weird is happening with the stub app:

[testbed] Starting test_suite...

>>> Running Command:
>>>     /home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/bin/testbed -ci
>>> Working Directory:
>>>     /home/russell
>>> Environment Overrides:
>>>     BRIEFCASE_DEBUG=1
>>>     BRIEFCASE_MAIN_MODULE=tests.testbed
===========================================================================
Install path: /home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr
Pre-initializing Python runtime...
config.executable: /home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/bin/testbed
config.run_module: tests.testbed
PYTHONPATH:
- /usr/lib/python3.10
- /usr/lib/python3.10/lib-dynload
- /home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/lib/testbed/app
- /home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/lib/testbed/app_packages
Configure argc/argv...
Initializing Python runtime...
Running app module: tests.testbed
---------------------------------------------------------------------------
TypeError: 'str' object cannot be interpreted as an integer

This doesn't happen when I use the system Python:

❯ PYTHONPATH="/usr/lib/python3.10/:/usr/lib/python3.10/lib-dynload/:/home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/lib/testbed/app/:/home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/lib/testbed/app_packages/" /usr/bin/python3.10 -c 'import runpy; runpy._run_module_as_main("tests.testbed")'
inside tests.testbed
Waiting for app to be ready for testing... /home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/lib/testbed/app_packages/toga/__init__.py:62: NotImplementedWarning: [Textual] Not implemented: Status Icons
  warnings.warn(NotImplementedWarning(f"[{platform}] Not implemented: {feature}"))
/home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/lib/testbed/app_packages/toga/__init__.py:62: NotImplementedWarning: [Textual] Not implemented: App.create_menus()
  warnings.warn(NotImplementedWarning(f"[{platform}] Not implemented: {feature}"))
/home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/lib/testbed/app_packages/toga/__init__.py:62: NotImplementedWarning: [Textual] Not implemented: Window.create_toolbar()
  warnings.warn(NotImplementedWarning(f"[{platform}] Not implemented: {feature}"))
ready.
>>>>>>>>>> EXIT 0 <<<<<<<<<<

It's even stranger that it happens for Windows as well....

@rmartin16
Copy link
Member Author

At any rate, if something like this was working, would you be interested in adding it to CI? Or just wait until the test suite can actually run? Or some other middle ground?

@freakboy3742
Copy link
Member

At any rate, if something like this was working, would you be interested in adding it to CI? Or just wait until the test suite can actually run? Or some other middle ground?

Yes, I'd be interested in seeing this (or a working version of it). While I'd love a full CI test suite, any validation would be better than nothing. Even a basic "will it start" test would have caught the 0.63.2 upgrade issue.

As a side note - Textual has unit test support, and that support will allow doing things like changing the screen size. However, I haven't done any investigation into how to make it work, beyond "it exists".

My assumption to date has been that we'd need a second app configuration to provide a testbed for textual (and probably a third for web) that is not much more than "from testbed import *", but providing a configuration hook to put the toga-textual requirements.

@rmartin16
Copy link
Member Author

Do you have any intuition where this error might be coming from?

TypeError: 'str' object cannot be interpreted as an integer

As best I can tell, the error text is from operator.py....but other projects have used this error message.

If take Briefcase out of the equation, the same thing happens:

❯ BRIEFCASE_MAIN_MODULE=test.testbed /home/russell/github/beeware/toga/testbed/build/testbed/pop/jammy/testbed-0.0.1/usr/bin/testbed
TypeError: 'str' object cannot be interpreted as an integer

Without BRIEFCASE_MAIN_MODULE=test.testbed, the app runs normally.

Debugging the stub app, Python is not exiting with an exception; just a return code of -1.

And as best I can tell, the error is occurring before any code in testbed.py actually runs....

@rmartin16
Copy link
Member Author

Interesting....just throwing spaghetti at the wall at this point...

This fails

briefcase run --test --config 'requires=["../core","../textual"]' --config 'test_sources=[]'

But this works...

briefcase build --test --config 'requires=["../core","../textual"]' --config 'test_sources=[]' \
    && briefcase run --test

@freakboy3742
Copy link
Member

Do you have any intuition where this error might be coming from?

TypeError: 'str' object cannot be interpreted as an integer

I don't - it's clearly a Python error, but I don't get that on macOS; and there's no Windows-specific Python logic in the stub that I can think of...

As best I can tell, the error text is from [operator.py]

Yeah - it would be consistent with something trying to do "some value" + 42... but I can't think of anywhere obvious that would be happening. My only suggestion at this point would be to strip down testbed.py bit by bit until the offender is found... it might be something that pytest is generating, which is why most of the stack trace is being swallowed?

@rmartin16
Copy link
Member Author

Yeah - it would be consistent with something trying to do "some value" + 42... but I can't think of anywhere obvious that would be happening. My only suggestion at this point would be to strip down testbed.py bit by bit until the offender is found... it might be something that pytest is generating, which is why most of the stack trace is being swallowed?

That's just it, though...this error is occurring seemingly before any of testbed.py runs...

It looks like the Briefcase args are leaking in to the stub app...

@rmartin16 rmartin16 force-pushed the textual-testbed branch 4 times, most recently from 6227b25 to 72f654b Compare September 9, 2024 05:45
@rmartin16
Copy link
Member Author

Ok; I think figured this out. It has nothing to do with Textual or command line arguments leaking.

This is just the error you get if you ask the stub app to run a module that doesn't exist, e.g.:

❯ BRIEFCASE_MAIN_MODULE=asdf /home/russell/tmp/beeware/configtest/build/configtest/pop/jammy/configtest-0.0.1/usr/bin/configtest
TypeError: 'str' object cannot be interpreted as an integer

This was happening for the testbed app because I was passing --config 'test_sources=[]'; so, when Briefcase set BRIEFCASE_MAIN_MODULE to tests.testbed, that module didn't exist in the app, and this error was thrown.

I think this simulates a similar operation with the Python REPL:

/usr/bin/python3.10 -c 'import runpy; runpy._run_module_as_main("asdf")'
/usr/bin/python3.10: No module named asdf

@rmartin16
Copy link
Member Author

Also....first evidence that one second isn't long enough for the DOM storage test I added...

AssertionError: assert 'SecurityError: The operation is insecure.' == 'Hello World'

https://github.com/beeware/toga/actions/runs/10767292112/job/29854361006?pr=2823

@freakboy3742
Copy link
Member

Ok; I think figured this out. It has nothing to do with Textual or command line arguments leaking.

This is just the error you get if you ask the stub app to run a module that doesn't exist, e.g.:

Well that's an interesting manifestation... it's unlikely apps will hit this in the wild, but if we can make it present nicer, we probably should.

Also....first evidence that one second isn't long enough for the DOM storage test I added...

Some errors like this are going to be inevitable when we're dependent on network access; unless it becomes a regular fixture in test failures, I'm not overly concerned.

@rmartin16
Copy link
Member Author

As for running the testbed app with Textual on macOS in test mode, I'm not sure that Briefcase supports this...

Even if I create the helloworld console app and run briefcase run macOS --test, I just get the Test suite didn't report a result message. On macOS, Briefcase will run a console app like a GUI app in order to capture stdout; while I can confirm that pytest runs, the output is not captured by Briefcase.

@freakboy3742
Copy link
Member

As for running the testbed app with Textual on macOS in test mode, I'm not sure that Briefcase supports this...

You're right there's no explicit support for Textual in Briefcase. Same goes for Web.

Test mode at present is really not much more than "run tests.<appname> as the entry point, capture stdout looking for test completion marker". That's not compatible with console mode apps; and difficult for Web.

I have been thinking recently that we may need to rethink how we run testing. This has been mostly driven by the web backend, but it has some possible benefits for other platforms.

At present, the test suite runs "in" the app, and reports success via console. That means the app needs to support threads (which Web can't); and it means the state of the test suite gets tied up in the state of the app.

An alternate approach would be to do something closer to Playwright/Selenium. The test suite runs as a normal Python test suite, and it opens a "session" to a remote "app" instance. In the case of Playwright, that's a browser session that have been instrumented. You send "remote control" commands to the browser, and poll the browser to query state and evaluate test conditions. This removes the need for any special handling - the test suite is "just" a test suite; it also means your test suite can, if necessary, start multiple apps, allowing you to test startup conditions etc (speed might be an issue for this; but it's not an option at all at present).

This would require a pretty radical rework of how Toga does testing... but not that radical, as we've already got the "instrumentation" stuff mostly abstracted. We need to set up an "over the wire" control mechanism, including an in-app server to accept control connections... but that's also something the Toga API is literally designed to make possible.

It also solves some big issues we have right now with log streaming - some of our failure modes are currently "macOS/iOS dropped a log message due to load/volume, and now the test suite doesn't know it finished". In the remote-control setup, all testing is a local test suite, so there's no ambiguity.

NGL, it would be a big change - but I think it makes Toga's testing story a lot more consistent and stable.

@rmartin16
Copy link
Member Author

rmartin16 commented Sep 10, 2024

Yeah; that makes sense. FWIW, if we implemented Textual's current testing support, the app would run in headless mode and there's an object that provides an interface to probe the running app. At least, that's my very cursory review of Textual's testing functionality....so, I'd expect it to be possible to still shoehorn in support for Textual in Toga's current setup if we had to.

@rmartin16
Copy link
Member Author

I was thinking, though....we could bring macOS to parity with Windows and Linux for testing console apps by simply streaming the output instead of calling open. That'd at least let users (and Toga CI) run briefcase run --test (even if the exact testing strategy may still be complicated story).

@freakboy3742
Copy link
Member

Yeah; that makes sense. FWIW, if we implemented Textual's current testing support, the app would run in headless mode and there's an object that provides an interface to probe the running app. At least, that's my very cursory review of Textual's testing functionality....so, I'd expect it to be possible to still shoehorn in support for Textual in Toga's current setup if we had to.

Yeah - that's my impression as well - although the entry point for the test might need to be structured a little different. However, that's also compatible with needing a second Briefcase configuration to set up the textual entry point, so it might not be a big deal.

@freakboy3742
Copy link
Member

I was thinking, though....we could bring macOS to parity with Windows and Linux for testing console apps by simply streaming the output instead of calling open. That'd at least let users (and Toga CI) run briefcase run --test (even if the exact testing strategy may still be complicated story).

Agreed that this would be technically possible; the complication with that approach is that app startup is very slightly different when you use open. We've hit this in the past with argument handling in Briefcase. In terms of guaranteeing that the app actually starts the way we think it is, I feel it's important that we preserve the "normal" startup as much as possible - mocking it should be a last resort.

@rmartin16
Copy link
Member Author

Agreed that this would be technically possible; the complication with that approach is that app startup is very slightly different when you use open. We've hit this in the past with argument handling in Briefcase. In terms of guaranteeing that the app actually starts the way we think it is, I feel it's important that we preserve the "normal" startup as much as possible - mocking it should be a last resort.

I'm not sure I follow this entirely; if for only my own edification, how does open interact with the console app on macOS?

Are we expecting that open will be used to run console apps for users following installation? Or maybe something else happening here?

For context, I was thinking that installed console apps would just be in PATH and available to run straight from the user's terminal session.

@freakboy3742
Copy link
Member

Agreed that this would be technically possible; the complication with that approach is that app startup is very slightly different when you use open. We've hit this in the past with argument handling in Briefcase. In terms of guaranteeing that the app actually starts the way we think it is, I feel it's important that we preserve the "normal" startup as much as possible - mocking it should be a last resort.

I'm not sure I follow this entirely; if for only my own edification, how does open interact with the console app on macOS?

It doesn't - unless you're testing the app. We are using the same code path to run a console app under test as we do any GUI app, specifically so that we can get the streamed output log.

I think I misunderstood what you were proposing though - I thought you were proposing replacing the use of open with a direct execution of the binary across the board for macOS. However, on re-reading, it sounds like you're suggesting that for the macOS case, we have a third execution option: Windows/Linux-style streaming of stdout, wrapped around a direct invocation of the binary - but only for testing a console binary.

I think that makes sense; since there's a now a different stub binary for console apps, the stdout handling never has the system log redirection installed.

@rmartin16 rmartin16 force-pushed the textual-testbed branch 2 times, most recently from 2e88b21 to 94e641b Compare September 10, 2024 15:41
@freakboy3742
Copy link
Member

The iOS CI failure was a gap in binary package support - I forgot that Toga requires a specific point release of Pillow (9.2.0). I've compiled and published those wheels, and re-run the CI step; looks like it's building now.

@rmartin16
Copy link
Member Author

So, for this, do you think it's necessary right now to introduce a wholly new app definition for Textual? or just stick with the --config overrides?

I also need to decide on how to start the app as "headless"...not sure I like my currently hacked-together solution.

@freakboy3742
Copy link
Member

So, for this, do you think it's necessary right now to introduce a wholly new app definition for Textual? or just stick with the --config overrides?

I hadn't thought of using --config overrides - that's a neat approach. I don't know if it's the permanent solution, but I don't object to it as a "get something working" solution.

I also need to decide on how to start the app as "headless"...not sure I like my currently hacked-together solution.

Two thoughts on this:

  1. If there's going to be a "headless" flag, I'd rather it was tied down to the Textual backend, because it's not clear to me what an interface-level "headless" attribute would represent
  2. Instead of a "headless" flag, could we use something that identifies test mode? Something like __main__.startswith('test')? Or possibly a variant on this sort of approach, where we document a specific attribute (like toga.App._test_mode) for this?

@rmartin16 rmartin16 force-pushed the textual-testbed branch 3 times, most recently from 996757f to 23eb758 Compare September 12, 2024 20:23
@rmartin16
Copy link
Member Author

rmartin16 commented Sep 12, 2024

I also need to decide on how to start the app as "headless"...not sure I like my currently hacked-together solution.

Two thoughts on this:

  1. If there's going to be a "headless" flag, I'd rather it was tied down to the Textual backend, because it's not clear to me what an interface-level "headless" attribute would represent
  2. Instead of a "headless" flag, could we use something that identifies test mode? Something like __main__.startswith('test')? Or possibly a variant on this sort of approach, where we document a specific attribute (like toga.App._test_mode) for this?

Hmm....I think I'm hesitant to over-implement this right now. While adding a "test" flag should be relatively straightforward, I don't feel like I understand what all it should/could be doing...especially since it would only drive functionality on a PoC backend. Additionally, I'm also reluctant to try to automate setting it based on the environment; while that discussion focuses on pytest, there are other ways to run test suites.

Sooo, I've just added a hacky flag to the Textual backend itself. I imagine this being replaced once the testing strategy for Textual is more fleshed out.

I'm still open to other implementations, though.

@rmartin16 rmartin16 changed the title Experiment with running testbed with Textual Run testbed with Textual in CI Sep 12, 2024
@rmartin16 rmartin16 marked this pull request as ready for review September 12, 2024 20:46
@freakboy3742
Copy link
Member

Sooo, I've just added a hacky flag to the Textual backend itself. I imagine this being replaced once the testing strategy for Textual is more fleshed out.

I'm still open to other implementations, though.

That's fair. As you've noted, Textual is still an experimental backend; the changes here are limited to either the testbed app, or the Textual backend; I can live with that until we get a more substantial textual testbed.

@freakboy3742 freakboy3742 merged commit b0ab9be into beeware:main Sep 12, 2024
38 checks passed
@rmartin16 rmartin16 deleted the textual-testbed branch September 13, 2024 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants