Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not fail on exception message formatting #2061

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

thomasfire
Copy link
Contributor

ARTIQ Pull Request

Description of Changes

Related Issue

Closes #2058

Type of Changes

Type
βœ“ πŸ› Bug fix

Steps

Testing

See example from #2058 :

Core Device Traceback:
Traceback (most recent call first):
  File "mar_exception.py", line 12, in throw
    raise Exception("{foo}")
  File "mar_exception.py", line 9, in ?? (RA=+0x184)
    self.throw()
builtins.Exception(5): {foo}

End of Core Device Traceback

Traceback (most recent call last):
  File "/nix/store/llvq4q4jdv072x9s0zwckd3307dhdvr9-python3.10-artiq-8.8314.11c6ebc.beta/bin/.artiq_run-wrapped", line 9, in <module>
    sys.exit(main())
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/frontend/artiq_run.py", line 226, in main
    return run(with_file=True)
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/frontend/artiq_run.py", line 212, in run
    raise exn
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/frontend/artiq_run.py", line 205, in run
    exp_inst.run()
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/language/core.py", line 54, in run_on_core
    return getattr(self, arg).run(run_on_core, ((self,) + k_args), k_kwargs)
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/coredevice/core.py", line 140, in run
    self._run_compiled(kernel_library, embedding_map, symbolizer, demangler)
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/coredevice/core.py", line 130, in _run_compiled
    self.comm.serve(embedding_map, symbolizer, demangler)
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/coredevice/comm_kernel.py", line 716, in serve
    self._serve_exception(embedding_map, symbolizer, demangler)
  File "/nix/store/3ry1d3n6c770cxk6zjl0jixgl4sd1y5f-python3-3.10.9-env/lib/python3.10/site-packages/artiq/coredevice/comm_kernel.py", line 698, in _serve_exception
    raise python_exn
Exception: {foo}

Licensing

See copyright & licensing for more info.
ARTIQ files that do not contain a license header are copyrighted by M-Labs Limited and are licensed under LGPLv3+.

@dnadlinger
Copy link
Collaborator

Could you add some regression tests, please? This seems quite brittle otherwise.

@dnadlinger
Copy link
Collaborator

It might be worth wrapping the format calls for the core-device case (where we do want it to happen) in a try/except with a better error message, as the user experience for that case seems quite bad (see original report).

@thomasfire thomasfire force-pushed the 2058-do-not-fmt-host-exceptions branch from 11c6ebc to 2df5ccf Compare March 20, 2023 04:43
@thomasfire thomasfire changed the title Do not format host messages Do not fail on exception message formatting Mar 20, 2023
@thomasfire

This comment was marked as resolved.

message = nested_exceptions[0][1].format(*nested_exceptions[0][2])
except Exception as ex:
message = nested_exceptions[0][1]
logger.error("Couldn't format exception message `{}`: {}: {}".format(message, type(ex).__name__, str(ex)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in several previous review comments, and consistently with the rest of the codebase, the pattern to use here is exc_info=True.

@sbourdeauducq
Copy link
Member

Please check your diff and remove the binary file that shouldn't be committed.

try:
lines.append("{}({}): {}".format(name, exn_id, message.format(*params)))
except:
lines.append("{}({}): {}".format(name, exn_id, message))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent failure is not a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a failure, if user will have some message which would break the message.format(*params), it will just not be formatted, which is intended behavior. User is not able to use such messages anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we even formatting those user exception messages anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the alternative is to add marker on the compile stage, and then transmit it each time with exception message. Otherwise it all will be hackable one way or another

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also changing the protocol and compiler seems to be overkill, since we do not assume it to be secure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case it should get documented that exception messages get formatted...

with self.assertLogs() as captured:
with self.assertRaisesRegex(RTIOUnderflow,
re.compile(
r"RTIO underflow at channel 0x\d+?:led\d*?, \d+? mu, slack -\d+? mu.*?RTIOUnderflow\(\d+\): RTIO underflow at channel 0x\d+?:led\d+?, \d+? mu, slack -\d+? mu",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are relying on the device map then you need to update the CI to load it.

@thomasfire thomasfire force-pushed the 2058-do-not-fmt-host-exceptions branch 2 times, most recently from c5352c6 to 63754bc Compare April 25, 2023 05:50
flake.nix Outdated
export ARTIQ_LOW_LATENCY=1

artiq_rtiomap --device-db $ARTIQ_ROOT/device_db.py device_map.bin
artiq_mkfs -s ip `python -c "import artiq.examples.kc705_nist_clock.device_db as ddb; print(ddb.core_addr)"`/24 -f device_map device_map.bin kasli.config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing; it's not a Kasli.

Copy link
Collaborator

@dnadlinger dnadlinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good to merge (squashing commits 1/2 and 4/5 while doing so).

@thomasfire thomasfire force-pushed the 2058-do-not-fmt-host-exceptions branch from 63754bc to 732707c Compare September 29, 2023 02:18
@thomasfire
Copy link
Contributor Author

Should be good to merge (squashing commits 1/2 and 4/5 while doing so).

Thank you.
The commits seem to be squashed into one anyway, I will keep them in my repo for reference.

Comment on lines 57 to 61
def run(self):
self.core.reset()
for _ in range(1000):
self.led.on()
self.led.off()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very roundabout/brittle way of inducing an underflow. Perhaps just at_mu(self.core.get_rtio_counter_mu() - 1000); self.led.on() or something like that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the original code should cause a sequence error, not underflow. Did you actually get underflow @thomasfire ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same observation, but figured the channel presumably has event replacement enabled, and maybe that is then the correct behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event replacement is done after lane allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got underflow, as described in the tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then please investigate, it's not supposed to happen.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe splitting this off into a separate issue and replacing this with a test case that actually is supposed to underflow is the way to go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docs:

If the current cursor is in the past, an artiq.coredevice.exceptions.RTIOUnderflow exception is thrown.

Isn't it actually this at at_mu(self.core.get_rtio_counter_mu() - 1000); self.led.on() ? So it seems to be fine here

@thomasfire thomasfire force-pushed the 2058-do-not-fmt-host-exceptions branch from 732707c to 907a1a2 Compare October 16, 2023 04:46
Signed-off-by: Egor Savkin <es@m-labs.hk>
Also add tests for kernel exceptions

Signed-off-by: Egor Savkin <es@m-labs.hk>
Signed-off-by: Egor Savkin <es@m-labs.hk>
Signed-off-by: Egor Savkin <es@m-labs.hk>
Signed-off-by: Egor Savkin <es@m-labs.hk>
@thomasfire thomasfire force-pushed the 2058-do-not-fmt-host-exceptions branch from 907a1a2 to bfe4a07 Compare June 28, 2024 04:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception marshalling vs. accidental format string syntax in messages
3 participants