-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not fail on exception message formatting #2061
base: master
Are you sure you want to change the base?
Do not fail on exception message formatting #2061
Conversation
Could you add some regression tests, please? This seems quite brittle otherwise. |
It might be worth wrapping the format calls for the core-device case (where we do want it to happen) in a try/except with a better error message, as the user experience for that case seems quite bad (see original report). |
11c6ebc
to
2df5ccf
Compare
This comment was marked as resolved.
This comment was marked as resolved.
a28219a
to
c347bb2
Compare
artiq/coredevice/comm_kernel.py
Outdated
message = nested_exceptions[0][1].format(*nested_exceptions[0][2]) | ||
except Exception as ex: | ||
message = nested_exceptions[0][1] | ||
logger.error("Couldn't format exception message `{}`: {}: {}".format(message, type(ex).__name__, str(ex))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in several previous review comments, and consistently with the rest of the codebase, the pattern to use here is exc_info=True
.
Please check your diff and remove the binary file that shouldn't be committed. |
try: | ||
lines.append("{}({}): {}".format(name, exn_id, message.format(*params))) | ||
except: | ||
lines.append("{}({}): {}".format(name, exn_id, message)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent failure is not a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a failure, if user will have some message which would break the message.format(*params)
, it will just not be formatted, which is intended behavior. User is not able to use such messages anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we even formatting those user exception messages anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the alternative is to add marker on the compile stage, and then transmit it each time with exception message. Otherwise it all will be hackable one way or another
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also changing the protocol and compiler seems to be overkill, since we do not assume it to be secure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case it should get documented that exception messages get formatted...
with self.assertLogs() as captured: | ||
with self.assertRaisesRegex(RTIOUnderflow, | ||
re.compile( | ||
r"RTIO underflow at channel 0x\d+?:led\d*?, \d+? mu, slack -\d+? mu.*?RTIOUnderflow\(\d+\): RTIO underflow at channel 0x\d+?:led\d+?, \d+? mu, slack -\d+? mu", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are relying on the device map then you need to update the CI to load it.
c5352c6
to
63754bc
Compare
flake.nix
Outdated
export ARTIQ_LOW_LATENCY=1 | ||
|
||
artiq_rtiomap --device-db $ARTIQ_ROOT/device_db.py device_map.bin | ||
artiq_mkfs -s ip `python -c "import artiq.examples.kc705_nist_clock.device_db as ddb; print(ddb.core_addr)"`/24 -f device_map device_map.bin kasli.config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit confusing; it's not a Kasli.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good to merge (squashing commits 1/2 and 4/5 while doing so).
63754bc
to
732707c
Compare
Thank you. |
def run(self): | ||
self.core.reset() | ||
for _ in range(1000): | ||
self.led.on() | ||
self.led.off() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a very roundabout/brittle way of inducing an underflow. Perhaps just at_mu(self.core.get_rtio_counter_mu() - 1000); self.led.on()
or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the original code should cause a sequence error, not underflow. Did you actually get underflow @thomasfire ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the same observation, but figured the channel presumably has event replacement enabled, and maybe that is then the correct behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Event replacement is done after lane allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got underflow, as described in the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then please investigate, it's not supposed to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe splitting this off into a separate issue and replacing this with a test case that actually is supposed to underflow is the way to go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the docs:
If the current cursor is in the past, an artiq.coredevice.exceptions.RTIOUnderflow exception is thrown.
Isn't it actually this at at_mu(self.core.get_rtio_counter_mu() - 1000); self.led.on()
? So it seems to be fine here
732707c
to
907a1a2
Compare
Signed-off-by: Egor Savkin <es@m-labs.hk>
Also add tests for kernel exceptions Signed-off-by: Egor Savkin <es@m-labs.hk>
Signed-off-by: Egor Savkin <es@m-labs.hk>
Signed-off-by: Egor Savkin <es@m-labs.hk>
Signed-off-by: Egor Savkin <es@m-labs.hk>
907a1a2
to
bfe4a07
Compare
ARTIQ Pull Request
Description of Changes
Related Issue
Closes #2058
Type of Changes
Steps
Testing
See example from #2058 :
Licensing
See copyright & licensing for more info.
ARTIQ files that do not contain a license header are copyrighted by M-Labs Limited and are licensed under LGPLv3+.