-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: Improved treatment of whitespace in assembly format #3044
Conversation
@@ -487,12 +487,21 @@ def parse_optional_group(self) -> FormatDirective: | |||
anchor = then_elements[-1] | |||
self.parse_punctuation("?") | |||
|
|||
if not then_elements: | |||
# Pull whitespace element of front, as they are not parsed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be keen to know if there's a more "pythonic" way of doing this.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3044 +/- ##
==========================================
- Coverage 89.86% 89.86% -0.01%
==========================================
Files 415 415
Lines 52060 52078 +18
Branches 8045 8048 +3
==========================================
+ Hits 46784 46799 +15
- Misses 3983 3984 +1
- Partials 1293 1295 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!! stencil dialect be pretty 👌
And I thought it was partially supported ahah, but it was actually accidentally accepted as a degenerate keyword, is that right?
whitespace_counter = 0 | ||
while whitespace_counter < len(then_elements) and isinstance( | ||
then_elements[whitespace_counter], WhitespaceDirective | ||
): | ||
whitespace_counter += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a maybe Pythonic way?
whitespace_counter = 0 | |
while whitespace_counter < len(then_elements) and isinstance( | |
then_elements[whitespace_counter], WhitespaceDirective | |
): | |
whitespace_counter += 1 | |
whitespace_counter = next( | |
( | |
i | |
for i, e in enumerate(then_elements) | |
if not isinstance(e, WhitespaceDirective) | |
), | |
len(then_elements), | |
) |
I don't find it necessarily clearer myself, but answering as requested!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I personally find it less clear what this does, but happy to go either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an attempt at pedagogy:
#next: get the next thing of an iterable
whitespace_counter = next(
(
# Just list the indices
i
# enumerate: list all elements and their indices ((0, l[0]), (1, l[1]), ..., (len(l)-1, ...))
for i, e in enumerate(then_elements)
# Here, filter those pairs to get only the non-whitespace ones
if not isinstance(e, WhitespaceDirective)
),
# Default value for next to return, if the iterable is empty/exhausted
# (Here, if there are no elements or no non-whitespace elements)
# Could use a None or what, I was just golfing for getting the exact same counting
# with those here :) Do what suits you
len(then_elements),
)
So the above is literally "give me the index of the first non-whitespace element, or the number of elements if none"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I personally find it less clear what this does, but happy to go either way
Yeah I don't mind myself; as long as it works and is understandable - I'd say both styles are both for the everyday xDSL dev!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if all the elements are whitespace then it should throw an error (as the next part of the code does)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep this one kept it, you get the length of the list out so you would trigger the exception down
Otherwise, just remove the len(then_elements)
from there and next()
itself would throw an error, but not a nice hand-written one; You could thus try/catch but I'd find this way OTT here.
You could also just replace len(then_elements)
in there by None
and just
if whitespace_counter is None:
# fancy_error
To make it less cognitively loaded!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend something like this:
whitespace_counter = 0 | |
while whitespace_counter < len(then_elements) and isinstance( | |
then_elements[whitespace_counter], WhitespaceDirective | |
): | |
whitespace_counter += 1 | |
first_non_whitespace_index = 0 | |
for first_non_whitespace_index, el in enumerate(then_elements): | |
if isinstance(el, WhitespaceDirective): | |
break |
The content parsed for any whitespace directive was whatever whitespace followed the directive. It just so happens that in nearly every case this is a single space for both cases. This meant |
Also I don't know much about the stencil dialect, but is the result of this pr correct for it? |
Yeah, the stencil dialect is a modified one anyway so there is no strict compatibility to have; thanks for asking! |
"""The whitespace that should be printed.""" | ||
|
||
def parse(self, parser: Parser, state: ParsingState) -> None: | ||
pass | ||
|
||
def print(self, printer: Printer, state: PrintingState, op: IRDLOperation) -> None: | ||
printer.print(self.whitespace) | ||
state.last_was_punctuation = False | ||
state.last_was_punctuation = len(self.whitespace) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Length of strings is O(n) :)
state.last_was_punctuation = len(self.whitespace) == 0 | |
state.last_was_punctuation = not bool(self.whitespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This string always has length 0 or 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could compromise on the simpler self.whitespace == ""
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
I feel that this is pretty clean now (an amalgamation of some of the suggestions): # Pull whitespace element of front, as they are not parsed
first_non_whitespace_index = None
for i, x in enumerate(then_elements):
if not isinstance(x, WhitespaceDirective):
first_non_whitespace_index = i
break
if first_non_whitespace_index is None:
self.raise_error("An optional group must have a non-whitespace directive") |
If no-one objects to the above way of getting the index I think this is ready to merge? |
This makes the following changes:
``
stencil.store
These changes bring the assembly format closer to the assembly format of mlir