Skip to content

Conversation

connieyzhu
Copy link
Contributor

@connieyzhu connieyzhu commented Aug 5, 2024

This patch extends support for using cat -e in lit tests that use the internal
shell. The logic for enabling options is changed to accommodate for the
possibility of having several options. This patch is dependent on #101530.

Fixes #102377

This patch extends support for using cat -e in lit tests that use the internal
shell. The logic for enabling options is changed to accommodate for the
possibility of having several options.
This patch changes the storing of enabled options from a list of bools to a
dataclass for better efficiency when accessing and checking the options.
@llvmbot
Copy link
Member

llvmbot commented Aug 5, 2024

@llvm/pr-subscribers-testing-tools

Author: Connie (connieyzhu)

Changes

This patch extends support for using cat -e in lit tests that use the internal
shell. The logic for enabling options is changed to accommodate for the
possibility of having several options. This patch is dependent on #101530.


Full diff: https://github.com/llvm/llvm-project/pull/102061.diff

1 Files Affected:

  • (modified) llvm/utils/lit/lit/builtin_commands/cat.py (+34-19)
diff --git a/llvm/utils/lit/lit/builtin_commands/cat.py b/llvm/utils/lit/lit/builtin_commands/cat.py
index d6dbe889cc2b1..14056e3ccd444 100644
--- a/llvm/utils/lit/lit/builtin_commands/cat.py
+++ b/llvm/utils/lit/lit/builtin_commands/cat.py
@@ -1,29 +1,42 @@
 import getopt
 import sys
+from dataclasses import dataclass, fields
 
 try:
     from StringIO import StringIO
 except ImportError:
     from io import StringIO
 
+@dataclass
+class Options:
+    show_ends: bool
+    show_nonprinting: bool
 
-def convertToCaretAndMNotation(data):
+def convertTextNotation(data, options):
     newdata = StringIO()
     if isinstance(data, str):
         data = bytearray(data.encode())
-
+    
     for intval in data:
-        if intval == 9 or intval == 10:
-            newdata.write(chr(intval))
-            continue
-        if intval > 127:
-            intval = intval - 128
-            newdata.write("M-")
-        if intval < 32:
-            newdata.write("^")
-            newdata.write(chr(intval + 64))
-        elif intval == 127:
-            newdata.write("^?")
+        if options.show_ends:
+            if intval == 10:
+                newdata.write("$")
+                newdata.write(chr(intval))
+                continue
+        if options.show_nonprinting:
+            if intval == 9 or intval == 10:
+                newdata.write(chr(intval))
+                continue
+            if intval > 127:
+                intval = intval - 128
+                newdata.write("M-")
+            if intval < 32:
+                newdata.write("^")
+                newdata.write(chr(intval + 64))
+            elif intval == 127:
+                newdata.write("^?")
+            else:
+                newdata.write(chr(intval))
         else:
             newdata.write(chr(intval))
 
@@ -32,9 +45,9 @@ def convertToCaretAndMNotation(data):
 
 def main(argv):
     arguments = argv[1:]
-    short_options = "v"
+    short_options = "ve"
     long_options = ["show-nonprinting"]
-    show_nonprinting = False
+    enabled_options = Options(show_ends=False, show_nonprinting=False)
 
     try:
         options, filenames = getopt.gnu_getopt(arguments, short_options, long_options)
@@ -43,8 +56,10 @@ def main(argv):
         sys.exit(1)
 
     for option, value in options:
-        if option == "-v" or option == "--show-nonprinting":
-            show_nonprinting = True
+        if option == "-v" or option == "--show-nonprinting" or option == "-e":
+            enabled_options.show_nonprinting = True
+        if option == "-e":
+            enabled_options.show_ends = True
 
     writer = getattr(sys.stdout, "buffer", None)
     if writer is None:
@@ -69,8 +84,8 @@ def main(argv):
                 fileToCat = open(filename, "rb")
                 contents = fileToCat.read()
 
-            if show_nonprinting:
-                contents = convertToCaretAndMNotation(contents)
+            if True in fields(enabled_options):
+                contents = convertTextNotation(contents, enabled_options)
             elif is_text:
                 contents = contents.encode()
             writer.write(contents)

@connieyzhu
Copy link
Contributor Author

CC: @ilovepi @petrhosek

Copy link

github-actions bot commented Aug 5, 2024

✅ With the latest revision this PR passed the Python code formatter.

@ilovepi ilovepi requested review from ilovepi and petrhosek August 5, 2024 21:46
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly OK, with some minor comments, but don't the tests from #101530 need to be updated in this patch?

Comment on lines 11 to 14
@dataclass
class Options:
show_ends: bool
show_nonprinting: bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some comment/documentation on this classes intended usage? probably at least worth doing for the fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 90 to 91
if True in fields(enabled_options):
contents = convertTextNotation(contents, enabled_options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I add a field to Options that isn't either show_nonprinting or show_ends?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both show_nonprinting and show_ends are False and this added field was True, convertTextNotation has an else statement on lines 43/44 that should account for this case, and not change the text (assuming that there is no existing implementation for this new option). But this is just a precaution, because I don't think this case should technically ever occur though - any option that exists in Options should have an implementation in convertTextNotation. Does this address your concern or am I misunderstanding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that means well run the conversion code when we don't need to right? It's preferable in my opinion to avoid that work if we don't have to, specially since that routine a character by character copy.

I'm also wondering if we should have an early exit from the routine if the neither option is set. WDYT?

Comment on lines 24 to 25
if options.show_ends:
if intval == 10:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can combine these and drop a level of nesting. Could also be if/else if/else but that isn't too important since there's a continue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@connieyzhu
Copy link
Contributor Author

connieyzhu commented Aug 5, 2024

Mostly OK, with some minor comments, but don't the tests from #101530 need to be updated in this patch?

@ilovepi I wouldn't be able to update the tests in this patch until #101530 has landed right, since I would need to rebase first?

Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly OK, with some minor comments, but don't the tests from #101530 need to be updated in this patch?

Comment on lines 90 to 91
if True in fields(enabled_options):
contents = convertTextNotation(contents, enabled_options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that means well run the conversion code when we don't need to right? It's preferable in my opinion to avoid that work if we don't have to, specially since that routine a character by character copy.

I'm also wondering if we should have an early exit from the routine if the neither option is set. WDYT?

This patch adds a new bool variable convert_text that is turned on if
show_nonprinting, show_ends, or any option added in the future that is
meant to modify text is True. This fixes the faulty logic that assumes
that any option that is True is one that will modify text. Now,
convertTextNotation() will only be called for options that need it.
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks basically fine to me (once tests are updated). One question I do have is how many tests actually use this functionality and do they really need to?

@connieyzhu
Copy link
Contributor Author

One question I do have is how many tests actually use this functionality and do they really need to?

@jh7370 I believe only one test currently uses cat -e. I'm not quite sure what its purpose is, but it's in TestUseColor.test in lldb.

This patch adds an assert to guarantee that when calling
convertTextNotation(), the show_ends and show_nonprinting options are
turned on. Some logic is also simplified when converting characters.
This patch allows for lit's built-in cat to use the -E option. The
functionality was already implemented in previous patches, but this
patch allows for the parsing and usage of -E.
@jh7370
Copy link
Collaborator

jh7370 commented Aug 8, 2024

One question I do have is how many tests actually use this functionality and do they really need to?

@jh7370 I believe only one test currently uses cat -e. I'm not quite sure what its purpose is, but it's in TestUseColor.test in lldb.

Thanks, I don't see the need for the $ line ending behaviour in that test. It's possible the non-printing characters behaviour is needed though. @JDevlieghere, since you last touched that test, can you confirm the purpose of the -e usage is to print non-printing characters?

We should avoid adding functionality that isn't needed, so for example, -v might be more appropriate in that test, which would then allow us to avoid adding -e functionality.

@petrhosek
Copy link
Member

If we decide to go ahead with this change, I'd consider replacing the use of getopt with argparse which is more idiomatic in modern Python and should simplify the implementation.

ilovepi pushed a commit that referenced this pull request Aug 20, 2024
…ernal shell (#104878)

This patch changes the test that uses the `cat -e` option to `cat -v` so
that the test can be run using lit's internal shell. For `cat`, the `-v`
option prints non-printing characters in ^ and M- notation, while the
`-e` option adds `$` to the end of lines in addition to printing
non-printing characters in ^ and M- notation. This is an alternative
patch to #102061, opting to
rewrite the test that uses `cat -e` instead of extending support to the
`-e` option.

Fixes #102377
@connieyzhu
Copy link
Contributor Author

Decided to go with the solution in #104878, so I will be closing this PR.

@connieyzhu connieyzhu closed this Aug 21, 2024
@connieyzhu connieyzhu deleted the lldb-cat branch August 22, 2024 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[llvm-lit] lit internal shell does not support -e option for cat commands

5 participants