Skip to content

add support to ignore ansi sequences when formatting usage display. Fixes #879 #880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions pkgs/args/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
## 2.7.0+1

* Fix usage column formatting to calculate correct string lengths when there are ANSI
coloring/styling escape sequences present

## 2.7.0

* Remove sorting of the `allowedHelp` argument in usage output. Ordering will
Expand Down
8 changes: 4 additions & 4 deletions pkgs/args/lib/src/usage.dart
Original file line number Diff line number Diff line change
Expand Up @@ -149,16 +149,16 @@ class _Usage {
if (option.hide) continue;

// Make room in the first column if there are abbreviations.
abbr = math.max(abbr, _abbreviation(option).length);
abbr = math.max(abbr, _abbreviation(option).lengthWithoutAnsi);

// Make room for the option.
title = math.max(
title, _longOption(option).length + _mandatoryOption(option).length);
title, _longOption(option).lengthWithoutAnsi + _mandatoryOption(option).lengthWithoutAnsi);

// Make room for the allowed help.
if (option.allowedHelp != null) {
for (var allowed in option.allowedHelp!.keys) {
title = math.max(title, _allowedTitle(option, allowed).length);
title = math.max(title, _allowedTitle(option, allowed).lengthWithoutAnsi);
}
}
}
Expand Down Expand Up @@ -218,7 +218,7 @@ class _Usage {

if (column < _columnWidths.length) {
// Fixed-size column, so pad it.
_buffer.write(text.padRight(_columnWidths[column]));
_buffer.write(padRight(text, _columnWidths[column]));
} else {
// The last column, so just write it.
_buffer.write(text);
Expand Down
31 changes: 30 additions & 1 deletion pkgs/args/lib/src/utils.dart
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,38 @@
// BSD-style license that can be found in the LICENSE file.
import 'dart:math' as math;


/// A utility class for finding and stripping ANSI codes from strings.
class _AnsiUtils {
static final String ansiCodePattern = [
Copy link
Member

@lrhn lrhn Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be very nice to have a textual description of what the pattern is supposed to match.
It's hard enough to debug RegExps even if you have a specification of what they're intended to do.
Without that, it's impossible to tell whether it's right or wrong.

Is this a correct description:

  • One of U+001b or U+009b
  • Followed by any number of the characters: [, ], (, ), #, ;, ?
  • Followed by either:
    • zero or more letters or digits
    • followed by any number of repeats of:
      • ;
      • followed by any number of digits, letters, or characters from -/#^.:=?%@~_
  • or:
    • 1-4 digits
    • followed by any number of repeats of:
      • ;
      • followed by 0-4 digits
    • followed by on digit, letter other than QUVWXY or abdeopsuvwxz, or one of =><~.

I don't know what the official specification of this is.
I'd be tempted to just go with what Wikpedia says is a simple control sequence introduction:

r'\x1b[0-?]*[ -/]*[@-~]'

Using U+009B as well is dangerous. It means something else in both UTF-8 and Windows-1252 code pages. (But I guess we're working on Dart strings, so we're past encoding issues.)

'[\\u001B\\u009B][[\\]()#;?]*(?:(?:(?:[a-zA-Z\\d]*(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\\u0007)',
'(?:(?:\\d{1,4}(?:;\\d{0,4})*)?[\\dA-PR-TZcf-ntqry=><~]))'
].join('|');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use raw strings for RegExp patterns. No need to use lists and join, just have adjacent strings:

  static const String ansiCodePattern = 
     r'[\x1b\x9b][[\]()#;?]*'
     r'(?:'  //
     r'(?:'  // -
     r'(?:[a-zA-Z\d]*' //--
     r'(?:;[\-a-zA-Z\d/#&.:=?%@~_]*)*'
     r')?\x07' // -
     r'|'
     r'(?:' // --
     r'(?:\d{1,4}(?:;\d{0,4})*[\dA-PR-TZcf-ntqry=><~]'
     r')' //-
     r')'; //


static final RegExp ansiRegex = RegExp(ansiCodePattern);

static String stripAnsi(String source) {
return source.replaceAll(ansiRegex, '');
}

static bool hasAnsi(String source) {
return ansiRegex.hasMatch(source);
}
}

/// A utility extension on [String] to provide ANSI code stripping and length
/// calculation without ANSI codes.
extension StringUtils on String {
/// Returns the length of the string without ANSI codes.
int get lengthWithoutAnsi {
if (!_AnsiUtils.hasAnsi(this)) return length;
return _AnsiUtils.stripAnsi(this).length;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more expensive than it needs to be.
You know the length of the total string, and you have the matches for every Ansi escape sequence, so you could just subtract the lengths of that, without needing to actually create a new string.

  int get lengthWithoutAnsi {
    var length = this.length;
    for (var escape in _AnsiUtils.ansiRE.allMatches(this)) {
      length -= escape.end - escape.start;
    }
    return length;
  }

(If we are a little lucky, the RegExp match won't eagerly allocate a match string if we don't ask for it.)

}

/// Pads [source] to [length] by adding spaces at the end.
String padRight(String source, int length) =>
source + ' ' * (length - source.length);
source + ' ' * (length - source.lengthWithoutAnsi);
Copy link
Member

@lrhn lrhn Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider adding .ansiLength which only accumulates the length of the Ansi sequences,
so that it's just int get lengthWithoutAnsi => length - ansiLength;.
Then this can be source.padRight(length + source.ansiLength). (Using padRight is more efficient because it doesn't have to create the intermediate ' ' * ... string. I guess a good optimizing compiler can avoid that too, but it's better to be safe.)


/// Wraps a block of text into lines no longer than [length].
///
Expand Down
2 changes: 1 addition & 1 deletion pkgs/args/pubspec.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: args
version: 2.7.0
version: 2.7.0+1
description: >-
Library for defining parsers for parsing raw command-line arguments into a set
of options and values using GNU and POSIX style options.
Expand Down
57 changes: 57 additions & 0 deletions pkgs/args/test/utils_test.dart
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ final _indentedLongLineWithNewlines =
const _shortLine = 'Short line.';
const _indentedLongLine = ' This is an indented long line that needs to be '
'wrapped and indentation preserved.';
const _ansiReset = 'This is normal text. \x1B[0m<- Reset point.';
const _ansiBoldTextSpecificReset = 'This is normal, \x1B[1mthis is bold\x1B[22m, and this uses specific reset.';
const _ansiMixedStyles = 'Normal, \x1B[31mRed\x1B[0m, \x1B[1mBold\x1B[0m, \x1B[4mUnderline\x1B[0m, \x1B[1;34mBold Blue\x1B[0m, Normal again.';
const _ansiLongSequence = 'Start \x1B[1;3;4;5;7;9;31;42;38;5;196;48;5;226m Beaucoup formatting! \x1B[0m End';
const _ansiCombined256 = '\x1B[1;38;5;27;48;5;220mBold Bright Blue FG (27) on Gold BG (220)\x1B[0m';
const _ansiCombinedTrueColor = '\x1B[4;48;2;50;50;50;38;2;150;250;150mUnderlined Light Green FG on Dark Grey BG\x1B[0m';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all of the form \x1b\[[\d;]*m. That doesn't cover most of the complexity of the RegExp.

If this is all the functionality needs to do, the RegExp shouldn't be as complicated. (It probably shouldn't, even if recognizing more escapes than just the color ones.)


void main() {
group('padding', () {
Expand Down Expand Up @@ -213,4 +219,55 @@ needs to be wrapped.
wrapTextAsLines('$_longLine \t'), equals(['$_longLine \t']));
});
});

group('text lengthWithoutAnsi is correct with no ANSI sequences', () {
test('lengthWithoutAnsi returns correct length on lines without ansi', () {
expect(_longLine.lengthWithoutAnsi, equals(_longLine.length));
});
test('lengthWithoutAnsi returns correct length on lines newlines and without ansi', () {
expect(_longLineWithNewlines.lengthWithoutAnsi, equals(_longLineWithNewlines.length));
});
test('lengthWithoutAnsi returns correct length on lines indented/newlines and without ansi', () {
expect(_indentedLongLineWithNewlines.lengthWithoutAnsi, equals(_indentedLongLineWithNewlines.length));
});
test('lengthWithoutAnsi returns correct length on short line without ansi', () {
expect(_shortLine.lengthWithoutAnsi, equals(_shortLine.length));
});
});

group('lengthWithoutAnsi is correct with no ANSI sequences', () {
test('lengthWithoutAnsi returns correct length on lines without ansi', () {
expect(_longLine.lengthWithoutAnsi, equals(_longLine.length));
});
test('lengthWithoutAnsi returns correct length on lines newlines and without ansi', () {
expect(_longLineWithNewlines.lengthWithoutAnsi, equals(_longLineWithNewlines.length));
});
test('lengthWithoutAnsi returns correct length on lines indented/newlines and without ansi', () {
expect(_indentedLongLineWithNewlines.lengthWithoutAnsi, equals(_indentedLongLineWithNewlines.length));
});
test('lengthWithoutAnsi returns correct length on short line without ansi', () {
expect(_shortLine.lengthWithoutAnsi, equals(_shortLine.length));
});
});

group('lengthWithoutAnsi is correct with variety of ANSI sequences', () {
test('lengthWithoutAnsi returns correct length - ansi reset', () {
expect(_ansiReset.lengthWithoutAnsi, equals(36));
});
test('lengthWithoutAnsi returns correct length - ansi bold, bold specific reset', () {
expect(_ansiBoldTextSpecificReset.lengthWithoutAnsi, equals(59));
});
test('lengthWithoutAnsi returns correct length - ansi mixed styles', () {
expect(_ansiMixedStyles.lengthWithoutAnsi, equals(54));
});
test('lengthWithoutAnsi returns correct length- ansi long sequence', () {
expect(_ansiLongSequence.lengthWithoutAnsi, equals(32));
});
test('lengthWithoutAnsi returns correct length - ansi 256 color sequence', () {
expect(_ansiCombined256.lengthWithoutAnsi, equals(41));
});
test('lengthWithoutAnsi returns correct length - ansi true color sequences', () {
expect(_ansiCombinedTrueColor.lengthWithoutAnsi, equals(41));
});
});
}