-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text is wrapped at small and inconsistent widths when attempting to wrap at a very large width. #416
Comments
This line in the optimized-fit algorithm is suspicious. It subtracts the target width from the line width where the target width is derived from the width given to I believe that subtraction could underflow for inputs that aren't as pathological as |
Hi @olson-sean-k, thanks for the test case and the excellent analysis. I have a test cases for wrapping with the maximum line width, but perhaps I only checked the first-fit algorithm... I'll have to check tomorrow. As for fixing this, I'm not sure exactly how, unfortunately. The algorithm does not lend itself well to saturated addition/subtraction (unfortunately!) and so I've been fighting with corner cases like this one for a while. Basically, the original algorithm assumed infinite precision math (I ported it from Python) and this breaks down in Rust. |
I found the test mentioned above and while it does use the optimal-fit algorithm, it only tested a very short string: #[test]
fn max_width() {
assert_eq!(wrap("foo bar", usize::max_value()), vec!["foo bar"]);
} That is of course rather unfortunate and something I will dig into. |
Thanks for taking a look!
I'm not sure this would be ideal, but maybe the optimized-fit algorithm could use (or fall back to) something like That may be overkill though and may decrease performance for common use cases that work properly today. If the bounds are known, maybe the algorithm could accept a fixed-sized width (e.g., |
Yeah, you're right: we could go all-in and use a big-int crate of some sort... but I feel it's overkill as you suggest. For "normal" terminal widths, the wrapping works fine from what I can tell, so I would rather limit the size of the input widths. I've been playing with that in the past, but I didn't finish it. |
I've done more testing with a build where line widths are limited to I will have to clean it up a bit and try to understand the consequences of limiting the line width in this way, especially since I'm also trying to make the library work outside of the console (as the Wasm demo). See also #247 for related discussions. |
This changes the type used for internal width computations in the wrap algorithms. Before, we used `usize` to represent the fragment widths and for the line widths. This could make the optimal-fit wrapping algorithm overflow when it tries to compute the optimal wrapping cost. The problem is that the algorithm computes a cost using integer values formed by (line_width - target_width)**2 When `line_width` is near `usize::MAX`, this computation can easily overflow. By using an `f64` for the cost computation, we achieve two things: * A much larger range for the cost computation: `f64::MAX` is about 1.8e308 whereas `u64::MAX` is only 1.8e19. Computing the cost with a fragment width in the range of `u64`, will thus not exceed 3e38, something which is easily represented with a `f64`. This means that wrapping fragments derived from a `&str` cannot overflow. Overflows can still be triggered when fragments with extreme proportions are formed directly. The boundary seems to be around 1e170 with fragment widths above this limit triggering overflows. * Applications which wrap text using proportional fonts will already be operating with widths measured in floating point units. Using such units internally makes life easier for such applications, as shown by the changes in the Wasm demo. Fixes #247 Fixes #416
I've been testing this a lot more and I believe the solution in #421 will work well: use |
The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties since all penalties are there to discourage certain features: * `nline_penalty` discourages breaks with more lines than necessary, * `overflow_penalty` discourages lines longer than the line width, * `short_last_line_penalty` discourages short last lines, * `hyphen_penalty` discourages hyphenation Making this change surfaces the overflow bug behind #247 and #416. This will be fixed next via #421 and this commit can be seen as a way of simplifying that PR.
The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties since all penalties are there to discourage certain features: * `nline_penalty` discourages breaks with more lines than necessary, * `overflow_penalty` discourages lines longer than the line width, * `short_last_line_penalty` discourages short last lines, * `hyphen_penalty` discourages hyphenation Making this change surfaces the overflow bug behind #247 and #416. This will be fixed next via #421 and this commit can be seen as a way of simplifying that PR.
The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties since all penalties are there to discourage certain features: * `nline_penalty` discourages breaks with more lines than necessary, * `overflow_penalty` discourages lines longer than the line width, * `short_last_line_penalty` discourages short last lines, * `hyphen_penalty` discourages hyphenation Making this change surfaces the overflow bug behind #247 and #416. This will be fixed next via #421 and this commit can be seen as a way of simplifying that PR.
The optimization problem solved by the optimal-fit algorithm is fundamentally a minimization problem. It is therefore not sensible to allow negative penalties since all penalties are there to discourage certain features: * `nline_penalty` discourages breaks with more lines than necessary, * `overflow_penalty` discourages lines longer than the line width, * `short_last_line_penalty` discourages short last lines, * `hyphen_penalty` discourages hyphenation Making this change surfaces the overflow bug behind #247 and #416. This will be fixed next via #421 and this commit can be seen as a way of simplifying that PR.
This changes the type used for internal width computations in the wrap algorithms. Before, we used `usize` to represent the fragment widths and for the line widths. This could make the optimal-fit wrapping algorithm overflow when it tries to compute the optimal wrapping cost. The problem is that the algorithm computes a cost using integer values formed by (line_width - target_width)**2 When `line_width` is near `usize::MAX`, this computation can easily overflow. By using an `f64` for the cost computation, we achieve two things: * A much larger range for the cost computation: `f64::MAX` is about 1.8e308 whereas `u64::MAX` is only 1.8e19. Computing the cost with a fragment width in the range of `u64`, will thus not exceed 3e38, something which is easily represented with a `f64`. This means that wrapping fragments derived from a `&str` cannot overflow. Overflows can still be triggered when fragments with extreme proportions are formed directly. The boundary seems to be around 1e170 with fragment widths above this limit triggering overflows. * Applications which wrap text using proportional fonts will already be operating with widths measured in floating point units. Using such units internally makes life easier for such applications, as shown by the changes in the Wasm demo. Fixes #247 Fixes #416
This changes the type used for internal width computations in the wrap algorithms. Before, we used `usize` to represent the fragment widths and for the line widths. This could make the optimal-fit wrapping algorithm overflow when it tries to compute the optimal wrapping cost. The problem is that the algorithm computes a cost using integer values formed by (line_width - target_width)**2 When `line_width` is near `usize::MAX`, this computation can easily overflow. By using an `f64` for the cost computation, we achieve two things: * A much larger range for the cost computation: `f64::MAX` is about 1.8e308 whereas `u64::MAX` is only 1.8e19. Computing the cost with a fragment width in the range of `u64`, will thus not exceed 3e38, something which is easily represented with a `f64`. This means that wrapping fragments derived from a `&str` cannot overflow. Overflows can still be triggered when fragments with extreme proportions are formed directly. The boundary seems to be around 1e170 with fragment widths above this limit triggering overflows. * Applications which wrap text using proportional fonts will already be operating with widths measured in floating point units. Using such units internally makes life easier for such applications, as shown by the changes in the Wasm demo. Fixes #247 Fixes #416
When wrapping text at a very large width (e.g.,
usize::MAX
) viatextwrap::wrap
, it is instead wrapped at small and inconsistent widths. This code......prints this:
Note that the wrapped width is not consistent, as the word "English" could fit on the first line without exceeding the width of the last line. I would not expect the text to be wrapped here at all, since it never reaches a width of
usize::MAX
. I'm not sure at what large widths this behavior begins to manifest.The text was updated successfully, but these errors were encountered: