Skip to content

Commit 9ad3ef1

Browse files
authored
Rollup merge of #98814 - fmease:minimal-fix-for-issue-97933, r=GuillaumeGomez
rustdoc: Censor certain complex unevaluated const exprs Fixes #97933. This is more of a hotfix for the aforementioned issue. By that, I mean that my proposed patch is not the best solution but one that does not change as much existing code. It treats symptoms rather than the root cause. This PR “censors” certain complex unevaluated constant expressions like `match`es, blocks, function calls, struct literals etc. by pretty-printing them as `_` / `{ _ }` (number and string literals, paths and `()` are still printed as one would expect). Resorting to this placeholder is preferable to printing the full expression verbatim since they can be quite large and verbose resulting in an unreadable mess in the generated documentation. Further, mindlessly printing the const would leak private and `doc(hidden)` struct fields (#97933), at least in the current stable & nightly implementations which rely on `span_to_snippet` (!) and `rustc_hir_pretty::id_to_string`. The censoring of _verbose_ expressions is probably going to stay longer term. However, in regards to private and `doc(hidden)` struct fields, I have a more proper fix in mind which I have already partially implemented locally and for which I am going to open a separate PR sometime soon. For that, I was already in contact with `@GuillaumeGomez.` The proper fix involves rustdoc not falling back on pretty-printing unevaluated consts so easily (what this PR is concerned about) and instead preferring to print evaluated consts which contain more information allowing it to selectively hide private and `doc(hidden)` fields, create hyperlinks etc. generally making the output more granular and precise (compared to the brutal `_` placeholder). Unfortunately, I was a bit too late and the issue just hit stable (1.62). Should this be backported to beta or even a potential 1.62.1? r? `@GuillaumeGomez`
2 parents 2a40911 + d3181a9 commit 9ad3ef1

8 files changed

+259
-10
lines changed

Diff for: src/librustdoc/clean/utils.rs

+87-6
Original file line numberDiff line numberDiff line change
@@ -340,17 +340,98 @@ pub(crate) fn is_literal_expr(tcx: TyCtxt<'_>, hir_id: hir::HirId) -> bool {
340340
false
341341
}
342342

343+
/// Build a textual representation of an unevaluated constant expression.
344+
///
345+
/// If the const expression is too complex, an underscore `_` is returned.
346+
/// For const arguments, it's `{ _ }` to be precise.
347+
/// This means that the output is not necessarily valid Rust code.
348+
///
349+
/// Currently, only
350+
///
351+
/// * literals (optionally with a leading `-`)
352+
/// * unit `()`
353+
/// * blocks (`{ … }`) around simple expressions and
354+
/// * paths without arguments
355+
///
356+
/// are considered simple enough. Simple blocks are included since they are
357+
/// necessary to disambiguate unit from the unit type.
358+
/// This list might get extended in the future.
359+
///
360+
/// Without this censoring, in a lot of cases the output would get too large
361+
/// and verbose. Consider `match` expressions, blocks and deeply nested ADTs.
362+
/// Further, private and `doc(hidden)` fields of structs would get leaked
363+
/// since HIR datatypes like the `body` parameter do not contain enough
364+
/// semantic information for this function to be able to hide them –
365+
/// at least not without significant performance overhead.
366+
///
367+
/// Whenever possible, prefer to evaluate the constant first and try to
368+
/// use a different method for pretty-printing. Ideally this function
369+
/// should only ever be used as a fallback.
343370
pub(crate) fn print_const_expr(tcx: TyCtxt<'_>, body: hir::BodyId) -> String {
344371
let hir = tcx.hir();
345372
let value = &hir.body(body).value;
346373

347-
let snippet = if !value.span.from_expansion() {
348-
tcx.sess.source_map().span_to_snippet(value.span).ok()
349-
} else {
350-
None
351-
};
374+
#[derive(PartialEq, Eq)]
375+
enum Classification {
376+
Literal,
377+
Simple,
378+
Complex,
379+
}
352380

353-
snippet.unwrap_or_else(|| rustc_hir_pretty::id_to_string(&hir, body.hir_id))
381+
use Classification::*;
382+
383+
fn classify(expr: &hir::Expr<'_>) -> Classification {
384+
match &expr.kind {
385+
hir::ExprKind::Unary(hir::UnOp::Neg, expr) => {
386+
if matches!(expr.kind, hir::ExprKind::Lit(_)) { Literal } else { Complex }
387+
}
388+
hir::ExprKind::Lit(_) => Literal,
389+
hir::ExprKind::Tup([]) => Simple,
390+
hir::ExprKind::Block(hir::Block { stmts: [], expr: Some(expr), .. }, _) => {
391+
if classify(expr) == Complex { Complex } else { Simple }
392+
}
393+
// Paths with a self-type or arguments are too “complex” following our measure since
394+
// they may leak private fields of structs (with feature `adt_const_params`).
395+
// Consider: `<Self as Trait<{ Struct { private: () } }>>::CONSTANT`.
396+
// Paths without arguments are definitely harmless though.
397+
hir::ExprKind::Path(hir::QPath::Resolved(_, hir::Path { segments, .. })) => {
398+
if segments.iter().all(|segment| segment.args.is_none()) { Simple } else { Complex }
399+
}
400+
// FIXME: Claiming that those kinds of QPaths are simple is probably not true if the Ty
401+
// contains const arguments. Is there a *concise* way to check for this?
402+
hir::ExprKind::Path(hir::QPath::TypeRelative(..)) => Simple,
403+
// FIXME: Can they contain const arguments and thus leak private struct fields?
404+
hir::ExprKind::Path(hir::QPath::LangItem(..)) => Simple,
405+
_ => Complex,
406+
}
407+
}
408+
409+
let classification = classify(value);
410+
411+
if classification == Literal
412+
&& !value.span.from_expansion()
413+
&& let Ok(snippet) = tcx.sess.source_map().span_to_snippet(value.span) {
414+
// For literals, we avoid invoking the pretty-printer and use the source snippet instead to
415+
// preserve certain stylistic choices the user likely made for the sake legibility like
416+
//
417+
// * hexadecimal notation
418+
// * underscores
419+
// * character escapes
420+
//
421+
// FIXME: This passes through `-/*spacer*/0` verbatim.
422+
snippet
423+
} else if classification == Simple {
424+
// Otherwise we prefer pretty-printing to get rid of extraneous whitespace, comments and
425+
// other formatting artifacts.
426+
rustc_hir_pretty::id_to_string(&hir, body.hir_id)
427+
} else if tcx.def_kind(hir.body_owner_def_id(body).to_def_id()) == DefKind::AnonConst {
428+
// FIXME: Omit the curly braces if the enclosing expression is an array literal
429+
// with a repeated element (an `ExprKind::Repeat`) as in such case it
430+
// would not actually need any disambiguation.
431+
"{ _ }".to_owned()
432+
} else {
433+
"_".to_owned()
434+
}
354435
}
355436

356437
/// Given a type Path, resolve it to a Type using the TyCtxt

Diff for: src/librustdoc/html/render/mod.rs

+3-1
Original file line numberDiff line numberDiff line change
@@ -716,12 +716,14 @@ fn assoc_const(
716716
ty = ty.print(cx),
717717
);
718718
if let Some(default) = default {
719+
write!(w, " = ");
720+
719721
// FIXME: `.value()` uses `clean::utils::format_integer_with_underscore_sep` under the
720722
// hood which adds noisy underscores and a type suffix to number literals.
721723
// This hurts readability in this context especially when more complex expressions
722724
// are involved and it doesn't add much of value.
723725
// Find a way to print constants here without all that jazz.
724-
write!(w, " = {}", default.value(cx.tcx()).unwrap_or_else(|| default.expr(cx.tcx())));
726+
write!(w, "{}", Escape(&default.value(cx.tcx()).unwrap_or_else(|| default.expr(cx.tcx()))));
725727
}
726728
}
727729

Diff for: src/librustdoc/html/render/print_item.rs

+9
Original file line numberDiff line numberDiff line change
@@ -1360,6 +1360,15 @@ fn item_constant(w: &mut Buffer, cx: &mut Context<'_>, it: &clean::Item, c: &cle
13601360
typ = c.type_.print(cx),
13611361
);
13621362

1363+
// FIXME: The code below now prints
1364+
// ` = _; // 100i32`
1365+
// if the expression is
1366+
// `50 + 50`
1367+
// which looks just wrong.
1368+
// Should we print
1369+
// ` = 100i32;`
1370+
// instead?
1371+
13631372
let value = c.value(cx.tcx());
13641373
let is_literal = c.is_literal(cx.tcx());
13651374
let expr = c.expr(cx.tcx());

Diff for: src/test/rustdoc/assoc-consts.rs

+4
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,10 @@ impl Bar {
2727
// @has assoc_consts/struct.Bar.html '//*[@id="associatedconstant.BAR"]' \
2828
// 'const BAR: usize'
2929
pub const BAR: usize = 3;
30+
31+
// @has - '//*[@id="associatedconstant.BAR_ESCAPED"]' \
32+
// "const BAR_ESCAPED: &'static str = \"<em>markup</em>\""
33+
pub const BAR_ESCAPED: &'static str = "<em>markup</em>";
3034
}
3135

3236
pub struct Baz<'a, U: 'a, T>(T, &'a [U]);

Diff for: src/test/rustdoc/const-value-display.rs

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
#![crate_name = "foo"]
22

33
// @has 'foo/constant.HOUR_IN_SECONDS.html'
4-
// @has - '//*[@class="docblock item-decl"]//code' 'pub const HOUR_IN_SECONDS: u64 = 60 * 60; // 3_600u64'
4+
// @has - '//*[@class="docblock item-decl"]//code' 'pub const HOUR_IN_SECONDS: u64 = _; // 3_600u64'
55
pub const HOUR_IN_SECONDS: u64 = 60 * 60;
66

77
// @has 'foo/constant.NEGATIVE.html'
8-
// @has - '//*[@class="docblock item-decl"]//code' 'pub const NEGATIVE: i64 = -60 * 60; // -3_600i64'
8+
// @has - '//*[@class="docblock item-decl"]//code' 'pub const NEGATIVE: i64 = _; // -3_600i64'
99
pub const NEGATIVE: i64 = -60 * 60;
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
// Test that certain unevaluated constant expression arguments that are
2+
// deemed too verbose or complex and that may leak private or
3+
// `doc(hidden)` struct fields are not displayed in the documentation.
4+
//
5+
// Read the documentation of `rustdoc::clean::utils::print_const_expr`
6+
// for further details.
7+
#![feature(const_trait_impl, generic_const_exprs)]
8+
#![allow(incomplete_features)]
9+
10+
// @has hide_complex_unevaluated_const_arguments/trait.Stage.html
11+
pub trait Stage {
12+
// A helper constant that prevents const expressions containing it
13+
// from getting fully evaluated since it doesn't have a body and
14+
// thus is non-reducible. This allows us to specifically test the
15+
// pretty-printing of *unevaluated* consts.
16+
const ABSTRACT: usize;
17+
18+
// Currently considered "overly complex" by the `generic_const_exprs`
19+
// feature. If / once this expression kind gets supported, this
20+
// unevaluated const expression could leak the private struct field.
21+
//
22+
// FIXME: Once the line below compiles, make this a test that
23+
// ensures that the private field is not printed.
24+
//
25+
//const ARRAY0: [u8; Struct { private: () } + Self::ABSTRACT];
26+
27+
// This assoc. const could leak the private assoc. function `Struct::new`.
28+
// Ensure that this does not happen.
29+
//
30+
// @has - '//*[@id="associatedconstant.ARRAY1"]' \
31+
// 'const ARRAY1: [u8; { _ }]'
32+
const ARRAY1: [u8; Struct::new(/* ... */) + Self::ABSTRACT * 1_000];
33+
34+
// @has - '//*[@id="associatedconstant.VERBOSE"]' \
35+
// 'const VERBOSE: [u16; { _ }]'
36+
const VERBOSE: [u16; compute("thing", 9 + 9) * Self::ABSTRACT];
37+
38+
// Check that we do not leak the private struct field contained within
39+
// the path. The output could definitely be improved upon
40+
// (e.g. printing sth. akin to `<Self as Helper<{ _ }>>::OUT`) but
41+
// right now “safe is safe”.
42+
//
43+
// @has - '//*[@id="associatedconstant.PATH"]' \
44+
// 'const PATH: usize = _'
45+
const PATH: usize = <Self as Helper<{ Struct { private: () } }>>::OUT;
46+
}
47+
48+
const fn compute(input: &str, extra: usize) -> usize {
49+
input.len() + extra
50+
}
51+
52+
pub trait Helper<const S: Struct> {
53+
const OUT: usize;
54+
}
55+
56+
impl<const S: Struct, St: Stage + ?Sized> Helper<S> for St {
57+
const OUT: usize = St::ABSTRACT;
58+
}
59+
60+
// Currently in rustdoc, const arguments are not evaluated in this position
61+
// and therefore they fall under the realm of `print_const_expr`.
62+
// If rustdoc gets patched to evaluate const arguments, it is fine to replace
63+
// this test as long as one can ensure that private fields are not leaked!
64+
//
65+
// @has hide_complex_unevaluated_const_arguments/trait.Sub.html \
66+
// '//*[@class="rust trait"]' \
67+
// 'pub trait Sub: Sup<{ _ }, { _ }> { }'
68+
pub trait Sub: Sup<{ 90 * 20 * 4 }, { Struct { private: () } }> {}
69+
70+
pub trait Sup<const N: usize, const S: Struct> {}
71+
72+
pub struct Struct { private: () }
73+
74+
impl Struct {
75+
const fn new() -> Self { Self { private: () } }
76+
}
77+
78+
impl const std::ops::Add<usize> for Struct {
79+
type Output = usize;
80+
81+
fn add(self, _: usize) -> usize { 0 }
82+
}

Diff for: src/test/rustdoc/hide-complex-unevaluated-consts.rs

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
// Regression test for issue #97933.
2+
//
3+
// Test that certain unevaluated constant expressions that are
4+
// deemed too verbose or complex and that may leak private or
5+
// `doc(hidden)` struct fields are not displayed in the documentation.
6+
//
7+
// Read the documentation of `rustdoc::clean::utils::print_const_expr`
8+
// for further details.
9+
10+
// @has hide_complex_unevaluated_consts/trait.Container.html
11+
pub trait Container {
12+
// A helper constant that prevents const expressions containing it
13+
// from getting fully evaluated since it doesn't have a body and
14+
// thus is non-reducible. This allows us to specifically test the
15+
// pretty-printing of *unevaluated* consts.
16+
const ABSTRACT: i32;
17+
18+
// Ensure that the private field does not get leaked:
19+
//
20+
// @has - '//*[@id="associatedconstant.STRUCT0"]' \
21+
// 'const STRUCT0: Struct = _'
22+
const STRUCT0: Struct = Struct { private: () };
23+
24+
// @has - '//*[@id="associatedconstant.STRUCT1"]' \
25+
// 'const STRUCT1: (Struct,) = _'
26+
const STRUCT1: (Struct,) = (Struct{private: /**/()},);
27+
28+
// Although the struct field is public here, check that it is not
29+
// displayed. In a future version of rustdoc, we definitely want to
30+
// show it. However for the time being, the printing logic is a bit
31+
// conservative.
32+
//
33+
// @has - '//*[@id="associatedconstant.STRUCT2"]' \
34+
// 'const STRUCT2: Record = _'
35+
const STRUCT2: Record = Record { public: 5 };
36+
37+
// Test that we do not show the incredibly verbose match expr:
38+
//
39+
// @has - '//*[@id="associatedconstant.MATCH0"]' \
40+
// 'const MATCH0: i32 = _'
41+
const MATCH0: i32 = match 234 {
42+
0 => 1,
43+
_ => Self::ABSTRACT,
44+
};
45+
46+
// @has - '//*[@id="associatedconstant.MATCH1"]' \
47+
// 'const MATCH1: bool = _'
48+
const MATCH1: bool = match Self::ABSTRACT {
49+
_ => true,
50+
};
51+
52+
// Check that we hide complex (arithmetic) operations.
53+
// In this case, it is a bit unfortunate since the expression
54+
// is not *that* verbose and it might be quite useful to the reader.
55+
//
56+
// However in general, the expression might be quite large and
57+
// contain match expressions and structs with private fields.
58+
// We would need to recurse over the whole expression and even more
59+
// importantly respect operator precedence when pretty-printing
60+
// the potentially partially censored expression.
61+
// For now, the implementation is quite simple and the choices
62+
// rather conservative.
63+
//
64+
// @has - '//*[@id="associatedconstant.ARITH_OPS"]' \
65+
// 'const ARITH_OPS: i32 = _'
66+
const ARITH_OPS: i32 = Self::ABSTRACT * 2 + 1;
67+
}
68+
69+
pub struct Struct { private: () }
70+
71+
pub struct Record { pub public: i32 }

Diff for: src/test/rustdoc/show-const-contents.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ pub const CONST_NEG_I32: i32 = -42;
2121
// @!has show_const_contents/constant.CONST_EQ_TO_VALUE_I32.html '// 42i32'
2222
pub const CONST_EQ_TO_VALUE_I32: i32 = 42i32;
2323

24-
// @has show_const_contents/constant.CONST_CALC_I32.html '= 42 + 1; // 43i32'
24+
// @has show_const_contents/constant.CONST_CALC_I32.html '= _; // 43i32'
2525
pub const CONST_CALC_I32: i32 = 42 + 1;
2626

2727
// @!has show_const_contents/constant.CONST_REF_I32.html '= &42;'

0 commit comments

Comments
 (0)