Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lint against getting pointers from immediately dropped temporaries #128985

Merged
merged 1 commit into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions compiler/rustc_lint/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -203,14 +203,14 @@ lint_confusable_identifier_pair = found both `{$existing_sym}` and `{$sym}` as i
.current_use = this identifier can be confused with `{$existing_sym}`
.other_use = other identifier used here

lint_cstring_ptr = getting the inner pointer of a temporary `CString`
.as_ptr_label = this pointer will be invalid
.unwrap_label = this `CString` is deallocated at the end of the statement, bind it to a variable to extend its lifetime
.note = pointers do not have a lifetime; when calling `as_ptr` the `CString` will be deallocated at the end of the statement because nothing is referencing it as far as the type system is concerned
.help = for more information, see https://doc.rust-lang.org/reference/destructors.html

lint_custom_inner_attribute_unstable = custom inner attributes are unstable

lint_dangling_pointers_from_temporaries = a dangling pointer will be produced because the temporary `{$ty}` will be dropped
.label_ptr = this pointer will immediately be invalid
.label_temporary = this `{$ty}` is deallocated at the end of the statement, bind it to a variable to extend its lifetime
.note = pointers do not have a lifetime; when calling `{$callee}` the `{$ty}` will be deallocated at the end of the statement because nothing is referencing it as far as the type system is concerned
.help = for more information, see <https://doc.rust-lang.org/reference/destructors.html>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too bad that we don't have place explaining what a "dangling pointer" is, we could have linked it here.

Opened #132286 about that.


lint_default_hash_types = prefer `{$preferred}` over `{$used}`, it has better performance
.note = a `use rustc_data_structures::fx::{$preferred}` may be necessary

Expand Down
223 changes: 223 additions & 0 deletions compiler/rustc_lint/src/dangling.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
use rustc_ast::visit::{visit_opt, walk_list};
use rustc_hir::def_id::LocalDefId;
use rustc_hir::intravisit::{FnKind, Visitor, walk_expr};
use rustc_hir::{Block, Body, Expr, ExprKind, FnDecl, LangItem};
use rustc_middle::ty::{Ty, TyCtxt};
use rustc_session::{declare_lint, impl_lint_pass};
use rustc_span::Span;
use rustc_span::symbol::sym;

use crate::lints::DanglingPointersFromTemporaries;
use crate::{LateContext, LateLintPass};

declare_lint! {
/// The `dangling_pointers_from_temporaries` lint detects getting a pointer to data
/// of a temporary that will immediately get dropped.
///
/// ### Example
///
/// ```rust
/// # #![allow(unused)]
/// # unsafe fn use_data(ptr: *const u8) { }
/// fn gather_and_use(bytes: impl Iterator<Item = u8>) {
/// let x: *const u8 = bytes.collect::<Vec<u8>>().as_ptr();
/// unsafe { use_data(x) }
/// }
/// ```
///
/// {{produces}}
///
/// ### Explanation
///
/// Getting a pointer from a temporary value will not prolong its lifetime,
/// which means that the value can be dropped and the allocation freed
/// while the pointer still exists, making the pointer dangling.
/// This is not an error (as far as the type system is concerned)
/// but probably is not what the user intended either.
///
/// If you need stronger guarantees, consider using references instead,
/// as they are statically verified by the borrow-checker to never dangle.
pub DANGLING_POINTERS_FROM_TEMPORARIES,
Warn,
"detects getting a pointer from a temporary"
}

/// FIXME: false negatives (i.e. the lint is not emitted when it should be)
/// 1. Method calls that are not checked for:
/// - [`temporary_unsafe_cell.get()`][`core::cell::UnsafeCell::get()`]
/// - [`temporary_sync_unsafe_cell.get()`][`core::cell::SyncUnsafeCell::get()`]
/// 2. Ways to get a temporary that are not recognized:
/// - `owning_temporary.field`
/// - `owning_temporary[index]`
/// 3. No checks for ref-to-ptr conversions:
/// - `&raw [mut] temporary`
/// - `&temporary as *(const|mut) _`
/// - `ptr::from_ref(&temporary)` and friends
#[derive(Clone, Copy, Default)]
pub(crate) struct DanglingPointers;

impl_lint_pass!(DanglingPointers => [DANGLING_POINTERS_FROM_TEMPORARIES]);

// This skips over const blocks, but they cannot use or return a dangling pointer anyways.
impl<'tcx> LateLintPass<'tcx> for DanglingPointers {
fn check_fn(
&mut self,
cx: &LateContext<'tcx>,
_: FnKind<'tcx>,
_: &'tcx FnDecl<'tcx>,
body: &'tcx Body<'tcx>,
_: Span,
_: LocalDefId,
) {
DanglingPointerSearcher { cx, inside_call_args: false }.visit_body(body)
}
}

/// This produces a dangling pointer:
/// ```ignore (example)
/// let ptr = CString::new("hello").unwrap().as_ptr();
/// foo(ptr)
/// ```
///
/// But this does not:
/// ```ignore (example)
/// foo(CString::new("hello").unwrap().as_ptr())
/// ```
///
/// But this does:
/// ```ignore (example)
/// foo({ let ptr = CString::new("hello").unwrap().as_ptr(); ptr })
/// ```
///
/// So we have to keep track of when we are inside of a function/method call argument.
struct DanglingPointerSearcher<'lcx, 'tcx> {
cx: &'lcx LateContext<'tcx>,
/// Keeps track of whether we are inside of function/method call arguments,
/// where this lint should not be emitted.
///
/// See [the main doc][`Self`] for examples.
inside_call_args: bool,
}

impl Visitor<'_> for DanglingPointerSearcher<'_, '_> {
fn visit_expr(&mut self, expr: &Expr<'_>) -> Self::Result {
if !self.inside_call_args {
lint_expr(self.cx, expr)
}
match expr.kind {
ExprKind::Call(lhs, args) | ExprKind::MethodCall(_, lhs, args, _) => {
self.visit_expr(lhs);
self.with_inside_call_args(true, |this| walk_list!(this, visit_expr, args))
}
ExprKind::Block(&Block { stmts, expr, .. }, _) => {
self.with_inside_call_args(false, |this| walk_list!(this, visit_stmt, stmts));
visit_opt!(self, visit_expr, expr)
}
_ => walk_expr(self, expr),
}
}
}

impl DanglingPointerSearcher<'_, '_> {
fn with_inside_call_args<R>(
&mut self,
inside_call_args: bool,
callback: impl FnOnce(&mut Self) -> R,
) -> R {
let old = core::mem::replace(&mut self.inside_call_args, inside_call_args);
let result = callback(self);
self.inside_call_args = old;
result
}
}

fn lint_expr(cx: &LateContext<'_>, expr: &Expr<'_>) {
if let ExprKind::MethodCall(method, receiver, _args, _span) = expr.kind
&& matches!(method.ident.name, sym::as_ptr | sym::as_mut_ptr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to do it in this PR (otherwise in a follow-up), we could add an rustc attribute to those as_ptr/as_mut_ptr methods.

You can follow afcb09b and 2a930d3 (from a PR of mine) with check_applied_to_fn_or_method instead, and something similar to this to check the attribute

&& let Some(def_id) = cx.typeck_results().type_dependent_def_id(e.hir_id)
&& cx.tcx.has_attr(def_id, sym::rustc_never_returns_null_ptr)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to do this as a follow-up, just not to stall this one any longer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created #132281 to track the use of a rustc attr for this lint.

Feel free to assign it to your-self (@rustbot assign I think).

&& is_temporary_rvalue(receiver)
&& let ty = cx.typeck_results().expr_ty(receiver)
&& is_interesting(cx.tcx, ty)
{
// FIXME: use `emit_node_lint` when `#[primary_span]` is added.
cx.tcx.emit_node_span_lint(
DANGLING_POINTERS_FROM_TEMPORARIES,
expr.hir_id,
method.ident.span,
DanglingPointersFromTemporaries {
callee: method.ident.name,
ty,
ptr_span: method.ident.span,
temporary_span: receiver.span,
},
)
}
}

fn is_temporary_rvalue(expr: &Expr<'_>) -> bool {
match expr.kind {
// Const is not temporary.
ExprKind::ConstBlock(..) | ExprKind::Repeat(..) | ExprKind::Lit(..) => false,

// This is literally lvalue.
ExprKind::Path(..) => false,

// Calls return rvalues.
ExprKind::Call(..) | ExprKind::MethodCall(..) | ExprKind::Binary(..) => true,

// Inner blocks are rvalues.
ExprKind::If(..) | ExprKind::Loop(..) | ExprKind::Match(..) | ExprKind::Block(..) => true,

// FIXME: these should probably recurse and typecheck along the way.
// Some false negatives are possible for now.
ExprKind::Index(..) | ExprKind::Field(..) | ExprKind::Unary(..) => false,

ExprKind::Struct(..) => true,

// FIXME: this has false negatives, but I do not want to deal with 'static/const promotion just yet.
ExprKind::Array(..) => false,

// These typecheck to `!`
ExprKind::Break(..) | ExprKind::Continue(..) | ExprKind::Ret(..) | ExprKind::Become(..) => {
false
}

// These typecheck to `()`
ExprKind::Assign(..) | ExprKind::AssignOp(..) | ExprKind::Yield(..) => false,

// Compiler-magic macros
ExprKind::AddrOf(..) | ExprKind::OffsetOf(..) | ExprKind::InlineAsm(..) => false,

// We are not interested in these
ExprKind::Cast(..)
| ExprKind::Closure(..)
| ExprKind::Tup(..)
| ExprKind::DropTemps(..)
| ExprKind::Let(..) => false,

// Not applicable
ExprKind::Type(..) | ExprKind::Err(..) => false,
}
}

// Array, Vec, String, CString, MaybeUninit, Cell, Box<[_]>, Box<str>, Box<CStr>,
// or any of the above in arbitrary many nested Box'es.
fn is_interesting(tcx: TyCtxt<'_>, ty: Ty<'_>) -> bool {
if ty.is_array() {
true
} else if let Some(inner) = ty.boxed_ty() {
inner.is_slice()
|| inner.is_str()
|| inner.ty_adt_def().is_some_and(|def| tcx.is_lang_item(def.did(), LangItem::CStr))
|| is_interesting(tcx, inner)
} else if let Some(def) = ty.ty_adt_def() {
for lang_item in [LangItem::String, LangItem::MaybeUninit] {
if tcx.is_lang_item(def.did(), lang_item) {
return true;
}
}
tcx.get_diagnostic_name(def.did())
.is_some_and(|name| matches!(name, sym::cstring_type | sym::Vec | sym::Cell))
} else {
false
}
}
7 changes: 4 additions & 3 deletions compiler/rustc_lint/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ mod async_closures;
mod async_fn_in_trait;
pub mod builtin;
mod context;
mod dangling;
mod deref_into_dyn_supertrait;
mod drop_forget_useless;
mod early;
Expand All @@ -65,7 +66,6 @@ mod levels;
mod lints;
mod macro_expr_fragment_specifier_2024_migration;
mod map_unit_fn;
mod methods;
mod multiple_supertrait_upcastable;
mod non_ascii_idents;
mod non_fmt_panic;
Expand All @@ -91,6 +91,7 @@ mod unused;
use async_closures::AsyncClosureUsage;
use async_fn_in_trait::AsyncFnInTrait;
use builtin::*;
use dangling::*;
use deref_into_dyn_supertrait::*;
use drop_forget_useless::*;
use enum_intrinsics_non_enums::EnumIntrinsicsNonEnums;
Expand All @@ -103,7 +104,6 @@ use invalid_from_utf8::*;
use let_underscore::*;
use macro_expr_fragment_specifier_2024_migration::*;
use map_unit_fn::*;
use methods::*;
use multiple_supertrait_upcastable::*;
use non_ascii_idents::*;
use non_fmt_panic::NonPanicFmt;
Expand Down Expand Up @@ -231,7 +231,7 @@ late_lint_methods!(
UngatedAsyncFnTrackCaller: UngatedAsyncFnTrackCaller,
ShadowedIntoIter: ShadowedIntoIter,
DropTraitConstraints: DropTraitConstraints,
TemporaryCStringAsPtr: TemporaryCStringAsPtr,
DanglingPointers: DanglingPointers,
NonPanicFmt: NonPanicFmt,
NoopMethodCall: NoopMethodCall,
EnumIntrinsicsNonEnums: EnumIntrinsicsNonEnums,
Expand Down Expand Up @@ -356,6 +356,7 @@ fn register_builtins(store: &mut LintStore) {
store.register_renamed("non_fmt_panic", "non_fmt_panics");
store.register_renamed("unused_tuple_struct_fields", "dead_code");
store.register_renamed("static_mut_ref", "static_mut_refs");
store.register_renamed("temporary_cstring_as_ptr", "dangling_pointers_from_temporaries");

// These were moved to tool lints, but rustc still sees them when compiling normally, before
// tool lints are registered, so `check_tool_name_for_backwards_compat` doesn't work. Use
Expand Down
17 changes: 10 additions & 7 deletions compiler/rustc_lint/src/lints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1137,16 +1137,19 @@ pub(crate) struct IgnoredUnlessCrateSpecified<'a> {
pub name: Symbol,
}

// methods.rs
// dangling.rs
#[derive(LintDiagnostic)]
#[diag(lint_cstring_ptr)]
#[diag(lint_dangling_pointers_from_temporaries)]
#[note]
#[help]
pub(crate) struct CStringPtr {
#[label(lint_as_ptr_label)]
pub as_ptr: Span,
#[label(lint_unwrap_label)]
pub unwrap: Span,
// FIXME: put #[primary_span] on `ptr_span` once it does not cause conflicts
pub(crate) struct DanglingPointersFromTemporaries<'tcx> {
pub callee: Symbol,
pub ty: Ty<'tcx>,
#[label(lint_label_ptr)]
pub ptr_span: Span,
#[label(lint_label_temporary)]
pub temporary_span: Span,
}

// multiple_supertrait_upcastable.rs
Expand Down
69 changes: 0 additions & 69 deletions compiler/rustc_lint/src/methods.rs

This file was deleted.

Loading
Loading