Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use rustc attrs in dangling_pointers_from_temporaries lint #132281

Open
Urgau opened this issue Oct 28, 2024 · 17 comments
Open

Use rustc attrs in dangling_pointers_from_temporaries lint #132281

Urgau opened this issue Oct 28, 2024 · 17 comments
Assignees
Labels
A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. C-cleanup Category: PRs that clean code up or issues documenting cleanup. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Urgau
Copy link
Member

Urgau commented Oct 28, 2024

If you want to do it in this PR (otherwise in a follow-up), we could add an rustc attribute to those as_ptr/as_mut_ptr methods.

You can follow afcb09b and 2a930d3 (from a PR of mine) with check_applied_to_fn_or_method instead, and something similar to this to check the attribute

&& let Some(def_id) = cx.typeck_results().type_dependent_def_id(e.hir_id)
&& cx.tcx.has_attr(def_id, sym::rustc_never_returns_null_ptr)

Originally posted by @Urgau in #128985 (comment)

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 28, 2024
@Urgau Urgau added C-cleanup Category: PRs that clean code up or issues documenting cleanup. A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Oct 28, 2024
@Urgau Urgau added E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. labels Oct 29, 2024
@Urgau
Copy link
Member Author

Urgau commented Oct 29, 2024

Steps

  1. Create the attribute (symbol, register, and validation), let's say rustc_as_ptr_function (better name to be found)
  2. Apply the newly created attribute to the relevant as_ptr/as_mut_ptr methods to the types listed
    // Array, Vec, String, CString, MaybeUninit, Cell, Box<[_]>, Box<str>, Box<CStr>,
    // or any of the above in arbitrary many nested Box'es.
    fn is_interesting(tcx: TyCtxt<'_>, ty: Ty<'_>) -> bool {
    if ty.is_array() {
    true
    } else if let Some(inner) = ty.boxed_ty() {
    inner.is_slice()
    || inner.is_str()
    || inner.ty_adt_def().is_some_and(|def| tcx.is_lang_item(def.did(), LangItem::CStr))
    || is_interesting(tcx, inner)
    } else if let Some(def) = ty.ty_adt_def() {
    for lang_item in [LangItem::String, LangItem::MaybeUninit] {
    if tcx.is_lang_item(def.did(), lang_item) {
    return true;
    }
    }
    tcx.get_diagnostic_name(def.did())
    .is_some_and(|name| matches!(name, sym::cstring_type | sym::Vec | sym::Cell))
  3. Use the newly created attribute instead of is_interesting fn, with something similar to this
    && let Some(def_id) = cx.typeck_results().type_dependent_def_id(e.hir_id)
    && cx.tcx.has_attr(def_id, sym::rustc_never_returns_null_ptr)

More context in #128985 (comment)

@gavincrawford

This comment has been minimized.

@rustbot

This comment has been minimized.

@gavincrawford
Copy link
Contributor

@rustbot claim
Oops

@gavincrawford
Copy link
Contributor

@GrigorenkoPV sorry, didn't realize this your follow up issue. Apologies.
@rustbot release-assignment

@GrigorenkoPV
Copy link
Contributor

@GrigorenkoPV sorry, didn't realize this your follow up issue. Apologies. @rustbot release-assignment

I, most likely, won't have a chance to work on it in the next few weeks, so you can claim it in the meantime if you want to.

@gavincrawford
Copy link
Contributor

Alright, I'll give it a shot. @rustbot claim

@GrigorenkoPV
Copy link
Contributor

3. Use the newly created attribute instead of is_interesting fn

I think the actual place where the attribute checking will have to go is this:

&& matches!(method.ident.name, sym::as_ptr | sym::as_mut_ptr)

so that we trigger on methods not based on their name but rather based on them having this attribute.

The is_interesting function checks that the type that we have owns the allocation. i.e. it's a Vec<T>, but not a &Vec<T> or a &[T].

Ideally this should probably be unified with is_temporary_rvalue, so that we can properly handle field projections and whatnot, but that sounds like a lot of pain and chances to get it wrong. Probably out of scope for now.

In the meantime I see 3 options for is_interesting:

  1. Leave as is (probably rename to owns_allocation or something along those lines for clarity). But add UnsafeCell and other missing things whose pointer-producing methods aren't named as_ptr/as_mut_ptr but will now be marked with the attribute:
    /// 1. Method calls that are not checked for:
    /// - [`temporary_unsafe_cell.get()`][`core::cell::UnsafeCell::get()`]
    /// - [`temporary_sync_unsafe_cell.get()`][`core::cell::SyncUnsafeCell::get()`]
  2. Mark the interesting types with an attribute. Probably not the same attribute as the methods. Replace this part
    for lang_item in [LangItem::String, LangItem::MaybeUninit] {
    if tcx.is_lang_item(def.did(), lang_item) {
    return true;
    }
    }
    tcx.get_diagnostic_name(def.did())
    .is_some_and(|name| matches!(name, sym::cstring_type | sym::Vec | sym::Cell))
  3. Try to deduce the type based on the method's signature. Should work for String, but will probably not work for Vec<T> or Box<[T]> as their as_ptr methods actually come from [T] through Deref.

In conclusion,

  1. Create a new attribute (for methods)
  2. Mark the already-checked methods with it, as well as the currently unchecked [Sync]UnsafeCell::get()
  3. Replace
    && matches!(method.ident.name, sym::as_ptr | sym::as_mut_ptr)
    with a check for the attribute.
  4. Rename is_interesting to something about owning the memory.
  5. Optionally: create a new attribute for types, mark the types mentioned here
    for lang_item in [LangItem::String, LangItem::MaybeUninit] {
    if tcx.is_lang_item(def.did(), lang_item) {
    return true;
    }
    }
    tcx.get_diagnostic_name(def.did())
    .is_some_and(|name| matches!(name, sym::cstring_type | sym::Vec | sym::Cell))
    and replace this snippet with a check for the attribute.

If you have any questions, feel free to reach out to me or Urgau (I guess they will not mind either, considering they've marked this issue with E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. )

@gavincrawford
Copy link
Contributor

gavincrawford commented Nov 2, 2024

Thanks for all the helpful resources! I've created a new trait to track functions that implement as_ptr or as_ptr_mut. I'm having an issue where, when I attempt to apply this newly created attribute to functions, in this case, Vec's as_ptr method, the compiler can't find the symbol.
Image
I don't see where all the other macros and getting imported from, and my LSP can't figure it out either, showing no definitions found. Any help would be appreciated.

I'll list the commit from my personal fork here, which I did mostly with inspiration from the two commits posted for reference by Urgau. Thanks again for the help!

@Urgau
Copy link
Member Author

Urgau commented Nov 2, 2024

You need to put them behind the bootstrap cfg, as the first build is with the beta compiler and not your local changes.

Like this, #[cfg_attr(not(bootstrap), rustc_as_ptr)].

Btw, I would highly recommend, keeping the stage0 std (as to not recompile everything) with --keep-stage-std 0.

@GrigorenkoPV
Copy link
Contributor

GrigorenkoPV commented Nov 2, 2024

Image

I think I know what the problem might be.

Currently a new version of rustc is built as follows:

  1. An old preexisting version of the compiler (called bootstrap) builds std.
  2. The same bootstrap compiler builds the new version of the compiler using the std from the previous step.
  3. This new version of the compiler now builds std too.
  4. Some other steps (not important for now).

The problem is that the preexisting bootstrap has no idea about this new attribute. Only the new compiler will. Solution? #[cfg(bootstrap)]. (I assume you're familiar with the cfg mechanism).

Effectively bootstrap cfg will be set during the step 1, but not during the step 3.
So to hide this new attribute during step 1, but not during step 3, you should use

#[cfg_attr(not(bootstrap), rustc_as_ptr)]

As for the IDE problems, https://rustc-dev-guide.rust-lang.org/building/suggested.html#suggested-workflows might help.
TLDR: ./x setup can generate the configurations necessary for your IDE to launch the LSP with appropriate settings. It can also generate config.toml for you.

Damn, Urgau was faster.

@gavincrawford
Copy link
Contributor

gavincrawford commented Nov 7, 2024

I believe I've finished up with the primary objectives, which are:

  1. Create a new attribute to track functions that return pointers, instead of matching function names
  2. Add this new attribute to Vec, String, CString, MaybeUninit, Cell, Box, and finally, the UnsafeCell::get() function
  3. Rename is_interesting to something more concise
  4. Change the implementation for the dangling pointer lint to match for the new attribute

I've decided to postpone the fifth step mentioned in this comment further up in the thread, as this is my first contribution here and I'd like to make sure I'm not making any mistakes before bundling that many edits into a single PR.

If everything sounds good, I'll go ahead and make a PR for review. My branch is linked here if you'd like to take a look for any reason. Again, thank you @Urgau and @GrigorenkoPV for being so wonderfully helpful!

@Urgau
Copy link
Member Author

Urgau commented Nov 7, 2024

The changes in your branch looks good. Feel free to create a PR (and assign it to me).

As for in_interesting/is_own_allocation I think we could/should simply replace it with something like this:

!matches!(ty.kind, hir::TyKind::Ref(_, _))

which checks if a type isn't a reference.

@GrigorenkoPV
Copy link
Contributor

As for in_interesting/is_own_allocation I think we could/should simply replace it with something like this:

!matches!(ty.kind, hir::TyKind::Ref(_, _))
which checks if a type isn't a reference.

I think then the lint will fire on things like Box<&String>, which is not what we want.

@Urgau
Copy link
Member Author

Urgau commented Nov 7, 2024

Ah, right, forgot about that. I think that should handled with expr_ty_adjusted instead of expr_ty which handles auto-ref/auto-deref, but this becomes more trick so let's do that as a follow-up.

@gavincrawford
Copy link
Contributor

gavincrawford commented Nov 7, 2024

Just made my PR. Also, would like to mention the few abnormalities I found about the types listed in the following snippet:

// Array, Vec, String, CString, MaybeUninit, Cell, Box<[_]>, Box<str>, Box<CStr>,
// or any of the above in arbitrary many nested Box'es.
fn is_interesting(tcx: TyCtxt<'_>, ty: Ty<'_>) -> bool {
if ty.is_array() {
true
} else if let Some(inner) = ty.boxed_ty() {
inner.is_slice()
|| inner.is_str()
|| inner.ty_adt_def().is_some_and(|def| tcx.is_lang_item(def.did(), LangItem::CStr))
|| is_interesting(tcx, inner)
} else if let Some(def) = ty.ty_adt_def() {
for lang_item in [LangItem::String, LangItem::MaybeUninit] {
if tcx.is_lang_item(def.did(), lang_item) {
return true;
}
}
tcx.get_diagnostic_name(def.did())
.is_some_and(|name| matches!(name, sym::cstring_type | sym::Vec | sym::Cell))

Firstly, the array primitive type does not implement any as_ptr functions, as far as I know, so I didn't tag anything with the attribute. Am I missing anything?

Second, MaybeUninit's slice_as_ptr method: I did not tag this function, as it internally references a regular as_ptr function, which has the attribute. Should the slice_as_ptr method also get the attribute, given that it does return a raw pointer, despite the implementation?

@GrigorenkoPV
Copy link
Contributor

GrigorenkoPV commented Nov 7, 2024

Just made my PR.

#132732 (mentioning it explicitly so that GitHub cross-links it with this issue)

Firstly, the array primitive type does not implement any as_ptr functions, as far as I know, so I didn't tag anything with the attribute. Am I missing anything?

I think it comes from slices and arrays just deref to those.

Second, MaybeUninit's slice_as_ptr method: I did not tag this function, as it internally references a regular as_ptr function, which has the attribute.

I don't think the attribute has any transitive effect. And the lint itself will only be matched against the code in the linted crate, so it won't peek into standard library's source code.

Should the slice_as_ptr method also get the attribute, given that it does return a raw pointer, despite the implementation?

Well, this function/method should probably be accounted for as well, but it is not a method on temporary (i.e. it is not temp.slice_as_ptr(), but rather rather MaybeUninit::slice_as_ptr(temp)), and thus requires some different handling, so I think this is out of scope for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. C-cleanup Category: PRs that clean code up or issues documenting cleanup. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants