Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wishlist: allow adding intrinsics in a way that doesn't break every backend #93145

Closed
scottmcm opened this issue Jan 21, 2022 · 9 comments · Fixed by #120500
Closed

Wishlist: allow adding intrinsics in a way that doesn't break every backend #93145

scottmcm opened this issue Jan 21, 2022 · 9 comments · Fixed by #120500
Assignees
Labels
A-codegen Area: Code generation A-intrinsics Area: Intrinsics C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@scottmcm
Copy link
Member

scottmcm commented Jan 21, 2022

See previous conversation in https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Musing.20on.20intrinsics.20with.20MIR.20available/near/243719813 which was split off from https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Considering.20a.20future.20integration.20of.20the.20gcc.20backend/near/243686539

As we get more backends, and as people write more out-of-tree backends, it gets harder and harder to add intrinsics. Right now, you already have to implement every new intrinsic

  1. in a way that works for bootstrap
  2. in MIRI, for CTFE
  3. in cg_llvm
  4. in cg_clif

And that list will just get longer with cg_gcc and whatever.

It would be really nice if there was an easy way to add an intrinsic that only required updating the type/safety checking parts of the compilers, not the backends at all. Then the changes the specific backends to implement the intrinsic in a more efficient way are separable from adding the intrinsic in the first place.

That would probably mean having MIR available for the intrinsic somehow, so the fallback path for intrinsics in backends could be to just use that MIR, rather than failing.

The intrinsics would then only need to be implemented in backends that care. For example, cg_clif might not bother overriding some of the unchecked math ones.

And it would be really nice for simd, as the fallback implementations would define their semantics in terms of normal scalar rust, which MIRI will always want anyway, and will be very convenient for other platforms or backends that don't have it implemented (yet or ever).

A sketch

This might not be the best way, but for the sake of a starting point, here's a sketch of how maybe it could work.

Back in #61885, which added an unchecked_sub intrinsic, I solved step (1) above by adding

/// For bootstrapping, implement unchecked_sub as just wrapping_sub.
#[cfg(bootstrap)]
pub unsafe fn unchecked_sub<T>(x: T, y: T) -> T {
    sub_with_overflow(x, y).0
}

It would be nice if, rather than the intrinsic being only declared

extern "rust-intrinsic" {
    #[cfg(not(bootstrap))]
    pub fn unchecked_sub<T>(x: T, y: T) -> T;
}

it could instead be defined in core with the fallback body

pub unsafe extern "rust-intrinsic" fn unchecked_sub<T>(x: T, y: T) -> T {
    sub_with_overflow(x, y).0
}

Then the backend could use that MIR, available like the MIR of any generic function (or inline, for non-generics).

@scottmcm scottmcm added A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-feature-request Category: A feature request, i.e: not implemented / a PR. A-intrinsics Area: Intrinsics labels Jan 21, 2022
@oli-obk
Copy link
Contributor

oli-obk commented Jan 21, 2022

Oh I should've written this up long ago... basically the plan @eddyb and I had for this was to add an attribute (like lang item, but intrinsic) that can be added to functions and methods. It would essentially work like your proposal, but you could annotate the methods on the integer types directly instead of having to redeclare an intrinsic for them.

@scottmcm
Copy link
Member Author

#96946 is a nice example where this would have been really handy -- ptr.mask(mask) can always be emitted as ptr.with_addr(ptr.addr() & mask), so only needing to update LLVM (not cranelift & gcc too) to use the intrinsic would make that PR way simpler.

bors added a commit to rust-lang-ci/rust that referenced this issue May 17, 2022
Add a query for checking whether a function is an intrinsic.

work towards rust-lang#93145

This will reduce churn when we add more ways to declare intrinsics

r? `@scottmcm`
xFrednet pushed a commit to xFrednet/rust that referenced this issue May 21, 2022
Add a query for checking whether a function is an intrinsic.

work towards rust-lang#93145

This will reduce churn when we add more ways to declare intrinsics

r? `@scottmcm`
@scottmcm
Copy link
Member Author

Another thought here: since GCC Rust is trying to use an unmodified core, this would be even more of a help. It's just annoying to update the llvm+cranelift+gcc backends when adding an intrinsic. But updating GCC at the same time is impossible -- and even not at the same time is impractical for many contributors due to licenses and copyright assignments.

@WaffleLapkin
Copy link
Member

So, I was thinking about working on this ever since @scottmcm mentioned it in #96946. Today I've finally started looking into it...

  • Ast changes seem to be trivial, we just need to disable a check that intrinsic don't have bodies
  • In lowering we just need to lower the block if it's present
  • In HIR mostly adding Option<BodyId> field and visiting it in visitors + returning the body from various getter functions
  • In typeck we need to assume that unknown intrinsic with a body has correct signature; body checking is made a little weird by primary_body_of returning an FnSig which ForeignItemKind::Fn doesn't have for some reason, I've worked it around by returning (hir::FnHeader, &hir::FnDecl<'_>) with a fabricated header, but still ugh
  • Codegen is far less trivial than all the other parts, there we need to somehow define and call the fallback and I haven't figured it out yet

If someone would be able to mentor me that would be very nice 🥲

@oli-obk
Copy link
Contributor

oli-obk commented Aug 22, 2022

I think we'll want to avoid the extern "rust-intrinsic" logic entirely. Ideally we'll be able to just annotate all the integer functions that forward to an intrinsic with #[rust_intrinsic = "name"], even if that means multiple functions are annotated with the same intrinsic name. For an MVP we should probably start with a single such function to keep the diff small.

We can keep the existing logic around, but it should mostly not matter, as query is_intrinsic can then be adjusted to handle the new rust_intrinsic functions, too, causing all users of is_intrinsic to automatically pick it up. There are still a few manual RustIntrinsic ABI checks that we need to review, but I was hopeful that the effort should be smaller now that we have is_intrinsic. It may be useful to make is_intrinsic return an Option<Symbol> of the intrinsic name, as they may now differ from the function's name.

oli-obk added a commit to oli-obk/rust that referenced this issue Feb 13, 2024
Implement intrinsics with fallback bodies

fixes rust-lang#93145 (though we can port many more intrinsics)
cc rust-lang#63585

The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for

* codegen_ssa (so llvm and gcc)
* codegen_cranelift

other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body).

cc `@scottmcm` `@WaffleLapkin`

### todo

* [ ] miri support
* [x] default intrinsic name to name of function instead of requiring it to be specified in attribute
* [x] make sure that the bodies are always available (must be collected for metadata)
oli-obk added a commit to oli-obk/rust that referenced this issue Feb 13, 2024
Implement intrinsics with fallback bodies

fixes rust-lang#93145 (though we can port many more intrinsics)
cc rust-lang#63585

The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for

* codegen_ssa (so llvm and gcc)
* codegen_cranelift

other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body).

cc ``@scottmcm`` ``@WaffleLapkin``

### todo

* [ ] miri support
* [x] default intrinsic name to name of function instead of requiring it to be specified in attribute
* [x] make sure that the bodies are always available (must be collected for metadata)
geektype added a commit to geektype/rust that referenced this issue Feb 13, 2024
Implement intrinsics with fallback bodies

fixes rust-lang#93145 (though we can port many more intrinsics)
cc rust-lang#63585

The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for

* codegen_ssa (so llvm and gcc)
* codegen_cranelift

other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body).

cc ```@scottmcm``` ```@WaffleLapkin```

### todo

* [ ] miri support
* [x] default intrinsic name to name of function instead of requiring it to be specified in attribute
* [x] make sure that the bodies are always available (must be collected for metadata)
oli-obk added a commit to oli-obk/rust that referenced this issue Feb 15, 2024
Implement intrinsics with fallback bodies

fixes rust-lang#93145 (though we can port many more intrinsics)
cc rust-lang#63585

The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for

* codegen_ssa (so llvm and gcc)
* codegen_cranelift

other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body).

cc `@scottmcm` `@WaffleLapkin`

### todo

* [ ] miri support
* [x] default intrinsic name to name of function instead of requiring it to be specified in attribute
* [x] make sure that the bodies are always available (must be collected for metadata)
@bors bors closed this as completed in dfa88b3 Feb 16, 2024
Nadrieril added a commit to Nadrieril/rust that referenced this issue Feb 17, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 17, 2024
Rollup merge of rust-lang#121192 - oli-obk:intrinsics2.0, r=WaffleLapkin

Give some intrinsics fallback bodies

cc rust-lang#93145
flip1995 pushed a commit to flip1995/rust that referenced this issue Feb 26, 2024
Implement intrinsics with fallback bodies

fixes rust-lang#93145 (though we can port many more intrinsics)
cc rust-lang#63585

The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for

* codegen_ssa (so llvm and gcc)
* codegen_cranelift

other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body).

cc `@scottmcm` `@WaffleLapkin`

### todo

* [ ] miri support
* [x] default intrinsic name to name of function instead of requiring it to be specified in attribute
* [x] make sure that the bodies are always available (must be collected for metadata)
@celinval
Copy link
Contributor

celinval commented Apr 6, 2024

@oli-obk if I understood this correctly, trying to get the instance body of an intrinsic with fallback body should be OK, is that correct? I was debugging this failure in StableMIR tests, that seemed related to the fact that we are trying to get the body of unlikely, but the compiler crashes instead:

error: internal compiler error: compiler/rustc_mir_transform/src/shim.rs:134:13: creating shims from intrinsics (Intrinsic(DefId(2:1527 ~ core[7c83]::intrinsics::unlikely))) is unsupported

@oli-obk
Copy link
Contributor

oli-obk commented Apr 6, 2024

You can't get the body of an intrinsic instance. You either need to handle the intrinsic instance yourself, or convert it to an item instance with the same def id and args. Then get the body of that

@celinval
Copy link
Contributor

celinval commented Apr 6, 2024

Would it be OK if StableMIR does that under the hood? If the user requests the body of an intrinsic that has a fallback body, we do the conversion under the hood and return it.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 6, 2024

Yea that seems reasonable. We'll still ICE if the intrinsic does not have a fallback body though. But that's how backends do this, too (or they error out instead of panicking, but it's basically the same thing at this point)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-intrinsics Area: Intrinsics C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants