Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debuginfo: Refactor debuginfo generation for types #94261

Merged
merged 9 commits into from
Mar 15, 2022
Merged
2 changes: 1 addition & 1 deletion compiler/rustc_codegen_gcc/src/debuginfo.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ impl<'a, 'gcc, 'tcx> DebugInfoBuilderMethods for Builder<'a, 'gcc, 'tcx> {
}

impl<'gcc, 'tcx> DebugInfoMethods<'tcx> for CodegenCx<'gcc, 'tcx> {
fn create_vtable_metadata(&self, _ty: Ty<'tcx>, _trait_ref: Option<PolyExistentialTraitRef<'tcx>>, _vtable: Self::Value) {
fn create_vtable_debuginfo(&self, _ty: Ty<'tcx>, _trait_ref: Option<PolyExistentialTraitRef<'tcx>>, _vtable: Self::Value) {
// TODO(antoyo)
}

Expand Down
4 changes: 2 additions & 2 deletions compiler/rustc_codegen_llvm/src/allocator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -140,8 +140,8 @@ pub(crate) unsafe fn codegen(
llvm::LLVMDisposeBuilder(llbuilder);

if tcx.sess.opts.debuginfo != DebugInfo::None {
let dbg_cx = debuginfo::CrateDebugContext::new(llmod);
debuginfo::metadata::compile_unit_metadata(tcx, module_name, &dbg_cx);
let dbg_cx = debuginfo::CodegenUnitDebugContext::new(llmod);
debuginfo::metadata::build_compile_unit_di_node(tcx, module_name, &dbg_cx);
dbg_cx.finalize(tcx.sess);
}
}
2 changes: 1 addition & 1 deletion compiler/rustc_codegen_llvm/src/consts.rs
Original file line number Diff line number Diff line change
Expand Up @@ -428,7 +428,7 @@ impl<'ll> StaticMethods for CodegenCx<'ll, '_> {
llvm::LLVMSetGlobalConstant(g, llvm::True);
}

debuginfo::create_global_var_metadata(self, def_id, g);
debuginfo::build_global_var_di_node(self, def_id, g);

if attrs.flags.contains(CodegenFnAttrFlags::THREAD_LOCAL) {
llvm::set_thread_local_mode(g, self.tls_model);
Expand Down
10 changes: 7 additions & 3 deletions compiler/rustc_codegen_llvm/src/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ pub struct CodegenCx<'ll, 'tcx> {
pub isize_ty: &'ll Type,

pub coverage_cx: Option<coverageinfo::CrateCoverageContext<'ll, 'tcx>>,
pub dbg_cx: Option<debuginfo::CrateDebugContext<'ll, 'tcx>>,
pub dbg_cx: Option<debuginfo::CodegenUnitDebugContext<'ll, 'tcx>>,

eh_personality: Cell<Option<&'ll Value>>,
eh_catch_typeinfo: Cell<Option<&'ll Value>>,
Expand Down Expand Up @@ -396,8 +396,12 @@ impl<'ll, 'tcx> CodegenCx<'ll, 'tcx> {
};

let dbg_cx = if tcx.sess.opts.debuginfo != DebugInfo::None {
let dctx = debuginfo::CrateDebugContext::new(llmod);
debuginfo::metadata::compile_unit_metadata(tcx, codegen_unit.name().as_str(), &dctx);
let dctx = debuginfo::CodegenUnitDebugContext::new(llmod);
debuginfo::metadata::build_compile_unit_di_node(
tcx,
codegen_unit.name().as_str(),
&dctx,
);
Some(dctx)
} else {
None
Expand Down
57 changes: 4 additions & 53 deletions compiler/rustc_codegen_llvm/src/debuginfo/doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ The function will take care of probing the cache for an existing node for
that exact file path.

All private state used by the module is stored within either the
CrateDebugContext struct (owned by the CodegenCx) or the
CodegenUnitDebugContext struct (owned by the CodegenCx) or the
FunctionDebugContext (owned by the FunctionCx).

This file consists of three conceptual sections:
Expand Down Expand Up @@ -72,21 +72,16 @@ describe(t = List)
...
```

To break cycles like these, we use "forward declarations". That is, when
To break cycles like these, we use "stubs". That is, when
the algorithm encounters a possibly recursive type (any struct or enum), it
immediately creates a type description node and inserts it into the cache
*before* describing the members of the type. This type description is just
a stub (as type members are not described and added to it yet) but it
allows the algorithm to already refer to the type. After the stub is
inserted into the cache, the algorithm continues as before. If it now
encounters a recursive reference, it will hit the cache and does not try to
describe the type anew.

This behavior is encapsulated in the 'RecursiveTypeDescription' enum,
which represents a kind of continuation, storing all state needed to
continue traversal at the type members after the type has been registered
with the cache. (This implementation approach might be a tad over-
engineered and may change in the future)
describe the type anew. This behavior is encapsulated in the
`type_map::build_type_with_children()` function.


## Source Locations and Line Information
Expand Down Expand Up @@ -134,47 +129,3 @@ detection. The `create_argument_metadata()` and related functions take care
of linking the `llvm.dbg.declare` instructions to the correct source
locations even while source location emission is still disabled, so there
is no need to do anything special with source location handling here.

## Unique Type Identification

In order for link-time optimization to work properly, LLVM needs a unique
type identifier that tells it across compilation units which types are the
same as others. This type identifier is created by
`TypeMap::get_unique_type_id_of_type()` using the following algorithm:

1. Primitive types have their name as ID

2. Structs, enums and traits have a multipart identifier

1. The first part is the SVH (strict version hash) of the crate they
were originally defined in

2. The second part is the ast::NodeId of the definition in their
original crate

3. The final part is a concatenation of the type IDs of their concrete
type arguments if they are generic types.

3. Tuple-, pointer-, and function types are structurally identified, which
means that they are equivalent if their component types are equivalent
(i.e., `(i32, i32)` is the same regardless in which crate it is used).

This algorithm also provides a stable ID for types that are defined in one
crate but instantiated from metadata within another crate. We just have to
take care to always map crate and `NodeId`s back to the original crate
context.

As a side-effect these unique type IDs also help to solve a problem arising
from lifetime parameters. Since lifetime parameters are completely omitted
in debuginfo, more than one `Ty` instance may map to the same debuginfo
type metadata, that is, some struct `Struct<'a>` may have N instantiations
with different concrete substitutions for `'a`, and thus there will be N
`Ty` instances for the type `Struct<'a>` even though it is not generic
otherwise. Unfortunately this means that we cannot use `ty::type_id()` as
cheap identifier for type metadata -- we have done this in the past, but it
led to unnecessary metadata duplication in the best case and LLVM
assertions in the worst. However, the unique type ID as described above
*can* be used as identifier. Since it is comparatively expensive to
construct, though, `ty::type_id()` is still used additionally as an
optimization for cases where the exact same type has been seen before
(which is most of the time).
Loading