From db7c38aebd531f7e964d0bb97a439b2a26dcf86c Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 26 Sep 2024 17:46:39 +0800 Subject: [PATCH 1/8] Introduce new relocation for landing pad The R_RISCV_LPAD relocation can be used for PLT entry generation and also for linker relaxation. Additionally, we defined a new mapping symbol type to help users understand the function signature for the corresponding function. The addend value is the label value, and it will point to the mapping symbol placed at the beginning of the function. e.g. ```asm foo: # void foo(void) $sFvvE: lpad 123 # R_RISCV_LPAD $sFvvE + 123 ``` We propose two linker relaxations for the landing pad. The first is removing the entire landing pad, which can be used when symbols have local visibility, and the address is not taken by any other reference. The second is a landing pad scheme conversion, designed for backward compatibility (or as a workaround) for legacy programs that may use functions without declarations. --- riscv-elf.adoc | 84 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 83 insertions(+), 1 deletion(-) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index edcd1e02..2d672994 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -548,7 +548,9 @@ Description:: Additional information about the relocation <| S - P .2+| 65 .2+| TLSDESC_CALL .2+| Static | .2+| Annotate call to TLS descriptor resolver function, `%tlsdesc_call(address of %tlsdesc_hi)`, for relaxation purposes only <| -.2+| 66-190 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use +.2+| 66 .2+| LPAD .2+| Static | .2+| Annotates the landing pad instruction inserted at the beginning of the function. The addend indicates the label value of the landing pad, and the symbol value is the address of the mapping symbol for the function signature, which will have the same address as the function. + <| +.2+| 67-190 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use <| .2+| 191 .2+| VENDOR .2+| Static | .2+| Paired with a vendor-specific relocation and must be placed immediately before it, indicates which vendor owns the relocation. <| @@ -1582,6 +1584,7 @@ A number of symbols, named mapping symbols, describe the boundaries. | $x. | $x .2+| Start of a sequence of instructions with extension. | $x. +| $s | Marker for the landing pad instruction. This should only be used with the function signature-based scheme and should be placed only at the beginning of the function. |=== The mapping symbol should set the type to `STT_NOTYPE`, binding to `STB_LOCAL`, @@ -2317,6 +2320,85 @@ instructions. It is recommended to initialize `jvt` CSR immediately after csrw jvt, a0 ---- +==== Landing Pad Relaxation + + Target Relocation::: R_RISCV_LPAD + + Description:: This relaxation type can relax lpad instruction into a none, + which removed the lpad instruciton. + This relaxation type can be performe even without `R_RISCV_RELAX`, + but the linker should pad nop instruciton to the same length of the original + instruction sequence. + + Condition:: The associated function of this lpad must have local visibility, and + it must not be referenced by any relocation other than `R_RISCV_CALL` and + `R_RISCV_CALL_PLT`. + This relaxation can also be performed when the function has global visibility, + if the symbol does not have a corresponding PLT entry and is not referenced by + the GOT or by any relocation other than `R_RISCV_CALL` and `R_RISCV_CALL_PLT`. + + Relaxation:: + - Lpad instruciton associated with `R_RISCV_LPAD` can be removed. + - Lpad instruciton associated with `R_RISCV_LPAD` can be replaced with nop + instruction if the relacation isn't paired with `R_RISCV_RELAX`. + + Example:: ++ +-- +Relaxation candidate: +[,asm] +---- + lpad 0x123 # R_RISCV_LPAD, R_RISCV_RELAX +---- + +Relaxation result: +[,asm] +---- + # No instruction +---- +Can be relaxed into `nop` if no `R_RISCV_RELAX` is paired with `R_RISCV_LPAD`. +[,asm] +---- + nop +---- +-- + +==== Landing Pad Scheme Relaxation + + Target Relocation::: R_RISCV_LPAD + + Description:: This relaxation type allows an `lpad` instruction to be relaxed + into `lpad 0`, which is a universal landing pad that ignores the label value + comparison. This relaxation is used when the label value is not computed + correctly. + + Condition:: This relaxation can be performed without `R_RISCV_RELAX`, and + should not be enabled by default. The user must explicitly enable this + relaxation, and it should only be applied during static linking. + + Relaxation:: + - Lpad instruction associated with `R_RISCV_LPAD` will be replaced with + `lpad 0`. + + Example:: ++ +-- +Relaxation candidate: +[,asm] +---- + lpad 0x123 # R_RISCV_LPAD +---- + +Relaxation result: +[,asm] +---- + lpad 0 +---- +-- + +NOTE: This relaxation is designed to be compatible with legacy programs that + may not declare the function signature correctly. + [bibliography] == References From 0726ba119c834d460cbab96a0f2d06b67dc51916 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 18 Oct 2024 16:36:21 +0800 Subject: [PATCH 2/8] Fix typo --- riscv-elf.adoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 2d672994..9d574dab 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -2011,7 +2011,7 @@ NOTE: Tag_RISCV_x3_reg_usage is treated as 0 if it is not present. Relaxation:: - The `auipc` instruction associated with `R_RISCV_GOT_HI20` can be - removed if the symbol is absolute. + removes if the symbol is absolute. - The instruction or instructions associated with `R_RISCV_PCREL_LO12_I` can be rewritten to either `c.li` or `addi` to materialize the symbol's @@ -2338,8 +2338,8 @@ instructions. It is recommended to initialize `jvt` CSR immediately after the GOT or by any relocation other than `R_RISCV_CALL` and `R_RISCV_CALL_PLT`. Relaxation:: - - Lpad instruciton associated with `R_RISCV_LPAD` can be removed. - - Lpad instruciton associated with `R_RISCV_LPAD` can be replaced with nop + - Lpad instruction associated with `R_RISCV_LPAD` can be removed. + - Lpad instruction associated with `R_RISCV_LPAD` can be replaced with nop instruction if the relacation isn't paired with `R_RISCV_RELAX`. Example:: From 1e21e4257699276ae52aaf90ecceece8effe75c8 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 18 Oct 2024 16:37:01 +0800 Subject: [PATCH 3/8] Revise 'Landing Pad Relaxation' Rephase to make it clearly about it can remove instruction. --- riscv-elf.adoc | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 9d574dab..138b82bb 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -2324,10 +2324,15 @@ instructions. It is recommended to initialize `jvt` CSR immediately after Target Relocation::: R_RISCV_LPAD + Description:: This relaxation type allows the `lpad` instruction to be removed. + However, if `R_RISCV_RELAX` is not present, the `lpad` instruction can only be + replaced with a sequence of `nop` instructions of the same length as the + original instruction. + Description:: This relaxation type can relax lpad instruction into a none, - which removed the lpad instruciton. - This relaxation type can be performe even without `R_RISCV_RELAX`, - but the linker should pad nop instruciton to the same length of the original + which removed the lpad instruction. + This relaxation type can be performed even without `R_RISCV_RELAX`, + but the linker should pad nop instruction to the same length of the original instruction sequence. Condition:: The associated function of this lpad must have local visibility, and From 5d43b5780494b922e6cd0a6cdb32f0c4d4c71ba6 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 18 Oct 2024 16:38:08 +0800 Subject: [PATCH 4/8] Revise 'Landing Pad Scheme Relaxation' - Drop the restriction of static link - Emphasis must be applied to all `R_RISCV_LPAD` - GNU property and PLT entries must adjust too. --- riscv-elf.adoc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 138b82bb..47bad138 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -2379,11 +2379,14 @@ Can be relaxed into `nop` if no `R_RISCV_RELAX` is paired with `R_RISCV_LPAD`. Condition:: This relaxation can be performed without `R_RISCV_RELAX`, and should not be enabled by default. The user must explicitly enable this - relaxation, and it should only be applied during static linking. + relaxation. Additionally, if this relaxation is applied, it must be applied + consistently to all `R_RISCV_LPAD` relocations in the entire binary. Relaxation:: - Lpad instruction associated with `R_RISCV_LPAD` will be replaced with `lpad 0`. + - The GNU property must be adjusted to reflect the use of this relaxation. + - The format of the PLT entries must also be adjusted accordingly. Example:: + From 32688be38f8de70c74322272a11774a4f902543a Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 18 Oct 2024 16:43:33 +0800 Subject: [PATCH 5/8] Add Note to 'Landing Pad Scheme Relaxation' --- riscv-elf.adoc | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 47bad138..77dd6f14 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -2407,6 +2407,11 @@ Relaxation result: NOTE: This relaxation is designed to be compatible with legacy programs that may not declare the function signature correctly. +NOTE: Dependent shared libraries will not undergo the corresponding +transformation. Therefore, if this Landing Pad Scheme Relaxation is used in a +dynamically linked environment, ensure that all dependent shared libraries are +rebuilt with the corresponding version. + [bibliography] == References From 02546dea2621f2a4850cd9a6920bc6f5bfd32495 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Fri, 18 Oct 2024 16:53:39 +0800 Subject: [PATCH 6/8] Relaxation condition updated based on symbol export to dynamic symbol table - Updated the relaxation condition to apply only when the symbol is not exported to the dynamic symbol table. --- riscv-elf.adoc | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 77dd6f14..701c1a16 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -2335,12 +2335,10 @@ instructions. It is recommended to initialize `jvt` CSR immediately after but the linker should pad nop instruction to the same length of the original instruction sequence. - Condition:: The associated function of this lpad must have local visibility, and - it must not be referenced by any relocation other than `R_RISCV_CALL` and - `R_RISCV_CALL_PLT`. - This relaxation can also be performed when the function has global visibility, - if the symbol does not have a corresponding PLT entry and is not referenced by - the GOT or by any relocation other than `R_RISCV_CALL` and `R_RISCV_CALL_PLT`. + Condition:: This relaxation can only be applied if the symbol is **NOT** + exported to the dynamic symbol table and is only referenced by `R_RISCV_CALL` + or `R_RISCV_CALL_PLT` relocations. If the symbol is exported or referenced by + other relocations, relaxation cannot be performed. Relaxation:: - Lpad instruction associated with `R_RISCV_LPAD` can be removed. From bc7fd8774e17dcbfe66beb71712c68b08282b7c2 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 21 Nov 2024 16:10:55 +0800 Subject: [PATCH 7/8] Drop the change that changed accidentally before --- riscv-elf.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 701c1a16..1541e1a9 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -2011,7 +2011,7 @@ NOTE: Tag_RISCV_x3_reg_usage is treated as 0 if it is not present. Relaxation:: - The `auipc` instruction associated with `R_RISCV_GOT_HI20` can be - removes if the symbol is absolute. + removed if the symbol is absolute. - The instruction or instructions associated with `R_RISCV_PCREL_LO12_I` can be rewritten to either `c.li` or `addi` to materialize the symbol's From 6e1638850425dd2b9e80c9d13070144238274fbb Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Thu, 21 Nov 2024 16:11:07 +0800 Subject: [PATCH 8/8] Add new Landing Pad Information Section --- riscv-elf.adoc | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 1541e1a9..5023073f 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -1212,6 +1212,7 @@ The defined processor-specific section types are listed in <>. | Name | Value | Attributes | SHT_RISCV_ATTRIBUTES | 0x70000003 | none +| SHT_RISCV_LADING_PAD_INFO | 0x70000004 | none |=== ==== Special Sections @@ -1226,12 +1227,16 @@ The defined processor-specific section types are listed in <>. | Name | Type | Attributes | .riscv.attributes | SHT_RISCV_ATTRIBUTES | none +| .riscv.lpadinfo | SHT_RISCV_LADING_PAD_INFO | none | .riscv.jvt | SHT_PROGBITS | SHF_ALLOC + SHF_EXECINSTR | .note.gnu.property | SHT_NOTE | SHF_ALLOC |=== +++.riscv.attributes+++ names a section that contains RISC-V ELF attributes. ++++.riscv.lpadinfo+++ names a section that contains RISC-V landing pad +information, which used for generating PLT and also can be used for debugging. + +++.riscv.jvt+++ is a linker-created section to store table jump target addresses. The minimum alignment of this section is 64 bytes. @@ -1570,6 +1575,51 @@ the `Zicfilp` extension. An executable or shared library with this bit set is required to generate PLTs with the landing pad (`lpad`) instruction, and all label are set to a value which hashed from its function signature. +=== Landing Pad Information Section (`.riscv.lpadinfo`) + +Landing pad information section is a section that contains the nessary information +for generating function signature based landing pad PLT, this section also may +exsiting when the unlabeled landing pad scheme is used. + +This section is consist by the entries of the following structure: + +``` +typedef struct +{ + Elf32_Word lpi_name; /* Symbol name (string tbl index) */ + Elf32_Word lpi_sig; /* Signature for the symbol (string tbl index) */ + Elf32_Word lpi_value; /* Landing pad value for the symbol */ +} Elf32_Lpadinfo; + +typedef struct +{ + Elf64_Word lpi_name; /* Symbol name (string tbl index) */ + Elf64_Word lpi_sig; /* Signature for the symbol (string tbl index) */ + Elf64_Word lpi_value; /* Landing pad value for the symbol */ +} Elf64_Lpadinfo; +``` + +The `lpi_name` field is the index into the string table for the symbol name, +the `lpi_signature` field is the index into the string table for the function +signature, it can be 0 if the signature string is not present, +and the `lpi_value` field is the landing pad value for the symbol. + +The string hold by `lpi_signature` field is the function signature string, which +is encoded as same as the mapping symbol of the function signature. + +NOTE: Using same encoding as mapping symbol aims to reduce the size of the +string table + +Every symbol with global or weak bind must has a corresponding entry in this +section, the `lpi_name` field must be the same as the symbol name string table +index. + +This section can be discard after static linking stage. + +Static linker should emit error if objects with same symbol but different +landing pad value are beging merged, however it may suppress the error if +linker enable the landing pad schem relaxation. + === Mapping Symbol The section can have a mixture of code and data or code with different ISAs.