This document describes the use of the ELF binary file format in the Application Binary Interface (ABI) of the LoongArch Architecture.
Version | Description |
---|---|
20230519 |
initial version, derived from the original LoongArch ELF psABI document. |
20231102 |
added relocation R_LARCH_CALL36, removed R_LARCH_DELETE / R_LARCH_CFA, and fixed the uleb128 relocation name. |
20231219 |
added the Code Models chapater; added TLS DESC relocations; polished the description of relocations. |
This specification provides the processor-specific definitions required by ELF for LoongArch-based systems.
All common ELF definitions referenced in this section can be found in the latest SysV gABI specification.
ELF
Executable and Linking Format
SysV gABI
Generic System V Application Binary Interface
PC
Program Counter
GOT
Global Offset Table
PLT
Procedure Linkage Table
TLS
Thread-Local Storage
An object file conforming to this specification must have the value EM_LOONGARCH (258, 0x102)
.
e_flags
Bit 31 - 8 | Bit 7 - 6 | Bit 5 - 3 | Bit 2 - 0 |
---|---|---|---|
(reserved) |
ABI version |
ABI extension |
Base ABI Modifier |
The ABI type of an ELF object is uniquely identified by EI_CLASS
and e_flags[7:0]
in its header.
Within this combination, EI_CLASS
and e_flags[2:0]
correspond to the base ABI type,
where the expression of C integral and pointer types (data model) is uniquely determined by
EI_CLASS
value, and e_flags[2:0]
represents additional properties of the base ABI type,
including the FP calling convention. We refer to e_flags[2:0]
as the base ABI modifier.
As a result, programs in lp64*
/ ilp32*
ABI should only be encoded with ELF64 / ELF32
object files, respectively.
0x0
0x4
0x5
0x6
0x7
are reserved values for e_flags[2:0]
.
Name | EI_CLASS | Base ABI Modifier (e_flags[2:0]) | Description |
---|---|---|---|
|
|
|
Uses 64-bit GPRs and the stack for parameter passing.
Data model is |
|
|
|
Uses 64-bit GPRs, 32-bit FPRs and the stack for parameter passing.
Data model is |
|
|
|
Uses 64-bit GPRs, 64-bit FPRs and the stack for parameter passing.
Data model is |
|
|
|
Uses 32-bit GPRs and the stack for parameter passing.
Data model is |
|
|
|
Uses 32-bit GPRs, 32-bit FPRs and the stack for parameter passing.
Data model is |
|
|
|
Uses 32-bit GPRs, 64-bit FPRs and the stack for parameter passing.
Data model is |
e_flags[5:3]
correspond to the ABI extension type.
Name | e_flags[5:3] | Description |
---|---|---|
|
|
No extra ABI features. |
|
(reserved) |
e_flags[7:6]
marks the ABI version of an ELF object.
ABI version | Value | Description |
---|---|---|
|
|
Stack operands base relocation type. |
|
|
Supporting relocation types directly writing to immediate slots. Can be implemented separately without compatibility with v0. |
|
Reserved. |
Enum | ELF reloc type | Usage | Detail |
---|---|---|---|
0 |
|
||
1 |
|
Runtime address resolving |
|
2 |
|
Runtime address resolving |
|
3 |
|
Runtime fixup for load-address |
|
4 |
|
Runtime memory copy in executable |
|
5 |
|
Runtime PLT supporting |
implementation-defined |
6 |
|
Runtime relocation for TLS-GD |
|
7 |
|
Runtime relocation for TLS-GD |
|
8 |
|
Runtime relocation for TLS-GD |
|
9 |
|
Runtime relocation for TLS-GD |
|
10 |
|
Runtime relocation for TLS-IE |
|
11 |
|
Runtime relocation for TLS-IE |
|
12 |
|
Runtime local indirect function resolving |
|
13 |
|
Runtime relocation for TLS descriptors |
|
14 |
|
Runtime relocation for TLS descriptors |
|
… Reserved for dynamic linker. |
|||
20 |
|
Mark la.abs |
Load absolute address for static link. |
21 |
|
Mark external label branch |
Access PC relative address for static link. |
22 |
|
Push PC-relative offset |
|
23 |
|
Push constant or absolute address |
|
24 |
|
Duplicate stack top |
|
25 |
|
Push for access GOT entry |
|
26 |
|
Push for TLS-LE |
|
27 |
|
Push for TLS-IE |
|
28 |
|
Push for TLS-GD |
|
29 |
|
Push for external function calling |
|
30 |
|
Assert stack top |
|
31 |
|
Stack top operation |
|
32 |
|
Stack top operation |
|
33 |
|
Stack top operation |
|
34 |
|
Stack top operation |
|
35 |
|
Stack top operation |
|
36 |
|
Stack top operation |
|
37 |
|
Stack top operation |
|
38 |
|
Instruction imm-field relocation |
with check 5-bit signed overflow |
39 |
|
Instruction imm-field relocation |
with check 12-bit unsigned overflow |
40 |
|
Instruction imm-field relocation |
with check 12-bit signed overflow |
41 |
|
Instruction imm-field relocation |
with check 16-bit signed overflow |
42 |
|
Instruction imm-field relocation |
with check 18-bit signed overflow and 4-bit aligned |
43 |
|
Instruction imm-field relocation |
with check 20-bit signed overflow |
44 |
|
Instruction imm-field relocation |
with check 23-bit signed overflow and 4-bit aligned |
45 |
|
Instruction imm-field relocation |
with check 28-bit signed overflow and 4-bit aligned |
46 |
|
Instruction fixup |
with check 32-bit unsigned overflow |
47 |
|
8-bit in-place addition |
|
48 |
|
16-bit in-place addition |
|
49 |
|
24-bit in-place addition |
|
50 |
|
32-bit in-place addition |
|
51 |
|
64-bit in-place addition |
|
52 |
|
8-bit in-place subtraction |
|
53 |
|
16-bit in-place subtraction |
|
54 |
|
24-bit in-place subtraction |
|
55 |
|
32-bit in-place subtraction |
|
56 |
|
64-bit in-place subtraction |
|
57 |
|
GNU C++ vtable hierarchy |
|
58 |
|
GNU C++ vtable member usage |
|
… Reserved |
|||
64 |
|
18-bit PC-relative jump |
with check 18-bit signed overflow and 4-bit aligned |
65 |
|
23-bit PC-relative jump |
with check 23-bit signed overflow and 4-bit aligned |
66 |
|
28-bit PC-relative jump |
with check 28-bit signed overflow and 4-bit aligned |
67 |
|
[31 … 12] bits of 32/64-bit absolute address |
|
68 |
|
[11 … 0] bits of 32/64-bit absolute address |
|
69 |
|
[51 … 32] bits of 64-bit absolute address |
|
70 |
|
[63 … 52] bits of 64-bit absolute address |
|
71 |
|
[31 … 12] bits of 32/64-bit PC-relative offset |
See Code Models for how it works on various code models. |
72 |
|
[11 … 0] bits of 32/64-bit address |
See Code Models for how it works on various code models. |
73 |
|
[51 … 32] bits of 64-bit PC-relative offset |
|
74 |
|
[63 … 52] bits of 64-bit PC-relative offset |
|
75 |
|
[31 … 12] bits of 32/64-bit PC-relative offset to GOT entry |
|
76 |
|
[11 … 0] bits of 32/64-bit GOT entry address |
|
77 |
|
[51 … 32] bits of 64-bit PC-relative offset to GOT entry |
|
78 |
|
[63 … 52] bits of 64-bit PC-relative offset to GOT entry |
|
79 |
|
[31 … 12] bits of 32/64-bit GOT entry absolute address |
|
80 |
|
[11 … 0] bits of 32/64-bit GOT entry absolute address |
|
81 |
|
[51 … 32] bits of 64-bit GOT entry absolute address |
|
82 |
|
[63 … 52] bits of 64-bit GOT entry absolute address |
|
83 |
|
[31 … 12] bits of TLS LE 32/64-bit offset from TP register |
|
84 |
|
[11 … 0] bits of TLS LE 32/64-bit offset from TP register |
|
85 |
|
[51 … 32] bits of TLS LE 64-bit offset from TP register |
|
86 |
|
[63 … 52] bits of TLS LE 64-bit offset from TP register |
|
87 |
|
[31 … 12] bits of 32/64-bit PC-relative offset to TLS IE GOT entry |
|
88 |
|
[11 … 0] bits of 32/64-bit TLS IE GOT entry address |
|
89 |
|
[51 … 32] bits of 64-bit PC-relative offset to TLS IE GOT entry |
|
90 |
|
[63 … 52] bits of 64-bit PC-relative offset to TLS IE GOT entry |
|
91 |
|
[31 … 12] bits of 32/64-bit TLS IE GOT entry absolute address |
|
92 |
|
[11 … 0] bits of 32/64-bit TLS IE GOT entry absolute address |
|
93 |
|
[51 … 32] bits of 64-bit TLS IE GOT entry absolute address |
|
94 |
|
[63 … 52] bits of 64-bit TLS IE GOT entry absolute address |
|
95 |
|
[31 … 12] bits of 32/64-bit PC-relative offset to TLS LD GOT entry |
|
96 |
|
[31 … 12] bits of 32/64-bit TLS LD GOT entry absolute address |
|
97 |
|
[31 … 12] bits of 32/64-bit PC-relative offset to TLS GD GOT entry |
|
98 |
|
[31 … 12] bits of 32/64-bit TLS GD GOT entry absolute address |
|
99 |
|
32-bit PC relative |
|
100 |
|
Instruction can be relaxed, paired with a normal relocation at the same address |
|
101 |
|
||
102 |
|
Alignment statement. If the symbol index is 0, the addend indicates the number of bytes occupied by nop instructions at the relocation offset. The alignment boundary is specified by the addend rounded up to the next power of two. If the symbol index is not 0, the addend indicates the first and third expressions of .align. The lowest 8 bits are used to represent the first expression, other bits are used to represent the third expression. |
|
103 |
|
22-bit PC-relative offset |
|
104 |
|
||
105 |
|
low 6-bit in-place addition |
|
106 |
|
low 6-bit in-place subtraction |
|
107 |
|
ULEB128 in-place addition |
|
108 |
|
ULEB128 in-place subtraction |
|
109 |
|
64-bit PC relative |
|
110 |
|
Used for medium code model function call sequence |
|
111 |
|
[31 … 12] bits of 32/64-bit PC-relative offset to TLS DESC GOT entry |
|
112 |
|
[11 … 0] bits of 32/64-bit TLS DESC GOT entry address |
|
113 |
|
[51 … 32] bits of 64-bit PC-relative offset to TLS DESC GOT entry |
|
114 |
|
[63 … 52] bits of 64-bit PC-relative offset to TLS DESC GOT entry |
|
115 |
|
[31 … 12] bits of 32/64-bit TLS DESC GOT entry absolute address |
|
116 |
|
[11 … 0] bits of 32/64-bit TLS DESC GOT entry absolute address |
|
117 |
|
[51 … 32] bits of 64-bit TLS DESC GOT entry absolute address |
|
118 |
|
[63 … 52] bits of 64-bit TLS DESC GOT entry absolute address |
|
119 |
|
Used on ld.[wd] for TLS DESC to get the resolve function address from GOT entry |
|
120 |
|
Used on jirl for TLS DESC to call the resolve function |
|
121 |
|
[31 … 12] bits of TLS LE 32/64-bit offset from TP register, can be relaxed |
|
122 |
|
TLS LE thread pointer usage, can be relaxed |
|
123 |
|
[11 … 0] bits of TLS LE 32/64-bit offset from TP register, sign-extended, can be relaxed. |
|
124 |
|
22-bit PC-relative offset to TLS LD GOT entry |
|
125 |
|
22-bit PC-relative offset to TLS GD GOT entry |
|
126 |
|
22-bit PC-relative offset to TLS DESC GOT entry |
|
Variable | Description |
---|---|
|
Runtime address of the symbol in the relocation entry |
|
The address of the instruction to be relocated |
|
Base address of an object loaded into the memory |
|
The address of the symbol in the relocation entry |
|
Addend field in the relocation entry associated with the symbol |
|
The address of GOT (Global Offset Table) |
|
GOT-relative offset of the GOT entry of a symbol. For tls LD/GD symbols, G is always equal to GD. |
|
TP-relative offset of a TLS LE/IE symbols |
|
GOT-relative offset of the GOT entry of a TLS IE symbol |
|
GOT-relative offset of the GOT entry of a TLS LD/GD/DESC symbol. If a symbol is referenced by IE, GD/LD and DESC simultaneously, this symbol has five GOT entries. The first two are for GD/LD; the next two are for DESC; the last one is for IE. |
|
The address of PLT entry of a function symbol |
As a RISC architecture, LoongArch is limited in the range of memory addresses that can be encoded and accessed with a single instruction. Several code models are defined as schemes to implement memory accesses in different circumstances with sequences of instructions of necessary addressing capabilities and performance costs.
Generally speaking, wider addressing range requires more instructions and brings higher overhead. The performance and size of an application can benefit from a code model that does not overestimate the memory space accessed by the code.
The normal code model allows the code to address a 4GiB PC-relative memory
space [(PC & ~0xfff)-2GiB-0x800, (PC & ~0xfff)+2GiB-0x800)
for data accesses and
256MiB PC-relative addressing space [PC-128MiB, PC+128MiB-4]
for function calls.
This is the default code model.
The following example shows how to load value from a global 32-bit integer
variable g1
in this code model:
00: pcalau12i $t0, %pc_hi20(g1) 0: R_LARCH_PCALA_HI20 g1 04: ld.w $a0, $t0, %pc_lo12(g1) 4: R_LARCH_PCALA_LO12 g1
The following example shows how to make function calls in this code model:
00: bl %plt(puts) 0: R_LARCH_B26 puts
For data accesses, the medium code model behaves the same as the normal code model.
For function calls, this code model allows the code to address a 256GiB PC-relative
memory space [PC-128GiB-0x20000, PC+128GiB-0x20000-4]
.
The following example shows how to make a function call to foo
in this code model:
00: pcaddu18i $ra, %call36(foo) 0: R_LARCH_CALL36 foo 04: jirl $ra, $ra, 0
The extreme code model uses sequence pcalau12i + addi.d + lu32i.d + lu52i.d
followed by {ld,st}x.[bhwd]
or {add,ldx}.d + jirl
to address the full 64-bit
memory space for data accesses and function calls, respectively.
Note
|
Instructions pcalau12i , addi.d , lu32i.d and lu52i.d must be
adjancent so that the linker can infer the PC of pcalau12i to apply
relocations to lu32i.d and lu52i.d . Otherwise, the results would be
incorrect if these four instructions are not in the same 4KiB page.
|
The following example shows how to load a value from a global 32-bit integer
variable g2
in this code model:
00: pcalau12i $t1, %pc_hi20(g2) 0: R_LARCH_PCALA_HI20 g2 04: addi.d $t0, $zero, %pc_lo12(g2) 4: R_LARCH_PCALA_LO12 g2 08: lu32i.d $t0, %pc64_lo20(g2) 8: R_LARCH_PCALA64_LO20 g2 0c: lu52i.d $t0, $t0, %pc64_hi12(g2) c: R_LARCH_PCALA64_HI12 g2 10: ldx.w $a0, $t1, $t0
The following example shows how to make a call to function bar
in this code model:
00: pcalau12i $t1, %pc_hi20(bar) 0: R_LARCH_PCALA_HI20 bar 04: addi.d $t0, $zero, %pc_lo12(bar) 4: R_LARCH_PCALA_LO12 bar 08: lu32i.d $t0, %pc64_lo20(bar) 8: R_LARCH_PCALA64_LO20 bar 0c: lu52i.d $t0, $t0, %pc64_hi12(bar) c: R_LARCH_PCALA64_HI12 bar 10: add.d $t0, $t0, $t1 14: jirl $ra, $t0, 0
-
[SysVelf] System V Application Binary Interface - DRAFT, 10 Jun. 2013, http://www.sco.com/developers/gabi/latest/contents.html