Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RV64_DYNAREC] Fix some typos in docs and dynarec/rv64 #1758

Merged
merged 5 commits into from
Aug 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/COMPILE.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ Using a 64bit OS:

Using a 64bit OS:

Caution: please use gcc-11 or higher, older gcc dosen't know cortex-a78ae
Caution: please use gcc-11 or higher, older gcc doesn't know cortex-a78ae
```
-D TEGRA_T234=1 -D CMAKE_BUILD_TYPE=RelWithDebInfo
```
Expand Down Expand Up @@ -211,7 +211,7 @@ If you encounter some linking errors, try using `NOLOADADDR=ON` (`cmake -D NOLOA

### Use ccmake

Alternatively, you can **use the curses-bases ccmake (or any other gui frontend for cmake)** to select wich platform to use interactively.
Alternatively, you can **use the curses-based ccmake (or any other gui frontend for cmake)** to select which platform to use interactively.

### Customize your build

Expand Down Expand Up @@ -245,7 +245,7 @@ You need to add `-DWITH_MOLD=1` if GNU ld is extremely slow. Then run `mold -run

#### Build a statically linked box64

You can now build box64 staticaly linked, with `-DSTATICBUILD`. This is to use inside an x86_64 chroot. Note that this version of box64 will have just the minimum of wrapped libs. So only libc, libm and libpthread basically are wrapped. Other libs (like libGL or libvulkan, SDL2, etc...) will not be wrapped and x86_64 version will be used. It's designed to be used in docker image, or in headless server.
You can now build box64 statically linked, with `-DSTATICBUILD`. This is to use inside an x86_64 chroot. Note that this version of box64 will have just the minimum of wrapped libs. So only libc, libm and libpthread basically are wrapped. Other libs (like libGL or libvulkan, SDL2, etc...) will not be wrapped and x86_64 version will be used. It's designed to be used in docker image, or in headless server.
Also, the Static Build is highly experimental, but feedback are always welcomed.

----
Expand Down
26 changes: 13 additions & 13 deletions docs/USAGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Env. var with * can also be put inside box64rc files.
Box64 look for 2 places for rcfile: `/etc/box64.box64rc` and `~/.box64rc`
The second takes precedence to the first, on an APP level
(that means if an [MYAPP] my appears in both file, only the settings in `~/.box64rc` will be applied)
There is also some égeneric" name, like [*SETUP*] that will be applied to every program containg "setup" in the name
There is also some égeneric" name, like [*SETUP*] that will be applied to every program containing "setup" in the name
(Note that this is not a full regex rules, it's just a name between '[*' and '*]', nothing else)

#### BOX64_LOG *
Expand Down Expand Up @@ -43,7 +43,7 @@ Enables/Disables the logging of `dlsym` errors.
#### BOX64_TRACE_FILE *
Send all log and trace to a file instead of `stdout`
Also, if name contains `%pid` then this is replaced by the actual PID of box64 instance
End the filename with `+` to have thetrace appended instead of overwritten
End the filename with `+` to have the trace appended instead of overwritten
Use `stderr` to use this instead of default `stdout`

#### BOX64_TRACE *
Expand Down Expand Up @@ -97,7 +97,7 @@ Show Segfault signal even if a signal handler is present
* 1 : Show SIGSEGV detail, even if a signal handler is present

#### BOX64_SHOWBT *
Show some Backtrace (Nativ e and Emulated) whgen a signal (SEGV, ILL or BUS) is caught
Show some Backtrace (Native and Emulated) when a signal (SEGV, ILL or BUS) is caught
* 0 : Don"t show backtraces (Default.)
* 1 : Show Backtrace detail (for native, box64 is rename as the x86_64 binary run)

Expand Down Expand Up @@ -151,7 +151,7 @@ Forbid dynablock creation in the interval specified (helpful for debugging behav
#### BOX64_DYNAREC_TEST *
Dynarec will compare it's execution with the interpreter (super slow, only for testing)
* 0 : No comparison. (Default.)
* 1 : Each opcode runs on interepter and on Dynarec, and regs and memory are compared and print if different.
* 1 : Each opcode runs on interpreter and on Dynarec, and regs and memory are compared and print if different.
* 2 : Thread-safe tests, extremely slow.
* 0xXXXXXXXX-0xYYYYYYYY : define the interval where dynarec is tested (inclusive-exclusive)

Expand Down Expand Up @@ -203,7 +203,7 @@ Optimisation of CALL/RET opcodes (not compatible with jit/dynarec/smc)
#### BOX64_DYNAREC_ALIGNED_ATOMICS *
Generated code for aligned atomics only
* 0 : The code generated can handle unaligned atomics (Default)
* 1 : Generated code only for aligned atomics (faster and less code generated, but will SEGBUS if LOCK prefix is unsed on unaligned data)
* 1 : Generated code only for aligned atomics (faster and less code generated, but will SEGBUS if LOCK prefix is unused on unaligned data)

#### BOX64_DYNAREC_BLEEDING_EDGE *
Detect MonoBleedingEdge and apply conservative settings
Expand Down Expand Up @@ -290,7 +290,7 @@ Box64 will use wrapped libs even if the lib is specified with absolute path
* 1 : Use Wrapped native libs even if path is absolute

#### BOX64_PREFER_EMULATED *
Box64 will prefer emulated libs first (execpt for glibc, alsa, pulse, GL, vulkan and X11
Box64 will prefer emulated libs first (except for glibc, alsa, pulse, GL, vulkan and X11
* 0 : Native libs are preferred (Default.)
* 1 : Emulated libs are preferred (Default for program running inside pressure-vessel)

Expand Down Expand Up @@ -321,14 +321,14 @@ Disables the load of vulkan libraries.
* 1 : Disables the load of vulkan libraries, both the native and the i386 version (can be useful on Pi4, where the vulkan driver is not quite there yet.)

#### BOX64_SHAEXT *
Expose or not SHAEXT (a.k.a. SHA_NI) capabilites
* 0 : Do not expose SHAEXT capabilites
* 1 : Expose SHAEXT capabilites (Default.)
Expose or not SHAEXT (a.k.a. SHA_NI) capabilities
* 0 : Do not expose SHAEXT capabilities
* 1 : Expose SHAEXT capabilities (Default.)

#### BOX64_SSE42 *
Expose or not SSE 4.2 capabilites
* 0 : Do not expose SSE 4.2 capabilites (default when libjvm is detected)
* 1 : Expose SSE 4.2 capabilites (Default.)
Expose or not SSE 4.2 capabilities
* 0 : Do not expose SSE 4.2 capabilities (default when libjvm is detected)
* 1 : Expose SSE 4.2 capabilities (Default.)

#### BOX64_FUTEX_WAITV *
Use of the new fuext_waitc syscall
Expand Down Expand Up @@ -361,7 +361,7 @@ Define x86_64 bash to launch script
`set waiting=0` to exit the infinite loop.
* 2 : Launch `gdbserver` when a segfault, bus error or illegal instruction signal is trapped, attached to the offending process, and go in an endless loop, waiting.
Use `gdb /PATH/TO/box64` and then `target remote 127.0.0.1:1234` to connect to the gdbserver (or use actual IP if not on the machine). After that, the procedure is the same as with ` BOX64_JITGDB=1`.
This mode can be usefullwhen programs redirect all console output to a file (like Unity3D Games)
This mode can be usefull when programs redirect all console output to a file (like Unity3D Games)
* 3 : Launch `lldb` when a segfault, bus error or illegal instruction signal is trapped, attached to the offending process and go in an endless loop, waiting.

#### BOX64_NORCFILES
Expand Down
10 changes: 5 additions & 5 deletions docs/box64.pod
Original file line number Diff line number Diff line change
Expand Up @@ -205,14 +205,14 @@ Disable handling of SigILL (to ease debugging mainly).

Show Segfault signal even if a signal handler is present

* 0 : Don"t force show the SIGSEGV analysis (Default.)
* 0 : Don't force show the SIGSEGV analysis (Default.)
* 1 : Show SIGSEGV detail, even if a signal handler is present

=item B<BOX64_SHOWBT>=I<0|1>

Show some Backtrace (Nativ e and Emulated) whgen a signal (SEGV, ILL or BUS) is caught
Show some Backtrace (Nativ e and Emulated) when a signal (SEGV, ILL or BUS) is caught

* 0 : Don"t show backtraces (Default.)
* 0 : Don't show backtraces (Default.)
* 1 : Show Backtrace detail (for native, box64 is rename as the x86_64 binary run)

=item B<BOX64_X11THREADS>=I<0|1>
Expand Down Expand Up @@ -407,7 +407,7 @@ Box64 will use wrapped libs even if the lib is specified with absolute path

=item B<BOX64_PREFER_EMULATED>=I<0|1>

Box64 will prefer emulated libs first (execpt for glibc, alsa, pulse, GL,
Box64 will prefer emulated libs first (except for glibc, alsa, pulse, GL,
vulkan and X11

* 0 : Native libs are preferred (Default.)
Expand Down Expand Up @@ -465,7 +465,7 @@ script. yyyy needs to be a full path to a valid x86_64 version of bash

* 0 : Just print the Segfault message on segfault (default)
* 1 : Launch `gdb` when a segfault, bus error or illegal instruction signal is trapped, attached to the offending process and go in an endless loop, waiting. When in gdb, you need to find the correct thread yourself (the one with `my_box64signalhandler` in is stack) then probably need to `finish` 1 or 2 functions (inside `usleep(..)`) and then you'll be in `my_box64signalhandler`, just before the printf of the Segfault message. Then simply `set waiting=0` to exit the infinite loop.
* 2 : Launch `gdbserver` when a segfault, bus error or illegal instruction signal is trapped, attached to the offending process, and go in an endless loop, waiting. Use `gdb /PATH/TO/box64` and then `target remote 127.0.0.1:1234` to connect to the gdbserver (or use actual IP if not on the machine). After that, the procedure is the same as with ` BOX64_JITGDB=1`. This mode can be usefullwhen programs redirect all console output to a file (like Unity3D Games)
* 2 : Launch `gdbserver` when a segfault, bus error or illegal instruction signal is trapped, attached to the offending process, and go in an endless loop, waiting. Use `gdb /PATH/TO/box64` and then `target remote 127.0.0.1:1234` to connect to the gdbserver (or use actual IP if not on the machine). After that, the procedure is the same as with ` BOX64_JITGDB=1`. This mode can be usefull when programs redirect all console output to a file (like Unity3D Games)

=back

Expand Down
2 changes: 1 addition & 1 deletion src/dynarec/rv64/dynarec_rv64_00_3.c
Original file line number Diff line number Diff line change
Expand Up @@ -1181,7 +1181,7 @@ uintptr_t dynarec64_00_3(dynarec_rv64_t* dyn, uintptr_t addr, uintptr_t ip, int
&& dyn->insts[ninst-1].x64.addr
&& *(uint8_t*)(dyn->insts[ninst-1].x64.addr)==0xB8
&& *(uint32_t*)(dyn->insts[ninst-1].x64.addr+1)==0) {
// hack for some protection that check a divide by zero actualy trigger a divide by zero exception
// hack for some protection that check a divide by zero actually trigger a divide by zero exception
MESSAGE(LOG_INFO, "Divide by 0 hack\n");
GETIP(ip);
STORE_XEMU_CALL(x3);
Expand Down
2 changes: 1 addition & 1 deletion src/dynarec/rv64/dynarec_rv64_0f.c
Original file line number Diff line number Diff line change
Expand Up @@ -1718,7 +1718,7 @@ uintptr_t dynarec64_0F(dynarec_rv64_t* dyn, uintptr_t addr, uintptr_t ip, int ni
NOTEST(x1);
MV(A1, xRAX);
CALL_(my_cpuid, -1, 0);
// BX and DX are not synchronized durring the call, so need to force the update
// BX and DX are not synchronized during the call, so need to force the update
LD(xRDX, xEmu, offsetof(x64emu_t, regs[_DX]));
LD(xRBX, xEmu, offsetof(x64emu_t, regs[_BX]));
break;
Expand Down
2 changes: 1 addition & 1 deletion src/dynarec/rv64/dynarec_rv64_660f.c
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,7 @@ uintptr_t dynarec64_660F(dynarec_rv64_t* dyn, uintptr_t addr, uintptr_t ip, int

ADDI(x5, xEmu, offsetof(x64emu_t, scratch));

// perserve gd
// preserve gd
LD(x3, gback, gdoffset + 0);
LD(x4, gback, gdoffset + 8);
SD(x3, x5, 0);
Expand Down
2 changes: 1 addition & 1 deletion src/dynarec/rv64/dynarec_rv64_f0.c
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ uintptr_t dynarec64_F0(dynarec_rv64_t* dyn, uintptr_t addr, uintptr_t ip, int ni

GETREX();

// TODO: Add support for unligned memory access for all the LOCK ones.
// TODO: Add support for unaligned memory access for all the LOCK ones.
// TODO: Add support for BOX4_DYNAREC_ALIGNED_ATOMICS.

switch(opcode) {
Expand Down
8 changes: 4 additions & 4 deletions src/dynarec/rv64/dynarec_rv64_helper.c
Original file line number Diff line number Diff line change
Expand Up @@ -888,7 +888,7 @@ int extcache_st_coherency(dynarec_rv64_t* dyn, int ninst, int a, int b)
return i1;
}

// On step 1, Float/Double for ST is actualy computed and back-propagated
// On step 1, Float/Double for ST is actually computed and back-propagated
// On step 2-3, the value is just read for inst[...].n.neocache[..]
// the reg returned is *2 for FLOAT
int x87_do_push(dynarec_rv64_t* dyn, int ninst, int s1, int t)
Expand Down Expand Up @@ -2207,7 +2207,7 @@ static void fpuCacheTransform(dynarec_rv64_t* dyn, int ninst, int s1, int s2, in
extcache_t cache = dyn->e;
int s1_val = 0;
int s2_val = 0;
// unload every uneeded cache
// unload every unneeded cache
// check SSE first, than MMX, in order, for optimisation issue
for (int i = 0; i < 16; ++i) {
int j = findCacheSlot(dyn, ninst, EXT_CACHE_SS, i, &cache);
Expand Down Expand Up @@ -2374,7 +2374,7 @@ void CacheTransform(dynarec_rv64_t* dyn, int ninst, int cacheupd, int s1, int s2

void rv64_move32(dynarec_rv64_t* dyn, int ninst, int reg, int32_t val, int zeroup)
{
// Depending on val, the following insns are emitted.
// Depending on val, the following insts are emitted.
// val == 0 -> ADDI
// lo12 != 0 && hi20 == 0 -> ADDI
// lo12 == 0 && hi20 != 0 -> LUI
Expand Down Expand Up @@ -2423,7 +2423,7 @@ void fpu_reflectcache(dynarec_rv64_t* dyn, int ninst, int s1, int s2, int s3)

void fpu_unreflectcache(dynarec_rv64_t* dyn, int ninst, int s1, int s2, int s3)
{
// need to undo the top and stack tracking that must not be reflected permenatly yet
// need to undo the top and stack tracking that must not be reflected permanently yet
x87_unreflectcache(dyn, ninst, s1, s2, s3);
}

Expand Down
2 changes: 1 addition & 1 deletion src/dynarec/rv64/dynarec_rv64_helper.h
Original file line number Diff line number Diff line change
Expand Up @@ -1398,7 +1398,7 @@ void emit_shld16c(dynarec_rv64_t* dyn, int ninst, rex_t rex, int s1, int s2, uin
void emit_pf(dynarec_rv64_t* dyn, int ninst, int s1, int s3, int s4);

// x87 helper
// cache of the local stack counter, to avoid upadte at every call
// cache of the local stack counter, to avoid update at every call
int x87_stackcount(dynarec_rv64_t* dyn, int ninst, int scratch);
// restore local stack counter
void x87_unstackcount(dynarec_rv64_t* dyn, int ninst, int scratch, int count);
Expand Down
14 changes: 7 additions & 7 deletions src/dynarec/rv64/dynarec_rv64_private.h
Original file line number Diff line number Diff line change
Expand Up @@ -87,10 +87,10 @@ typedef struct flagcache_s {

typedef struct instruction_rv64_s {
instruction_x64_t x64;
uintptr_t address; // (start) address of the arm emitted instruction
uintptr_t address; // (start) address of the riscv emitted instruction
uintptr_t epilog; // epilog of current instruction (can be start of next, or barrier stuff)
int size; // size of the arm emitted instruction
int size2; // size of the arm emitted instrucion after pass2
int size; // size of the riscv emitted instruction
int size2; // size of the riscv emitted instruction after pass2
int pred_sz; // size of predecessor list
int *pred; // predecessor array
uintptr_t mark[3];
Expand All @@ -107,8 +107,8 @@ typedef struct instruction_rv64_s {
uint16_t ymm0_out; // the ymmm0 at th end of the opcode
uint16_t ymm0_pass2, ymm0_pass3;
int barrier_maybe;
flagcache_t f_exit; // flags status at end of intruction
extcache_t e; // extcache at end of intruction (but before poping)
flagcache_t f_exit; // flags status at end of instruction
extcache_t e; // extcache at end of instruction (but before poping)
flagcache_t f_entry; // flags status before the instruction begin
uint8_t vector_sew;
} instruction_rv64_t;
Expand All @@ -120,8 +120,8 @@ typedef struct dynarec_rv64_s {
uintptr_t start; // start of the block
uint32_t isize; // size in byte of x64 instructions included
void* block; // memory pointer where next instruction is emitted
uintptr_t native_start; // start of the arm code
size_t native_size; // size of emitted arm code
uintptr_t native_start; // start of the riscv code
size_t native_size; // size of emitted riscv code
uintptr_t last_ip; // last set IP in RIP (or NULL if unclean state) TODO: move to a cache something
uint64_t* table64; // table of 64bits value
int table64size;// size of table (will be appended at end of executable code)
Expand Down
14 changes: 7 additions & 7 deletions src/dynarec/rv64/rv64_emitter.h
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ f28–31 ft8–11 FP temporaries Caller
#define BLTU(rs1, rs2, imm13) EMIT(B_type(imm13, rs2, rs1, 0b110, 0b1100011))
#define BGEU(rs1, rs2, imm13) EMIT(B_type(imm13, rs2, rs1, 0b111, 0b1100011))

// TODO: Find a better way to have conditionnal jumps? Imm is a relative jump address, so the the 2nd jump needs to be addapted
// TODO: Find a better way to have conditionnal jumps? Imm is a relative jump address, so the the 2nd jump needs to be adapted
#define BEQ_safe(rs1, rs2, imm) \
if ((imm) > -0x1000 && (imm) < 0x1000) { \
BEQ(rs1, rs2, imm); \
Expand Down Expand Up @@ -605,7 +605,7 @@ f28–31 ft8–11 FP temporaries Caller
#define FSGNJS(rd, rs1, rs2) EMIT(R_type(0b0010000, rs2, rs1, 0b000, rd, 0b1010011))
// move rs1 to rd
#define FMVS(rd, rs1) FSGNJS(rd, rs1, rs1)
// store rs1 with oposite rs2 sign bit to rd
// store rs1 with opposite rs2 sign bit to rd
#define FSGNJNS(rd, rs1, rs2) EMIT(R_type(0b0010000, rs2, rs1, 0b001, rd, 0b1010011))
// -rs1 => rd
#define FNEGS(rd, rs1) FSGNJNS(rd, rs1, rs1)
Expand All @@ -619,7 +619,7 @@ f28–31 ft8–11 FP temporaries Caller
#define FMVWX(frd, rs1) EMIT(R_type(0b1111000, 0b00000, rs1, 0b000, frd, 0b1010011))
// Convert from signed 32bits to Single
#define FCVTSW(frd, rs1, rm) EMIT(R_type(0b1101000, 0b00000, rs1, rm, frd, 0b1010011))
// Convert from Single to signed 32bits (trucated)
// Convert from Single to signed 32bits (truncated)
#define FCVTWS(rd, frs1, rm) EMIT(R_type(0b1100000, 0b00000, frs1, rm, rd, 0b1010011))

#define FADDS(frd, frs1, frs2) EMIT(R_type(0b0000000, frs2, frs1, 0b000, frd, 0b1010011))
Expand All @@ -644,7 +644,7 @@ f28–31 ft8–11 FP temporaries Caller
#define FCVTLS(rd, frs1, rm) EMIT(R_type(0b1100000, 0b00010, frs1, rm, rd, 0b1010011))
// Convert from Single to unsigned 64bits
#define FCVTLUS(rd, frs1, rm) EMIT(R_type(0b1100000, 0b00011, frs1, rm, rd, 0b1010011))
// onvert from Single to signed 32/64bits (trucated)
// Convert from Single to signed 32/64bits (truncated)
#define FCVTSxw(rd, frs1, rm) EMIT(R_type(0b1100000, rex.w ? 0b00010 : 0b00000, frs1, rm, rd, 0b1010011))

// RV32D
Expand All @@ -664,7 +664,7 @@ f28–31 ft8–11 FP temporaries Caller
#define FSGNJD(rd, rs1, rs2) EMIT(R_type(0b0010001, rs2, rs1, 0b000, rd, 0b1010011))
// move rs1 to rd
#define FMVD(rd, rs1) FSGNJD(rd, rs1, rs1)
// store rs1 with oposite rs2 sign bit to rd
// store rs1 with opposite rs2 sign bit to rd
#define FSGNJND(rd, rs1, rs2) EMIT(R_type(0b0010001, rs2, rs1, 0b001, rd, 0b1010011))
// -rs1 => rd
#define FNEGD(rd, rs1) FSGNJND(rd, rs1, rs1)
Expand Down Expand Up @@ -939,15 +939,15 @@ f28–31 ft8–11 FP temporaries Caller


// Zbc
// Carry-less multily (low-part)
// Carry-less multiply (low-part)
#define CLMUL(rd, rs1, rs2) EMIT(R_type(0b0000101, rs2, rs1, 0b001, rd, 0b0110011))
// Carry-less multiply (high-part)
#define CLMULH(rd, rs1, rs2) EMIT(R_type(0b0000101, rs2, rs1, 0b011, rd, 0b0110011))
// Carry-less multiply (reversed)
#define CLMULR(rd, rs1, rs2) EMIT(R_type(0b0000101, rs2, rs1, 0b010, rd, 0b0110011))

// Zbs
// encoding of the "imm" on RV64 use a slight different mask, but it will work using R_type with high bit of imm ovewriting low bit op func
// encoding of the "imm" on RV64 use a slight different mask, but it will work using R_type with high bit of imm overwriting low bit op func
// Single-bit Clear (Register)
#define BCLR(rd, rs1, rs2) EMIT(R_type(0b0100100, rs2, rs1, 0b001, rd, 0b0110011))
// Single-bit Clear (Immediate)
Expand Down
2 changes: 1 addition & 1 deletion src/dynarec/rv64/rv64_prolog.S
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//arm prologue for dynarec
//riscv prologue for dynarec
//Save stuff, prepare stack and register
//called with pointer to emu as 1st parameter
//and address to jump to as 2nd parameter
Expand Down