Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support building with tcc compiler #63

Closed
rick-masters opened this issue Dec 28, 2023 · 11 comments · Fixed by #64
Closed

Support building with tcc compiler #63

rick-masters opened this issue Dec 28, 2023 · 11 comments · Fixed by #64
Labels
enhancement New feature or request

Comments

@rick-masters
Copy link
Contributor

The live-bootstrap project must compile Fiwix with tcc because gcc is not available until much later.
Note that the tcc used to build Fiwix must be patched to handle the physical / virtual addresss scheme used by Fiwix.
With gcc, the address scheme is handled by a linker script but tcc does not support linker scripts.

In the forthcoming PR, documentation is provided in docs/tcc.txt which explains where to get tcc, how to patch it, and how to build Fiwix.

The following is an explanation of the various changes to support tcc.
Some of these changes are significant and so I am open to discussing better alternatives.

Makefile:

  • Created new CCEXE variable to specify the gcc or tcc compiler
  • Moved CONFFLAGS to CC so that all -D flags in the same place
  • Created compiler-specific flags for CC, LD, and LDFLAGS
  • tcc does not use CPP because it does not use linker script, see fiwix.ld below
  • tcc does not have a VERSION variable by default, so I created one
  • tcc does not have a separate linker so we use tcc (which requires ARCH) for LD
  • tcc does not support elf_386, nostartfiles, or nodefaultlibs
  • tcc does not support linker script, so specify text address manually

drivers/block/ata.c:

  • tcc does not support 64-bit division, so replace with bit shifts

fiwix.ld:

  • tcc must use _start instead of start so I changed linker script to use _start
  • tcc does not support custom sections or the linker script at all therefore:
    The _kstack section is not available, so I eliminate changing the kernel stack.
    The kernel stack is kept at 0xF000 to 0x10000 as specified originally in setup_kernel.
    I hestitated to remove all the _kstack related code but honestly, changing the kernel
    stack does not appear to have an clear purpose and Fiwix appears to work fine without
    doing that.
    include/fiwix/asm.h:
  • tcc does not allow specifying register order so hard code the order in assembly

include/fiwix/config.h:

  • Make support for 64-bit printk types optional because tcc does not support 64-bit division

kernel/boot.S:

  • Since tcc does not support linker script custom addresses, physical addresses in the .setup section must be computed manually
  • SAVE_ALL - preserve ebx
  • tcc does not recognize pushal. Both tcc and gcc recognize pusha so that is used instead.
  • An .align 4 was removed because it didn't appear necessary and with tcc it pads code with nulls which caused problems.
  • do_switch: preserve ebx
  • tlbinfo: preserve ebx

kernel/init.c:

  • INIT_TRAMPOLINE needed to be bigger because tcc produced larger code

lib/printk.c

  • Make support for 64-bit printk types optional because tcc does not support 64-bit division

mm/memory.c

  • memmove is normally included into compiled code (I believe) for copying structures but tcc excludes it with our compile options. (gcc still includes it). So, I added in an explicit memmove function for tcc.
@mikaku
Copy link
Owner

mikaku commented Dec 29, 2023

Wow!, I'm a bit scared for this amount of changes in the core. This will require a lot of testing.

Some questions:

  1. I see you removed the kernel stack lines in the linker script, but I don't see where you defined the new kernel stack location. I expected to see it in boot.S but it's not there.
  2. In asm.h, why you need to move the arguments of USER_SYSCALL in the order %eax, %ecx, %edx and %ebx, instead of using the natural order? Is this something related the way tcc works?
  3. In _start (in boot.S) I want to disable interrupts (cli) just from the beginning, but it seems to me that it won't happen for tcc compilations. I think that cli should be placed right above of the line #ifdef __TINYC__.
  4. You say SAVE_ALL - preserve ebx but I don't see any change affecting the %ebx register in SAVE_ALL macro.
  5. I'm surprised with the change in do_switch. If %ebx was been clobbered until now, shouldn't this create a more visible malfunction?
  6. Where is memmove() used? (also, this type of functions are defined in lib/strings.c).

Changes:

  1. I'd change all the new lines with ((unsigned int)_end & 0xFFFFF000) by ((unsigned int)_end & PAGE_MASK), just for clarify.

@rick-masters
Copy link
Contributor Author

Wow!, I'm a bit scared for this amount of changes in the core. This will require a lot of testing.

Thankfully, we're in the final stretch. The only change remaining after this one is kexec for linux!

Some questions:

  1. I see you removed the kernel stack lines in the linker script, but I don't see where you defined the new kernel stack location. I expected to see it in boot.S but it's not there.

The kernel stack is left where it was set at the start of the kernel.
https://github.com/rick-masters/Fiwix/blob/40ef6e832a394a48a89d23377fc7db0889201678/kernel/boot.S#L115

I didn't see a reason to relocate the stack.

  1. In asm.h, why you need to move the arguments of USER_SYSCALL in the order %eax, %ecx, %edx and %ebx, instead of using the natural order? Is this something related the way tcc works?

Yes, tcc has a strange behavior of moving arguments into registers automatically, and in a strange order.

Consider the following syscall:

        USER_SYSCALL(SYS_open, "/dev/console", O_RDWR, 0);      /* stdin */

Here is how tcc compiles this:

   a:   b8 05 00 00 00          mov    $0x5,%eax
   f:   b9 02 00 00 00          mov    $0x2,%ecx
  14:   ba 00 00 00 00          mov    $0x0,%edx
  19:   bb 00 00 00 00          mov    $0x0,%ebx
  1e:   89 c0                   mov    %eax,%eax
  20:   89 c9                   mov    %ecx,%ecx
  22:   89 d2                   mov    %edx,%edx
  24:   89 db                   mov    %ebx,%ebx
  26:   cd 80                   int    $0x80

(Note the pointer to "/dev/console" appears as a zero in the instruction mov $0x0, %ebx. The zero is replaced later during linking stage.)

So, %0 is eax, %1 is ecx, %2 is edx, and %3 is ebx.
To match gcc and to be explicit I then move the %0, %1, %2, %3 arguments into the appropriate named registers. However, I can see how that does not make it easier to understand. I think a comment might be more appropriate than inserting code which is essentially redundant. The comment can explain that tcc moves the arguments into registers automatically and what order it uses.

I've changed the macro to the following:

#ifdef __TINYC__
/* tcc loads "r" (register) arguments automatically into registers using this order:
 * eax, ecx, edx, ebx
 * Therefore, we rearrange the arguments so they go into the correct registers.
 */
#define USER_SYSCALL(num, arg1, arg2, arg3)    \
        __asm__ __volatile__(                   \
                "int    $0x80\n\t"              \
                : /* no output */               \
                : "r"((unsigned int)num), "r"((unsigned int)arg2), "r"((unsigned int)arg3), "r"((unsigned int)arg1)     \
        );
#else
#define USER_SYSCALL(num, arg1, arg2, arg3)     \
        __asm__ __volatile__(                   \
                "movl   %0, %%eax\n\t"          \
                "movl   %1, %%ebx\n\t"          \
                "movl   %2, %%ecx\n\t"          \
                "movl   %3, %%edx\n\t"          \
                "int    $0x80\n\t"              \
                : /* no output */               \
                : "eax"((unsigned int)num), "ebx"((unsigned int)arg1), "ecx"((unsigned int)arg2), "edx"((unsigned int)arg3)     \
        );
#endif
  1. In _start (in boot.S) I want to disable interrupts (cli) just from the beginning, but it seems to me that it won't happen for tcc compilations. I think that cli should be placed right above of the line #ifdef __TINYC__.

I have made this change.

  1. You say SAVE_ALL - preserve ebx but I don't see any change affecting the %ebx register in SAVE_ALL macro.

Sorry, this was a mistake. The change in SAVE_ALL was replacing pushal/popal with pusha/popa.

  1. I'm surprised with the change in do_switch. If %ebx was been clobbered until now, shouldn't this create a more visible malfunction?

I was surprised by this as well. I had to look at the assembly to understand why tcc was having a problem but gcc was not. It turns out that gcc was either not using ebx or using it in a way that avoided problems but tcc uses ebx more often.

  1. Where is memmove() used? (also, this type of functions are defined in lib/strings.c).

tcc uses memmove to copy structures. Wherever a structure is assigned to another structure tcc inserts a call to memmove. Normally, memmove is included in the executable by linking with libtcc. However, we compile with -nostdlib -nostdinc which excludes libtcc.

I have moved memmove to lib/strings.c

Changes:

  1. I'd change all the new lines with ((unsigned int)_end & 0xFFFFF000) by ((unsigned int)_end & PAGE_MASK), just for clarify.

I have made this change.

@mikaku
Copy link
Owner

mikaku commented Dec 29, 2023

Thankfully, we're in the final stretch. The only change remaining after this one is kexec for linux!

Nice, it has been a long road. You did a lot of changes to fit Fiwix into the live-bootstrap project. Congratulations.

The kernel stack is left where it was set at the start of the kernel.
https://github.com/rick-masters/Fiwix/blob/40ef6e832a394a48a89d23377fc7db0889201678/kernel/boot.S#L115

I didn't see a reason to relocate the stack.

Ah yes, I missed that, sorry. Now I thought, shouldn't be better to point kernel stack at 0x10000-4 instead of 0x10000? I mean, just to make sure that it will reside in the page between 0xF000 and 0xFFFF. What do you think?.

Yes, tcc has a strange behavior of moving arguments into registers automatically, and in a strange order.
I think a comment might be more appropriate than inserting code which is essentially redundant.

Yes, the comment will clear up things. Thanks.

I was surprised by this as well. I had to look at the assembly to understand why tcc was having a problem but gcc was not. It turns out that gcc was either not using ebx or using it in a way that avoided problems but tcc uses ebx more often.

Somehow this change is good and sanitizes the code.

@rick-masters
Copy link
Contributor Author

Ah yes, I missed that, sorry. Now I thought, shouldn't be better to point kernel stack at 0x10000-4 instead of 0x10000? I mean, just to make sure that it will reside in the page between 0xF000 and 0xFFFF. What do you think?.

This shouldn't be necessary but there is no harm in doing so.

You can see from this pseudocode that the stack register is decremented before the value is stored in memory:
https://c9x.me/x86/html/file_module_x86_id_269.html

So, the first push will already go into 0xFFFC.

@mikaku
Copy link
Owner

mikaku commented Dec 29, 2023

You can see from this pseudocode that the stack register is decremented before the value is stored in memory:

This explains it is not necessary, indeed.

@mikaku
Copy link
Owner

mikaku commented Dec 29, 2023

Thank you very much.

@mikaku mikaku added the enhancement New feature or request label Dec 30, 2023
@mikaku
Copy link
Owner

mikaku commented Jan 7, 2024

I'm reviewing the code and I think that when Fiwix is built by the tcc compiler, it will show something like this:

[...]
             (built on Sat Jan  6 17:10:24 UTC 2024 with GCC tcc)
[...]

This is because the following line:

printk(" (built on %s with GCC %s)\n", UTS_VERSION, __VERSION__);

is the same on each compiler, it just changes the constant __VERSION__.

I think that a best approach would be:

  1. Remove the following line in the Makefile:

    Fiwix/Makefile

    Line 30 in 00974ef

    CC += -D__VERSION__=\"tcc\"
  2. Add and #ifdef in main.c and change the printf line by this block:
#ifdef __TINYC__
        printk("             (built on %s with tcc)\n", UTS_VERSION);
#else
        printk("             (built on %s with GCC %s)\n", UTS_VERSION, __VERSION__);
#endif 

Thoughts?

With this change we don't need to regenerate a new 1.5.0-lb1 version, it will just appear in the next 1.5.0-lb2 version.

@mikaku mikaku reopened this Jan 7, 2024
@rick-masters
Copy link
Contributor Author

Yes, your suggested change looks better to me.

@mikaku
Copy link
Owner

mikaku commented Jan 8, 2024

I've checked that this change did not make any difference when compiling Fiwix using GCC. Please, check if all is also correct when using TCC.

@rick-masters
Copy link
Contributor Author

It works fine with tcc.

@mikaku
Copy link
Owner

mikaku commented Jan 8, 2024

Perfect!
Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants