Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Threading phase 2 #301

Merged
merged 4 commits into from
Oct 25, 2021
Merged

[WIP] Threading phase 2 #301

merged 4 commits into from
Oct 25, 2021

Conversation

lemaitre
Copy link
Contributor

Continuation of #282

The goal is to implement the pthread interface (or at least similar) for cosmopolitan.
The key observation that makes it possible to consider such a daunting task is the following:
It seems that there are really few part of the code that is OS dependent, namely OS thread creation, OS thread destruction, futex, and TLS.

The complex parts are built on top of those, and on top of atomics which are OS independent.

The goal of this PR is to focus on the portable abstraction, and make it works on Linux (because that's the platform I know).

Status:

  • thread creation
  • thread exit
  • detach
  • join
  • kill
  • cancel
  • mutex
  • rw_lock
  • semaphore
  • condition variable
  • TLS setup
  • TLS allocation
  • TLS for main thread
  • dynamic TLS
  • custom stack size
  • custom stack
  • thread-safe libc
  • actual pthread interface
  • anything I forgot

Copy link
Owner

@jart jart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge. Glad to see continued progress. I'll push a fixup for you after this for other platform breakages.

@@ -45,6 +45,7 @@ cosmo: push %rbp
pop %rax
#endif
call _init
call _main_thread_init # FIXME: use .init.start macro
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will show you how this can be fixed in a follow-up change.

The simplest right thing to do here, is say:

#if SupportsLinux()
        call    _main_thread_init
#endif 

Then put:

  if (!IsLinux()) return;

At the start of that function. Since

But ideally, we wouldn't want to link in threading initialization runtime if threading isn't used. So the idiomatic trick that Cosmopolitan uses to keep binaries minimal (i.e. you only pay for what you use) is what we call "yoinking". To do it in pure C you could have an individual file that looks like this:

static textstartup void cthread_init() {
  /* do stuff */
}

const void *const cthread_ctor[] initarray = {
    cthread_init,
};

Then, assuming the cthread library implements one function per file (a good practice in general) then any function which makes the assumption that cthread_init() was called, would simply put:

STATIC_YOINK("cthread_ctor");

At the top of the file. It's a code size saving technique compared to the more conventional alternative, of having every cthread API call cthread_init() at the beginning, and then putting static bool once; if (!lockcmpxchg(&once, false, true)) return; at the beginning of the init function.

The textstartup keyword is optional and basically asks the linker to relocate initialization code to the same section of the binary, so that fewer page faults occur during startup for large binaries.

Finally, there's the extreme code size saving technique of embedding code in the _init() function, which runs before all constructors. This has to be written in assembly and diverges from the System V ABI in order to make efficient use of the LODS and STOS instructions. The best example of this pattern is in libc/nexgen32e/kcpuids.S

: "rcx", "r11", "cc", "memory");
return rc;
}
return -1;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we normally do here is return enosys() but to be consistent with the above code you would likely want to return -ENOSYS;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're rigth, sorry for the quick and dirty.

int cthread_native_sem_init(cthread_native_sem_t* sem, int count) {
static void pause(int attempt) {
if (attempt < 16) {
for (int i = 0; i < (1 << attempt); ++i) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the exponential backoff latency in nanoseconds for my cpu, assuming 31 nanosecond pause:

0 = 31
1 = 62
2 = 124
3 = 248
4 = 496
5 = 992
6 = 1,984
7 = 3,968
8 = 7,936
9 = 15,872
10 = 31,744
11 = 63,488
12 = 126,976
13 = 253,952
14 = 507,904
15 = 1,015,808
16 = 2,031,616
17 = 4,063,232
18 = 8,126,464
19 = 16,252,928
20 = 32,505,856
21 = 65,011,712
22 = 130,023,424
23 = 260,046,848
24 = 520,093,696
25 = 1,040,187,392
26 = 2,080,374,784
27 = 4,160,749,568
28 = 8,321,499,136
29 = 16,642,998,272
30 = 33,285,996,544

After 6 you might consider switching to nanosleep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs tweaking, indeed.

@@ -342,6 +342,19 @@ SECTIONS {
/*END: Read Only Data (only needed for initialization) */
/*END: Read Only Data */
} :Rom

.tdata . : {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The contents of these might need to be moved into .data and .bss so as to not break APE on non-Linux. I'll take a look into it after merging.

Copy link
Contributor Author

@lemaitre lemaitre Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought just before .data would be fine, as it is just more data somehow. However, .tdata (and maybe .tbss) should most likely be kept in its own section because it has a special TLS flag for the ELF header.

I think it would also be nice to keep them together to enable a smarter init when cthread is disabled (see 91d7833#commitcomment-58693093).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does appear to be a TLS program header. Linux and FreeBSD support it for sure, so it could shave a few microseconds off startup time. OpenBSD probably doesn't and I would hope ignores it, but we'll have to see. I think you're doing the right thing for now, in setting it up manually.

@jart jart merged commit 45a7435 into jart:master Oct 25, 2021
jart added a commit that referenced this pull request Oct 25, 2021
Cosmopolitan Threads are currently Linux-only (with some NetBSD
and Windows support too!). This change ensures we only initialize
the high-level threading runtime when Cosmopolitan Threads are used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants