Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken IO in multithreaded apps #791

Open
plzombie opened this issue Jan 1, 2022 · 15 comments
Open

Broken IO in multithreaded apps #791

plzombie opened this issue Jan 1, 2022 · 15 comments
Labels
bug CRTL C run-time library

Comments

@plzombie
Copy link

plzombie commented Jan 1, 2022

Hello. I found what if I call IO functions in separate threads (and more than one thread use IO functions at the same time), this thread crashed. Sometimes it prints error "Semaphore unlocked by wrong owner" or "Thread has no thread-specific data" before crashing.

I created test program what make many threads and each thread write data in separate file. Program was compiled with -bm flag. In very rare cases everything works fine, but usually threads crashed and some of them printed "Thread has no thread-specific data". Here is a link to test program

@jmalak
Copy link
Member

jmalak commented Jan 2, 2022

Visual studio C compiler and Watcom/Open Watcom compilers report some problem with memory.
Do you have enough memory for executable to work with such number of thread and stack size?
anyway return value from _beginthreadex is -1L if it fails, that you don't detect failure properly.

@plzombie
Copy link
Author

plzombie commented Jan 2, 2022

I checked again my example and it looks what maximum number of concurrently using descriptors is 64. I updated my gist, now it should work in VC (it even work in MINGW with vc6 runtime). And it works under OW... sometimes. It means OW have same restriction to simultaneously 64 opened files somethere.

I have 16 threaded cpu, if I set number of threads to 20, sometimes (50/50) I got error:

Thread 0 started
Thread 2 started
Thread 1 started
Thread 7 started
Thread 2 failed to fill file testdata2.bin
Semaphore unlocked by wrong owner
Thread 9 started
Thread has no thread-specific data
Thread has no thread-specific data
Thread has no thread-specific data
Thread has no thread-specific data
Thread has no thread-specific data
Thread has no thread-specific data
Thread has no thread-specific data

Semaphore unlocked by wrong owner
Same error I got in real multithreaded apps.
In rare cases it doesn't even print error and just hang

@plzombie
Copy link
Author

plzombie commented Jan 2, 2022

image
Working fine, print error and hang... 3 in a row

If it hangs there are usually corrupted files, like data what should be written in descriptor a written in descriptor b

@jmalak
Copy link
Member

jmalak commented Jan 2, 2022

your new sample code doesn't exhibit any problem.
below is directory file listing

02.01.2022  13:38           262 144 testdata0.bin
02.01.2022  13:38           262 144 testdata1.bin
02.01.2022  13:38           262 144 testdata10.bin
02.01.2022  13:38           262 144 testdata11.bin
02.01.2022  13:38           262 144 testdata12.bin
02.01.2022  13:38           262 144 testdata13.bin
02.01.2022  13:38           262 144 testdata14.bin
02.01.2022  13:38           262 144 testdata15.bin
02.01.2022  13:38           262 144 testdata16.bin
02.01.2022  13:38           262 144 testdata17.bin
02.01.2022  13:38           262 144 testdata18.bin
02.01.2022  13:38           262 144 testdata19.bin
02.01.2022  13:38           262 144 testdata2.bin
02.01.2022  13:38           262 144 testdata3.bin
02.01.2022  13:38           262 144 testdata4.bin
02.01.2022  13:38           262 144 testdata5.bin
02.01.2022  13:38           262 144 testdata6.bin
02.01.2022  13:38           262 144 testdata7.bin
02.01.2022  13:38           262 144 testdata8.bin
02.01.2022  13:38           262 144 testdata9.bin

no problem with any thread.
I attached output for 100 sequential test, no memory problems.
lst.txt

I don't know how you compile it. I am using following command for Windows.
wcl386 -bm -bt=nt -fm t1.c

@jmalak
Copy link
Member

jmalak commented Jan 2, 2022

I think the source of problem is incorrect use of WaitForMultipleObjects(THREADS_NUM, threads, TRUE, INFINITE); It supposes some synchronization setup for threads.
I added Sleep(1000) after it to have a time to finish all threads and you can see that all treads are finished before program end but for your original code it looks like not all threads are started or finished before program end.
lst.txt

@plzombie
Copy link
Author

plzombie commented Jan 2, 2022

No, there is nothing wrong with WaitForMultipleObjects. I created simpliest example. It just reopen files in infinite loop. https://gist.github.com/plzombie/9af6f2aaa8ec7a8b12dbc86e8296fd33
And for me program crashes. I use latest OW build from github

@jmalak
Copy link
Member

jmalak commented Jan 2, 2022

I did a some testing and it looks like some mistake in CRTL, because I get a run-time message "Semaphore unlocked by wrong owner".
I will check locking commands on file operations, it looks like some is unpaired under specific conditions.

@jmalak jmalak added the bug label Jan 2, 2022
@jmalak
Copy link
Member

jmalak commented Feb 3, 2022

Please, could you recheck with latest version of OW V2.
The problem was indirectly caused by bug in Code Generator, issue #805
Generated code for CRTL was afected by this bug.

@plzombie
Copy link
Author

plzombie commented Feb 4, 2022

Hello. "Thread has no thread-specific data" is gone (although I can reproduce it in my first example). But I still get "Semaphore unlocked by wrong owner" error.

@jmalak
Copy link
Member

jmalak commented Feb 4, 2022

Thanks for check.
Hm, it is strange. I checked it on my system and I didn't get any problem, except if number of thread is too high then reported a message. I will try to investigate.
Please, what exact host OS version you are using?
I tested it on old 32-bit Windows XP with SP3.

@plzombie
Copy link
Author

plzombie commented Feb 4, 2022

Windows 10 x64 21H2, 8core cpu. I will try it on my old system later, maybe I can reproduce

@jmalak
Copy link
Member

jmalak commented Feb 4, 2022

I will try to check it on Windows 10, because OW can use some old construct (from Windows NT 3/4 era) which has problem with Windows 10 WoW64.

@jmalak jmalak added the CRTL C run-time library label Feb 4, 2022
@plzombie
Copy link
Author

plzombie commented Feb 4, 2022

I do more tests. I used my first and last example.
Win10 x64 2c - reproduced
Win10 x86 2c4t - reproduced. For 2nd example dumps occured (see https://disk.yandex.ru/d/Kgi0NdSP6LzYUA, I compiled it with 03.02.2022 build)
ReactOS 0.4.14, x86 1c - first is not reproduced, 2nd is reproduced. Also has stack dumps
I will try to check it also on p3 with winxp later

@plzombie
Copy link
Author

plzombie commented Feb 5, 2022

Tested on WinXP SP3 (p3) - can reproduce 2nd example

@jmalak
Copy link
Member

jmalak commented Feb 5, 2022

I will do more investigation.
Anyway in OW is limit for 64 thread per internal allocated memory block, but it looks like there is some mistake in block handling. it is reason for problems when number of threads are over 64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug CRTL C run-time library
Projects
None yet
Development

No branches or pull requests

2 participants