Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Disable drd, helgrind in vmmalloc_fork tests #2328

Merged
merged 1 commit into from
Nov 6, 2017

Conversation

GBuella
Copy link
Contributor

@GBuella GBuella commented Nov 3, 2017

Ref: pmem/issues#639
Ref: pmem/issues#665

Apparently valgrind just can't handle the way jemalloc
uses mutexes across forks, as demonstrated by this
dummy program:

$ cat dummy.c
pthread_mutex_t dummy = PTHREAD_MUTEX_INITIALIZER;

static void prefork(void)
{
	pthread_mutex_lock(&dummy);
}

static void postfork_child(void)
{
	pthread_mutexattr_t attr;

	if (pthread_mutexattr_init(&attr) != 0)
	        abort();
	if (pthread_mutex_init(&dummy, &attr) != 0) {
	        pthread_mutexattr_destroy(&attr);
	        abort();
	}
	pthread_mutexattr_destroy(&attr);
}

static void postfork_parent(void)
{
	pthread_mutex_unlock(&dummy);
}

int main()
{
	pthread_atfork(prefork,
	    postfork_parent, postfork_child);
	fork();
	pthread_mutex_lock(&dummy);
	pthread_mutex_unlock(&dummy);
}
$ cc -pthread dummy.c -o dummy
$ valgrind --tool=helgrind ./dummy 2>&1 | grep holds
==26890== Thread #1: Exiting thread still holds 1 lock
$ valgrind --tool=drd ./dummy 2>&1 | grep -i mutex
==26898== Mutex reinitialization:
		mutex 0x30a040, recursion count 1, owner 1.
==26898==    at 0x4C385F0: pthread_mutex_init
==26898== mutex 0x30a040 was first observed at:
==26898==    at 0x4C390D3: pthread_mutex_lock
==26898== Recursive locking not allowed:
		mutex 0x30a040, recursion count 1, owner 1.
==26898==    at 0x4C390D3: pthread_mutex_lock
==26898== mutex 0x30a040 was first observed at:
==26898==    at 0x4C390D3: pthread_mutex_lock

This change is Reviewable

Ref: pmem/issues#639
Ref: pmem/issues#665

Apparently valgrind just can't handle the way jemalloc
uses mutexes across forks, as demonstrated by this
dummy program:

$ cat dummy.c
pthread_mutex_t dummy = PTHREAD_MUTEX_INITIALIZER;

static void prefork(void)
{
	pthread_mutex_lock(&dummy);
}

static void postfork_child(void)
{
	pthread_mutexattr_t attr;

	if (pthread_mutexattr_init(&attr) != 0)
	        abort();
	if (pthread_mutex_init(&dummy, &attr) != 0) {
	        pthread_mutexattr_destroy(&attr);
	        abort();
	}
	pthread_mutexattr_destroy(&attr);
}

static void postfork_parent(void)
{
	pthread_mutex_unlock(&dummy);
}

int main()
{
	pthread_atfork(prefork,
	    postfork_parent, postfork_child);
	fork();
	pthread_mutex_lock(&dummy);
	pthread_mutex_unlock(&dummy);
}
$ cc -pthread dummy.c -o dummy
$ valgrind --tool=helgrind ./dummy 2>&1 | grep holds
==26890== Thread pmem#1: Exiting thread still holds 1 lock
$ valgrind --tool=drd ./dummy 2>&1 | grep -i mutex
==26898== Mutex reinitialization:
		mutex 0x30a040, recursion count 1, owner 1.
==26898==    at 0x4C385F0: pthread_mutex_init
==26898== mutex 0x30a040 was first observed at:
==26898==    at 0x4C390D3: pthread_mutex_lock
==26898== Recursive locking not allowed:
		mutex 0x30a040, recursion count 1, owner 1.
==26898==    at 0x4C390D3: pthread_mutex_lock
==26898== mutex 0x30a040 was first observed at:
==26898==    at 0x4C390D3: pthread_mutex_lock
@pbalcer
Copy link
Member

pbalcer commented Nov 3, 2017

should we report hg issue?

@GBuella
Copy link
Contributor Author

GBuella commented Nov 3, 2017

should we report hg issue?

I don't know.
If we find whether POSIX or Linux man pages say using mutexes across fork this way is supported or not, we can decide.

@krzycz
Copy link
Contributor

krzycz commented Nov 3, 2017

pthread_mutex_init in postfork_child is really weird. I have no idea why the mutex needs to be re-initialized.


Reviewed 2 of 2 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

@pbalcer
Copy link
Member

pbalcer commented Nov 3, 2017

Because it was locked by the parent, and you can't just call pthread_mutex_unlock (it's a different thread technically speaking).

@krzycz
Copy link
Contributor

krzycz commented Nov 3, 2017

So what? The entire process state/memory is cloned, including the mutex, so you only need to unlock it.

Do you mean the mutex internally holds its current owner ID and you cannot unlock mutex from the thread that is not an owner? In general you may be right, but in practice owner is used mainly to handle recursive mutexes and ROBUST/ERRORCHECK types, but for normal mutexes (like those used in jemalloc) it probably doesn't matter.
I did some tests and unlocking mutex in child process works fine.


Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

@GBuella
Copy link
Contributor Author

GBuella commented Nov 3, 2017

For example, thread id is going to be different in the new process, so if the implementation relies on that, unlock would would be troublesome.

@GBuella
Copy link
Contributor Author

GBuella commented Nov 3, 2017

And the same thing goes on with rwlocks BTW

@pbalcer
Copy link
Member

pbalcer commented Nov 3, 2017

yep, @krzycz what you are suggesting is relying on an implementation detail. The value of that mutex, as far as the new process is concerned, is undefined.
And I'm pretty sure valgrind will complain either way.

@krzycz
Copy link
Contributor

krzycz commented Nov 3, 2017

I don't think its value is undefined. For me, it's exactly the same as in the parent process. The behavior however is in fact undefined - it works for normal pthread mutexes, but it might not for other mutex types/implementations.
Anyway, if we know for sure there is nothing wrong here, I think we can safely suppress drd/helgrind warnings and/or disable those tools in these particular test cases.


Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

@krzycz
Copy link
Contributor

krzycz commented Nov 3, 2017

:lgtm:


Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

@pbalcer
Copy link
Member

pbalcer commented Nov 6, 2017

What I meant is that, yes, you have your bytes, but you are missing the context in which those values have any meaning. Thus, the state of the mutex is undefined.

@krzycz krzycz merged commit 7302b15 into pmem:master Nov 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants