Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@kbaladurin
Copy link
Member

@kbaladurin kbaladurin commented Feb 8, 2018

If sigsegv_handler is called on original stack (for example, if segmentation fault occurs in native application's thread that hasn't alternate signal stack) we should call common_signal_handler directly othersize sigsegv_handler's stackframe will be corrupted.

+--------------------+
|                    |      <- ExecuteHandlerOnOriginalStack's frame
|                    | 
+--------------------+
|                    |
+--------------------+
|    returnPoint     |      <- sigsegv_handler's frame
+--------------------+
|                    |
+--------------------+
|                    |
|   saved registers  |  
|                    |
+--------------------+      <- SIGSEGV (original stack sp)
|                    |

Such kind of crash can be reproduced using following example:

using System;
using System.Threading;
using System.Runtime.InteropServices;


public class Test
{
    [DllImport("libtest.so")]
    internal static extern void SleepAndSegFault(int ms);

    static void Main()
    {
        SleepAndSegFault(200);

        while (true) {}
    }
}

libtest.c:

#include <stdlib.h>
#include <unistd.h>

void *TestTask(void *arg)
{
    int ms = *(int *)arg;
    usleep(ms*1024);
    int *p = NULL;
    *p = 1;
    return NULL;
}

extern void SleepAndSegFault(int ms)
{
    pthread_t TestThread;
    pthread_create(&TestThread, NULL, TestTask, (void *)&ms);
}

In some cases it can lead to stack smashing errors that are described in https://github.com/dotnet/coreclr/issues/16208

@kbaladurin
Copy link
Member Author

cc @Dmitri-Botcharnikov @alpencolt

@kbaladurin
Copy link
Member Author

@dotnet-bot test Ubuntu x64 Checked Innerloop Build and Test

@jkotas jkotas requested a review from janvorli February 8, 2018 12:10
@kbaladurin
Copy link
Member Author

@dotnet-bot test Alpine.3.6 x64 Debug Build

@janvorli
Copy link
Member

@kbaladurin thank you for looking into the issue. I am sorry for a delayed response, I was on vacation last week. I would prefer making the necessary change in a simpler way. I would check for the alternate stack in the sigsegv_handler and if the thread doesn't have the alternate stack installed, I would call common_signal_handler directly like e.g. the sigtrap_handler does instead of going through the unnecessary hoops.
However, there is a problem with detecting the alternate stack presence (even in your change). The sigaltstack function is not listed as an async-signal-safe function, so it should not be called from the SIGSEGV signal handler.
I was thinking a bit about how we can detect it in an async signal safe way and so far got two ideas:

  1. Use a bool __thread variable that PAL would set to true when it installs the alternate stack for a thread. But it seems that access to a __thread variable may call malloc if the TLS model is not set to initial-exec - see https://www.bountysource.com/issues/1326546-access-to-__thread-variable-may-call-malloc. And malloc is not async signal safe. But maybe we can change the TLS model for this new variable - I am not well familiar with all the consequences of the various TLS models, so this would have to be investigated.
  2. Keep a data structure containing all the alternate stack ranges PAL has created and use it to check if the __builtin_frame_address(0) is in one of these. This may be tricky since access to this data structure would have to be thread safe.

@kbaladurin
Copy link
Member Author

@janvorli thank you for suggestion! I missed that sigaltstack isn't async-signal-safe. Also we have problems with sigaltstack and TLS in other places:
sigaltstack:

Thread 6 "corerun" hit Breakpoint 1, sigaltstack () at ../sysdeps/unix/syscall-template.S:84
84	in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0  sigaltstack () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007ffff256a3d5 in EnsureSignalAlternateStack () at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/signal.cpp:147
#2  0x00007ffff2599066 in AllocatePalThread (ppThread=0x7ffff1ff7430) at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/init/sxs.cpp:115
#3  0x00007ffff2599204 in CreateCurrentThreadData () at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/init/sxs.cpp:95
#4  0x00007ffff254de31 in CorUnix::InternalGetCurrentThread () at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/include/pal/thread.hpp:727
#5  0x00007ffff256944d in CatchHardwareExceptionHolder::IsEnabled () at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/seh.cpp:347
#6  0x00007ffff25693d9 in SEHProcessException (exception=0x7ffff1ff7740) at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/seh.cpp:263
#7  0x00007ffff256c5df in common_signal_handler (code=11, siginfo=0x7fff440044b0, sigcontext=0x7fff44004380, numParams=2)
    at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/signal.cpp:888
#8  0x00007ffff256befd in signal_handler_worker (code=11, siginfo=0x7fff440044b0, context=0x7fff44004380, returnPoint=0x7fff44003e80)
    at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/signal.cpp:436
#9  0x00007ffff2651fe6 in CallSignalHandlerWrapper0 () at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/arch/amd64/callsignalhandlerwrapper.S:30
#10 0x00007ffff1ff9861 in TestTask (arg=0x7fffffffc7bc) at stackdump.c:23
#11 0x00007ffff79bd6ba in start_thread (arg=0x7ffff1ff8700) at pthread_create.c:333
#12 0x00007ffff6e5241d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

First access to TLS variable:

(gdb) bt
#0  __GI___libc_malloc (bytes=64) at malloc.c:2902
#1  0x00007ffff7de9ff8 in allocate_and_init (map=<optimized out>) at dl-tls.c:603
#2  tls_get_addr_tail (ti=0x7ffff6cfefc8, dtv=0x6ff140, the_map=0x62c140) at dl-tls.c:791
#3  0x00007ffff5d20908 in GetThread () at /media/kbaladurin/data/dotnet/forked/coreclr/src/vm/threads.inl:34
#4  0x00007ffff611695d in IsSafeToCallExecutionManager () at /media/kbaladurin/data/dotnet/forked/coreclr/src/vm/exceptionhandling.cpp:5129
#5  0x00007ffff610a737 in IsSafeToHandleHardwareException (contextRecord=0x7fff440048d0, exceptionRecord=0x7fff44004da0) at /media/kbaladurin/data/dotnet/forked/coreclr/src/vm/exceptionhandling.cpp:5182
#6  0x00007ffff65abeea in SEHProcessException (exception=0x7ffff1ff6660) at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/seh.cpp:235
#7  0x00007ffff65af02f in common_signal_handler (code=11, siginfo=0x7ffff1ff7a30, sigcontext=0x7ffff1ff7900, numParams=2)
    at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/signal.cpp:888
#8  0x00007ffff65ae94d in signal_handler_worker (code=11, siginfo=0x7ffff1ff7a30, context=0x7ffff1ff7900, returnPoint=0x7ffff1ff6e80)
    at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/exception/signal.cpp:436
#9  0x00007ffff669c946 in CallSignalHandlerWrapper0 () at /media/kbaladurin/data/dotnet/forked/coreclr/src/pal/src/arch/amd64/callsignalhandlerwrapper.S:30
#10 0x00007ffff1ff988f in TestTask (arg=0x7fffffffc7bc) at stackdump.c:26
#11 0x00007ffff79bd6ba in start_thread (arg=0x7ffff1ff8700) at pthread_create.c:333
#12 0x00007ffff6e5241d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

So we should fix this behavior too.

Seems like we can specify TLS model for particular variable using tls_model attribute, for example:

__thread int g_var __attribute__ ((tls_model("initial-exec"))) = 0;

In this case TLS was allocated before first access.

@kbaladurin
Copy link
Member Author

SEHProcessException's call leads to allocation of the CPalThread's object that causes setup alternate signal stack and TLS variable access. Should we call it for thread without alternate stack?

@janvorli
Copy link
Member

@kbaladurin thank you for finding the problematic cases we already have!
The first case (with CatchHardwareExceptionHolder::IsEnabled on the call stack) could be fixed by not calling the InternalGetCurrentThread, but using GetCurrentPalThread instead. That one doesn't allocate the thread data. When it returns nullptr, we know that there are no hardware exception handlers in the native code, since such handlers can only be on a thread that we have created or joined. It seems that we would want to also convert the GetCurrentPalThread from using pthread_getspecific to using a TLS variable with tls_models set to initial-exec (provided that it works for us) or convert the CatchHardwareExceptionHolder to use a TLS variable for the hardware exception holder count instead of using m_hardwareExceptionHolderCount on the current thread object.

@kbaladurin
Copy link
Member Author

@janvorli thank you! Seems like pthread_getspecific doesn't allocate any memory:

void *
__pthread_getspecific (pthread_key_t key)
{
  struct pthread_key_data *data;

  /* Special case access to the first 2nd-level block.  This is the
     usual case.  */
  if (__glibc_likely (key < PTHREAD_KEY_2NDLEVEL_SIZE))
    data = &THREAD_SELF->specific_1stblock[key];
  else
    {
      /* Verify the key is sane.  */
      if (key >= PTHREAD_KEYS_MAX)
	/* Not valid.  */
	return NULL;

      unsigned int idx1st = key / PTHREAD_KEY_2NDLEVEL_SIZE;
      unsigned int idx2nd = key % PTHREAD_KEY_2NDLEVEL_SIZE;

      /* If the sequence number doesn't match or the key cannot be defined
	 for this thread since the second level array is not allocated
	 return NULL, too.  */
      struct pthread_key_data *level2 = THREAD_GETMEM_NC (THREAD_SELF,
							  specific, idx1st);
      if (level2 == NULL)
	/* Not allocated, therefore no data.  */
	return NULL;

      /* There is data.  */
      data = &level2[idx2nd];
    }

  void *result = data->data;
  if (result != NULL)
    {
      uintptr_t seq = data->seq;

      if (__glibc_unlikely (seq != __pthread_keys[key].seq))
	result = data->data = NULL;
    }

  return result;
}

sysdeps/x86_64/nptl/tls.h:
# define THREAD_GETMEM_NC(descr, member, idx) \
  ({ __typeof (descr->member[0]) __value;				      \
     if (sizeof (__value) == 1)						      \
       asm volatile ("movb %%fs:%P2(%q3),%b0"				      \
		     : "=q" (__value)					      \
		     : "0" (0), "i" (offsetof (struct pthread, member[0])),   \
		       "r" (idx));					      \
     else if (sizeof (__value) == 4)					      \
       asm volatile ("movl %%fs:%P1(,%q2,4),%0"				      \
		     : "=r" (__value)					      \
		     : "i" (offsetof (struct pthread, member[0])), "r" (idx));\
     else								      \
       {								      \
	 if (sizeof (__value) != 8)					      \
	   /* There should not be any value with a size other than 1,	      \
	      4 or 8.  */						      \
	   abort ();							      \
									      \
	 asm volatile ("movq %%fs:%P1(,%q2,8),%q0"			      \
		       : "=r" (__value)					      \
		       : "i" (offsetof (struct pthread, member[0])),	      \
			 "r" (idx));					      \
       }								      \
     __value; })

sysdeps/arm/nptl/tls.h:
#define THREAD_GETMEM_NC(descr, member, idx) \
  descr->member[idx]

sysdeps/i386/nptl/tls.h:
# define THREAD_GETMEM_NC(descr, member, idx) \
  ({ __typeof (descr->member[0]) __value;				      \
     if (sizeof (__value) == 1)						      \
       asm volatile ("movb %%gs:%P2(%3),%b0"				      \
		     : "=q" (__value)					      \
		     : "0" (0), "i" (offsetof (struct pthread, member[0])),   \
		     "r" (idx));					      \
     else if (sizeof (__value) == 4)					      \
       asm volatile ("movl %%gs:%P1(,%2,4),%0"				      \
		     : "=r" (__value)					      \
		     : "i" (offsetof (struct pthread, member[0])),	      \
		       "r" (idx));					      \
     else								      \
       {								      \
	 if (sizeof (__value) != 8)					      \
	   /* There should not be any value with a size other than 1,	      \
	      4 or 8.  */						      \
	   abort ();							      \
									      \
	 asm volatile  ("movl %%gs:%P1(,%2,8),%%eax\n\t"		      \
			"movl %%gs:4+%P1(,%2,8),%%edx"			      \
			: "=&A" (__value)				      \
			: "i" (offsetof (struct pthread, member[0])),	      \
			  "r" (idx));					      \
       }								      \
     __value; })

So it's enough to fix CatchHardwareExceptionHolder::IsEnabled and GetThread (change tls model for gCurrentThreadInfo variable).

@janvorli
Copy link
Member

@kbaladurin the thing that makes me nervous about using pthread_getspecific is that it is not guaranteed to be async signal safe. So even though it is that way today, it doesn't have to be the same way tomorrow. Although it seems that it was actually fixed to be async signal safe in glibc three years ago, so it sounds unlikely that it would be changed back to non-safe in the future.
I have also looked at MUSL and FreeBSD and it also seems to be async signal safe there.
That means we are safe using the pthread_getspecific today.
But making all the places use __thread would be also nice from the consistency point of view, since the PAL is the only place where we use the pthread_set/getspecific . So I'd like to make such change, but not as part of your change.

@alpencolt
Copy link

cc @mlabiuk

@kbaladurin kbaladurin changed the title ExecuteHandlerOnOriginalStack: handle case when it is called on original stack sigsegv_handler: handle case when it is called on original stack Feb 13, 2018
@kbaladurin
Copy link
Member Author

@janvorli thank you! It's better to replace pthread_set/getspecific by __thread in this case. I could make this changes in separate PR.


// Use should use initial-exec TLS model for this variable because access to it should be async-signal-safe as we
// want to use it in common_signal_handler
static __thread bool g_has_alternate_stack __attribute__ ((tls_model("initial-exec"))) = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this a bit and I've figured we actually don't need to a add a new variable. We can use the result of the GetCurrentPalThread() - if it returns null, the current thread is foreign and so it doesn't have our alternate stack. If it returns non-null, it has the alternate stack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I've removed unnecessary variable.

contextInitialization = false;
ExecuteHandlerOnOriginalStack(code, siginfo, context, &returnPoint);
_ASSERTE(FALSE); // The ExecuteHandlerOnOriginalStack should never return
if (GetCurrentPalThread())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last nit - the returnPoint structure is quite large due to the CONTEXT it contains. That unnecessarily adds to the stack of foreign threads, which might cause a stack overflow on distros with very small default stack, e.g. Alpine, in case when the SIGSEGV happens at a point when the foreign thread has only a low amount of stack space left. So it would be nice to make the check before the volatile bool contextInitialization = true and allocate the SignalHandlerWorkerReturnPoint using alloca only for the case when we need it. Btw, the address returned by alloca would need to be manually aligned to alignof(SignalHandlerWorkerReturnPoint) and allocate sizeof(SignalHandlerWorkerReturnPoint) + alignof(SignalHandlerWorkerReturnPoint) - 1 bytes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for suggestion!

Konstantin Baladurin added 3 commits February 14, 2018 17:21
InternalGetCurrentThread in IsEnabled method.

InternalGetCurrentThread tries to create pal thread if it doesn't
exist for the current thread. It's unnecessary because in this case
there are no hardware exception handlers for such thread.

Also CatchHardwareExceptionHolder::IsEnable is called from signal
handlers and during pal thread creation non-async-signal-safe
function are called.
We should use initial-exec tls model to avoid memory allocations
during first access to this variable because it may ocuur in
signal handlers.
If sigsegv_handler is called on original stack (for example, if segmentation
fault occurs in native application's thread that hasn't alternate signal stack)
we should call common_signal_handler directly othersize sigsegv_handler's
stackframe will be corrupted.
Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants