2.5 alpine thread stack size/level #196

Daniel-ltw · 2018-03-08T19:44:09Z

https://bugs.ruby-lang.org/issues/14387#change-70914

This is for people who might encounter this issue.

ncopa · 2018-03-09T10:02:35Z

Testcase:

# test.rb
n = 1000
res = {}
1.upto(n).to_a.inject(res) do |r, i|
  r[i] = {}
end

def f(x)
  x.each_value { |v| f(v) }
end

f(res)

Problem is the way ruby calculate stacksize for main thread and what musl reports as available via the non-portable pthread_getattr_np() call. Musl will give you the memory size that is guaranteed by kernel while glibc gives you the number you will "hopefully" get.

Daniel-ltw · 2018-03-11T19:26:10Z

@ncopa is there a config variable we could throw, during the make/compilation phase, to increase that size? I'm just wondering if this is a viable solution/workaround at this point.

ncopa · 2018-03-12T12:44:14Z

No, that is sort of the problem. The stack size is big enough, (sort of) but it the way musl reports it. A possible workaround would be to implement ruby's own pthread_getattr_np() and use that when current thread (pthread_self()) is the same as main thread. (syscall(SYS_gettid) == getpid()) and fallback to libc's phread_getattr_np() when its not main thread.

The code there is already messy as it is and "fixing" it or adding a workaround in ruby will make it worse 😞 What ruby tries to do here is somewhat controversial in the first place. Kernel could still deny ruby the stack memory it believes it has available.

Daniel-ltw · 2018-03-12T18:53:21Z

This is kind of frustrating as I have static code analysis failing on the stack size and I have fall back to ruby 2.4 for that.

The normal application test do run fine with 2.5.

ncopa · 2018-03-13T17:36:29Z

Possible workaround:

diff --git a/thread_pthread.c b/thread_pthread.c
index 951885ffa0..e2d662143b 100644
--- a/thread_pthread.c
+++ b/thread_pthread.c
@@ -721,7 +721,7 @@ ruby_init_stack(volatile VALUE *addr
         native_main_thread.register_stack_start = (VALUE*)bsp;
     }
 #endif
-#if MAINSTACKADDR_AVAILABLE
+#if MAINSTACKADDR_AVAILABLE && !(defined(__linux__) && !defined(__GLIBC__))
     if (native_main_thread.stack_maxsize) return;
     {
        void* stackaddr;
@@ -1680,7 +1680,7 @@ ruby_stack_overflowed_p(const rb_thread_t *th, const void *addr)
 
 #ifdef STACKADDR_AVAILABLE
     if (get_stack(&base, &size) == 0) {
-# ifdef __APPLE__
+# if defined(__APPLE__) || (defined(__linux__) && !defined(__GLIBC__))
        if (pthread_equal(th->thread_id, native_main_thread.id)) {
            struct rlimit rlim;
            if (getrlimit(RLIMIT_STACK, &rlim) == 0 && rlim.rlim_cur > size) {

It may look like __APPLE__ systems have similar problem? I need to investigate why they do the check if (pthread_equal(th->thread_id, native_main_thread.id)). Thats the logic we need: if current thread is main thread, then use getrlimit(RLIMIT_STACK)

ncopa · 2018-03-14T15:01:40Z

The workaround will omit the reserve_stack thing on linux. A proper fix based on http://www.openwall.com/lists/musl/2013/03/31/10:

diff --git a/thread_pthread.c b/thread_pthread.c
index 951885ffa0..d9814e789e 100644
--- a/thread_pthread.c
+++ b/thread_pthread.c
@@ -530,9 +530,6 @@ hpux_attr_getstackaddr(const pthread_attr_t *attr, void **addr)
 #   define MAINSTACKADDR_AVAILABLE 0
 # endif
 #endif
-#if MAINSTACKADDR_AVAILABLE && !defined(get_main_stack)
-# define get_main_stack(addr, size) get_stack(addr, size)
-#endif
 
 #ifdef STACKADDR_AVAILABLE
 /*
@@ -614,6 +611,54 @@ get_stack(void **addr, size_t *size)
     return 0;
 #undef CHECK_ERR
 }
+
+#if defined(__linux__) && !defined(__GLIBC__) && defined(HAVE_GETRLIMIT)
+
+#ifndef PAGE_SIZE
+#include <unistd.h>
+#define PAGE_SIZE sysconf(_SC_PAGE_SIZE)
+#endif
+
+static int
+get_main_stack(void **addr, size_t *size)
+{
+    size_t start, end, limit, prevend = 0;
+    struct rlimit r;
+    FILE *f;
+    char buf[PATH_MAX+80], s[8];
+    int n;
+
+    f = fopen("/proc/self/maps", "re");
+    if (!f)
+        return -1;
+    n = 0;
+    while (fgets(buf, sizeof buf, f)) {
+        n = sscanf(buf, "%zx-%zx %*s %*s %*s %*s %7s", &start, &end, s);
+        if (n >= 2) {
+            if (n == 3 && strcmp(s, "[stack]") == 0)
+                break;
+            prevend = end;
+        }
+        n = 0;
+    }
+    fclose(f);
+    if (n == 0)
+        return -1;
+
+    limit = 100 << 20; /* 100MB stack limit */
+    if (getrlimit(RLIMIT_STACK, &r)==0 && r.rlim_cur < limit)
+        limit = r.rlim_cur & -PAGE_SIZE;
+    if (limit > end) limit = end;
+    if (prevend < end - limit) prevend = end - limit;
+    if (start > prevend) start = prevend;
+    *addr = (void *)end;
+    *size = end - start;
+    return 0;
+}
+#else
+# define get_main_stack(addr, size) get_stack(addr, size)
+#endif
+
 #endif
 
 static struct {

Upstream bug reports: https://bugs.ruby-lang.org/issues/14387 docker-library/ruby#196

see docker-library/ruby#196

jottr · 2018-04-05T10:41:42Z

Any updates on this? Will this be fixed upstream instead?
Thank you for working on this!

ncopa · 2018-04-05T11:29:39Z

I did send the patch to https://bugs.ruby-lang.org/issues/14387#change-70914 but I haven't got any response. Maybe ask for response there?

The problem is not too complicated to understand, but understanding how to properly fix it is a bit complicated.

Daniel-ltw · 2018-04-06T03:44:31Z

@ncopa Should we get this patch applied at this level for now as it will still affect ruby 2.5.0 and ruby 2.5.1?

This is so that downstream users get the fix for now and it will not affect other OS users from this level.

ncopa · 2018-04-10T11:53:22Z

yes, i think this patch should be applied at this level.

Daniel-ltw · 2018-04-10T18:46:34Z

As for, when the ruby lang team wants to apply the patch, let them ponder on that, as you have already release what you think is an appropriate patch.

I guess all contributor now have to be conscious that there is this patch for ruby 2.5 that is needed.

tianon · 2018-05-16T17:49:40Z

I'm still a little bit wary of applying the patch at this level, especially given that upstream doesn't seem keen on it. 😞

Daniel-ltw · 2018-05-16T20:42:38Z

@tianon I guess this issue currently on affects alpine due to the use of musl c.

Upstream does not seem to be concern as majority of the users are on glibc which does not really affect them.

Based on @ncopa patch, this should work nicely with musl c as Yui Naruse said. As for other non glibc platforms, this might not be the right patch.

Daniel-ltw · 2018-06-07T03:15:56Z

@ncopa https://bugs.ruby-lang.org/issues/14387#note-19
New reply in regards to your patch.

The Alpine docker image has a smaller stack size which results in SystemStackError from being raised in situations where you have a deep call stack. This error was being raised in this repository when running Capybara tests involving rack-test and SASS compilation. Switching to Debian should fix this as it doesn't have this stack size problem. See more information on this issue here: docker-library/ruby#196 When switching to Debian, we're pulling in the yarn binary via their apt repository as default Debian doesn't provide an npm from which we could use `npm install -g yarn`. An additional problem is that the default Debian install doesn't support apt repositories on HTTPS urls. We first have to install the `apt-transport-https` before being able to use the yarn repo.

stephendolan · 2018-08-31T13:55:08Z

Is it possible to apply the patch from @ncopa within a Dockerfile using the ruby:2.5.1-alpine image? The larger debian images are compounding and making quite a difference in our CI builds.

t-anjan · 2018-10-01T07:43:14Z

When will this change get pushed to Docker Hub? I see here that the images on Docker Hub are still built using this commit from BEFORE the PR merge. The last time the official images were built was 11 days ago.

docker-library/ruby#196

yosifkit mentioned this issue Mar 8, 2018

Stack Level Too deep error when using alpine ruby image #197

Closed

Daniel-ltw mentioned this issue Mar 12, 2018

Alpine Ruby 2.5 Stack Size Issue Starefossen/docker-ruby-node#22

Open

algitbot pushed a commit to alpinelinux/aports that referenced this issue Mar 16, 2018

main/ruby: fix calculation of stack size for main thread

818d6e0

Upstream bug reports: https://bugs.ruby-lang.org/issues/14387 docker-library/ruby#196

voidpart pushed a commit to voidpart/dockerfiles that referenced this issue Mar 31, 2018

ruby: add patch for 2.5 alpine version

7da1ea1

see docker-library/ruby#196

wglambert added the Issue label Apr 25, 2018

yosifkit mentioned this issue May 23, 2018

Is there a plan to make an alpine build of mysql? docker-library/mysql#179

Closed

romatr mentioned this issue Jul 27, 2018

SystemStackError: stack level too deep rubocop/rubocop-rspec#665

Closed

adrianclay mentioned this issue Aug 8, 2018

Replace Alpine for Debian in our Dockerfile GovWifi/govwifi-admin#24

Merged

gruz0 mentioned this issue Aug 12, 2018

Активировать запуск RuboCop в Travis CI после исправления ошибок в работе с образом ruby-2.5:alpine burn-my-fat/web#30

Closed

tianon mentioned this issue Sep 20, 2018

Apply Alpine thread stack size patch #237

Merged

tianon closed this as completed in #237 Sep 28, 2018

ledermann added a commit to ledermann/docker-rails that referenced this issue Oct 3, 2018

Upgrade Ruby to 2.5.1

539322e

docker-library/ruby#196

ledermann added a commit to ledermann/docker-rails that referenced this issue Oct 4, 2018

Upgrade Ruby to 2.5.1

6ee9595

docker-library/ruby#196

ledermann added a commit to ledermann/docker-rails that referenced this issue Oct 4, 2018

Upgrade Ruby to 2.5.1

d1a2181

docker-library/ruby#196

yosifkit mentioned this issue Nov 1, 2018

[Alpine] Segfault #245

Closed

This was referenced Dec 28, 2018

Segmentation Fault (Segfault) on the ruby gem protocolbuffers/protobuf#4460

Closed

Cross building doesn't work for alpine3.7 rake-compiler/rake-compiler-dock#20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2.5 alpine thread stack size/level #196

2.5 alpine thread stack size/level #196

Daniel-ltw commented Mar 8, 2018

ncopa commented Mar 9, 2018

Uh oh!

Daniel-ltw commented Mar 11, 2018

Uh oh!

ncopa commented Mar 12, 2018

Uh oh!

Daniel-ltw commented Mar 12, 2018

Uh oh!

ncopa commented Mar 13, 2018

Uh oh!

ncopa commented Mar 14, 2018

Uh oh!

jottr commented Apr 5, 2018 •

edited

Loading

Uh oh!

ncopa commented Apr 5, 2018

Uh oh!

Daniel-ltw commented Apr 6, 2018

Uh oh!

ncopa commented Apr 10, 2018

Uh oh!

Daniel-ltw commented Apr 10, 2018

Uh oh!

tianon commented May 16, 2018

Uh oh!

Daniel-ltw commented May 16, 2018

Uh oh!

Daniel-ltw commented Jun 7, 2018

Uh oh!

stephendolan commented Aug 31, 2018

Uh oh!

t-anjan commented Oct 1, 2018

Uh oh!

2.5 alpine thread stack size/level #196

2.5 alpine thread stack size/level #196

Comments

Daniel-ltw commented Mar 8, 2018

ncopa commented Mar 9, 2018

Uh oh!

Daniel-ltw commented Mar 11, 2018

Uh oh!

ncopa commented Mar 12, 2018

Uh oh!

Daniel-ltw commented Mar 12, 2018

Uh oh!

ncopa commented Mar 13, 2018

Uh oh!

ncopa commented Mar 14, 2018

Uh oh!

jottr commented Apr 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ncopa commented Apr 5, 2018

Uh oh!

Daniel-ltw commented Apr 6, 2018

Uh oh!

ncopa commented Apr 10, 2018

Uh oh!

Daniel-ltw commented Apr 10, 2018

Uh oh!

tianon commented May 16, 2018

Uh oh!

Daniel-ltw commented May 16, 2018

Uh oh!

Daniel-ltw commented Jun 7, 2018

Uh oh!

stephendolan commented Aug 31, 2018

Uh oh!

t-anjan commented Oct 1, 2018

Uh oh!

jottr commented Apr 5, 2018 •

edited

Loading