Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue & Deque segfault upon GC #114

Closed
orls opened this issue May 10, 2018 · 8 comments
Closed

Queue & Deque segfault upon GC #114

orls opened this issue May 10, 2018 · 8 comments
Assignees
Labels

Comments

@orls
Copy link

orls commented May 10, 2018

Queue & Deque can segfault when GC is triggered:

<?php

$q = new \Ds\Queue();

for ($i = 0; $i < 100 ; $i++) {
    $q->push(123);
    gc_collect_cycles();
}

Observed on:

  • php 7.2.5 & ds 1.2.5, Mac & linux (via docker for mac);
  • php 7.1.14 & ds 1.2.3, Mac

Not observed on php 7.1.4 & ds 1.2.3 (using this docker image mentioned in another issue).

I can avoid this locally by altering php_ds_queue_get_gc as follows (and similarlyphp_ds_dueue_get_gc):

     ds_queue_t *queue = Z_DS_QUEUE_P(obj);

     *gc_data  = queue->deque->buffer;
-    *gc_count = (int) queue->deque->capacity;
+    *gc_count = (int) queue->deque->size;

     return NULL;
 }

...but I'm not confident that this is a proper fix, since the bug seems to indicate that the spare capacity in the buffer is mis-initialized. Perhaps something about the lower-level mem mgmt used by deque reallocation has changed in PHP?

@orls
Copy link
Author

orls commented May 10, 2018

Backtrace:

(lldb) target create "php"
Current executable set to 'php' (x86_64).
(lldb) settings set -- target.run-args  "queue-segfault.php"
(lldb) run
Process 28473 launched: '/usr/local/bin/php' (x86_64)
Process 28473 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x000000010090dde3 php`gc_mark_grey(ref=0xbfdffd17d03fb088) at zend_gc.c:510
   507 					while (zv != end) {
   508 						if (Z_REFCOUNTED_P(zv)) {
   509 							ref = Z_COUNTED_P(zv);
-> 510 							GC_REFCOUNT(ref)--;
   511 							gc_mark_grey(ref);
   512 						}
   513 						zv++;
Target 0: (php) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x000000010090dde3 php`gc_mark_grey(ref=0xbfdffd17d03fb088) at zend_gc.c:510
    frame #1: 0x000000010090d523 php`gc_mark_roots at zend_gc.c:583
    frame #2: 0x000000010090cfc5 php`zend_gc_collect_cycles at zend_gc.c:1057
    frame #3: 0x00000001008f7c21 php`zif_gc_collect_cycles(execute_data=0x0000000103c21120, return_value=0x00007fff5fbfdaa0) at zend_builtin_functions.c:356
    frame #4: 0x0000000100995782 php`ZEND_DO_ICALL_SPEC_RETVAL_UNUSED_HANDLER(execute_data=0x0000000103c21030) at zend_vm_execute.h:573
    frame #5: 0x00000001009368a4 php`execute_ex(ex=0x0000000103c21030) at zend_vm_execute.h:59723
    frame #6: 0x0000000100936aaa php`zend_execute(op_array=0x0000000103c80600, return_value=0x0000000000000000) at zend_vm_execute.h:63760
    frame #7: 0x00000001008cd922 php`zend_execute_scripts(type=8, retval=0x0000000000000000, file_count=3) at zend.c:1496
    frame #8: 0x00000001008213ed php`php_execute_script(primary_file=0x00007fff5fbfe8a8) at main.c:2590
    frame #9: 0x00000001009f399d php`do_cli(argc=2, argv=0x00007fff5fbfefd0) at php_cli.c:1011
    frame #10: 0x00000001009f292c php`main(argc=2, argv=0x00007fff5fbfefd0) at php_cli.c:1404
    frame #11: 0x00007fffab124235 libdyld.dylib`start + 1
(lldb) fr var
(zend_refcounted *) ref = 0xbfdffd17d03fb088
(HashTable *) ht = 0x0000000000000000
(Bucket *) p = 0x0000000103aa542c
(Bucket *) end = 0x00007fff5fbfd9b0
(zval *) zv = 0x0000000103c78000
(zend_object_get_gc_t) get_gc = 0x0000000103ab21c0 (ds.so`php_ds_deque_get_gc at php_deque_handlers.c:120)
(zend_object *) obj = 0x0000000103c02780
(int) n = 64
(zval *) zv = 0x0000000103c78290
(zval *) end = 0x0000000103c783f0
(zval) tmp = {
  value = {
    lval = 4357891968
    dval = 2.1530847096763009E-314
    counted = 0x0000000103c02780
    str = 0x0000000103c02780
    arr = 0x0000000103c02780
    obj = 0x0000000103c02780
    res = 0x0000000103c02780
    ref = 0x0000000103c02780
    ast = 0x0000000103c02780
    zv = 0x0000000103c02780
    ptr = 0x0000000103c02780
    ce = 0x0000000103c02780
    func = 0x0000000103c02780
    ww = (w1 = 62924672, w2 = 1)
  }
  u1 = {
    v = (type = '\b', type_flags = '\x04', const_flags = '\0', reserved = '\0')
    type_info = 1032
  }
  u2 = {
    next = 0
    cache_slot = 0
    lineno = 0
    num_args = 0
    fe_pos = 0
    fe_iter_idx = 0
    access_flags = 0
    property_guard = 0
    extra = 0
  }
}

@rtheunissen rtheunissen self-assigned this May 10, 2018
@rtheunissen
Copy link
Member

rtheunissen commented May 10, 2018

I'm not confident that this is a proper fix, since the bug seems to indicate that the spare capacity in the buffer is mis-initialized.

I think you're absolutely right here. I'm assuming that the GC is stepping across the buffer but encounters an uninitialized zval, and fails. I don't think using size is safe here because the values in the buffer aren't necessarily positioned between 0 and size-1. It would be interesting to see what happens when you push and unshift, which would put a value at 0 and another at size-1, with uninitialized values in between. Your solution worked because you only ever pushed, which would result in values at 0...size-1.

The only solution I see here is to investigate how the buffer is allocated. Might be worth taking a look at how some of the SPL structures do this.

It's also worth noting that Queue uses a Deque internally so fixing Deque's GC will fix Queue as well.

@orls
Copy link
Author

orls commented May 10, 2018

Hm – upon further investigation, I can't (yet) recreate this on the official docker php image. I'll try to dig deeper to see if it's connected to another extension.

@rtheunissen
Copy link
Member

I haven't encountered this myself, but I'll try to reproduce this locally today and see what happens. 👍

@rtheunissen
Copy link
Member

rtheunissen commented May 10, 2018

@orls keep in mind that unallocated memory is unpredictable, and this might only segfault sometimes.

@rtheunissen
Copy link
Member

rtheunissen commented May 10, 2018

I have a feeling we're not clearing out on reallocation.

Current:

/**
 * Reallocates a zval buffer to a specified length.
 */
#define REALLOC_ZVAL_BUFFER(ptr, n)          \
do {                                         \
    ptr = erealloc(ptr, (n) * sizeof(zval)); \
} while (0)

SPL uses safe_erealloc and a memset to '\0' for the extension of the buffer. @krakjoe / @nikic is this something we should definitely be doing as well?

See https://lxr.room11.org/xref/php-src%40master/ext/spl/spl_fixedarray.c#123

@orls
Copy link
Author

orls commented May 10, 2018

keep in mind that unallocated memory is unpredictable, and this might only segfault sometimes.

Indeed. Fiddling with extensions does seem to affect it, but I doubt that's related to the particular extensions involved, it'll just be happenstance of memory allocations over time.

Here's the best recreation I can get, by restricting memory & allocating some stuff first. Using latest official php docker image (7.2.5)

<?php

ini_set('memory_limit','8M');

$junk = [];
for ($i = 0; $i < 1024; $i++) {
    $junk []= random_bytes(1024);
}

$q = new \Ds\Queue();
for ($i = 0; $i < 100 ; $i++) {
    $q->push(123);
    gc_collect_cycles();
    echo $q->count() . PHP_EOL;
}

@rtheunissen
Copy link
Member

This has now been fixed and released as 1.2.6 on PECL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants