Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detecting problems with zero-length vectors? #4

Open
tdhock opened this issue Jun 9, 2020 · 10 comments
Open

detecting problems with zero-length vectors? #4

tdhock opened this issue Jun 9, 2020 · 10 comments

Comments

@tdhock
Copy link

tdhock commented Jun 9, 2020

in binsegRcpp there is an argument IntegerVector max_segments which could have zero length and cause problems. how can we detect that using RcppDeepState and tell the user to change that?

here is the related rcpp-devel thread http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2020-June/010465.html

from R internals
1.13.1 Internals of RallocThe memory used byR_allocis allocated as R vectors, of typeRAWSXP. Thus the allocationis in units of 8 bytes, and is rounded up. A request for zero bytes currently returnsNULL(butthis should not be relied on). For historical reasons, in all other cases 1 byte is added beforerounding up so the allocation is always 1–8 bytes more than was asked for: again this shouldnot be relied on.The vectors allocated are protected via the setting ofR_VStack, as the garbage collectormarks everything that can be reached from that location. When a vector isR_allocated, itsATTRIBpointer is set to the currentR_VStack, andR_VStackis set to the latest allocation.ThusR_VStackis a single-linked chain of the vectors currently allocated viaR_alloc. Functionvmaxsetresets the locationR_VStack, and should be to a value that has previously be obtainedviavmaxget: allocations after the value was obtained will no longer be protected and henceavailable for garbage collection.

@tdhock
Copy link
Author

tdhock commented Jun 9, 2020

@tdhock
Copy link
Author

tdhock commented Jun 9, 2020

@tdhock
Copy link
Author

tdhock commented Jun 10, 2020

Writing R extensions https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-valgrind says
we should maybe try --with-valgrind-instrumentation=2
Since this memory all belongs to R, valgrind would not (and did not) detect the problem in an uninstrumented build of R.

@tdhock
Copy link
Author

tdhock commented Jun 10, 2020

this program which uses new/delete

#include <stdlib.h>
#include <stdio.h>

int read_new(int i){
  int* ptr = new int[0];
  int x = ptr[i];
  delete[] ptr;
  return x;
}

main(){
  read_new(0);
}

saved in one_past_end.c, compiled via

g++ -g -o one_past_end one_past_end.c

causes an invalid read when run through valgrind

(base) tdhock@maude-MacBookPro:~/R/RcppDeepState$ valgrind ./one_past_end
==8854== Memcheck, a memory error detector
==8854== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8854== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==8854== Command: ./one_past_end
==8854== 
==8854== Invalid read of size 4
==8854==    at 0x1086F7: read_new(int) (one_past_end.c:7)
==8854==    by 0x108721: main (one_past_end.c:13)
==8854==  Address 0x5b7dc80 is 0 bytes after a block of size 0 alloc'd
==8854==    at 0x4C3089F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8854==    by 0x1086DE: read_new(int) (one_past_end.c:6)
==8854==    by 0x108721: main (one_past_end.c:13)
==8854== 
==8854== 
==8854== HEAP SUMMARY:
==8854==     in use at exit: 0 bytes in 0 blocks
==8854==   total heap usage: 2 allocs, 2 frees, 72,704 bytes allocated
==8854== 
==8854== All heap blocks were freed -- no leaks are possible
==8854== 
==8854== For counts of detected and suppressed errors, rerun with: -v
==8854== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
(base) tdhock@maude-MacBookPro:~/R/RcppDeepState$ 

and when I use the same function via Rcpp, I do get an invalid read from valgrind. there are lots of false positive messages but the important one is read_new(int) (rcpp_interface.cpp:15) after the read_new(1). it finally crashes with a segfault after read_new(10^7)

(base) tdhock@maude-MacBookPro:~/R/R-4.0.0$ R -d valgrind -e 'for(i in 10^(0:10)){print(i);binsegRcpp::read_new(i)}'
==8380== Memcheck, a memory error detector
==8380== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8380== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==8380== Command: /home/tdhock/lib/R/bin/exec/R -e for(i~+~in~+~10^(0:10)){print(i);binsegRcpp::read_new(i)}
==8380== 
==8380== Conditional jump or move depends on uninitialised value(s)
==8380==    at 0x55AC9E0: __wcsnlen_sse4_1 (strlen.S:147)
==8380==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8380==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8380==    by 0x50BBD27: wcstombs (stdlib.h:154)
==8380==    by 0x50BBD27: tre_parse_bracket_items (tre-parse.c:336)
==8380==    by 0x50BBD27: tre_parse_bracket (tre-parse.c:453)
==8380==    by 0x50BBD27: tre_parse (tre-parse.c:1380)
==8380==    by 0x50B37B8: tre_compile (tre-compile.c:1920)
==8380==    by 0x50B0F00: tre_regcompb (regcomp.c:150)
==8380==    by 0x4FAA672: do_gsub (grep.c:2023)
==8380==    by 0x4F79775: bcEval (eval.c:7090)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F86032: Rf_eval (eval.c:846)
==8380== 

R version 4.0.0 (2020-04-24) -- "Arbor Day"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Loading required package: grDevices
> for(i in 10^(0:10)){print(i);binsegRcpp::read_new(i)}
[1] 1
==8380== Invalid read of size 16
==8380==    at 0x55AC988: __wcsnlen_sse4_1 (strlen.S:117)
==8380==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8380==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8380==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8380==    by 0x4EFB122: do_makenames (character.c:939)
==8380==    by 0x4F79775: bcEval (eval.c:7090)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==  Address 0xac2abf0 is 8 bytes after a block of size 8 alloc'd
==8380==    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8380==    by 0x4FBDC80: R_chk_calloc (memory.c:3428)
==8380==    by 0x4EFB0AB: do_makenames (character.c:932)
==8380==    by 0x4F79775: bcEval (eval.c:7090)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380== 
==8380== Invalid read of size 16
==8380==    at 0x55AC98D: __wcsnlen_sse4_1 (strlen.S:117)
==8380==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8380==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8380==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8380==    by 0x4EFB122: do_makenames (character.c:939)
==8380==    by 0x4F79775: bcEval (eval.c:7090)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==  Address 0xac2ac00 is 0 bytes after a block of size 32 in arena "client"
==8380== 
==8380== Invalid read of size 16
==8380==    at 0x55AC992: __wcsnlen_sse4_1 (strlen.S:117)
==8380==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8380==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8380==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8380==    by 0x4EFB122: do_makenames (character.c:939)
==8380==    by 0x4F79775: bcEval (eval.c:7090)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==  Address 0xac2ac10 is 16 bytes after a block of size 32 in arena "client"
==8380== 
==8380== Conditional jump or move depends on uninitialised value(s)
==8380==    at 0x55ACA61: __wcsnlen_sse4_1 (strlen.S:161)
==8380==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8380==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8380==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8380==    by 0x4EFB122: do_makenames (character.c:939)
==8380==    by 0x4F79775: bcEval (eval.c:7090)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380== 
==8380== Invalid read of size 4
==8380==    at 0x1137915B: read_new(int) (rcpp_interface.cpp:15)
==8380==    by 0x1137111C: _binsegRcpp_read_new (RcppExports.cpp:26)
==8380==    by 0x4F3B85F: R_doDotCall (dotcode.c:598)
==8380==    by 0x4F7BB2D: bcEval (eval.c:7646)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380==    by 0x4F8DB75: R_compileAndExecute (eval.c:1514)
==8380==    by 0x4F8DDDA: do_for (eval.c:2296)
==8380==    by 0x4F86251: Rf_eval (eval.c:798)
==8380==    by 0x4FB9A69: Rf_ReplIteration (main.c:264)
==8380==  Address 0xa95f4b4 is 4 bytes after a block of size 0 alloc'd
==8380==    at 0x4C3089F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8380==    by 0x1137915A: read_new(int) (rcpp_interface.cpp:14)
==8380==    by 0x1137111C: _binsegRcpp_read_new (RcppExports.cpp:26)
==8380==    by 0x4F3B85F: R_doDotCall (dotcode.c:598)
==8380==    by 0x4F7BB2D: bcEval (eval.c:7646)
==8380==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8380==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8380==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8380==    by 0x4F7D038: bcEval (eval.c:7058)
==8380==    by 0x4F8DB75: R_compileAndExecute (eval.c:1514)
==8380==    by 0x4F8DDDA: do_for (eval.c:2296)
==8380==    by 0x4F86251: Rf_eval (eval.c:798)
==8380== 
[1] 10
[1] 100
[1] 1000
[1] 10000
[1] 1e+05
[1] 1e+06
[1] 1e+07

 *** caught segfault ***
address 0xd42e2a0, cause 'invalid permissions'

Traceback:
 1: binsegRcpp::read_new(i)
An irrecoverable exception occurred. R is aborting now ...
==8380== 
==8380== Process terminating with default action of signal 11 (SIGSEGV)
==8380==    at 0x551AE75: raise (raise.c:46)
==8380==    by 0x4FB798C: sigactionSegv (main.c:638)
==8380==    by 0x551AF1F: ??? (in /lib/x86_64-linux-gnu/libc-2.27.so)
==8380==    by 0x1137915A: read_new(int) (rcpp_interface.cpp:14)
==8380== 
==8380== HEAP SUMMARY:
==8380==     in use at exit: 54,421,656 bytes in 11,579 blocks
==8380==   total heap usage: 33,735 allocs, 22,156 frees, 97,906,197 bytes allocated
==8380== 
==8380== LEAK SUMMARY:
==8380==    definitely lost: 0 bytes in 0 blocks
==8380==    indirectly lost: 0 bytes in 0 blocks
==8380==      possibly lost: 0 bytes in 0 blocks
==8380==    still reachable: 54,421,656 bytes in 11,579 blocks
==8380==                       of which reachable via heuristic:
==8380==                         newarray           : 4,264 bytes in 1 blocks
==8380==         suppressed: 0 bytes in 0 blocks
==8380== Rerun with --leak-check=full to see details of leaked memory
==8380== 
==8380== For counts of detected and suppressed errors, rerun with: -v
==8380== Use --track-origins=yes to see where uninitialised values come from
==8380== ERROR SUMMARY: 103 errors from 6 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
(base) tdhock@maude-MacBookPro:~/R/R-4.0.0$ 

@tdhock
Copy link
Author

tdhock commented Jun 10, 2020

same thing for malloc.

// [[Rcpp::export]]
int read_malloc(int i){
  int *ptr = (int*)malloc(0);
  int x = ptr[i];
  free(ptr);
  return x;
}

R gives me an invalid read at read_malloc(int) (rcpp_interface.cpp:23) ...

(base) tdhock@maude-MacBookPro:~/R/RcppDeepState$ R -d valgrind -e 'for(i in 10^(0:10)){print(i);binsegRcpp::read_malloc(i)}'
==8950== Memcheck, a memory error detector
==8950== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8950== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==8950== Command: /home/tdhock/lib/R/bin/exec/R -e for(i~+~in~+~10^(0:10)){print(i);binsegRcpp::read_malloc(i)}
==8950== 
==8950== Conditional jump or move depends on uninitialised value(s)
==8950==    at 0x55AC9E0: __wcsnlen_sse4_1 (strlen.S:147)
==8950==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8950==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8950==    by 0x50BBD27: wcstombs (stdlib.h:154)
==8950==    by 0x50BBD27: tre_parse_bracket_items (tre-parse.c:336)
==8950==    by 0x50BBD27: tre_parse_bracket (tre-parse.c:453)
==8950==    by 0x50BBD27: tre_parse (tre-parse.c:1380)
==8950==    by 0x50B37B8: tre_compile (tre-compile.c:1920)
==8950==    by 0x50B0F00: tre_regcompb (regcomp.c:150)
==8950==    by 0x4FAA672: do_gsub (grep.c:2023)
==8950==    by 0x4F79775: bcEval (eval.c:7090)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F86032: Rf_eval (eval.c:846)
==8950== 

R version 4.0.0 (2020-04-24) -- "Arbor Day"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Loading required package: grDevices
> for(i in 10^(0:10)){print(i);binsegRcpp::read_malloc(i)}
[1] 1
==8950== Invalid read of size 16
==8950==    at 0x55AC988: __wcsnlen_sse4_1 (strlen.S:117)
==8950==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8950==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8950==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8950==    by 0x4EFB122: do_makenames (character.c:939)
==8950==    by 0x4F79775: bcEval (eval.c:7090)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==  Address 0xac2abf0 is 8 bytes after a block of size 8 alloc'd
==8950==    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8950==    by 0x4FBDC80: R_chk_calloc (memory.c:3428)
==8950==    by 0x4EFB0AB: do_makenames (character.c:932)
==8950==    by 0x4F79775: bcEval (eval.c:7090)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950== 
==8950== Invalid read of size 16
==8950==    at 0x55AC98D: __wcsnlen_sse4_1 (strlen.S:117)
==8950==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8950==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8950==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8950==    by 0x4EFB122: do_makenames (character.c:939)
==8950==    by 0x4F79775: bcEval (eval.c:7090)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==  Address 0xac2ac00 is 0 bytes after a block of size 32 in arena "client"
==8950== 
==8950== Invalid read of size 16
==8950==    at 0x55AC992: __wcsnlen_sse4_1 (strlen.S:117)
==8950==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8950==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8950==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8950==    by 0x4EFB122: do_makenames (character.c:939)
==8950==    by 0x4F79775: bcEval (eval.c:7090)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==  Address 0xac2ac10 is 16 bytes after a block of size 32 in arena "client"
==8950== 
==8950== Conditional jump or move depends on uninitialised value(s)
==8950==    at 0x55ACA61: __wcsnlen_sse4_1 (strlen.S:161)
==8950==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==8950==    by 0x551FB20: wcstombs (wcstombs.c:34)
==8950==    by 0x4EFB122: wcstombs (stdlib.h:154)
==8950==    by 0x4EFB122: do_makenames (character.c:939)
==8950==    by 0x4F79775: bcEval (eval.c:7090)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950== 
==8950== Invalid read of size 4
==8950==    at 0x1137917B: read_malloc(int) (rcpp_interface.cpp:23)
==8950==    by 0x113715CC: _binsegRcpp_read_malloc (RcppExports.cpp:37)
==8950==    by 0x4F3B85F: R_doDotCall (dotcode.c:598)
==8950==    by 0x4F7BB2D: bcEval (eval.c:7646)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950==    by 0x4F8DB75: R_compileAndExecute (eval.c:1514)
==8950==    by 0x4F8DDDA: do_for (eval.c:2296)
==8950==    by 0x4F86251: Rf_eval (eval.c:798)
==8950==    by 0x4FB9A69: Rf_ReplIteration (main.c:264)
==8950==  Address 0xa95f4b4 is 4 bytes after a block of size 0 alloc'd
==8950==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8950==    by 0x1137917A: read_malloc(int) (rcpp_interface.cpp:22)
==8950==    by 0x113715CC: _binsegRcpp_read_malloc (RcppExports.cpp:37)
==8950==    by 0x4F3B85F: R_doDotCall (dotcode.c:598)
==8950==    by 0x4F7BB2D: bcEval (eval.c:7646)
==8950==    by 0x4F85E5F: Rf_eval (eval.c:723)
==8950==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==8950==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==8950==    by 0x4F7D038: bcEval (eval.c:7058)
==8950==    by 0x4F8DB75: R_compileAndExecute (eval.c:1514)
==8950==    by 0x4F8DDDA: do_for (eval.c:2296)
==8950==    by 0x4F86251: Rf_eval (eval.c:798)
==8950== 
[1] 10
[1] 100
[1] 1000
[1] 10000
[1] 1e+05
[1] 1e+06
[1] 1e+07

 *** caught segfault ***
address 0xd42e2a0, cause 'invalid permissions'

Traceback:
 1: binsegRcpp::read_malloc(i)
An irrecoverable exception occurred. R is aborting now ...
==8950== 
==8950== Process terminating with default action of signal 11 (SIGSEGV)
==8950==    at 0x551AE75: raise (raise.c:46)
==8950==    by 0x4FB798C: sigactionSegv (main.c:638)
==8950==    by 0x551AF1F: ??? (in /lib/x86_64-linux-gnu/libc-2.27.so)
==8950==    by 0x1137917A: read_malloc(int) (rcpp_interface.cpp:22)
==8950== 
==8950== HEAP SUMMARY:
==8950==     in use at exit: 54,421,675 bytes in 11,579 blocks
==8950==   total heap usage: 33,735 allocs, 22,156 frees, 97,906,216 bytes allocated
==8950== 
==8950== LEAK SUMMARY:
==8950==    definitely lost: 0 bytes in 0 blocks
==8950==    indirectly lost: 0 bytes in 0 blocks
==8950==      possibly lost: 0 bytes in 0 blocks
==8950==    still reachable: 54,421,675 bytes in 11,579 blocks
==8950==                       of which reachable via heuristic:
==8950==                         newarray           : 4,264 bytes in 1 blocks
==8950==         suppressed: 0 bytes in 0 blocks
==8950== Rerun with --leak-check=full to see details of leaked memory
==8950== 
==8950== For counts of detected and suppressed errors, rerun with: -v
==8950== Use --track-origins=yes to see where uninitialised values come from
==8950== ERROR SUMMARY: 103 errors from 6 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
(base) tdhock@maude-MacBookPro:~/R/RcppDeepState$ 

@tdhock
Copy link
Author

tdhock commented Jun 10, 2020

this function uses IntegerVector(0) instead of malloc(0)

// [[Rcpp::export]]
int read_memory(int i){
  Rcpp::IntegerVector x(0);
  return x[i];
}

for read_memory(1) there is no invalid read from valgrind.
it results in an invalid read message eventually for read_memory(10^7).

(base) tdhock@maude-MacBookPro:~/R/RcppDeepState$ R -d valgrind -e 'for(i in 10^(0:10)){print(i);binsegRcpp::read_memory(i)}'
==9379== Memcheck, a memory error detector
==9379== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9379== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==9379== Command: /home/tdhock/lib/R/bin/exec/R -e for(i~+~in~+~10^(0:10)){print(i);binsegRcpp::read_memory(i)}
==9379== 
==9379== Conditional jump or move depends on uninitialised value(s)
==9379==    at 0x55AC9E0: __wcsnlen_sse4_1 (strlen.S:147)
==9379==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==9379==    by 0x551FB20: wcstombs (wcstombs.c:34)
==9379==    by 0x50BBD27: wcstombs (stdlib.h:154)
==9379==    by 0x50BBD27: tre_parse_bracket_items (tre-parse.c:336)
==9379==    by 0x50BBD27: tre_parse_bracket (tre-parse.c:453)
==9379==    by 0x50BBD27: tre_parse (tre-parse.c:1380)
==9379==    by 0x50B37B8: tre_compile (tre-compile.c:1920)
==9379==    by 0x50B0F00: tre_regcompb (regcomp.c:150)
==9379==    by 0x4FAA672: do_gsub (grep.c:2023)
==9379==    by 0x4F79775: bcEval (eval.c:7090)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F86032: Rf_eval (eval.c:846)
==9379== 

R version 4.0.0 (2020-04-24) -- "Arbor Day"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Loading required package: grDevices
> for(i in 10^(0:10)){print(i);binsegRcpp::read_memory(i)}
[1] 1
==9379== Invalid read of size 16
==9379==    at 0x55AC988: __wcsnlen_sse4_1 (strlen.S:117)
==9379==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==9379==    by 0x551FB20: wcstombs (wcstombs.c:34)
==9379==    by 0x4EFB122: wcstombs (stdlib.h:154)
==9379==    by 0x4EFB122: do_makenames (character.c:939)
==9379==    by 0x4F79775: bcEval (eval.c:7090)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F7D038: bcEval (eval.c:7058)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==  Address 0xac2abf0 is 8 bytes after a block of size 8 alloc'd
==9379==    at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9379==    by 0x4FBDC80: R_chk_calloc (memory.c:3428)
==9379==    by 0x4EFB0AB: do_makenames (character.c:932)
==9379==    by 0x4F79775: bcEval (eval.c:7090)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F7D038: bcEval (eval.c:7058)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F7D038: bcEval (eval.c:7058)
==9379== 
==9379== Invalid read of size 16
==9379==    at 0x55AC98D: __wcsnlen_sse4_1 (strlen.S:117)
==9379==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==9379==    by 0x551FB20: wcstombs (wcstombs.c:34)
==9379==    by 0x4EFB122: wcstombs (stdlib.h:154)
==9379==    by 0x4EFB122: do_makenames (character.c:939)
==9379==    by 0x4F79775: bcEval (eval.c:7090)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F7D038: bcEval (eval.c:7058)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==  Address 0xac2ac00 is 0 bytes after a block of size 32 in arena "client"
==9379== 
==9379== Invalid read of size 16
==9379==    at 0x55AC992: __wcsnlen_sse4_1 (strlen.S:117)
==9379==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==9379==    by 0x551FB20: wcstombs (wcstombs.c:34)
==9379==    by 0x4EFB122: wcstombs (stdlib.h:154)
==9379==    by 0x4EFB122: do_makenames (character.c:939)
==9379==    by 0x4F79775: bcEval (eval.c:7090)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F7D038: bcEval (eval.c:7058)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==  Address 0xac2ac10 is 16 bytes after a block of size 32 in arena "client"
==9379== 
==9379== Conditional jump or move depends on uninitialised value(s)
==9379==    at 0x55ACA61: __wcsnlen_sse4_1 (strlen.S:161)
==9379==    by 0x5599EC1: wcsrtombs (wcsrtombs.c:104)
==9379==    by 0x551FB20: wcstombs (wcstombs.c:34)
==9379==    by 0x4EFB122: wcstombs (stdlib.h:154)
==9379==    by 0x4EFB122: do_makenames (character.c:939)
==9379==    by 0x4F79775: bcEval (eval.c:7090)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F7D038: bcEval (eval.c:7058)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379== 
[1] 10
[1] 100
[1] 1000
[1] 10000
[1] 1e+05
[1] 1e+06
[1] 1e+07
==9379== Invalid read of size 4
==9379==    at 0x11379023: read_memory(int) (rcpp_interface.cpp:9)
==9379==    by 0x11370C6C: _binsegRcpp_read_memory (RcppExports.cpp:15)
==9379==    by 0x4F3B85F: R_doDotCall (dotcode.c:598)
==9379==    by 0x4F7BB2D: bcEval (eval.c:7646)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379==    by 0x4F88A46: Rf_applyClosure (eval.c:1814)
==9379==    by 0x4F7D038: bcEval (eval.c:7058)
==9379==    by 0x4F8DB75: R_compileAndExecute (eval.c:1514)
==9379==    by 0x4F8DDDA: do_for (eval.c:2296)
==9379==    by 0x4F86251: Rf_eval (eval.c:798)
==9379==    by 0x4FB9A69: Rf_ReplIteration (main.c:264)
==9379==  Address 0x10a68a10 is 41,216 bytes inside a block of size 49,672 free'd
==9379==    at 0x4C30D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9379==    by 0x4FC477B: ReleaseLargeFreeVectors (memory.c:1114)
==9379==    by 0x4FC477B: RunGenCollect (memory.c:1896)
==9379==    by 0x4FC477B: R_gc_internal (memory.c:3125)
==9379==    by 0x4FC4EC9: Rf_allocSExp (memory.c:2365)
==9379==    by 0x5026C47: ReadItem (serialize.c:1853)
==9379==    by 0x5026D19: ReadItem (serialize.c:1871)
==9379==    by 0x5026D19: ReadItem (serialize.c:1871)
==9379==    by 0x5027390: ReadItem (serialize.c:1858)
==9379==    by 0x5027776: ReadItem (serialize.c:1959)
==9379==    by 0x5028A9D: R_Unserialize (serialize.c:2179)
==9379==    by 0x5029E39: R_unserialize (serialize.c:2890)
==9379==    by 0x502A259: do_lazyLoadDBfetch (serialize.c:3181)
==9379==    by 0x4F863E5: Rf_eval (eval.c:830)
==9379==  Block was alloc'd at
==9379==    at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9379==    by 0x4FC682C: Rf_allocVector3 (memory.c:2805)
==9379==    by 0x4FC7858: Rf_allocVector (Rinlinedfuns.h:593)
==9379==    by 0x4FC7858: R_alloc (memory.c:2257)
==9379==    by 0x4F21C1D: R_decompress1 (connections.c:6037)
==9379==    by 0x502A2A4: do_lazyLoadDBfetch (serialize.c:3178)
==9379==    by 0x4F863E5: Rf_eval (eval.c:830)
==9379==    by 0x4F8681B: forcePromise (eval.c:551)
==9379==    by 0x4F85F0B: Rf_eval (eval.c:759)
==9379==    by 0x4F5687F: Rf_findFun3 (envir.c:1545)
==9379==    by 0x4F78DF9: bcEval (eval.c:6885)
==9379==    by 0x4F85E5F: Rf_eval (eval.c:723)
==9379==    by 0x4F87C7E: R_execClosure (eval.c:1888)
==9379== 
[1] 1e+08

 *** caught segfault ***
address 0x246bb230, cause 'memory not mapped'

Traceback:
 1: binsegRcpp::read_memory(i)
An irrecoverable exception occurred. R is aborting now ...
==9379== 
==9379== Process terminating with default action of signal 11 (SIGSEGV)
==9379==    at 0x551AE75: raise (raise.c:46)
==9379==    by 0x4FB798C: sigactionSegv (main.c:638)
==9379==    by 0x551AF1F: ??? (in /lib/x86_64-linux-gnu/libc-2.27.so)
==9379==    by 0x11379022: Rcpp_ReleaseObject (RcppCommon.h:98)
==9379==    by 0x11379022: ~PreserveStorage (PreserveStorage.h:13)
==9379==    by 0x11379022: ~Vector (Vector.h:29)
==9379==    by 0x11379022: read_memory(int) (rcpp_interface.cpp:8)
==9379== 
==9379== HEAP SUMMARY:
==9379==     in use at exit: 54,424,227 bytes in 11,579 blocks
==9379==   total heap usage: 33,728 allocs, 22,149 frees, 97,908,768 bytes allocated
==9379== 
==9379== LEAK SUMMARY:
==9379==    definitely lost: 0 bytes in 0 blocks
==9379==    indirectly lost: 0 bytes in 0 blocks
==9379==      possibly lost: 0 bytes in 0 blocks
==9379==    still reachable: 54,424,227 bytes in 11,579 blocks
==9379==                       of which reachable via heuristic:
==9379==                         newarray           : 4,264 bytes in 1 blocks
==9379==         suppressed: 0 bytes in 0 blocks
==9379== Rerun with --leak-check=full to see details of leaked memory
==9379== 
==9379== For counts of detected and suppressed errors, rerun with: -v
==9379== Use --track-origins=yes to see where uninitialised values come from
==9379== ERROR SUMMARY: 101 errors from 6 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
(base) tdhock@maude-MacBookPro:~/R/RcppDeepState$ 

@tdhock
Copy link
Author

tdhock commented Jun 11, 2020

Simon Urbanek says that valgrind can NOT detect these issues because of the way R treats zero length vectors.

"Rcpp simply calls allocVector() so regular R rules apply. R's SEXP can hold vectors up to length 1 inside without additional allocations*, therefore from memory management perspective writes to the first element of a 0-length vector are not invalid. The valgrind instrumentation of R doesn't guard against that case, i.e., it doesn't mark those 8 bytes as NOACCESS, it only marks additional allocated memory accordingly (not relevant in this case).

    • see also R-ints 1.1.4 for details on allocator classes"

Dirk says that he would accept a PR which adds range checking so that would probably
be the easiest way to solve this problem.

| Another idea to detect these errors would be to create a modified Rcpp
| package where the C++ subscript operator [] is modified to check for
| size and report an incorrect dereferencing.

There is a very cute and less-than-a-dozen-lines class Vec in Stroustrup's "A
Tour of C++" (2nd ed, 2018, section 11.2). The std::Vec inherits "everything"
from std::vector but then redefines 'T& operator[](int i)' and 'const R&
operator[](int i) const' to use .at(i) and provide range checking.
The discussion there is brief, but good. Range checking introduces about 10%
overhead. Quoting: "However, experience shows that such overhead can lead
people to prefer the far more unsafe builtin arrays.". Well put. He also
notes that "some implementations" offer this via a compile-time option.
Seems like an opportunity for someone to get famous and add a #define to Rcpp
to support this too. Well written PRs, ideally supported by more than merely
superficial testing, are always welcome.

@tdhock
Copy link
Author

tdhock commented Jul 13, 2020

hi can you please keep this open until the issue is fixed? (it's not fixed yet, right?)

@akhikolla
Copy link
Owner

The issue can be resolved only by creating a PR for range checking as Dirk suggested. @tdhock Should we keep this open till then?

@akhikolla akhikolla reopened this Jul 13, 2020
@tdhock
Copy link
Author

tdhock commented Jul 13, 2020

yes please keep it open as a reminder that it has not yet been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants