-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent logging w/cl-syslog results in memory corruption #92
Comments
Simpler reproduction that doesn't involve RPCQ or (let ((threads '()))
(unwind-protect
(loop :repeat 4 :do
(push (bt:make-thread (lambda ()
(loop :repeat 20 :do
(cl-syslog:log "foo" ':user ':info "hello"))))
threads))
(dolist (thread threads)
(bt:join-thread thread)))) |
The plot thickens! That last test does not reproduce on linux laptop (with older SBCL). My linux laptop is only dual-core (4x logical) though, so maybe that's the reason. I'll try installing a linux VM on my macbook and see if it reproduces there. |
Neither of the above tests reproduce on the latest
|
Not sure I am looking in the correct place (or correct version), but Apple's syslog wrapper here appears to maintain a global asl client and doesn't appear to add any locking around it for |
Backtrace of failure on Mac OS from the bordeaux-threads-only testcase:
|
Dumping state before context-switch. Need some time to ponder the correct fix. Long story short: it looks like if you want thread-safe syslogging on macOS, you have to do the locking yourself. You thought POSIX would protect you? Muuuhahahaa. Fools! Strangely Silver lining: you can get a thread-safe (I think) Observe:
Indeed the
And in the Apple syslog sources in
In other words, calling (the mach-o-object symbol) Tying this back in to (cffi:defcfun "syslog" :void
(priority :int)
(format :string)
&rest) If I apply the following patch, then memory corruption woes disappear when diff --git a/cl-syslog.asd b/cl-syslog.asd
index 760be33..1f5ae1f 100644
--- a/cl-syslog.asd
+++ b/cl-syslog.asd
@@ -3,7 +3,7 @@
:version (:read-file-form "VERSION.txt")
:licence "MIT (See LICENSE)"
:description "Common Lisp syslog interface."
- :depends-on (#:alexandria #:cffi #:global-vars #:usocket #:split-sequence #:babel #:local-time
+ :depends-on (#:alexandria #:cffi #:global-vars #:usocket #:split-sequence #:babel #:local-time #:trivial-features
#+sbcl #:sb-posix)
:in-order-to ((asdf:test-op (asdf:test-op #:cl-syslog-tests)))
:components ((:file "package")
diff --git a/cl-syslog.lisp b/cl-syslog.lisp
index 2b33bdf..95eb197 100644
--- a/cl-syslog.lisp
+++ b/cl-syslog.lisp
@@ -31,7 +31,9 @@
(cffi:defcfun "closelog" :void)
-(cffi:defcfun "syslog" :void
+(cffi:defcfun (syslog #-os-macosx "syslog"
+ #+os-macosx "syslog$DARWIN_EXTSN")
+ :void
(priority :int)
(format :string)
&rest) That Weird fact no. 47: calling |
Did some more digging on this today. The good news, I was able to use cffi-grovel's The bad news: it appears that
For the record, here is the proof-of-concept diff --git a/cl-syslog.asd b/cl-syslog.asd
index 760be33..e61ead8 100644
--- a/cl-syslog.asd
+++ b/cl-syslog.asd
@@ -3,11 +3,15 @@
:version (:read-file-form "VERSION.txt")
:licence "MIT (See LICENSE)"
:description "Common Lisp syslog interface."
+ :defsystem-depends-on ("cffi-grovel")
:depends-on (#:alexandria #:cffi #:global-vars #:usocket #:split-sequence #:babel #:local-time
- #+sbcl #:sb-posix)
+ #:trivial-features #+sbcl #:sb-posix)
:in-order-to ((asdf:test-op (asdf:test-op #:cl-syslog-tests)))
+ :serial t
:components ((:file "package")
(:file "variable")
+ #+darwin
+ (:cffi-wrapper-file "macos-syslog-wrapper")
(:file "cl-syslog")
(:file "rfc5424")
(:file "rfc5424-reserved")
diff --git a/cl-syslog.lisp b/cl-syslog.lisp
index 2b33bdf..989a2b4 100644
--- a/cl-syslog.lisp
+++ b/cl-syslog.lisp
@@ -31,7 +31,9 @@
(cffi:defcfun "closelog" :void)
-(cffi:defcfun "syslog" :void
+(cffi:defcfun (syslog #-darwin "syslog"
+ #+darwin "syslog_cffi_darwin_wrapper")
+ :void
(priority :int)
(format :string)
&rest)
diff --git a/macos-syslog-wrapper.lisp b/macos-syslog-wrapper.lisp
new file mode 100644
index 0000000..cf21a27
--- /dev/null
+++ b/macos-syslog-wrapper.lisp
@@ -0,0 +1,35 @@
+(in-package #:syslog)
+
+(include "syslog.h")
+(include "stdarg.h")
+
+;; This flag is needed because CFFI-TOOLCHAIN winds up calling
+;; CFFI-TOOLCHAIN:SBCL-TOOLCHAIN-PARAMETERS on SBCL, which reads
+;; src/runtime/sbcl.mk and sets CFFI-TOOLCHAIN:*CC-FLAGS* to whatever
+;; sbcl.mk defines for CFLAGS. On my macOS 10.14 machine, sbcl.mk's
+;; CFLAGS includes the flag -mmacosx-version-min=10.6. According to
+;; syslog.h on my mac, __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__
+;; must be at least __MAC_10_13 in order to get the thread-safe
+;; _syslog$DARWIN_EXTSN, rather than the unsafe _syslog.
+(flag "-mmacosx-version-min=10.13")
+
+
+;; DEFWRAPPER doesn't support the use of &REST to indicate varargs
+;; like DEFCFUN, neither does it support :VOID return type, so we
+;; resort to a small C snippet to get 'er done.
+;;
+;; There is an open PR to add support for VOID return type to
+;; DEFWRAPPER: https://github.com/cffi/cffi/pull/144
+;;
+;; (defwrapper "syslog" :void
+;; (priority :int)
+;; (format :string)
+;; &rest)
+(c
+ "void syslog_cffi_darwin_wrapper(int priority, char* format, ...)"
+ "{"
+ " va_list ap;"
+ " va_start(ap, format);"
+ " syslog(priority, format, ap);"
+ " va_end(ap);"
+ "}") |
While attempting to run the pyquil test suite in parallel via the pytest-xdist plugin, I noticed occasional "Unhandled memory fault" errors like the following.
Based on the error message, this looked similar to quil-lang/qvm#110, so I tried disabling logging in quilc and, sure enough, the errors disappeared.
The work-around in qvm-app was to add a
WITH-LOCKED-LOG
macro and use it to acquire a global lock around any locking calls.In the case of RPCQ, it's not so simple since the logger instance is passed in by the caller of
RPCQ:START-SEVER
.Ideally, this would be resolved in
CL-SYSLOG
, if possible, but we might want to implement a workaround in QUILC/RPCQ in case that turns out to be impossible / impractical / slow to get merged.Here is a minimal-ish testcase that reproduces the issue.
The text was updated successfully, but these errors were encountered: