-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add (de)compression tracing functionality #2482
Conversation
c87598a
to
ab94d0f
Compare
/* We depend on the trace header to avoid duplicating the ZSTD_trace struct. | ||
* But, we check the version so it is compatible with dynamic linking. | ||
*/ | ||
#include "../lib/common/zstd_trace.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, is zstd_trace.h
a new "public" API ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this API would eventually become public. I was planning on leaving it out of the default installation for now, but publishing it internally.
Agreed, the CLI is a great place to use, show and experiment with this new |
|
||
#if ZSTD_TRACE && ZSTD_HAVE_WEAK_SYMBOLS | ||
|
||
ZSTD_WEAK_ATTR int ZSTD_trace_compress_begin(ZSTD_CCtx const* cctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, by default, these functions exist but don't do anything,
they need to be replaced in target binary by real functions that are actually doing something
which is possible thanks to weak linking.
I wonder if there is a (future) opportunity here to have some "good enough" default functions, instead of empty ones,
although to be fair I don't know the topic nearly enough to tell if it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We (zstd) don't really know what to do with the data that is provided. So by default we just disable the tracing, so the cost is only one function call + one branch per (de)compression.
We could also introduce other methods of tracing in the future, like attaching the tracing functions to the cctx/dctx itself.
int ZSTD_trace_compress_begin(ZSTD_CCtx const* cctx) | ||
{ | ||
int enabled = 0; | ||
if (g_traceFile == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this design mandates the use of global variables ?
Could it be mitigated by allowing passing some void*
state parameter ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two options:
- Global state inside the tracing library.
- return
void*
from thebegin()
function, and consume it in theend()
function.
I chose the former, because the void*
would have to be malloc
'd. That would mean we have to be certain never to leak the void*
. Currently end()
is only called on success, so we would leak it on errors. We could fix that by calling free()
whenever we overwrite it with a new void*
or free the context. But, that introduces extra complexity to both zstd and the tracing library. Because now, what if the tracing library is in C++
. It couldn't use new
to allocate the object, it would have to use malloc
. And generally having one library allocate and another free seems complex.
So I decided to keep it simpler and have the tracing library keep track of the state. A more complex implementation would have a map<uintptr_t, State>
to keep track of a separate state per-context, using the context pointer as a key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also change the zstd implementation to make it compulsory to call end()
. Basically, if zstd detected that the void* tracingContext
is not NULL
when calling begin()
or during ZSTD_freeCCtx()
it would call end()
and set the error
member to some generic error (and we would have the flexibility to set a more specific error code later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the argument about simplicity.
It's just that, as it's a user API (even a hidden one), I'm easily worried that it could be more difficult to update it afterwards.
But overall, I'm fine with the idea of starting simple, experiment, and introduce complexity only when it feels more palatable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just that, as it's a user API (even a hidden one), I'm easily worried that it could be more difficult to update it afterwards.
At this point I expect this API to change between versions. Once we have enough experience with it, we can think about what stability guarantees we want to provide.
f1384cd
to
54a4998
Compare
This PR adds tracing hooks for zstd compression and decompression calls. Users can inject tracing by overriding the weakly linked tracing functions.
lib/common/trace.{h,c}
- see this file for what is tracedlib/
programs/zstdcli_tracing.{h,c}
enabled by--trace FILE
.The tracing functionality in the CLI is optional, if we don't want it I can remove it. I figured it would both provide an example of how to hook up the tracing, and potentially be useful to measure (de)compression speed and output in a machine readable format in some scenarios.
E.g.