-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new wasm rewriter w/ orchestrion #5494
base: master
Are you sure you want to change the base?
Conversation
packages/dd-trace/src/appsec/iast/taint-tracking/rewriter-esm.mjs
Outdated
Show resolved
Hide resolved
getRewriterOriginalPathAndLineFromSourceMap = | ||
getGetOriginalPathAndLineFromSourceMapFunction(chainSourceMap, getOriginalPathAndLineFromSourceMap) | ||
} | ||
setGetOriginalPathAndLineFromSourceMapFunction(chainSourceMap, iastRewriter) | ||
|
||
rewriter = new Rewriter({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning to move this outside IAST ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but not in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we supposed to have one single orchestrion.yml
file or one for each library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One single one.
That said, I can have it generated by having a separate one for each library and just combining them. That's relatively easy. Would you prefer that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only reviewed the part related to the rewriter, main comment:
- the rewriter part should be extracted as soon as possible from the taint-tracking (I get this is like that to release it quickly). I can help if needed.
telemetryVerbosity, | ||
chainSourceMap | ||
} | ||
|
||
try { | ||
Module.register('./rewriter-esm.mjs', { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this is not going to be specific thing for iast anymore, should we move this or at least import this from initialize.mjs
or loader-hooks.mjs
?
For iast we can assume that the rewriter is not working if node version is <v20.6.0
, but can whole orchestrion assume that? Asking because module.register
was added in 20.6.0.
Keep in mind that this code is not executed if there is no loader configured before, we added it for iast because there isn't any need to rewrite esm customer code when esm is not properly configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was talking about this with @rochdev and the biggest issue with doing this early on in the loading process is that we don't have config. We need to see if config.iast.enabled
before initializing the rewriter. That might be a breaking change. We can probably figure out a way of initializing at least that config early on, but changing how all that works is still a bigger change, and I'm not sure yet if it's breaking. What I've done instead is ensure that the existing langchain integration (i.e. by traditional RITM instrumentation) still works, so we're not reliant on Orchestrion for that. That way this PR is not a breaking change, and we can figure out how to enable Orchestrion for older Node.js at a later time, maybe in v6. How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to see if config.iast.enabled before initializing the rewriter.
oh, I'm stupid, yes. It was also one of the reasons why I wrote this here originally.
Keep in mind that this code is not executed if there is no loader configured before, we added it for iast because there isn't any need to rewrite esm customer code when esm is not properly configured.
I emphasize on this, this code is not executed if customer has not --loader
configured or initialize.mjs
loaded.
Is something that you can assume for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is something that you can assume for now?
That's fine for now. We can adjust this later if needed, but for now we just assume this way of enabling ESM support was used.
To reduce PR noise, I'd like to do that as a separate PR after this one. Is that alright? |
What does this PR do?
Adds support for
langchain
via Orchestrion-JS, by way of the newwasm-js-rewriter
, which replaces the oldnative-iast-rewriter
. The new rewriter contains both Orchestrion-JS and IAST rewriting for taint tracking.Motivation
Prior to Orchestrion-JS, we were unable to support
langchain
when used in an ESM app.Additional Notes
Right now this PR depends on a branch of the rewriter. Once the branch is merged on the rewriter, a cleanup commit will be added that removes things like
yarn link
and git dependencies.