-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search for .mem in same directory as Node script #4942
Conversation
…ed code with separate .js.mem from Node.js, while not being in the same working directory. It was possible to workaround this and set search path using memoryInitializerPrefixURL since 7e9e24, but I think it makes sense to have a nice default too, which would be to search for .js.mem file in the same directory as .js itself when running using Node.js. Added regression test for this behaviour, which also revealed that SpiderMonkey / V8 shells already got this right, so this PR aligns Node.js behaviour to others.
A general question, why is node.js special here? Is it just because we have an easy way to get the dir of the |
@kripken As I said above, pure JS shells (V8 / SpiderMonkey) already do the right thing:
That is, Theoretically we could implement something similar for browsers too, but there 1) |
Thanks, now I see. |
src/postamble.js
Outdated
@@ -43,6 +43,8 @@ if (memoryInitializer) { | |||
memoryInitializer = Module['locateFile'](memoryInitializer); | |||
} else if (Module['memoryInitializerPrefixURL']) { | |||
memoryInitializer = Module['memoryInitializerPrefixURL'] + memoryInitializer; | |||
} else if (ENVIRONMENT_IS_NODE) { | |||
memoryInitializer = __dirname + '/' + memoryInitializer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment here saying what this does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, somehow missed this - added a comment.
This may break existing users, I think? People that had the mem file in the current dir but the js in another? |
@kripken Do you think it's a common case when they had files in different directories? Given that |
Yeah, I agree it's very unlikely. Possibly also something we should have recommended people not do anyhow :) But I'm not 100% sure, and I worry about annoying users. Perhaps @juj has a sense of the risk here? One possibility to reduce the risk is look in both places, but it seems like that isn't easy to implement in the code, so maybe not worth the complexity? |
@kripken I've added a comment as per request. Anything else missing / what are the generic thoughts on whether this can be moved forward? |
What's left is to assess the risk of breakage, I think. @juj: maybe you have time to give your thoughts now, after GDC? Another possible thing to reduce the risk here is to improve the error message in the breakage case (i.e., when we fail to find the mem file, say on node "it should be in the same dir as the js file")? |
Sounds reasonable to me. |
I have some thoughts about this code... This will add another node-js dependency into resulting js file. Atm emscripten already loads "fs" "dir" module, which makes webpack "unhappy". And to make webpack accept current code, I had to preamble module file with "require=function(){}" line. For the issue itself, emscripten result can be preambled with
This allows users to set path to file themselves and put ".mem" file anywhere they want. Same for browsers
Using this method for overriding function I was able to load .mem file using axios lib and custom url. |
@Deamon87 It can be worked around, yes (I mentioned this in issue description), but the underlying issue in inconsistency of file search between engines still exists and looking in the same folder is sane default to have, as this is what most users expect. As for webpack - it's very flexible and you should configure it to use aliases or ignore specific modules if your issue is that they are accidentally included; preambling module with custom Anyway, I just don't think having problems with webpack configuration should be a blocker for fixing actual Node.js behavior. |
And btw, this change doesn't break webpack either as it knows what |
@RReverser |
@@ -43,6 +43,9 @@ if (memoryInitializer) { | |||
memoryInitializer = Module['locateFile'](memoryInitializer); | |||
} else if (Module['memoryInitializerPrefixURL']) { | |||
memoryInitializer = Module['memoryInitializerPrefixURL'] + memoryInitializer; | |||
} else if (ENVIRONMENT_IS_NODE) { | |||
// Look for .mem file in the same folder as script by default | |||
memoryInitializer = __dirname + '/' + memoryInitializer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions do come to mind:
- does node.js have a path join function that could be used here? I.e. is it guaranteed that
__dirname
does not end in a'/'
so that this won't do a/path/to/foo//foo.mem
? On Windows, should this be a backslash'\'
or is node.js satisfied or even expect'/'
on Windows as well? - are other file loads with node.js affected similarly? e.g. why is
.mem
file special here for node.js?
Last, I wonder if this would be better to read
if (typeof Module['locateFile'] === 'function') {
memoryInitializer = Module['locateFile'](memoryInitializer);
} else if (Module['memoryInitializerPrefixURL']) {
memoryInitializer = Module['memoryInitializerPrefixURL'] + memoryInitializer;
}
+ if (ENVIRONMENT_IS_NODE && isThisARelativeUrl(memoryInitializer)) {
+ // Relative paths should be treated relative to the folder where the .js script resides in
+ memoryInitializer = __dirname + '/' + memoryInitializer;
+ }
since Module['locateFile']
and Module['memoryInitializerPrefixURL']
may also be relative - do we ever want to load files relative to the cwd in node.js?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively to make it more generic, perhaps we could have a Module['documentDirectory']
which would be initialized to __dirname
on node.js by default, and the above type of code would always read relative directories with respect to Module['documentDirectory']
if that exists. That way we'd avoid having to put in node.js specifics at each place of use (assuming we will have more of them in the future, e.g. when node.js adds .wasm support?), and if we need a similar thing for SpiderMonkey or other shells, it would be decoupled as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I.e. is it guaranteed that __dirname does not end in a '/' so that this won't do a /path/to/foo//foo.mem?
Yeah, it is.
On Windows, should this be a backslash '' or is node.js satisfied or even expect '/' on Windows as well?
Windows itself doesn't mind forward slashes and accepts either, so I wouldn't bother.
e.g. why is .mem file special here for node.js?
.mem
is special only because Emscripten 1) generates it alongside the .js
but 2) searches in current working dir and not alongside .js
, which causes app to immediately crash if they don't match (e.g. when invoked by Cargo or other builder from top-level working directory). For regular file reads by the app itself, I think it's reasonable to leave behaviour as-is because they're not generated alongside .js
in the first place.
Module['memoryInitializerPrefixURL'] may also be relative - do we ever want to load files relative to the cwd in node.js
Yeah, as said above, I think for normal cases we want to load files relative to cwd, so that program would behave in the same way as when compiled natively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I don't understand the idea for Module['documentDirectory']
- what does "document" mean here?
Perhaps we need something like Module['pwd']
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kripken Overriding current working directory sounds irrelevant to this issue though? We want cwd to be still the same (at least where it exists for real, like in Node.js), it's only .mem
that needs to be fixed to be found alongside .js
as I explained above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@curiousdannii I already commented about the browser situations above couple of times - TL;DR it's quite different because browser doesn't have a notion of "pwd" unlike all the console shells, and not having a solution for everything is not a reason not to fix at least those we can. Anyway, this is blocked due to the other reasons now.
The idea to add this path name is a good solution. As commented earlier, I think it would be good to separate the location of initialization (platform specific) from the location of use (platform independent), so that we don't need to sprinkle if (ENVIRONMENT_IS_NODE) x = __dirname + x;
statements to wherever we need to load items relative to the main .js
document in question. I understand there is only one such place where we are fixing this for now, but as we discussed above, there will likely be more of such fixes in the future (.data
, .wasm
?), so having something like
// in preamble:
if (!Module['documentDirectory']) {
if (ENVIRONMENT_IS_NODE) Module['documentDirectory'] = __dirname + '/';
}
// in location(s) of use:
memoryInitializer = Module['documentDirectory'] + memoryInitializer;
This way when we expand in the future to having multiple locations that need this fix, we can just slap the Module['documentDirectory'] +
bit to them without having to worry about node.js specifics in such locations of use. That will keep platform specifics down to that one line at init time. It's a small change to the current proposed form, but I think it would show a good form for future fixes. @kripken, did you have a thought about the name of such a variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that we don't need to sprinkle if (ENVIRONMENT_IS_NODE) x = __dirname + x; statements to wherever we need to load items relative to the main .js document in question
Oh of course, I was thinking about having a generic function rather than copy-pasting in the worst case.
Anyway, this idea sounds good to me, except the naming which might be confusing, especially on the browser side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about scriptDirectory
? It should state pretty clearly that it should be a directory with generated script (and, consequently, other artifacts it depends upon).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module['scriptDirectory']
sounds good to me. @kripken ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me too.
In browser environment, the field
then at the place of use, it could complete relative paths with respect to |
@juj Please take another look at what I wrote above. Your proposed I mentioned browser side above that there we could / should use
|
TL;DR - If you prefer not to fix Node.js behaviour here, I can close the PR, just want to make sure that we're not going forth and back on exact same questions. |
Emscripten also generates other build outputs alongside .js in the same directory: these are similar to how the |
Ah yeah in this sense |
Regarding |
Can we fix |
@saschanaz No, see above. |
@juj So what is the latest plan on this given the information above? Do you want to introduce some generic solution for these generated dependencies? |
Another one is the Emterpreter data, which does currently apply to Node. |
In case anyone else finds this thread in search for a solution, this can be worked around in Node.js for an already-built module by using require-inject-scope. This allows you to inject the (My issue with this was, I had an Emscripten-built module, and needed to load it from another folder.) |
I think this can be closed as #5368 has been merged. |
I believe so, thanks @curiousdannii |
This seems to be a common source of issues when trying to run optimised code with separate
.js.mem
from Node.js, while not being in the same working directory as script itself.It was possible to workaround this and set search path using
memoryInitializerPrefixURL
since 7e9e24, but I think it makes sense to have a nice default too, which would be to search for.js.mem
file in the same directory as.js
itself when running using Node.js.Added regression test for this behaviour, which also revealed that SpiderMonkey / V8 shells already got this right, so this PR aligns Node.js behaviour to others.
(Note: this is reopening of #4936, cc @juj)