-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environment variable and CLI argument handling in user-land startup snapshots #55603
Comments
Opened #55604 for 1, since I think it's a handy flag on its own (and for 2 or 3 we'll also need some changes already in there to track the accessed environment variables in |
|
I was implementing process.accessedEnv.record(true);
// Do things..
process.accessedEnv.get(); // Returns a set of accessed environment variables
process.accessedEnv.record(false); So users can do what the process.accessedEnv.record(true);
v8.startupSnapshot.addSerializeCallback(() => {
// Get all the environment variables accessed by the script and do some checks.
const envs = process.accessedEnv.get();
process.accessedEnv.record(false);
});
// Run scripts If it's necessary to start the recording as soon as the process is launched instead of waiting until some JavaScript is executed, the configuration can be passed into the process via command line flags or snapshot configuration JSON. That also allows users to leave environment variable accesses coming from Node.js internals to Node.js and not worry about them. |
I can imagine that being useful for asserting about ENV access in unit tests too. How would you recommend getting the set of environment variables read by Node internals during the snapshot creation, such as |
Actually for that I think I can use |
I think we can support a configuration in the snapshot configuration json ( {
"builder": "snapshot.js",
"traceEnvInAPI": true
} which instructs Node.js to track all the environment variables in a set as soon as the configuration is read (currently snapshot config is read after a few very global initializations so we'll need to do some refactoring to move the read of the snapshot configs earlier once we notice that // Returns all the environment variables accessed up until the point
// process.accessedEnv.record(false) is called (if it is ever called)
// Set(*) { 'NODE_ICU_DATA', 'OPENSSL_CONF', 'NODE_EXTRA_CA_CERTS', ... }
process.accessedEnv.get(); The same can be implemented in a |
Maybe for a more dedicated API it's better to have a global record & local ones (e.g. for tests), so it can be something like: process.traceEnv.global.start(); // can also be started by --trace-env-start-global or traceEnvStartGlobal in snapshot config
process.traceEnv.global.get(); // Set(13) { 'NODE_ICU_DATA', 'OPENSSL_CONF', 'NODE_EXTRA_CA_CERTS', ... }
process.traceEnv.global.stop(); // Generally no need to do this for application code controlling the process We can implement the global one first and then look into a local version later, the local version would be a bit more complex to implement so should be in a follow up, though it's useful to think about what the local version looks like so the API looks more consistent. const trace = process.traceEnv.createLocal();
trace.start(); // By default it's started as soon as it's created.
trace.get();
trace.stop();
// When records is unreachable and gc'ed, the tracking will be stopped as well, along with it's env var stores Returning a set (lower memory usage) or an array seems adequate for the basic use cases though we can also think about more complex options (in |
Ah right, good point. |
During the snapshot building process, the
process.env
andprocess.execArgv
can be queried and captured in the snapshot, which may not match what the environment variables or CLI arguments actually are when the snapshot is deserialized.For the environment variables queried by Node.js internals, we currently work around this problem by avoiding to cache these values in JS land and always run the pre-execution code to refresh whatever states that are dependent on environment variables or CLI arguments, so after the snapshot is deserialized, any code would get new values from
process.env
andprocess.execArgv
that reflect the environment where the deserialized application is run. For user land access to these states in the builder script, the expectation is that users should do something similar - either avoid accessing them in the builder script to avoid capturing them in JS, or refreshing any cached states using the deserialize callbacks.During discussions with @acutmore it seems that while this can be worked around using the solution described above, it would be better if Node.js provides more utilities for working with the environment variables. I think some APIs may be useful:
Some CLI flags to trace access to environment variables, so that users can have a better idea about what variables are used and potentially cached during the snapshot building process, and whether the access need to be delayed to run time or at least refreshed during deserialization. This seems useful in general, not necessarily limited to the startup snapshot feature.
An API that passes a list of environment variables accessed by the builder script to a user-provided serialization callback. Users can e.g. store a copy of those (if necessary they can encrypt sensitive values) as part of the snapshot, and specify a deserializer callback to check for mismatches using that copy or adjust accordingly. Or generate a dotenv file using that API to ensure that some important environment variables are always configured to be consistent with what's used by the builder script, etc. A rough sketch looks like this (more ideas welcomed!)
Some field in the snapshot configurations that can be used to allow-list access to environment variables in the builder script, or some field that be used to run an alternative main script if certain environment variables mismatch as a fallback (this should be opt-in since serializing environment variables can store sensitive information into the blob) - this is also a rough sketch of what the general idea may look like:
The text was updated successfully, but these errors were encountered: