Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questionable internal casing of property name process.env.Path #20605

Closed
JamesMGreene opened this issue May 8, 2018 · 10 comments
Closed

Questionable internal casing of property name process.env.Path #20605

JamesMGreene opened this issue May 8, 2018 · 10 comments

Comments

@JamesMGreene
Copy link
Contributor

JamesMGreene commented May 8, 2018

  • Version: v8.11.1 (but affects all versions AFAIK)
  • Platform: Windows 10 Enterprise x64
  • Subsystem: process, child_process

There is some quirky-ness regarding the property name casing of process.env.PATH (which is actually cased as Path, it seems) and/or the usage of it via the child_process module, e.g. in child_process.spawn.

Does this internal casing of the property name as Path originate from Node.js or is that something that Node is provided by Windows/libuv under the covers? It results in weird behavior when trying to pass in a copied version of process.env to child_process.spawn (etc.).

Using the Node REPL:

> var childProcess = require('child_process');

> Object.keys(process.env).filter(k => k.toUpperCase() === 'PATH')
[ 'Path' ]

> process.env.PATH = 'C:\\Python27';
'C:\\Python27'

> childProcess.spawnSync('python', ['--version']).stderr.toString().trim()
'Python 2.7.11'

> let env = { ...process.env };  // a.k.a. `Object.assign({}, process.env);`
undefined

> Object.keys(env).filter(k => k.toUpperCase() === 'PATH')
[ 'Path' ]

> env.PATH
undefined

> env.Path
'C:\\Python27'

// ISSUE: Node unexpectedly using `process.env.PATH` instead of `options.env.Path`
> childProcess.spawnSync('python', ['--version'], { env }).stderr.toString().trim()
'Python 2.7.11'

> process.env.PATH = '';
''

// ISSUE: Node unexpectedly using `process.env.PATH` instead of `options.env.Path`
> childProcess.spawnSync('python', ['--version'], { env }).stderr.toString().trim()
TypeError: Cannot read property 'toString' of null

> env.PATH = env.Path;
'C:\\Python27'

> childProcess.spawnSync('python', ['--version'], { env }).stderr.toString().trim()
'Python 2.7.11'

I don't really understand how the difference is happening here. As far as I could tell looking through the Node.js core code, the env properties are enumerated and turned into envPairs (an array of ${key}=${value} strings), that are then utilized by "process_wrap.cc"'s Spawn method as env_pairs, so I didn't see anywhere where there is an expectation of having an uppercase PATH variable available at all. 😕

@richardlau
Copy link
Member

I believe the issue is that libuv is looking for uppercase PATH in the passed in options->env in the following code:

/*
* Attempt to find the value of the PATH environment variable in the child's
* preprocessed environment.
*
* If found, a pointer into `env` is returned. If not found, NULL is returned.
*/
static WCHAR* find_path(WCHAR *env) {
for (; env != NULL && *env != 0; env += wcslen(env) + 1) {
if (wcsncmp(env, L"PATH=", 5) == 0)
return &env[5];
}
return NULL;
}

which is called from the following code to find the path of the application being spawned:

node/deps/uv/src/win/process.c

Lines 1010 to 1046 in cf44abb

/* Get PATH environment variable. */
path = find_path(env);
if (path == NULL) {
DWORD path_len, r;
path_len = GetEnvironmentVariableW(L"PATH", NULL, 0);
if (path_len == 0) {
err = GetLastError();
goto done;
}
alloc_path = (WCHAR*) uv__malloc(path_len * sizeof(WCHAR));
if (alloc_path == NULL) {
err = ERROR_OUTOFMEMORY;
goto done;
}
path = alloc_path;
r = GetEnvironmentVariableW(L"PATH", path, path_len);
if (r == 0 || r >= path_len) {
err = GetLastError();
goto done;
}
}
err = uv__stdio_create(loop, options, &process->child_stdio_buffer);
if (err)
goto done;
application_path = search_path(application,
cwd,
path);
if (application_path == NULL) {
/* Not found. */
err = ERROR_FILE_NOT_FOUND;
goto done;
}

cc @nodejs/libuv @nodejs/platform-windows

@JamesMGreene
Copy link
Contributor Author

Nice find, @richardlau!

@JamesMGreene
Copy link
Contributor Author

That said, how is it that process.env.Path ends up matching on the PATH= check either? The envPairs are generated by enumerating through the process.env keys, which would give it Path rather than PATH as far as I can see.

@richardlau
Copy link
Member

Probably because this function sets any "missing" required environment variables:

/*
* The way windows takes environment variables is different than what C does;
* Windows wants a contiguous block of null-terminated strings, terminated
* with an additional null.
*
* Windows has a few "essential" environment variables. winsock will fail
* to initialize if SYSTEMROOT is not defined; some APIs make reference to
* TEMP. SYSTEMDRIVE is probably also important. We therefore ensure that
* these get defined if the input environment block does not contain any
* values for them.
*
* Also add variables known to Cygwin to be required for correct
* subprocess operation in many cases:
* https://github.com/Alexpux/Cygwin/blob/b266b04fbbd3a595f02ea149e4306d3ab9b1fe3d/winsup/cygwin/environ.cc#L955
*
*/
int make_program_env(char* env_block[], WCHAR** dst_ptr) {
WCHAR* dst;
WCHAR* ptr;
char** env;
size_t env_len = 0;
int len;
size_t i;
DWORD var_size;
size_t env_block_count = 1; /* 1 for null-terminator */
WCHAR* dst_copy;
WCHAR** ptr_copy;
WCHAR** env_copy;
DWORD* required_vars_value_len = alloca(n_required_vars * sizeof(DWORD*));
/* first pass: determine size in UTF-16 */
for (env = env_block; *env; env++) {
int len;
if (strchr(*env, '=')) {
len = MultiByteToWideChar(CP_UTF8,
0,
*env,
-1,
NULL,
0);
if (len <= 0) {
return GetLastError();
}
env_len += len;
env_block_count++;
}
}
/* second pass: copy to UTF-16 environment block */
dst_copy = (WCHAR*)uv__malloc(env_len * sizeof(WCHAR));
if (!dst_copy) {
return ERROR_OUTOFMEMORY;
}
env_copy = alloca(env_block_count * sizeof(WCHAR*));
ptr = dst_copy;
ptr_copy = env_copy;
for (env = env_block; *env; env++) {
if (strchr(*env, '=')) {
len = MultiByteToWideChar(CP_UTF8,
0,
*env,
-1,
ptr,
(int) (env_len - (ptr - dst_copy)));
if (len <= 0) {
DWORD err = GetLastError();
uv__free(dst_copy);
return err;
}
*ptr_copy++ = ptr;
ptr += len;
}
}
*ptr_copy = NULL;
assert(env_len == ptr - dst_copy);
/* sort our (UTF-16) copy */
qsort(env_copy, env_block_count-1, sizeof(wchar_t*), qsort_wcscmp);
/* third pass: check for required variables */
for (ptr_copy = env_copy, i = 0; i < n_required_vars; ) {
int cmp;
if (!*ptr_copy) {
cmp = -1;
} else {
cmp = env_strncmp(required_vars[i].wide_eq,
required_vars[i].len,
*ptr_copy);
}
if (cmp < 0) {
/* missing required var */
var_size = GetEnvironmentVariableW(required_vars[i].wide, NULL, 0);
required_vars_value_len[i] = var_size;
if (var_size != 0) {
env_len += required_vars[i].len;
env_len += var_size;
}
i++;
} else {
ptr_copy++;
if (cmp == 0)
i++;
}
}
/* final pass: copy, in sort order, and inserting required variables */
dst = uv__malloc((1+env_len) * sizeof(WCHAR));
if (!dst) {
uv__free(dst_copy);
return ERROR_OUTOFMEMORY;
}
for (ptr = dst, ptr_copy = env_copy, i = 0;
*ptr_copy || i < n_required_vars;
ptr += len) {
int cmp;
if (i >= n_required_vars) {
cmp = 1;
} else if (!*ptr_copy) {
cmp = -1;
} else {
cmp = env_strncmp(required_vars[i].wide_eq,
required_vars[i].len,
*ptr_copy);
}
if (cmp < 0) {
/* missing required var */
len = required_vars_value_len[i];
if (len) {
wcscpy(ptr, required_vars[i].wide_eq);
ptr += required_vars[i].len;
var_size = GetEnvironmentVariableW(required_vars[i].wide,
ptr,
(int) (env_len - (ptr - dst)));
if (var_size != len-1) { /* race condition? */
uv_fatal_error(GetLastError(), "GetEnvironmentVariableW");
}
}
i++;
} else {
/* copy var from env_block */
len = wcslen(*ptr_copy) + 1;
wmemcpy(ptr, *ptr_copy, len);
ptr_copy++;
if (cmp == 0)
i++;
}
}
/* Terminate with an extra NULL. */
assert(env_len == (ptr - dst));
*ptr = L'\0';
uv__free(dst_copy);
*dst_ptr = dst;
return 0;
}

of which uppercase PATH is one:

static const env_var_t required_vars[] = { /* keep me sorted */
E_V("HOMEDRIVE"),
E_V("HOMEPATH"),
E_V("LOGONSERVER"),
E_V("PATH"),
E_V("SYSTEMDRIVE"),
E_V("SYSTEMROOT"),
E_V("TEMP"),
E_V("USERDOMAIN"),
E_V("USERNAME"),
E_V("USERPROFILE"),
E_V("WINDIR"),
};

so if PATH is missing the OS will be queried (and will get Path, environment variables at the OS level on Windows are case insensitive). It looks like the code will ignore Path in the passed in envPairs (!).

@JamesMGreene
Copy link
Contributor Author

I would agree that seems to be what the code suggests but I also do not have Python in my OS-level PATH variable, so I do think it is somehow utilizing process.env.Path.

@richardlau
Copy link
Member

richardlau commented May 8, 2018

At the OS level PATH and Path are the same thing. process.env is special in that it has custom getter/setter methods:

node/src/node.cc

Lines 2618 to 2647 in b00c34c

static void EnvGetter(Local<Name> property,
const PropertyCallbackInfo<Value>& info) {
Isolate* isolate = info.GetIsolate();
if (property->IsSymbol()) {
return info.GetReturnValue().SetUndefined();
}
#ifdef __POSIX__
node::Utf8Value key(isolate, property);
const char* val = getenv(*key);
if (val) {
return info.GetReturnValue().Set(String::NewFromUtf8(isolate, val));
}
#else // _WIN32
node::TwoByteValue key(isolate, property);
WCHAR buffer[32767]; // The maximum size allowed for environment variables.
SetLastError(ERROR_SUCCESS);
DWORD result = GetEnvironmentVariableW(reinterpret_cast<WCHAR*>(*key),
buffer,
arraysize(buffer));
// If result >= sizeof buffer the buffer was too small. That should never
// happen. If result == 0 and result != ERROR_SUCCESS the variable was not
// not found.
if ((result > 0 || GetLastError() == ERROR_SUCCESS) &&
result < arraysize(buffer)) {
const uint16_t* two_byte_buffer = reinterpret_cast<const uint16_t*>(buffer);
Local<String> rc = String::NewFromTwoByte(isolate, two_byte_buffer);
return info.GetReturnValue().Set(rc);
}
#endif
}

node/src/node.cc

Lines 2650 to 2679 in b00c34c

static void EnvSetter(Local<Name> property,
Local<Value> value,
const PropertyCallbackInfo<Value>& info) {
Environment* env = Environment::GetCurrent(info);
if (config_pending_deprecation && env->EmitProcessEnvWarning() &&
!value->IsString() && !value->IsNumber() && !value->IsBoolean()) {
if (ProcessEmitDeprecationWarning(
env,
"Assigning any value other than a string, number, or boolean to a "
"process.env property is deprecated. Please make sure to convert the "
"value to a string before setting process.env with it.",
"DEP0104").IsNothing())
return;
}
#ifdef __POSIX__
node::Utf8Value key(info.GetIsolate(), property);
node::Utf8Value val(info.GetIsolate(), value);
setenv(*key, *val, 1);
#else // _WIN32
node::TwoByteValue key(info.GetIsolate(), property);
node::TwoByteValue val(info.GetIsolate(), value);
WCHAR* key_ptr = reinterpret_cast<WCHAR*>(*key);
// Environment variables that start with '=' are read-only.
if (key_ptr[0] != L'=') {
SetEnvironmentVariableW(key_ptr, reinterpret_cast<WCHAR*>(*val));
}
#endif
// Whether it worked or not, always return value.
info.GetReturnValue().Set(value);
}

So if you set process.env.Path the variable will be set at the OS level. (This doesn't apply if you copy process.env, e.g., let env = { ...process.env }; and set Path on the copy).

C:\work\node\bin\node-v8.11.1-win-x64>node
> process.env.Path;
'C:\\Python27'
> process.env.PATH;
'C:\\Python27'
> process.env.Path='';
''
> process.env.PATH; // for process.env on Windows, Path and PATH are the same (case insensitivity)
''
> var childProcess = require('child_process');
undefined
> let env = { ...process.env };
undefined
> childProcess.spawnSync('python', ['--version'], { env }).stderr.toString().trim()
TypeError: Cannot read property 'toString' of null
> env.Path='C:\\Python27' // libuv ignores this
'C:\\Python27'
> env.PATH // Path and PATH different in the env copy
undefined
> childProcess.spawnSync('python', ['--version'], { env }).stderr.toString().trim()
TypeError: Cannot read property 'toString' of null
> process.env.Path='C:\\Python27'
'C:\\Python27'
> process.env.PATH // for process.env on Windows, Path and PATH are the same (case insensitivity)
'C:\\Python27'
> childProcess.spawnSync('python', ['--version'], { env }).stderr.toString().trim()
'Python 2.7.13'
>

@JamesMGreene
Copy link
Contributor Author

So if you set process.env.Path the variable will be set at the OS level.

Ah, that was the missing piece of the puzzle for me. 👌

calebboyd pushed a commit to nexe/nexe that referenced this issue May 8, 2018
cjihrig added a commit to cjihrig/libuv that referenced this issue May 9, 2018
Refs: nodejs/node#20605
PR-URL: libuv#1837
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
@richardlau
Copy link
Member

This is kind of a note for myself since I haven't had time to look into this further, but thinking about this a bit more, I'm wondering if the libuv behaviour of using the passed in env.PATH (or env.Path after libuv/libuv#1837) on Windows to find the command to execute is correct. I don't think the Unix platforms behave this way (need to check) and the passed in env is given to the child process but not used to execute the child process?

@lundibundi
Copy link
Member

ping @richardlau should this stay open?
Would note that on Windows the first case-insensitively found path in env (after sorting it) if any will be used to look up the command to execute suffice?

@richardlau
Copy link
Member

Better documentation is always good and would be good enough to close this out. My last comment should probably still be investigated; if I remember correctly the libuv Windows code was using the path passed through in the env options to look up the command to execute which I don’t think is the case for Unix.

lundibundi added a commit to lundibundi/node that referenced this issue Mar 4, 2020
MylesBorins pushed a commit that referenced this issue Mar 11, 2020
Fixes: #20605
Refs: libuv/libuv#1837

PR-URL: #32091
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
MylesBorins pushed a commit that referenced this issue Mar 11, 2020
Fixes: #20605
Refs: libuv/libuv#1837

PR-URL: #32091
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
targos pushed a commit that referenced this issue Apr 22, 2020
Fixes: #20605
Refs: libuv/libuv#1837

PR-URL: #32091
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants