Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-path envvars with colons get mangled #448

Closed
mikepurvis opened this issue Mar 24, 2017 · 4 comments
Closed

Non-path envvars with colons get mangled #448

mikepurvis opened this issue Mar 24, 2017 · 4 comments

Comments

@mikepurvis
Copy link
Member

This is a regression introduced by the changes in #443. Specifically, @tspicer01 and I are seeing issues with $ROS_MASTER_URI, because the logic splits on : and can end up reordering the pieces within a given workspace, so we get crazy stuff in our environments like:

ROS_MASTER_URI=http:11311://localhost

I'll have a fix and some proper tests up shortly.

@mikepurvis
Copy link
Member Author

mikepurvis commented Mar 24, 2017

A couple paths forward on this:

  • Make the logic smarter in terms of preserving order of things, so that the mangling doesn't happen.
  • Make the logic smarter in terms of only merging keys that are definitely lists of colon-separated paths (eg, look for at least one that has a leading slash).
  • Avoid invoking any merge logic until there's an actual name collision (not convinced this would have saved ROS_MASTER_URI, since it would still end up getting merged down the package tree).
  • Include a blacklist for known problem keys
  • Include a whitelist, so that we only merge PYTHONPATH, INCLUDE_DIRS, PATH, CMAKE_PREFIX_PATH, etc.

The whitelist seems the safest— only perform mutations on things you really do understand, but it seems a little scary as well to ensure everything's covered. For example, what about LD_LIBRARY_PATH vs. DYLD_LIBRARY_PATH, ROSLISP_PACKAGE_DIRECTORIES, etc.

@wjwwood
Copy link
Member

wjwwood commented Mar 24, 2017

To be clear, I'm assuming this only matters if you're using --env-cache, right?


I now realize that your new logic (from #443) assumes that these paths can be reorder, which is probably not what contributors of the path extensions (catkin or individual packages) intended. Even if you preserve some of the order by avoiding a set(), the looping order looks arbitrary. However, some of these assumptions, that don't hold imo, may have been in the --env-cache logic before your recent pr.

Also, I think the only "perfect" way to know what should be blacklisted or whitelisted is to get that information from the packages which contributed the env hooks. For any reasonable group of variables I do not think you assume any of these things without information from the packages modifying them:

  • all variables are lists separated by :
  • deduplicating the list ok
  • whether or not you append or prepend doesn't matter (or order in general doesn't matter)
  • topological order of packages making changes to variables doesn't matter

Only the first one listed is what you've run into here.

Since some of these things are set implicitly by catkin itself (i.e. the packages don't explicitly ask to set these variables) I think you could possibly whitelist those env variables. Things like PATH, LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, PYTHONPATH, CMAKE_PREFIX_PATH, etc. can somewhat safely have assumptions made about them. But that doesn't account for a package that might manually manipulate these variables in a custom environment hook. You just have to assume (hope) they don't do that, otherwise the env cache thing will be broken I think.

For other variables like ROSLISP_PACKAGE_DIRECTORIES and ROS_MASTER_URI, which are set by packages using custom env hooks, you simply cannot assume anything about them in catkin_tools.


W.r.t. the order in which hooks are sourced (topological or alphabetical or by workspace or unspecified), I can talk a bit about what we did in ament to try to clarify this. In ament, env hooks are sourced in topological order, even across workspaces. I think this makes sense, because it allows packages later in the topological order to override dependent packages through manipulation of paths, like the PYTHONPATH, for example. Though I'm not sure how they ought to be handled in catkin when doing an isolated build like you're doing here. I think, but I'm not certain, that this is simply not specified in catkin. Within a single workspace I believe it is the alphabetical order of the env hook files themselves, which are sourced after caktin does its default setting of variables. In fact the environment hooks are done last, see:

https://github.com/ros/catkin/blob/kinetic-devel/cmake/templates/setup.sh.in#L78

And they are ordered (alphabetically and some other rules) in the Python script that's run during setup.sh:

https://github.com/ros/catkin/blob/5a7f61d594a56ca4b3ae1705e64140e9f29e0bd7/cmake/templates/_setup_util.py.in#L219

But once you cross the workspace threshold, then it is done by workspace (older first).

ament's design is slightly cleaner, imo, in another way too because the ament tool itself doesn't set environment variables itself. For example, an ament_cmake package will add itself to the CMAKE_PREFIX_PATH rather than relying on the build tool to so, and can choose to append or prepend itself and to only do so if the value is not already in the path. Whereas a pure Python package (a setup.py but no CMakeLists.txt) might only add itself to the PYTHONPATH, again choosing whether or not to prepend, append, or to do so uniquely. This is done with environment hook templates which are automatically added by the package (so no extra work for users), but it does so in each shell language, i.e. .sh or .bat for Windows. Here's an example for the PYTHONPATH in .sh:

https://github.com/ament/ament_package/blob/master/ament_package/template/environment_hook/pythonpath.sh.in#L3

But as you can see in that example, it uses these common functions like ament_prepend_unique_value, which is a shell function that does just what the name implies.

I can imagine a system where the packages which cause these changes declare what they want to do in a DSL-ish thing, something like "prepend X to PYTHONPATH" or "set ROS_MASTER_URI to Y", rather than expressing it in each shell language. From this DSL, shell scripts could be generated, and a tool like catkin_tools could also use this DSL to know what needs to be done in each case to avoid running the shell on each hook every time, or without having to rely on heuristics to determine how to treat the env variables due to a lack of this "intent" information.


Ultimately you loose information when you stop calling the env hooks each time. This is something I brought up at the time, and is why env cache defaults to off (see #249), and now we're trying to recover the missing logic which happens when you just take a snapshot of the env after each package so that we can avoid weird issues when trying to combine paths.

@mikepurvis
Copy link
Member Author

mikepurvis commented Mar 25, 2017

The code as written is a problem even without the cache (I'm working on a fix; a PR will be up later this evening or tomorrow). I see the issue with not evaluating hooks each time in terms of how it limits what the hooks are capable of doing, but I think there are other more fundamental limits which arise from the "isolated" mode of building when one also attempts to build packages in parallel, specifically:

  • catkin has no concept of extending multiple workspaces, so if a package sources two different dependencies' install spaces, the second will wipe the changes made by the first (this is the issue that motivated Correctly merge envvars from isolated workspaces. #443 in the first place, see Isolated result spaces broken when not extending #382 (comment)).
  • comparing the merge-install and isolate-install build modes, there's never going to be exact parity as far as hook execution order since a workspace boundary is being introduced between every build (the same holds true for catkin_make vs catkin_make_isolated— the hook order is alphabetical in the first and topological in the second).
  • I stand by the order as being arbitrary within a single package's immediate, in-workspace dependencies. With a merged workspace, you have the build products in a single resultspace, and then all the underlay stuff in one or more others behind that. With all the "in-workspace" dependencies being split up across multiple isolated installspaces, the order that those paths appear in really has no meaning.

As far as a plan forward, I'm reworking this logic:

  • To maintain order (which, in the isolated case, will mean the env loader order, which is given by get_cached_recursive_build_depends_in_workspace)
  • To only be applied to envvars which match .*PATH$.
  • To be in a standalone function covered by clear test cases.

At the end of the day, isolated installspaces with this logic do appear to work well for the env hooks used in the desktop_full package tree, and that's mostly what I care about for my particular use case.

@mikepurvis
Copy link
Member Author

I like the idea of this merge logic always running, even in the non isolated case, however I see the argument that this breaks the ability of a hook to do something like, for example, remove a directory from a path var.

If you'd like, I can add a condition that skips calling merge_envs in the case where there's only a single overlay. That would more closely mirror the catkin_make behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants