-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: discussion of workspace "extending" behavior #48
Comments
Could explain what you mean by auto-extension? |
Whenever you call unset CMAKE_PREFIX_PATH
source /opt/ros/hydro/setup.bash
# calling catkin build will extend /opt/ros/hydro
mkdir -p ~/ws1/src
cd ~/ws1
catkin build
source ~/ws1/devel/setup.bash
# calling catkin build somewhere else will extend /opt/ros/hydro and ~/ws1
mkdir -p ~/ws2/src
cd ~/ws2
catkin build
source ~/ws2/devel/setup.bash
# calling catkin build somewhere else will extend /opt/ros/hydro and ~/ws1 and ~/ws2 etc |
This is not something unset CMAKE_PREFIX_PATH
source /opt/ros/hydro/setup.bash
# CMAKE_PREFIX_PATH will be "/opt/ros/hydro:"
cd ~/some_single_package
mkdir build
cd build
cmake ..
make
source ~/some_single_package/build/devel/setup.bash
# CMAKE_PREFIX_PATH will now be "/home/user/some_single_package/build/devel:/opt/ros/hydro" |
Yeah, I know that it's a feature of the underlying |
@tkruse @jack-oquin You guys brought this up over in #47 and it might be good to focus that discussion here. |
OK, @tkruse said it well there. Workspace chaining should not depend on a bunch of poorly-understood variables lying around in the users' shell environments. The workflow should declare the chaining explicitly when creating a new workspace. I don't know what that means for backwards compatibility or how to deal with all the documentation showing people various ways of making the current design work. In the absence of explicit chaining, I suppose one could fall back on implicit chaining via the shell environment, but that preserves one of the main causes for confusion and failure. |
Okay, so some other resources from the past on the topic: catkin currently uses the CMAKE_PREFIX_PATH when creating a new workspace to determine the workspaces the new one will depend on. An alternative approach would be that when creating a workspace, this variable is not used, but instead explicit user input is required, else the new workspace has no parents. I will guess this would require changing catkin itself, as I believe it currently does not offer any high-level python API to change that behavior. I believe the code doing this is in catkin/cmake/templates/_setup_util.py.in, so even from the filename you can see that this would be a messy operation. |
This behavior has nothing to do with
It depends on one variable,
The intention is that the user's environment drives the behavior of the build (sort of like
I agree, being explicit is always better, though commonly people complain about that too, e.g. defining build and run depends separately in the
Ok, so lets consider that for a moment. When I build a workspace I could pass the list of workspaces explicitly to
I think the biggest problem conceptually is that the act of building (invoking |
No tool for general audiences should ever remove anything from any environment variable (other than environment variables exclusively managed and used by that tool, i.e. 'private' env variables). Any sh file provided by ROS should only add to the environment. And the delta between the environment before and after sourcing must be fully predictable from just the command line used when creating the workspace and the current state of the workspace. No knowledge of the environment at workspace creation time should ever be required to predict the additions that any sh file will do to environment variables. When you source two sh files in sequence, then they will perform their modifications in sequence, so the order obviously matters. Given that workspace chaining implies some kind of overlaying, only the last element of the chain should be sourced, and that source action should do what is necessary to provide overlaying respecting the chain order. The CMAKE_PREFIX_PATH is intended to help resolve the location of packages for building against them, that's the only purpose it should ever be used for. I have always argued for that. Explicit is better than implicit, but DRY still applies, as does convention over configuration. Convention means using fixed values when no explicit value is given, not reading values from the environment. Catkin is based on implicit over explicit, makes people repeat themselves, and in some places does not provide useful conventions allowing to avoid explicit configuration. The preference order is for configuring stuff is (> means better than): |
@wjwwood: Please do not interpret my comments as harshly critical of you or of the current implementation. I hold you and your work in the highest regard. But, repeatability is the most important thing when building, and complex interactions with the shell environment work against that. I fully understand the need for some shell variables, and I know how tempting it can be to solve a tricky design problem by adding an extra variable. But they need to be simple, well-documented and relatively constant per-user. I did not like the excessive shell environment usage of rosbuild 5 years ago. With catkin, it's gotten even more complex. My personal experience with catkin over the last year and a half has been frustrating. Explaining catkin to intelligent lower-division undergraduates is very difficult. They waste a lot of time on the robots because they botched their shell environment, each in some different way. I defer to your judgement to determine what, if anything, can be done to improve matters at this late date. If there is no solution to this problem, then we'll need to document how it works better. But, that is difficult, given the present complexity. A cleaner design would be easier to document well. |
Even if a "new" cleaner design might not be applicable in the near term I would consider it very valuable if we would come up with something like that. Based on a clear and better vision we can always strive to get closer to it. But until now we have only identified some aspects which makes the current approach difficult to use. I don't see any concrete idea for a cleaner design yet. |
Well, like @dirk-thomas said, I think if we feel that there is a problem we have to identify it, and come up with a concrete way to change it. Only then can we try to come to some consensus on if and how it should change. So @jack-oquin you mentioned that you believe it:
In order to act on these points we need examples of how that is the case and suggestions on how it should/could be different.
We cannot act on this, we need to know HOW it was botched and we need suggestions and insight on how it could be different to avoid that. I sympathize with the point, but it is simply not helpful in resolving the issues to simply state that they exist without describing them in detail. I already tried to make progress on the "many, complex" environment variables discussion by pointing out that we only use the Additionally, in this thread we've already discussed having the build command take a list of workspaces explicitly (ignoring any workspaces in the environment) and I've raised some questions about how that affects certain workflows and the user experience which haven't been addressed since. So, I'm trying to solicit concrete input from you guys and then discuss it, but I'm having trouble keeping us on track. |
Nobody except you, Dirk or Tully knows or cares which parts of the environment is used by which tools. As best I can tell, here's what I need in order to get useful work done on ROS:
There's probably something I am forgetting, and maybe some of it is no longer necessary. My ROS_PACKAGE_PATH is relatively short at the moment, because my current workspace does not have much in it. It was over 5KB long recently. If our students get any of that wrong, they are in deep trouble. Most of them don't know how to deal with it. They all have to share a common student account on the robots, so they each need to create a separate workspace. Disaster! |
There have been suggestion for a cleaner design already. The first is that no part of the current So if a user wants to chaining, he has to say explicitly which workspace he wants to chain to, and only this information will be used. So if there is a lot or rubbish in the current environment, the generated setup files still remain pristine. The rubbish is not persisted to reappear later on. Don't store rubbish in the workspace configuration, don't blindly mash together setup files. The next suggestion is: Do not make any other modifications to the And given the way catkin was designed, do not expect the community to humbly report their problems to you. catkin has many flaws, and every flaws may have many symptoms, none of which have any useful error message pointing to the source of the problem, and the problems may appear in combinations. Users are happy enough if after yet another wasted day, their robot can start acting on the demo. Users have then no motivation at all to go back and find what was originally wrong and write a bug report. catkin has made it too difficult to debug problems and create clean reports. Again, I do not get my hopes up here because the current catkin maintainer still thinks this is a great idea:
(taken from catkin_make) As long as you have this attitude to reporting failures, you cannot expect useful feedback. I think it is rather ironical that you throw such design at users and then say something like "We cannot act on this, we need to know HOW it was botched". If you need to know that, then write tools that generate useful failure reports. Duh. |
Let's focus on the use of Proposal 1: Use a Catkin-specific veriable in place of $CMAKE_PREFIX_PATH(inspired by comments in #47) So I think the first thing is that catkin uses (and some argue abuses) the What if we add a catkin-specific environment variable, call it Proposal 2: Add an optional argument to
|
Thank you @jbohren for keeping this on track. I completely agree with Proposal 1. Having a separate variable for the catkin workspaces is clearly the better choice. I also agree that having an option as in Proposal 2 would make it easier. If I understand it correctly it should work the same way as when the following would be invoked manually One potential problem with this is the following: if the user has sourced a different workspace before (e.g. Regarding Proposal 3 I am not sure if I do see the need for having a default stored somewhere rather then just spelling it out explicitly. But lets not focus on that specific feature - as you described it can be added on top if found useful. |
Exactly.
Yeah, we can return to this in #47 |
What about my remark and question regarding proposal 2? Should we only allow the usage when the environment is clean to guarantee that the result is correct? |
@tkruse Please focus on the issue at hand. It is inappropriate and damaging to our discussion to bring in separate topics and use hyperbole to make your point. Please speak for yourself and make your opinions heard while focusing on constructive solutions for the topic at hand. |
We are starting to have multiple conversations here, but hopefully that's ok. @jack-oquin I wanted to address your feedback on the environment variables.
In fairness I think that list includes other develops of things which are represented in your list and anyone who would like to understand the system (the people who would stand to benefit from this being simpler). Lets take a closer look at the environment variables:
These two variables are the only two variables catkin actually uses internally and these are the only two variables which we could remove and replace with some other functionality. I'm not sure how we would do that, maybe a marker file or something. Currently the discussion is to stop using
These are all Gazebo specific, you'd have to take up these with them, catkin neither can control this nor is it forcing them to do things this way.
These, along with
ROS sets quite a few variables, but the more mercurial of the them is probably the I guess my point here is that all of the environment variables serve a purpose and most of them cannot be controlled by catkin.
That only occurs in the extreme case when you have every package installed into a different folder so you have N workspace's chained where N in the number of packages you are building.
They shouldn't be setting these manually, they should only be sourcing a setup file, so I'm not sure what they would do to get it wrong, but I believe you that they do get in erroneous states and I take the point that it would be complicated to debug in that case. So given that we want to improve this scenario, what do we do to make it simpler? We've already discussed making a custom |
I didn't say that there had not been, I said that I had raised questions about how that change would behave in a few described workflows and I got no direct feedback on that.
I'm not sure that makes sense, but as a user of software, I would never expect to go to an issue tracker and say "There was a problem and it sucked" and then have the developer be able to fix that. I think it is far more unreasonable to expect that the developers can address something which hasn't been described in detail and/or made reproducible. Further more, problems that user run into which have not been described cannot support any engineering decision to change the design because you have no idea what the actual problem was. Also the Python snippet you pulled has nothing to do with any of the problems described here, though I agree that silently suppressing the traceback makes it harder for users to make good reports. However, catching and suppressing the traceback in
Do you have any constructive suggestions on how we can generate better reports? (excluding the catkin_make Python traceback one) |
It would be ideal if someone could run the command even when the environment isn't clean. When the user sources the generated setup file, then it should get cleaned. Even if it only manages to clean |
Only cleaning the CATKIN_PREFIX_PATH would be incomplete. It needs to undo previous prepends of PATH, PYTHONPATH, LD_LIBRARY_PATH, etc. So I do think when sourcing a different environment it is mandatory to rollback changes introduces by previously sourcing a different file. Currently that is done implicitly. Arguably it could be done with an explicit command instead. And if not done explicitly by the user it could error out telling the user the command to first rollback the previous environment. Anyway this kind of rollback is always limited to the environment variable modified by the _setup_util.py file. For any user defined environment hook it is not possible since there is just not enough knowledge about it in order to roll it back. |
Yeah, I think that's fine as long as it's done with the current rollback functionality so that we don't blow away custom additions to things like
Yeah, I think to determine which is better we'd need to try out both methods.
Of course, but I think (and I expect we're all in agreement that) those variables are the ones that are most likely to cause problems. Is there any way to define a |
Somehow using the file system seems like a huge step in the right direction. The fundamental design problem is that we are effectively using the equivalent of C++ global variables for things that should be per-object. Because the scope of these variables is inappropriate, people dealing with multiple "objects" (i.e. workspaces) get very confused.
Yes, I did notice that. Perhaps I have not succeeded in convincing anyone that there is a problem.
I would like to understand how adding yet another environment variable will "make things more straightforward". Maybe it's true, but I don't get it.
The Gazebo variables have not caused much trouble, probably because they are relatively static, not changing for every workspace. I only mentioned them to remind you that actual users must to keep track of a large number of messy details. Saying "catkin only adds two" tends to obscure that point.
Along with ROS_PACKAGE_PATH, they seem to be the root cause of most problems. Every one of them contains per-workspace components. Wrong scope! Maybe it's time to think outside of the box.
It occurs because I am using
Each robot has a single student account which they must all share. So, we asked them to create a separate workspace for each project. If everybody were to edit the I see the mess after it has happened. It's usually hard to tell exactly how it got that way. In many cases, they probably had cruft in their environment when creating the workspace initially. ROS is not very helpful in diagnosing these kinds of problems. The students are intelligent, but inexperienced.
I don't know how it works right now, nor how to fix it. If I were designing something like this from scratch, I would use the file system and not the shell environment for per-workspace customization. The variables set in I have been trying to document the fact that there is a problem. Not every user with a problem knows how to fix it. That does not make the report invalid. |
If you're sharing a single user account I would suggest that putting anything in your bashrc which relates to a users workspace is inappropriate and that is very much like a power user where you're jumping workspaces all the time. Many people i've seen that reuse workspaces just add an alias to source their workspace for convenience. Often one letter or a very short word. "h" or "hydro" or "hsrc" Stepping back I don't think that we want to reinvent the isolation of shell processes. If we want to leave a clean environment behind after running something we should use a child shell process and then terminate it. (Or a new terminal) The other solution is that we can do a lot of accounting and try to clean things up automatically. But there's always going to be a corner case where something was manually modified which was automatically set and it will get rolled back incorrectly. The other option we could pursue would be to make it more visible that you're in an environment and what environment you're in. In the same way that virtualenv does this already by updating your PS1 display. We can probably even do this via a hooks package and allow customization. Virtualenv does not have any tools for rolling back env changes except quitting. I believe this approach can solve the useability issues without requiring us to change how the tool we rely on underneath us, such as the linux runtime linking, gcc and cmake. The workspaces sourced can be displayed, and you'll be able to see what is already on your path. And we don't need to try to support rolling back. If you want to roll back you will need to reconstruct from an empty environment. |
I think maintaining context is really important, too, but I've found that tools that are based on file structure can get really messy, themselves. I think a marker file is important, and I even think putting configuration info into that marker file is really the way to go. In that case, we could change the way catkin determines the environment at configure/build time like the following: _Given the following actions:_ # Load /opt/ros/hydro workspace
source /opt/ros/hydro/setup.bash
# Make a new workspace, called "ws1"
mkdir -p ~/ws1/src
cd ~/ws1/src && catkin_create_pkg foo
# Build the new workspace, chaining it from /opt/ros/hydro
cd ~/ws1 && catkin_make _Current Catkin Sticky Behavior:_ (correct me if I'm wrong)
_Idea for New Catkin Sticky Behavior:_
It makes things more straightforward in three ways:
|
Such restrictions will give you better error reports because the circumstances of any failure will have less variation.
As a result of all the hidden mechanics, when faced with any failure, users will start hacking away in unpredictable ways before asking for help, at which point their setup is too broken to reconstruct the original almost pristine state where the original failure occured. And thus you would not get a useful error report from the original incident. Make this test: Take some novice ROS user, give him a pencil, tell him a given
|
I would also again like to encourage people to consider that
is not a case where the second command should act by default as if the user wanted to create a workspace chained to hydro. Between those two commands weeks may have passed, the user may have run a thousand commands in between, the first command may have come from a .bashrc, the user might not be aware that he returned to a shell where previously he ran the source command, etc. Viable alternatives are explicit arguments or user interaction ("Do you really want to chain against ...(n):") |
Dirk's patch ros/catkin#641 fixes the rollback failure, and now I want to look at overriding the auto-chaining behavior. I just did a little experimenting and looking into the catkin source, I'm going to continue discussing overriding Once we figure out what the interface is to control that, I figure we can get back to discussing how |
In order to advance this discussion I have proposed a conference call to talk with all interested people directly (https://groups.google.com/forum/#!topic/ros-sig-buildsystem/K7_QBJzFlVA). Please consider to reply to the email on the mailing list. |
Here's a working example of how we can enable manual extension, make the |
I think we can close this issue. There was a lot of really helpful discussion, and I think the features in #58 and #80 enabled much greater control over workspace extension. I understand that there still aren't good guarantees when not starting with a clean shell, but at least there's more introspection into it, now. |
Currently,
catkin build
performs the workspace auto-extension behavior thatcatkin_make
also does by default. I think it would be really helpful to people to be able to create a clean workspace without having to messily unset various environment variables. This could be related to #47The text was updated successfully, but these errors were encountered: