Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PR196 1/3] New asyncio-based execution engine #249

Merged
merged 33 commits into from
Jan 20, 2016
Merged

Conversation

jbohren
Copy link
Contributor

@jbohren jbohren commented Dec 15, 2015

This PR is the first set of refactored partial changes from PR #196 relating only to the execution engine and I/O handling, without the "linked" develspace support nor improved catkin clean support. It also includes the changes from #247 (and supersedes #248). This is meant to be a "functional" PR and it is expected that it might need further small revisions before being merged. This also adds a Travis CI configuration that builds on Linux as well as OS X.

asciicast

Main Module Changes

catkin_tools/common.py

The changes in catkin_tools.common include the following:

  • Replacing FakeLock with an asyncio-based implementation
  • Addition of get_recursive_build_dependants_in_workspace function which gives the packages that depend on a given package in the workspace. (This is used in the future)
  • Replacement of os.Popen with subprocess.Popen
  • Fixing bug where slice_to_printed_length fails if the length is larger than the length of the lookup array.
  • Fixing uncaught IOError when the program receives a sigint during a wide_log message
  • Adds a mkdir_p function which recursively makes directories

catkin_tools/resultspace.py

The main change to catkin_tools.resultspace is the addition of environment and env hook checksum caching. This enables faster builds by using the resultspace mechanism instead of the build_env.sh mechanism. The resultspace loading can also cache the environment, and updates the cache when a resultspace's Catkin env_hooks change in any way.

It also adds a strict flag which controls whether or not to check for a .catkin file before loading the environment.

catkin_tools/argument_parsing.py

The changes in catkin_tools.argument_parsing mainly support the new job server, which is initialized even when there isn't any support for the GNU Make jobserver, to allow for future extensions to the jobserver in other contexts (distcc, ninja, etc).

Execution Engine

The execution engine consists of the main asyncio-based Executor, Jobs, and job Stages described in #196. Jobs for different catkin build types are defined via setuptools entry_points. This improves job parallelism, but also provides a mechanism for capturing stdout and stderr independently. As such, warnings and errors are now detected and printed in a much clearer manner.

This PR improves on the CMake and Catkin jobs defined in #196 by forgoing using the build_env.sh script, and instead using the catkin_tools.resultspace-based environment loading. This not only speeds up builds, but it also fixes a bug that prevented single-pass workspace building as described in this thread.

More details on the design of the execution engine are forthcoming via a documentation PR #250

Job Environments

Previously, each build job got its environment from a build_env.sh file, which was generated before the package was built. This shell script essentially sources the resultspace that each job is meant to be built against.

This PR adds the ability to cache the resulting environment from sourcing these resultspace setup file. In addition to speeding up all builds, it speeds up building already-built packages dramatically. For the Hydro ros-base workspace (172 packages), catkin build without env caching takes about 30 seconds, and catkin build with env caching takes about 10 seconds. This means that it's saving between 0.1-0.2 seconds per package, which adds up. Note that this effect becomes more dramatic as more workspaces are chained.

This PR removes the build_env.sh files entirely, and instead opts for a non-static way to get the environment for a job. This is the --env option which can be passed to any verb which supports it. So to get the environment in which qt_gui_cpp is built, you could run catkin build --env qt_gui_cpp, and it will print the environment to stdout. This is used instead of build_env.sh to reproduce build stages as well, as shown below:

_________________________________________________________________________________
Warnings   << qt_gui_cpp:make /home/jbohren/ws/ct_roscomm/build/_logs/qt_gui_cpp/build.make.000.log
cd /home/jbohren/ws/ct_roscomm/build/qt_gui_cpp; catkin build --env qt_gui_cpp | xargs -I %ENV% env %ENV% /usr/bin/make --jobserver-fds=6,7 -j; cd -
sip: Deprecation warning: qt_gui_cpp.sip:1: %Module version number should be specified using the 'version' argument
** WARNING expected ``)'' = 40
.................................................................................

Since this feature changes behavior, it is off by default, but can be enabled / disabled with --env-cache and --no-env-cache, respectively.

Addresses the Following Issues

Enhancements

Definite Bugfixes

Likely Bugfixes (through different implementation)

Outstanding Issues

log(clr("[build] @!@{rf}Error:@| The workspace packages have a circular "
"dependency, and cannot be built. Please run `catkin list "
"--deps` to determine the problematic package(s)."))
return
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above provides a more graceful failure for #229

@xqms
Copy link
Contributor

xqms commented Dec 15, 2015

Sorry, answered to the wrong PR. I'm putting this here again so it's in the right place ;-)

Thanks for your work on separating this!

I just saw a unicode exception here:

Warnings << gazebo_ros_control:make /var/lib/jenkins/jobs/spacebot/workspace/build/_logs/gazebo_ros_control/build.make.000.log
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/home/max/install/catkin_tools/catkin_tools/execution/controllers.py", line 452, in run
    wide_log('\n'.join(lines))
  File "/home/max/install/catkin_tools/catkin_tools/common.py", line 450, in wide_log
    wide_log_fn(msg, **kwargs)
  File "/home/max/install/catkin_tools/catkin_tools/common.py", line 419, in disabled_wide_log
    log(msg, **kwargs)
  File "/home/max/install/catkin_tools/catkin_tools/common.py", line 272, in log
    print(*args, **kwargs)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 129: ordinal not in range(128)

The unicode character 2018 is "LEFT SINGLE QUOTATION MARK", emitted by gcc on my system:

[...] warning: enumeration value ‘EFFORT’ not handled in switch

@xqms
Copy link
Contributor

xqms commented Dec 15, 2015

Sorry, this may have been caused by running catkin_tools under Jenkins, which apparently sets stdout encoding to ascii. Running catkin build straight in the terminal works just fine.

I'll investigate further.

@jbohren
Copy link
Contributor Author

jbohren commented Dec 15, 2015

Sorry, this may have been caused by running catkin_tools under Jenkins, which apparently sets stdout encoding to ascii. Running catkin build straight in the terminal works just fine.

@xqms Yeah, overall, the unicode encoding/decoding needs to be cleaned up before or after this gets merged. Can you simulate the stdout encoding in a unit test?

@xqms
Copy link
Contributor

xqms commented Dec 16, 2015

For reference, I was able to fix my issue by setting the environment variable PYTHONIOENCODING=utf_8 when executing catkin_tools under jenkins. This forces the stdout encoding to UTF-8.

@xqms Yeah, overall, the unicode encoding/decoding needs to be cleaned up before or after this gets merged. Can you simulate the stdout encoding in a unit test?

I guess we should define the wanted behavior first. I'd propose that bytes captured from a job should be passed through 1:1 without caring about the encoding. First decoding to unicode and then re-encoding again to some variable output encoding is just asking for trouble.

This might conflict with the need to actually parse the output (e.g. line splitting). Checking for the newline byte should still be possible without decoding to unicode, though...

@NikolausDemmel
Copy link
Member

One question: Why are we downloading and compiling catkin from github and then sourcing the resulting setup.bash in the travis config? It turns out that (on my OS X machine) 5 tests fail without this, but why should that be a prerequisite?

@jbohren
Copy link
Contributor Author

jbohren commented Dec 17, 2015

One question: Why are we downloading and compiling catkin from github and then sourcing the resulting setup.bash in the travis config?

Well, we're getting catkin from github on Travis because it's a prerequisite for building the Catkin packages.

It turns out that (on my OS X machine) 5 tests fail without this, but why should that be a prerequisite?

@NikolausDemmel Which tests are failing, and are you running the tests in a clean environment, or after sourcing some setup file? If it's a clean environment, I would expect any tests which build Catkin CMake packages to fail.

@NikolausDemmel
Copy link
Member

Well, we're getting catkin from github on Travis because it's a prerequisite for building the Catkin packages.

Would it make sense to have a copy of catkin in the test suite for that?

Which tests are failing, and are you running the tests in a clean environment, or after sourcing some setup file? If it's a clean environment, I would expect any tests which build Catkin CMake packages to fail.

Clean environment w-r-t ROS, i.e. I haven't sourced any setup file, catkin is not available. It makes sense as you say. Output of test command is: https://gist.github.com/NikolausDemmel/b7e5acea3ec3539672c3#file-catkin_tools-nosetests

@jbohren
Copy link
Contributor Author

jbohren commented Dec 17, 2015

Well, we're getting catkin from github on Travis because it's a prerequisite for building the Catkin packages.

Would it make sense to have a copy of catkin in the test suite for that?

I don't think that's necessary. It might be best to fix the version (or even add more versions to the matrix), though. What do you think, @wjwwood?

Which tests are failing, and are you running the tests in a clean environment, or after sourcing some setup file? If it's a clean environment, I would expect any tests which build Catkin CMake packages to fail.

Clean environment w-r-t ROS, i.e. I haven't sourced any setup file, catkin is not available. It makes sense as you say. Output of test command is: https://gist.github.com/NikolausDemmel/b7e5acea3ec3539672c3#file-catkin_tools-nosetests

Yeah, that's what I would expect, all's good.

@NikolausDemmel
Copy link
Member

I don't think that's necessary. It might be best to fix the version (or even add more versions to the matrix), though.

I just find it a bit strange that after downloading catkin_tools and installing all python and system dependencies you can't successfully run the unit tests without making sure catkin is in your environment. That's why I think keeping a copy to make the tests self-contained might be reasonable. It would also fix the version of catkin that is tested against and can be updated explicitely when needed.

But in any case there should be some documentation about this if catkin is not included. I believe on your docs branch you have already some stuff for developers, like how to add verbs etc, so that might be the best place.

@jbohren
Copy link
Contributor Author

jbohren commented Dec 17, 2015

I don't think that's necessary. It might be best to fix the version (or even add more versions to the matrix), though.

I just find it a bit strange that after downloading catkin_tools and installing all python and system dependencies you can't successfully run the unit tests without making sure catkin is in your environment. That's why I think keeping a copy to make the tests self-contained might be reasonable. It would also fix the version of catkin that is tested against and can be updated explicitely when needed.

Yeah, I'm just always weary of copying source code around. If anything, I'd have the tests check it out from source and build/source it.

But in any case there should be some documentation about this if catkin is not included. I believe on your docs branch you have already some stuff for developers, like how to add verbs etc, so that might be the best place.

That sounds reasonable, I'll add that to #250.

@wjwwood
Copy link
Member

wjwwood commented Dec 17, 2015

My preference would be to have the tests check out a copy of catkin during the test setup. Other necessities like catkin_pkg and empy are actually dependencies (directly or indirectly) so they should be installed when you run the tests.

@mikepurvis
Copy link
Member

By necessity, catkin_tools and catkin are joined at the hip— it makes sense to have tests on catkin_tools which will detect breaking changes in catkin proper.

@wjwwood
Copy link
Member

wjwwood commented Dec 17, 2015

By necessity, catkin_tools and catkin are joined at the hip— it makes sense to have tests on catkin_tools which will detect breaking changes in catkin proper.

I don't actually agree with the first half of that statement 😄. It should be perfectly capable (in the future if not now) of installing a workspace full of pure cmake packages.

But I do agree that the main use case is to use it in conjunction with catkin. It's not clear to me what version of catkin it should checkout, e.g. the indigo-devel branch or the latest release tag and in the future when there are multiple "latest" releases to choose from, which one? It might even require it to be a configuration of the tests and to be added to the matrix of things to test.

@jbohren
Copy link
Contributor Author

jbohren commented Dec 18, 2015

But I do agree that the main use case is to use it in conjunction with catkin. It's not clear to me what version of catkin it should checkout, e.g. the indigo-devel branch or the latest release tag and in the future when there are multiple "latest" releases to choose from, which one? It might even require it to be a configuration of the tests and to be added to the matrix of things to test.

Here's the simplest way to implement this, as shown over in #251
https://github.com/catkin/catkin_tools/blob/pre-0.4.0-destdir/.travis.yml#L14

@jbohren jbohren changed the title New asyncio-based execution engine (PR#196 1/3) [PR196 1/3] New asyncio-based execution engine Dec 18, 2015
@NikolausDemmel NikolausDemmel mentioned this pull request Dec 21, 2015
#sudo add-apt-repository ppa:fkrull/deadsnakes -y
#sudo apt-get update
#sudo apt-get install python3.4 python3-dev
#fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it should be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is left over from before 3.4 was available on Travis Precise.

@wjwwood
Copy link
Member

wjwwood commented Dec 21, 2015

Does this need to be rebased?

@jbohren
Copy link
Contributor Author

jbohren commented Dec 21, 2015

Does this need to be rebased?

@wjwwood I don't think it needs to be rebased, it built on #247 and doesn't interfere with the other ones that you merged yesterday.

@wjwwood
Copy link
Member

wjwwood commented Jan 19, 2016

You need to lead the output with '\r' + (' ' * terminal_width()) + <output>, otherwise you get lines like this: [100%] Built target cpp_common] [1/4 jobs] [64 queued] [High Load] [cpp_common:make (100%) - 2.7]

@wjwwood
Copy link
Member

wjwwood commented Jan 19, 2016

Sorry that should be: '\r' + (' ' * terminal_width()) + '\r' + <output>

@wjwwood
Copy link
Member

wjwwood commented Jan 19, 2016

Also, should the warnings be re-quoted when using -i? I get stuff like this:

[ ... ]
-- Found PY_em: /Library/Python/2.7/site-packages/em.pyc
-- Using empy: /Library/Python/2.7/site-packages/em.pyc
-- Using CATKIN_ENABLE_TESTING: ON
-- Call enable_testing()
-- Using CATKIN_TEST_RESULTS_DIR: /Users/william/indigo/build/rosconsole_bridge/test_results
-- Found gtest: gtests will be built
-- Using Python nosetests: /usr/local/bin/nosetests-2.7
-- catkin 0.6.16
-- Configuring done
CMake Warning (dev):
  Policy CMP0042 is not set: MACOSX_RPATH is enabled by default.  Run "cmake
-- Generating done
  --help-policy CMP0042" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  MACOSX_RPATH is not specified for the following targets:

   rosconsole_bridge

This warning is for project developers.  Use -Wno-dev to suppress it.

-- Build files have been written to: /Users/william/indigo/build/rosconsole_bridge
_________________________________________________________________________________________________________________________________________________________________________________
Warnings   << rosconsole_bridge:cmake /Users/william/indigo/build/_logs/rosconsole_bridge/build.cmake.000.log
CMake Warning (dev):
  Policy CMP0042 is not set: MACOSX_RPATH is enabled by default.  Run "cmake
  --help-policy CMP0042" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

  MACOSX_RPATH is not specified for the following targets:

   rosconsole_bridge

This warning is for project developers.  Use -Wno-dev to suppress it.

cd /Users/william/indigo/build/rosconsole_bridge; catkin build --get-env rosconsole_bridge | catkin env -si  /usr/local/bin/cmake /Users/william/indigo/src/rosconsole_bridge --no-warn-unused-cli -DCATKIN_DEVEL_PREFIX=/Users/william/indigo/devel -DCMAKE_INSTALL_PREFIX=/Users/william/indigo/install; cd -
.................................................................................................................................................................................

I haven't checked the behavior when using -v.

@wjwwood
Copy link
Member

wjwwood commented Jan 19, 2016

Another small thing I noticed are snippets like this one: @{yf}{cf}--@{yf}| Found the following Boost libraries: which appear to have been skipped over when doing substitutions.

@wjwwood
Copy link
Member

wjwwood commented Jan 20, 2016

Other than the overwriting of the status line when using -i, feel free to ticket the additional items to be done separately from this pr.

@jbohren
Copy link
Contributor Author

jbohren commented Jan 20, 2016

@wjwwood OK. All the console controller issues are resolved in 0359197 / ff9a92a

@wjwwood
Copy link
Member

wjwwood commented Jan 20, 2016

I think we need to print the status line after new input, but there might a performance trade-off.

For now I think it's good enough; we can address things after merging.

@wjwwood
Copy link
Member

wjwwood commented Jan 20, 2016

I also think we may not be closing out child processes correctly. But I can investigate that more later.

wjwwood added a commit that referenced this pull request Jan 20, 2016
[PR196 1/3] New asyncio-based execution engine
@wjwwood wjwwood merged commit c5c8807 into master Jan 20, 2016
@wjwwood wjwwood deleted the pre-0.4.0-executor branch January 20, 2016 02:10
@jbohren
Copy link
Contributor Author

jbohren commented Jan 20, 2016

I think we need to print the status line after new input, but there might a performance trade-off.

Yeah, I was thinking that, too.

For now I think it's good enough; we can address things after merging.

👏 Awesome.

@NikolausDemmel
Copy link
Member

Nice!!

@xqms
Copy link
Contributor

xqms commented Jan 20, 2016

Congrats on getting this merged - I think this has been a major step forward :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants