-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python distribution task for user-defined setup.py + integration with ./pants {run/binary/test} #5141
Python distribution task for user-defined setup.py + integration with ./pants {run/binary/test} #5141
Conversation
Quoting myself from #4740
@CMLivingston @kwlzn thoughts? Edit: correct issue # |
@wisechengyi it's not specific to c-extensions tho, so that'd be framing it up wrong. the term maybe |
@wisechengyi I am pretty torn on it; on one hand I see the confusion that could be had there in terms of the name, however I agree with Kris that the convention of using |
Thanks @kwlzn @CMLivingston for explanation. I think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like a good start - handful of initial thoughts.
class PythonDistribution(PythonTarget): | ||
"""A Python distribution containing c/cpp extensions. | ||
|
||
:API: public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once this or anything else goes in marked public
you won't be able to remove or change it without a two minor release deprecation cycle. so it would probably make sense to revisit the marking once we've demonstrated good use of it and nothing else needs to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
|
||
# Like Hello world, but built with Pants. | ||
|
||
python_binary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as-is, this example isn't the most readable because of how many files/dirs its split around. it would probably make sense to collapse this example into a single dir that had a singular BUILD
file containing a python_dist
+ python_binary
and all of the sources in one location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
from pants.base.payload_field import PrimitiveField | ||
|
||
class PythonDistribution(PythonTarget): | ||
"""A Python distribution containing c/cpp extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's nothing specific to c/cpp extensions here - it's just a python dist, full stop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
|
||
python_distribution( | ||
name='superhello' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defining sources in the setup.py
won't be enough to support things like changed
or other features that rely on target->sources mappings.
since the Target
impl subclasses PythonTarget
, if you weren't overriding sources=[]
in the super
call it would have an implicit sources=
field that would be partially correct in this example (i.e. only globs *.py at the moment, not the c files or other files). it would probably also make sense to 1) permit/require a definition of sources 2) explicitly define sources=
in this example since the point is to be illustrative about how to use it 3) consider what the most appropriate default sources globs should be for python_dist
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Along the lines of #3, what if we just used rglobs(*.h|*.py|*.c|*.cpp)
as an implicit default? This way the user would not need to define the sources they care about in two places (setup.py and then have to also list them out in a sources param). I'm not sure I understand the value in allowing the user to define sources beyond restricting change detection to a subset of rglobs(*.h|*.py|*.c|*.cpp)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rglobs
would preventpython_dist
from being able to co-exist with other targets in a dir or any child dir, which I think would be bad in general. I'm fairly sure (but not positive) that all of the existing implicit sources globs you'd find would be regularglobs
for this reason.- globbing all of the .h/.c/.cpp to me, for
python_dist
, feels unusual/unexpected as a user. I could see*.py
- but I'd argue that C extension dists are typically an exception vs the norm in the general case, so we should probably require that users explicitly declare those when needed.
I'm open to other perspectives and counter arguments tho, but from an adoption strategy perspective in my experience it's way easier to add shortcuts like this later than it would be to take them away once delivered if we find they lead to bad/unwanted behavior - so IMHO, starting with the least surprising implicit behavior would be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
payload.add_fields({ | ||
'platforms': PrimitiveField(tuple(maybe_list(platforms or []))) | ||
}) | ||
super(PythonDistribution, self).__init__(sources=[], payload=payload, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth noting that subclassing PythonTarget
is going to net you a handful of fields that may or may not make sense in the dist context, like provides
, resources
and resource_targets
. should consider the ramifications of what would happen if a user defines these expecting them to Just Work - and potentially add tests to cover those cases or disallow their definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
# Create a temporary working directory, pydistworkdir | ||
pydist_workdir = os.path.join(workdir, 'pydistworkdir') | ||
safe_mkdir(pydist_workdir) | ||
pex_name = "%s.pex" % target.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect to see this keying all writes to some path under e.g. .pants.d/pyprep/local_dists/<target fingerprint>/...
- ideally with that key computed outside of this function and handed in as a target dir vs being handed the workdir directly. it would also probably be a good idea to include some form of no-op reuse of previously built wheels on re-keys to the same fingerprint for build performance.
in order to ensure that sources are properly declared, it should likely also copy the declared sources for the target into a temporary dir in that path before invoking the setup.py
in it. this will ensure that any temporary files that are created as part of that land in the workdir and thus don't need manual cleanup such as what you're doing for *.egg-info
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
:param dist_targets: PythonDistribution targets to dump | ||
:param workdir: Working directory for python targets (./pantsd/python) | ||
:param log: Use this logger. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good, in general, to add type info to all docstrings. e.g.
:param list dist_targets: A list of `PythonDistribution` targets to build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We noticed that docstrings in most of the files we have touched don't have types. Should we add them to this file alone or should we address it in a separate pull request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't need to be uniform, but in general always adding type info going forward would be preferred.
whl_dist = chroot_deps_contents[0] # TODO: find better way to grab .whl from chroot | ||
whl_location = os.path.join(pydist_workdir, fingerprint, '.deps', whl_dist) | ||
return whl_location | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not successfully building a dist here should probably be an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
for tgt in dist_targets: | ||
whl_location = build_python_distribution_from_target(tgt, workdir) | ||
if whl_location: | ||
locations.add(whl_location) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if two different python_dist
targets here key to the same wheel, that should probably be an error because otherwise one will be silently ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
builder = PEXBuilder(path=path, interpreter=interpreter, copy=True) | ||
dump_requirements(builder, interpreter, req_libs, self.context.log) | ||
if python_dist_targets: | ||
dump_python_distributions(builder, python_dist_targets, self.workdir, self.context.log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens here if the py_dist target happens to conflict or overwrite a resolved requirement?
seems to me like we should be dumping those prior to the resolve and then considering them in the full transitive resolve closure to reveal any resolution impact.
this also raises the question of how to handle transitive requirements of python_dist
setup.py.. ideally we could construct the wheel intransitively and then let the resolve handle pulling in the transitive requirements as part of this.
there's also a question of how to handle setup_requires
in a given setup.py
, but that's something we can probably just punt on for now and revisit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Would this approach require anything beyond moving this logic up higher (above dump_requirements) and letting dump_requirements
de-dupe transitive requirements?
How would we ensure that 3rd party requirements resolved as part of a python distribution (specified by 3rdparty/python:dep
) and requirements specified in setup.py are not duplicated? It seems like this would necessitate handling setup_requires
in the here and now as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks promising. I have gave verbal feedback before and I will add those as comments here:
- I would suggest to give python_distribution a sources field, even if always uses as sources=rglob('**/*'). This should trigger test of depedent modules' test
- In long term, I would like to see some way to enable breaking down the C/C++ codes into libraries that we can create a build graph from. This should facilitate better modulation.
7c3ecc6
to
9b08c4e
Compare
@@ -53,6 +56,7 @@ def register_goals(): | |||
task(name='interpreter', action=SelectInterpreter).install('pyprep') | |||
task(name='requirements', action=ResolveRequirements).install('pyprep') | |||
task(name='sources', action=GatherSources).install('pyprep') | |||
task(name='python-dists', action=PythonCreateDistributions).install('pyprep') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Installing this task on pyprep seems like the best place to put it, however with this choice it will only work in ./pants run
if the produced .whl is dumped into the leftmost runner pex, which seems like an anti-pattern. Thoughts?
This is ready for another look when convenient. Unit/integration tests will follow after we solidify the task approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good mod a handful of questions and comments.
would also need some basic test coverage here before this can be merged.
'super_greet.c', | ||
'hello_package/hello.py', | ||
'hello_package/__init__.py', | ||
'setup.py' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
version='1.0.0', | ||
ext_modules=[c_module], | ||
packages=find_packages(), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
python_tests( | ||
name='superhello', | ||
sources=[ | ||
'test_superhello.py' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so I guess the linter catches only certain indent errors in *.py files, but not in BUILD files? Pretty sure it flagged indent issues in *py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
|
||
|
||
def test_superhello(): | ||
assert hello.hello() == "Super hello" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
name='main', | ||
source='main.py', | ||
dependencies=[ | ||
'examples/src/python/example/python_distribution/hello/superhello:superhello', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should just reference :superhello
, since its in the same BUILD file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
fingerprint = dist_tgt.payload.fingerprint() | ||
dist_target_dir = os.path.join(local_dists_workdir, fingerprint) | ||
if not os.path.exists(dist_target_dir): | ||
log.debug('Creating working directory for target %s with fingerprint %s', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there no way to use a FingerprintStrategy
under care of a Task.invalidated()
block to do this instead?
built_dists = set() | ||
|
||
if dist_targets: | ||
# Check for duplicate distribution names, since we write the pexes to <dist>/<name>.pex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this task produces wheels, not pex files tho - which I wouldn't expect to write to dist (until consumed by a pex).
afaict, its possible for two python_dist
targets to share the same base target name - but have totally different dist names based on their setup.py files. so I'm not sure this validation buys much here, fwict? thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really know the dist_name will be created from the target definition currently. It seems like a reasonable approximation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think you're right that this doesn't really protect against much. Just removed.
if len(dists) == 0: | ||
raise TaskError('No distributions were produced by python_create_distribution task.') | ||
elif len(dists) > 1: | ||
raise TaskError('Ambiguous whls found: %s' % (' '.join(dists))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check is reasonable, but is this a case you've actually seen before? I wouldn't expect this to be possible without seriously non-standard setup.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was a check we added before we implemented the fingerprint strategy, so it may not be necessary anymore. Seems safe/reasonable enough to keep, but do you think we should remove?
|
||
:param req_libs: A list of :class:`PythonRequirementLibrary` targets to resolve. | ||
:param local_python_dist_targets: A list of :class:`PythonDistribution` targets to resolve. | ||
:returns a PEX containing target requirements and any specified python dist targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: :returns:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
built_dists = self.context.products.get_data(PythonCreateDistributions.PYTHON_DISTS) | ||
if built_dists: | ||
for dist in built_dists: | ||
builder.add_dist_location(dist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to me like this is sidestepping the actual resolve, but ultimately I'm curious if this covers the case of:
- user has a target that depends on
3rdparty/python:pytorch
which provides e.g.pytorch==1.0.0
- user also depends on a
python_dist
that providespytorch==2.0.0
- when the pex is built, it blows up indicating two incompatible versions of pytorch?
if so, great. if not, this could spell out trouble at runtime - particularly because of transitive deps that users might not be aware of. either way this case should probably get covered with an integration test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. Let us get back to you on that, definitely need an integ test regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a really good observation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this issue get addressed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is covered in the same way a resolve would blow up on normal python targets (i.e. throwing an error indicating two incompatible versions of pytorch). I have an integration test covering this.
safe_mkdir(dist_target_dir) | ||
|
||
# Copy sources and setup.py over for packaging. | ||
sources_rel_to_target_base = dist_tgt.sources_relative_to_target_base() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The codes for copying sources file should be taken care of with a invalidated block I believe. (https://github.com/pantsbuild/pants/blob/master/src/python/pants/task/task.py#L309)
built_dists = self.context.products.get_data(PythonCreateDistributions.PYTHON_DISTS) | ||
if built_dists: | ||
for dist in built_dists: | ||
builder.add_dist_location(dist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a really good observation.
if not 'setup.py' in sources_basenames: | ||
raise TargetDefinitionException(self, | ||
'A setup.py is required to create a python_dist. You must include a setup.py file in your sources field.') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also should check for multiple setup.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
built_dists = set() | ||
|
||
if dist_targets: | ||
# Check for duplicate distribution names, since we write the pexes to <dist>/<name>.pex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really know the dist_name will be created from the target definition currently. It seems like a reasonable approximation?
if not os.path.exists(install_dir): | ||
safe_mkdir(install_dir) | ||
setup_runner = SetupPyRunner(dist_target_dir, 'bdist_wheel', interpreter=interpreter, install_dir=install_dir) | ||
setup_runner.run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought: How long does a single setup runner run take? Is it anyway to run multiple setup runner simultaneously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure at the moment, and this will likely be a follow up concern after this PR has landed.
@@ -18,6 +18,7 @@ | |||
|
|||
from pants.contrib.jax_ws.targets.jax_ws_library import JaxWsLibrary | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because of a style issue? If not, I would rather not submit a change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a style fix.
python_tests( | ||
name='superhello', | ||
sources=[ | ||
'test_superhello.py' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 spaces here vs. 2 below in dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
names[name] = dist_target | ||
|
||
with self.invalidated(dist_targets, invalidate_dependents=True) as invalidation_check: | ||
local_dists_workdir = self.local_dists_workdir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for local_dists_workdir any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
@@ -101,7 +104,8 @@ def extra_requirements(self): | |||
def create_pex(self, pex_info=None): | |||
"""Returns a wrapped pex that "merges" the other pexes via PEX_PATH.""" | |||
relevant_targets = self.context.targets( | |||
lambda tgt: isinstance(tgt, (PythonRequirementLibrary, PythonTarget, Files))) | |||
lambda tgt: isinstance(tgt, (PythonDistribution, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation here is valid but a bit odd. I think we generally prefer to break earlier.
lambda tgt: isinstance(tgt, (
PythonDidstribution, ...))
or
lambda tgt: isinstance(
tgt,
(PythonDistribution...)
)
seems to be a bit better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
@@ -0,0 +1,50 @@ | |||
# coding=utf-8 | |||
# Copyright 2014 Pants project contributors (see CONTRIBUTORS.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
output = subprocess.check_output(pex) | ||
self.assertIn('Super hello', output) | ||
else: | ||
print('No python {} found. Skipping.'.format(version)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: instead of skip the test, if there is a wrong version of python installed, we should test that an error is generated appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pants convention to skip tests if the system does not have the target interpreter. The test util checks for the interpreter by making a subprocess.check_output call, so worrying about the error is out of the scope of these tests changes, I think. We can discuss further offline.
built_dists = self.context.products.get_data(PythonCreateDistributions.PYTHON_DISTS) | ||
if built_dists: | ||
for dist in built_dists: | ||
builder.add_dist_location(dist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this issue get addressed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small suggestion, feel free to take or leave.
@@ -57,7 +57,12 @@ def __init__(self, | |||
sources_basenames = [os.path.basename(source) for source in sources] | |||
if not 'setup.py' in sources_basenames: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SUGGESTION
count_of_setup_py = [os.path.basename(s) for s in sources].count('setup.py')
if count_of_setup_py == 0:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handful of comments - fwict it seems like there's still a decent functional gap here in that a python_dist
's transitive deps won't be considered during resolution. this feels like a feature customers will need from day 1.
if count_of_setup_py > 1: | ||
raise TargetDefinitionException(self, | ||
'Multiple setup.py files detected. You can only specify one ' | ||
'setup.py in a python_dist target.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this second check seems like an arbitrary restriction to me - it's not clear why a file named setup.py
in e.g. a nested child dir rglob'd into a python_dist
would matter as long as we can identify the main setup.py
entrypoint target?
conversely, if that invalid, nested setup.py
was the only one in a target then this wouldn't complain either.
checking only for the existence of "at least one "top-level" setup.py
relative to the target definition" would probably be more correct, since thats the shape of a typical python dist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. ✅
"""Return whether `invalid_targets` contains at least one target from `targets`. | ||
|
||
:param invalid_targets: A list of targets that have been invalidated by a cache manager. | ||
:param targets: A list of targets to check for membership in `invalid_targets`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: docstring param ordering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
# Dump built python distributions, if any, into the requirements pex. | ||
# NB: If a python_dist depends on a requirement X and another target in play requires | ||
# an incompatible version of X, this will cause the pex resolver to throw an incompatiblity | ||
# error and Pants will abort the build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is good to test for, but I'm not sure I'd expect to see many outbound dependency edges from a python_dist
other than for transitive pins. in fact, it could make sense to disallow dependencies=
for this new target type altogether as a simplification measure since they don't make much sense in the dist building context (and pins could still be composed with a python_dist
under a bare target
).
but the test case I mentioned earlier (and the one I would be most worried about) would be if the python_dist
's setup.py
declared itself as pytorch==1.2.3
and then a target depended on that same requirement + e.g. python_requirement_library(pytorch==1.2.4)
transitively. by just dumping the local dist into the pex chroot (vs considering that as part of the resolve), I believe this could lead to a pex that would be broken in subtle, weird ways.
I also don't see how any transitive deps of the python_dist
would be resolved and included currently - and both of these problems appear to be somewhat intertwined at the impl level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These sound like fair concerns. Let me ask a couple clarifying questions to make sure I understand them correctly. I'm not entirely clear so this might be easier to clarify over chat / hangouts. Lmk if that's viable for you.
- I assume disallowing
dependencies=
would be mainly a #simplify measure, but it wouldn't cover the test case that you mention? - Re: test case are you referring to a
python_dist
with a setup.py that has:
setup(
name='pytorch',
version='1.2.3'
)
and a target that depends on this python_dist
and another target that depends on 1.2.4
transitively like:
python_binary(
name='main',
source='main.py',
dependencies=[
':main2',
':pytorch',
]
)
python_library(
name='main2',
source='main2.py',
requirements=[
'python_requirement_library(pytorch==1.2.4)',
]
)
- Hope I'm wrong here, but it seems like resolving transitive deps is a pretty complex/nontrivial undertaking. I understand the need to #BeRigorous, but I'm wondering if we can balance it with "Iterative, team-based development"? What if "this feels like a feature customers will need from day 1" weren't true for our internal customer that requested this feature? Assuming we address all other concerns, is there a path we can take to unblock our customer now and tackle transitive deps in a separate pull request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
correct, two separate things there.
-
that would be exactly it for the conflict case. for transitive dep resolution, the idea would be a
python_binary
that depended on apython_dist
with asetup.py
that declared ainstall_requires=['some_transitive_dep']
- which would be transitively imported from thepython_binary
'smain.py
during atest
/run
. I believe currently this would fail on the import ofsome_transitive_dep
due to no resolver treatment forsetup.py
declared deps. -
hmm, I don't get the sense it'd be super complex. the first bit would be extracting the injectable requirement spec from the built wheel or
setup.py
, but you can probably get away with simple post-build wheel filename parsing for that (and then e.g.'=='.join([name, version])
. the second bit would be injecting that inferred spec in with the rest of the transitive requirement specs at build time before they're all passed to the resolver (and ensuring the dir containing the wheel is injected into thefind-links
paths with precedence). afaict, this would solve for both cases described in 2. but a TODO (ideally w/ github issue link) and iteration is fine also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thank you for the clarifications and guidance on #3. I'll take a stab and if it proves too complex / time consuming I may just TODO it.
So, related to #1 - the use case for including deps for a python_dist
is that our internal customer wants to import tensorflow in setup.py so that they can load all the tensorflow includes into include_dirs. Would there be a way around this, if we weren't going to include deps inpython_dist
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...our internal customer wants to import tensorflow in setup.py so that they can load all the tensorflow includes into include_dirs. Would there be a way around this, if we weren't going to include deps inpython_dist?
the dependencies declared on this target would never be made available to the executing setup.py
for use in the current model. they would only be pulled in transitively for any target roots and resolved into deps for a final pex product (pex, chroot or otherwise).
setuptools/setup.py
et al already make an affordance for this case with setup_requires
. this ensures the declared setup_requires
deps are present at setup.py execution time vs install time. making it so that the user could declare "i need tensorflow during setup.py execution time" would, afaict, involve plumbing full support for setup_requires
in SetupPyRunner
via either native setup.py
declarations or BUILD file declarations. I think from the pants/pex side, this would involve composing a setup-time pex/chroot and executing the setup.py
inside of that where it'd be able to import tensorflow
from a hermetic environment.
we typically don't deal with setup_requires
much in pants/pex land because of the emphasis on bdists (wherein setup.py
is pre-executed for us) for repeatability - but it'd certainly be a nice feature to have for python_dist
consumption, since it'd bring us closer to parity with pip
.
format, e.g. ``'CPython>=3', or just ['>=2.7','<3']`` for requirements | ||
agnostic to interpreter class. | ||
""" | ||
self.address = address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaict, this happens in Target.__init__()
so can kill that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
|
||
:return: A boolean indicating if any target in `targets` exists in `invalid_targets`. | ||
""" | ||
return any([target in invalid_targets for target in targets]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use set
logic here:
>>> set([3,2,6]).isdisjoint(set([3,5,9]))
False
>>> set([3,2,6]).isdisjoint(set([1,5,9]))
True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
if len(dists) == 0: | ||
raise TaskError('No distributions were produced by python_create_distribution task.') | ||
else: | ||
return os.path.join(os.path.abspath(install_dir), dists[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can kill the else
here and treat the if
check as guard clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
# Note that we check for the existence of the directory, instead of for invalid_vts, | ||
# to cover the empty case. | ||
if not os.path.isdir(path): | ||
if not os.path.isdir(path) or targets_are_invalid(python_dist_targets, invalid_targets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious.. what is special about PythonDistribution
targets vs others covered by the target_set_id
path keying that necessitated this extra check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were right, we didn't need that and removed it.
|
||
def resolve_requirements(self, req_libs): | ||
with self.invalidated(req_libs) as invalidation_check: | ||
def resolve_requirements(self, req_libs, python_dist_targets=()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on how python_dist_targets
is used below, it seems like you could easily get at this with an inner local_python_dists = self.context.targets(is_local_python_dist)
vs the new signature + changing the calling pattern of all callers. this would #simplify and reduce duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
# Note that we check for the existence of the directory, instead of for invalid_vts, | ||
# to cover the empty case. | ||
if not os.path.isdir(path): | ||
if not os.path.isdir(path) or targets_are_invalid(python_dist_targets, invalid_targets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
superhello_project = 'examples/src/python/example/python_distribution/hello/superhello' | ||
superhello_tests = 'examples/tests/python/example/python_distribution/hello/test_superhello' | ||
|
||
def test_pants_binary(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably better to use the skipif
decorator for these?
@unittest.skipIf(has_python_version(...), "no python ...")
def test_thing(...):
pass
..but I would also expect python2.7 to be a strict requirement of pants' CI and test environments, so it's probably not even worth doing this check in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote it like this to leave room for extension when we want to test with Python 3 (similar to interpreter selection tests).
In this snippet, wouldn't we be doing not has_python_version(...)
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
|
||
if not 'setup.py' in sources: | ||
raise TargetDefinitionException( | ||
self, 'A setup.py in the top-level directory relative to the target definition is required.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference in the error message to the target address will be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TargetDefinitionException
contains this information.
@@ -128,6 +131,12 @@ def _create_binary(self, binary_tgt, results_dir): | |||
dump_sources(builder, tgt, self.context.log) | |||
dump_requirements(builder, interpreter, req_tgts, self.context.log, binary_tgt.platforms) | |||
|
|||
# Dump built python distributions, if any, into builder's chroot. | |||
built_dists = self.context.products.get_data(BuildLocalPythonDistributions.PYTHON_DISTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are two local dists belong to two separate targets (say A and B). Running pants binary A B
should generate .pex only contain the right local dist. Does this code handle the multiple targets scenario correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
import super_greet | ||
|
||
def hello(): | ||
print(super_greet.super_greet()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove? I assume this was for debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. That returns a string which needs to be printed for verification in stdout.
|
||
def test_pants_binary(self): | ||
self._maybe_test_pants_binary('2.7') | ||
command=['binary', '{}:main'.format(self.superhello_project)] | ||
pants_run_27 = self.run_pants(command=command) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since its assumed that pants is running 2.7, is it necessary to specify that in these variables? Or are you future-proofing a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Def not necessary, I can clean them up.
|
||
python_dist( | ||
name='superhello_with_conflicting_dep', | ||
sources=[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the 'conflicting dep' in this example case?
your setup.py
provides a package called superhello
. your python_dist
and python_binary
both depend on the same pex_lib
@ pex==1.2.11
.
to have a conflicting case here to exercise the core resolver consideration that we've been discussing, your setup.py
would need to either stomp on the pex
name as built into a dist with a differing version - or depend on one via install_requires
- in order to create a conflict (ideally, tests for both).
and again, your python_dist
should also not declare a dependencies=
here because it is /providing/ a dep (and dragging its transitive dependencies along with it if/when the resolver is wired up correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case was for declaring a dep in the binary below (3rdparty/pex which conflicts iirc), but I think this is all probably unnecessary because we know targets can handle this correctly already.
So you are proposing we kill of dependencies field of python dist all together and have internal teams list their deps (like tensorflow) in install_requires instead? Because writing python code that depends on TF in your py_dist and linking to 3rdparty/python:tensorflow in the dependencies field of a consuming binary target seems confusing.
"""Return the absolute path of the whl in a setup.py install directory.""" | ||
dists = glob.glob(os.path.join(install_dir, '*.whl')) | ||
if len(dists) == 0: | ||
raise TaskError('No distributions were produced by python_create_distribution task.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
producing >1 wheels should probably be an error condition, or at least a warning - vs the silent masking happening here now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
extra_reqs_pex = [ | ||
self.resolve_requirements([self.context.build_graph.get_target(addr)]) | ||
] | ||
pexes = extra_reqs_pex + pexes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was probably better before, as the one liner.
@@ -1,5 +1,5 @@ | |||
# coding=utf-8 | |||
# Copyright 2016 Pants project contributors (see CONTRIBUTORS.md). | |||
# Copyright 2017 Pants project contributors (see CONTRIBUTORS.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My previous suggestions have been addressed in previous commits. Just one minor nit here.
@@ -30,6 +33,10 @@ def has_python_sources(tgt): | |||
return isinstance(tgt, (PythonLibrary, PythonTests, PythonBinary)) and tgt.has_sources() | |||
|
|||
|
|||
def is_local_python_dist(tgt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a similar helper method above (is_distribution
) and we have another call to isinstance in setup_py.py. Would it be useful to have one helper method, maybe in a util file?
ac3afc8
to
cb4204d
Compare
Is there something we needed to do before we can land this change? The failed tests seem to be come over TravisCI infrastructure. |
I've just restarted them. But the answer to your question is really "get a shipit from @kwlzn", probably. |
8631104
to
887d496
Compare
887d496
to
1a04e80
Compare
The latest CI runs only failure comes from golang lint check, |
I've restarted the failing shard - looks good. Will review tomorrow. |
Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good - on the test cases in particular. handful of comments.
@@ -34,6 +37,7 @@ def build_file_aliases(): | |||
PythonBinary.alias(): PythonBinary, | |||
PythonLibrary.alias(): PythonLibrary, | |||
PythonTests.alias(): PythonTests, | |||
'python_dist': PythonDistribution, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should use PythonDistribution.alias()
vs 'python_dist'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
if vt.valid: | ||
built_dists.add(self._get_whl_from_dir(os.path.join(vt.results_dir, 'dist'))) | ||
else: | ||
if vt.target.dependencies : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: whitespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
raise TargetDefinitionException( | ||
vt.target, 'The `dependencies` field is disallowed on `python_dist` targets. List any 3rd ' | ||
'party requirements in the install_requirements argument of your setup function.' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect this check for dependencies + TargetDefinitionException
raising to happen in the PythonDistribution.__init__()
vs at task execution time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, this is not possible in the init method. My attempts have been blocked by AssertionError: Cannot retrieve dependencies of BuildFileAddress(testprojects/src/python/python_distribution/fasthello_with_install_requires/BUILD, fasthello) because it is not in the BuildGraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, not feasible elsewhere: target constructors don't see their dependencies currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, k
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
if len(dists) == 0: | ||
raise TaskError('No distributions were produced by python_create_distribution task.') | ||
if len(dists) > 1: | ||
raise TaskError('Ambiguous local python distributions found: %s' % (' '.join(dists))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: pants repo style dictates '{}'.format(blah)
form (vs pex's %s
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
|
||
|
||
def build_req_libs_provided_by_setup_file(context, local_built_dists, class_name): | ||
"""Build a requirements library from a local wheel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the use of the term 'build' here is misleading.. how about inject_synthetic_req_libs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, how about taking just a build_graph
vs the entire context
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
@@ -19,6 +19,7 @@ def product_types(cls): | |||
|
|||
def execute(self): | |||
req_libs = self.context.targets(has_python_requirements) | |||
if req_libs: | |||
dist_tgts = self.context.targets(is_local_python_dist) | |||
if req_libs or dist_tgts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing how dist_tgts
is used here?
..but maybe has_python_requirements
should be expanded to also match PythonDistribution
targets if they now provide a resolvable requirement on the surface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not be viable in the current implementation fwict, as PythonDistribution
targets would get treated as requirements libraries elsewhere in the codebase and break things. I see the need for the distinction because tasks that operate on req libs need PythonRequirementsLibrary
targets, and while these are provided by py dist targets, it appears to me that the current backend does not accept PythonDistribution
targets as a has_python_requirements
target as of right now.
""" | ||
local_dist_targets = self.context.targets(is_local_python_dist) | ||
tgts = req_libs + local_dist_targets | ||
with self.invalidated(tgts) as invalidation_check: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect this to conform more to the existing calling convention - so ideally the caller passes in local_dist_targets
as either a combined or separate param here? that would align better with how things are used below too, afaict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tgts = req_libs + (local_dist_targets or [])
works as well.
req_name = '=='.join([whl_metadata[0], whl_metadata[1]]) | ||
local_whl_reqs.append(PythonRequirement(req_name, repository=whl_dir)) | ||
if local_whl_reqs: | ||
addr = Address.parse(class_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the class name feels like an odd Address
to use here? it won't be unique, for one - so this likely needs to be generative into some synthetic namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
# binary targets that each have their own unique python_dist depencency. | ||
if any([tgt.id in dist for tgt in binary_tgt.closure(exclude_scopes=Scopes.COMPILE)]): | ||
builder.add_dist_location(dist) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how are you ensuring alignment between the dump_requirements
provided req->dists vs the dists explicitly added here?
i.e. how does dump_requirements
know where to resolve custom dists from in the case where they're injected into req_tgts
above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
# This protects against the case where a single `./pants binary` command builds two | ||
# binary targets that each have their own unique python_dist depencency. | ||
if any([tgt.id in dist for tgt in binary_tgt.closure(exclude_scopes=Scopes.COMPILE)]): | ||
builder.add_dist_location(dist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be good to add a test case for the case described here - did not see one below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - one last important comment about synthetic address namespacing and a couple of readability/doc string things - but otherwise I think we're good to land this and iterate.
in the binary for the case where a user specifies mulitple binary targets in a single invocation of | ||
`./pants binary`. | ||
:return: a :class: `PythonRequirementLibrary` containing a local wheel and its | ||
transitive dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaict, it doesn't actually include transitive dependencies here - just the surface requirement that aligns to the built wheel.
addr = Address.parse(synthetic_address) | ||
build_graph.inject_synthetic_target(addr, PythonRequirementLibrary, requirements=local_whl_reqs) | ||
req_libs = [build_graph.get_target(addr)] | ||
return req_libs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic here seems fine but I think this function would be more readable/maintainable if you decomposed it a bit - something along the lines of (note the renamed function name + docstring, too):
def inject_synthetic_dist_requirements(build_graph, local_built_dists, synthetic_address, binary_tgt=None):
"""Injects synthetic `PythonRequirement` objects from a `PythonDistribution` into the `BuildGraph`.
...
"""
def should_create_req(bin_tgt, loc):
if not bin_tgt:
return True
# <-- Add comment here about why this is necessary
return any([tgt.id in loc for tgt in bin_tgt.closure()])
def python_requirement_from_wheel(path):
base = os.path.basename(path)
whl_dir = os.path.dirname(path)
whl_metadata = base.split('-')
req_name = '=='.join([whl_metadata[0], whl_metadata[1]])
return PythonRequirement(req_name, repository=whl_dir)
local_whl_reqs = [
python_requirement_from_wheel(whl_location)
for whl_location in local_built_dists
if should_create_req(binary_tgt, whl_location)
]
if not local_whl_reqs:
return []
addr = Address.parse(synthetic_address)
build_graph.inject_synthetic_target(addr, PythonRequirementLibrary, requirements=local_whl_reqs)
return [build_graph.get_target(addr)]
if built_dists: | ||
req_tgts = inject_req_libs_provided_by_setup_file(self.context.build_graph, | ||
built_dists, | ||
binary_tgt.invalidation_hash(), binary_tgt) + req_tgts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally speaking, synthetic addresses would map to the path hierarchy of a given target w/ some synthetic name (e.g. //path/to/the/dir:my_injected_dist_0
).. this maps to e.g. //<some_sha_here>
in the root.
at a minimum I think you would want to key this into the target's path namespace, e.g. ~:
':'.join((binary_tgt.address.rel_path, binary_tgt.invalidation_hash()))
..it'd also be good to break out the last two params here each onto its own line for readability while you're there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and at that point, its probably just as safe/sane to use a static name key like:
':'.join((binary_tgt.address.rel_path, '{}_dist_reqs'.format(binary_tgt.address.target_name))
# Note that we check for the existence of the directory, instead of for invalid_vts, | ||
# to cover the empty case. | ||
if not os.path.isdir(path): | ||
with safe_concurrent_creation(path) as safe_path: | ||
# Handle locally-built python distribution dependencies. | ||
built_dists = self.context.products.get_data(BuildLocalPythonDistributions.PYTHON_DISTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed something: What happens if only these local dists change? The invalidated() block won't notice that, so the requirements pex won't be rebuilt. It seems like their originating targets should participate in the invalidation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I guess this is why the local_dist_targets arg exists? But they aren't passed in practice, as far as I can tell. And the duplication is a bit weird. Any reason not to do the inject_synthetic_dist_requirements() when the distribution is created, i.e., in build_local_python_distributions.py
? That seems more idiomatic. Then downstream tasks like this one don't have to distinguish between the different provenances of requirement libraries, and can treat them uniformly.
…with ctypes (#5815) ### Problem It is currently possible to expose native code to Python by compiling it in a `python_dist()` target, specifying C or C++ source files as a `distutils.core.Extension` in the `setup.py` file, as well as in the target's sources. `python_dist()` was introduced in #5141. We introduced a "native toolchain" to compile native sources for this use case in #5490. Exposing Python code this way requires using the Python native API and `#include <Python.h>` in your source files. However, python can also interact with native code that does not use the Python native API, using the provided `ctypes` library. For this to work, the `python_dist()` module using `ctypes` needs to have a platform-specific shared library provided within the package. This PR introduces the targets, tasks, and subsystems to compile and link a shared library from native code, then inserts it into the `python_dist()` where it is easily accessible. ### Solution - Introduce the `ctypes_compatible_c_library()` target which covers C sources (`ctypes_compatible_cpp_library()` for C++), and can specify what to name the shared library created from linking the object files compiled from its sources and dependencies. - Introduce `CCompile`, `CppCompile`, and `LinkSharedLibraries` to produce the shared libraries from the native sources. The compile tasks use options `CCompileSettings` or `CppCompileSettings` to define file extensions for "header" and "source" files. - Introduce the `CCompileSettings` and `CppCompileSettings` subsystems to control compile settings for those languages. - Convert `BuildLocalPythonDistributions` to proxy to the native backend through the new `PythonNativeCode` subsystem. - Move all the `BinaryTool` subsystems to a `subsystems/binaries` subdirectory, and expose them to the v2 engine through `@rule`s defined in the subsystem's file. - Move some of the logic in `pex_build_util.py` to `setup_py.py`, and expose datatypes composing the setup.py environment through `@rule`s in `setup_py.py`. `SetupPyRunner.for_bdist_wheel()` was created to set the wheel's platform, if the `python_dist()` target contains any native sources of its own, or depends on any `ctypes_compatible_*_library`s. **Note:** the new targets are specifically prefixed with `ctypes_compatible_` because we don't yet eclipse the functionality of `contrib/cpp`. When the targets become usable for more than this one use case, the name should be changed. ### Result To see how to link up native and Python code with `ctypes`, here's (most of) the contents of `testprojects/src/python/python_distribution/ctypes`: *BUILD*: ```python ctypes_compatible_c_library( name='c_library', sources=['some_math.h', 'some_math.c', 'src-subdir/add_three.h', 'src-subdir/add_three.c'], ctypes_dylib=native_artifact(lib_name='asdf-c'), ) ctypes_compatible_cpp_library( name='cpp_library', sources=['some_more_math.hpp', 'some_more_math.cpp'], ctypes_dylib=native_artifact(lib_name='asdf-cpp'), ) python_dist( sources=[ 'setup.py', 'ctypes_python_pkg/__init__.py', 'ctypes_python_pkg/ctypes_wrapper.py', ], dependencies=[ ':c_library', ':cpp_library', ], ) ``` *setup.py*: ```python setup( name='ctypes_test', version='0.0.1', packages=find_packages(), # Declare two files at the top-level directory (denoted by ''). data_files=[('', ['libasdf-c.so', 'libasdf-cpp.so'])], ) ``` *ctypes_python_pkg/ctypes_wrapper.py*: ```python import ctypes import os def get_generated_shared_lib(lib_name): # These are the same filenames as in setup.py. filename = 'lib{}.so'.format(lib_name) # The data files are in the root directory, but we are in ctypes_python_pkg/. rel_path = os.path.join(os.path.dirname(__file__), '..', filename) return os.path.normpath(rel_path) asdf_c_lib_path = get_generated_shared_lib('asdf-c') asdf_cpp_lib_path = get_generated_shared_lib('asdf-cpp') asdf_c_lib = ctypes.CDLL(asdf_c_lib_path) asdf_cpp_lib = ctypes.CDLL(asdf_cpp_lib_path) def f(x): added = asdf_c_lib.add_three(x) multiplied = asdf_cpp_lib.multiply_by_three(added) return multiplied ``` Now, the target `testprojects/src/python/python_distribution/ctypes` can be depended on in a BUILD file, and other python code can freely use `from ctypes_python_pkg.ctypes_wrapper import f` to start jumping into native code. #### Follow-up issues: 1. #5933 2. #5934 3. #5949 4. #5950 5. #5951 6. #5962 7. #5967 8. #5977
Closes pantsbuild#5831 Prep for release 1.8.0dev3 (pantsbuild#5937) Ban bad `readonly` shell pattern (pantsbuild#5924) This subverts `set -e` and means that failed commands don't fail the script, which is responsible for late failures on CI such as https://travis-ci.org/pantsbuild/pants/jobs/389174049 which failed to download protoc, but only failed when something tried to use it. Fixup macosx platform version. (pantsbuild#5938) This needs to match the Travis CI osx platform we pre-build wheels on. Work towards pantsbuild#4896. Allow pants to select targets by file(s) (pantsbuild#5930) Fixes pantsbuild#5912 Problem There should be an option to accept literal files, and then select the targets that own those files. Similar to how the --changed-parent triggers a diff and the targets are selected based on the result. The proposed syntax is something like: $ ./pants \ --owner-of=this/is/a/file/name.java \ # flag triggers owner lookup --owner-of=this/is/a/file/name/too.py \ # flag triggers owner lookup compile # goal Solution I've created a global option --owner-of= that takes a list of files as a parameter, and created a OwnerCalculator class to handle the logic similar to how ChangeCalculator works with the --changed-* subsystem. Result Now users will be able to run goals on files without needing to know which target owns those files. Also, the list-owners goal can be deprecated in favor of --owner-of=some/file.py list It is important to note that multiple target selection methods are not allowed, so it fails when more than one of --changed-*, --owner-of, or target specs are supplied. e.g. this fails: $ ./pants \ --owner-of=this/is/a/file/name.java \ # flag triggers owner lookup --owner-of=this/is/a/file/name/too.py \ # flag triggers owner lookup compile <another target> Integration test for daemon environment scrubbing (pantsbuild#5893) This shows that pantsbuild#5898 works, which itself fixed pantsbuild#5854 Add the --owner-of= usage on Target Address documentation (pantsbuild#5931) Problem The documentation of the feature proposed in PR pantsbuild#5930 Solution I decided to put it inside Target Addresses because that is where a user would look if they needed a feature like this, I think. Result More docs, and that's always good. Add script to get a list of failing pants from travis (pantsbuild#5946) [jvm-compile] template-methodify JvmCompile further; add compiler choices (pantsbuild#5923) Introduce `JvmPlatform.add_compiler_choice(name)`, which allows plugins to register compilers that can be configured. This patch pulls out some template methods in JvmCompile to make it easier to extend. It also pushes some of the implementations of those methods down into ZincCompile, where appropriate. These changes should be covered by existing tests, but it could make sense to add tests around the interfaces of the new template methods. I don't anticipate there being a large number of implementations at this time though, so I didn't think it'd be worth it. Add the following template methods * `create_empty_extra_products` Allows subclasses to create extra products that other subclasses might not need, that ought to be constructed even if no compile targets are necessary. * `register_extra_products_from_contexts` rename of `_register_vts`. This allows subclasses to register their extra products for particular targets. * `select_runtime_context` Not 100% happy with this, but I'm working on something that needs to have different types of compile contexts. It allows subclasses to specify a context that provides paths for the runtime classpath if the default context isn't quite right for the usages in the base class. * `create_compile_jobs` Pulled this out into a separate method so that subclasses can create multiple graph jobs per target. * Pushed down behavior from JvmCompile that should live in zinc via the template methods extracted above. There's probably more that could be done here, but this was the first cut of it. * Moved the execute definition from BaseZincCompile to ZincCompile so that it's possible to subclass BaseZincCompile with a different compiler name. release notes for 1.7.0.rc1 (pantsbuild#5942) Use target not make_target in some tests (pantsbuild#5939) This pushes parsing of the targets through the engine, rather than bypassing it. This is important because I'm about to make these targets require an EagerFilesetWithSpec as their source/sources arg, rather than being happy with a list of strings. Add new remote execution options (pantsbuild#5932) As described in pantsbuild#5904, a few configuration values that are important to testing of remote execution are currently hardcoded. Extract existing options to a `ExecutionOptions` collection (which should become a `Subsystem` whenever we add support for consuming `Subsystems` during bootstrap), and add the new options. Fixes pantsbuild#5904. Separate the resolution cache and repository cache in Ivy (pantsbuild#5844) move glob matching into its own file (pantsbuild#5945) See pantsbuild#5871, where we describe an encapsulation leak created by implementing all of the glob expansion logic in the body of `VFS`. - Create `glob_matching.rs`, exporting the `GlobMatching` trait, which exports the two methods `canonicalize` and `expand`, which call into methods in a private trait `GlobMatchingImplementation`. **Note:** `canonicalize` calls `expand`, and vice versa, which is why both methods were moved to `glob_matching.rs`. Orthogonal glob matching logic is made into a trait that is implemented for all types implementing `VFS`, removing the encapsulation leak. The `VFS` trait is now just four method signature declarations, making the trait much easier to read and understand. Enable fromfile support for --owner-of and increase test coverage (pantsbuild#5948) The new `--owner-of` option was broken in the context of `pantsd`, but didn't have test coverage due to the `--changed` and `--owner-of` tests not running under `pantsd`. Additionally, `fromfile` support is useful for this option, but was not enabled. Mark some integration tests as needing to run under the daemon, and enable `fromfile` support for `--owner-of`. [pantsd] Robustify client connection logic. (pantsbuild#5952) Fixes pantsbuild#5812 under full-on-assault stress testing via: $ watch -n.1 'pkill -f "pantsd \[" pantsd-runner' this will mostly behave like: WARN] pantsd was unresponsive on port 55620, retrying (1/3) WARN] pantsd was unresponsive on port 55620, retrying (2/3) WARN] pantsd was unresponsive on port 55626, retrying (3/3) WARN] caught client exception: Fallback(NailgunExecutionError(u'Problem executing command on nailgun server (address: 127.0.0.1:55630): TruncatedHeaderError(u"Failed to read nailgun chunk header (TruncatedRead(u\'Expected 5 bytes before socket shutdown, instead received 0\',)).",)',),), falling back to non-daemon mode 23:30:24 00:00 [main] (To run a reporting server: ./pants server) 23:30:38 00:14 [setup] 23:30:39 00:15 [parse] ... mid-flight terminations (simulated via single-shot pkill calls) also result in a more descriptive error with traceback proxying: 23:40:51 00:04 [zinc] 23:40:51 00:04 [javac] 23:40:51 00:04 [cpp] 23:40:51 00:04 [errorprone] 23:40:51 00:04 [findbugs]CRITICAL] CRITICAL] lost active connection to pantsd! Exception caught: (<class 'pants.bin.remote_pants_runner.Terminated'>) File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 73, in <module> main() File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 69, in main PantsLoader.run() File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 65, in run cls.load_and_execute(entrypoint) File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_loader.py", line 58, in load_and_execute entrypoint_main() File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_exe.py", line 39, in main PantsRunner(exiter, start_time=start_time).run() File "/Users/kwilson/dev/pants/src/python/pants/bin/pants_runner.py", line 39, in run return RemotePantsRunner(self._exiter, self._args, self._env, bootstrap_options).run() File "/Users/kwilson/dev/pants/src/python/pants/bin/remote_pants_runner.py", line 162, in run self._run_pants_with_retry(port) File "/Users/kwilson/dev/pants/src/python/pants/java/nailgun_client.py", line 221, in execute return self._session.execute(cwd, main_class, *args, **environment) File "/Users/kwilson/dev/pants/src/python/pants/java/nailgun_client.py", line 94, in execute return self._process_session() File "/Users/kwilson/dev/pants/src/python/pants/java/nailgun_client.py", line 69, in _process_session for chunk_type, payload in self.iter_chunks(self._sock, return_bytes=True): File "/Users/kwilson/dev/pants/src/python/pants/java/nailgun_protocol.py", line 206, in iter_chunks chunk_type, payload = cls.read_chunk(sock, return_bytes) File "/Users/kwilson/dev/pants/src/python/pants/java/nailgun_protocol.py", line 182, in read_chunk raise cls.TruncatedHeaderError('Failed to read nailgun chunk header ({!r}).'.format(e)) Exception message: abruptly lost active connection to pantsd runner: NailgunError(u'Problem talking to nailgun server (address: 127.0.0.1:55707, remote_pid: -28972): TruncatedHeaderError(u"Failed to read nailgun chunk header (TruncatedRead(u\'Expected 5 bytes before socket shutdown, instead received 0\',)).",)',) Re-shade zinc to avoid classpath collisions with annotation processors. (pantsbuild#5953) Zinc used to be shaded before the `1.x.y` upgrade (pantsbuild#4729), but shading was removed due to an overabundance of optimism. When testing the zinc upgrade internally, we experienced a classpath collision between an annotation processor and zinc (in guava, although zinc has many other dependencies that could cause issues). Shade zinc, and ensure that our annotation processor uses a very old guava in order to attempt to force collisions in future. Improve PythonInterpreterCache logging (pantsbuild#5954) When users have issues building their Python interpreter cache, they are often very confused because does not currently log much about the process to help users debug. Here we add log lines describing what/where Pants looks to build the interpreter cache, and the results of what it found. This should help users better understand/debug the process. use liblzma.dylib for xz on osx and add platform-specific testing to the rust osx shard (pantsbuild#5936) See pantsbuild#5928. The `xz` archiver wasn't tested on osx at all, and failed to find `liblzma.so` on osx (it should have been `liblzma.dylib`). There were additional errors with library search paths reported in that PR which I was not immediately able to repro. This PR hopefully fixes all of those errors, and ensures they won't happen again with the addition of platform-specific testing (see previous issue at pantsbuild#5920). - Switch to a statically linked `xz`. - Fix the incorrect key `'darwin'` in the platform dictionary in the `LLVM` subsystem (to `'mac'`). - Add the tag `platform_specific_behavior` to the new python target `tests/python/pants_test/backend/python/tasks:python_native_code_testing`, which covers the production of `python_dist()`s with native code. - Add the `-z` argument to `build-support/bin/ci.sh` to run all tests with the `platform_specific_behavior` tag. Also clean up old unused options in the getopts call, and convert echo statements to a simpler heredoc. - Change the name of the "Rust Tests OSX" shard to "Rust + Platform-specific Tests OSX", and add the `-z` switch to the `ci.sh` invocation. **Note:** the tests in `tests/python/pants_test/backend/native/subsystems` are going to be removed in pantsbuild#5815, otherwise they would be tagged similarly. `./pants test tests/python/pants_test/backend/python/tasks:python_native_code_testing` now passes on osx, and this fact is now being tested in an osx shard in travis. Support output directory saving for local process execution. (pantsbuild#5944) Closes pantsbuild#5860 reimplement a previous PR -- ignore this This commit is a reimplementation of registering @rules for backends, because this PR began before that one was split off. add some simple examples to demonstrate how to use backend rules ...actually make the changes to consume backend rules in register.py revert accidental change to a test target remove extraneous log statement fix lint errors add native backend to release.sh isolate native toolchain path and hope for the best add config subdir to native backend really just some more attempts start trying to dispatch based on platform extend EngineInitializer to add more rules from a backend refactor Platform to use new methods in osutil.py refactor the native backend to be a real backend and expose rules register a rule in the python backend to get a setup.py environment make python_dist() tests pass make lint pass create tasks and targets for c/c++ sources - refactors the "native toolchain" and introduces the "binaries" subdirectory of subsystems start by adding a new ctypes testproject add c/c++ sources add example BUILD file add some native targets add tasks dir remove gcc try to start adding tasks clean some leftover notes in BuildLocalPythonDistributions update NativeLibrary with headers move DependencyContext to target.py add native compile tasks houston we have compilation run: ./pants -ldebug --print-exception-stacktrace compile testprojects/src/python/python_distribution/ctypes: for an idea use the forbidden product request change target names to avoid conflict with cpp contrib and flesh out cpp_compile now we are compiling code we can link things now now we know how to infer headers vs sources fix the test case and fix include dir collection (un)?suprisingly, everything works, but bdist_wheel doesn't read MANIFEST.in houston we have c++ bring back gcc so we can compile halfway done with osx support now things work on osx???????? ok, now it works on linux again too first round of review - NB: refactors the organization of the `results_dir` for python_dist()! - move ctypes integration testing into python backend tests revert some unnecessary changes refactor native_artifact to be a datatype fix some copyright years add ctypes integration test add assert_single_element method in collections.py decouple the native tools for setup.py from the execution environment streamline the selection of the native tools for setup.py invocation make gcc depend on binutils on linux for the 'as' assembler fix logging visibility by moving it back into the task make the check for the external llvm url more clear refactor local dist building a bit - use SetupPyRunner.DIST_DIR as the source of truth - add a separate subdir of the python_dist target's results_dir for the python_dist sources - move shraed libs into the new subdir fix imports second round of review - fixed bugs - expanded error messages and docstrings make a couple docstring changes fix dist platform selection ('current' is not a real platform) lint fixes fix broken regex which modifies the `.dylib` extension for python_dist() fix the ctypes integration test respond to some review comments clear the error message if we can't find xcode cli tools refactor the compilation and linking pipeline to use subsystems - also add `PythonNativeCode` subsystem to bridge the native and python backends refactor the compilation and linking pipeline to use subsystems add some notes fix rebase issues add link to pantsbuild#5788 -- maybe use variants for args for static libs move `native_source_extensions` to a new `PythonNativeCode` subsystem update native toolchain docs and remove bad old tests move tgt_closure_platforms into the new `PythonNativeCode` subsystem remove unnecessary logging remove compile_settings_class in favor of another abstractmethod refactor `NativeCompile` and add documentation improve debug logging in NativeCompile document NativeCompileSettings refactor and add docstrings convert provides= to ctypes_dylib= and add many more docstrings remove or improve TODOs improve or remove FIXMEs improve some docstrings, demote a FIXME, and add a TODO link FIXMEs to a ticket add notes to the ctypes testproject update mock object for strict deps -- test passes fix failing integration test on osx add hack to let travis pass fix the system_id key in llvm and add a shameful hack to pass travis swap the order of alias_types remove unused EmptyDepContext class remove -settings suffix from compile subsystem options scopes add AbstractClass to NativeLibrary bump implementation_version for python_dist() build - we have changed the layout of the results_dir in this PR add ticket link and fix bug revert indentation changes to execute() method refactor `assert_single_element()` revert addition of `narrow_relative_paths()` add link to pantsbuild#5950 move common process invocation logic into NativeCompile revert an unnecessary change turn an assert into a full exception revert unnecessary change use get_local_platform() wherever possible delete duplicate llvm subsystem fix xcode cli tools resolution change `ctypes_dylib=` to `ctypes_native_library=` add a newline move UnsupportedPlatformError to be a class field remove unused codegen_types field fix zinc-compiler options to be valid ones Construct rule_graph recursively (pantsbuild#5955) The `RuleGraph` is currently constructed iteratively, but can be more-easily constructed recursively. Switch to constructing the `RuleGraph` recursively, and unify a few disparate diagnostic messages. Helps to prepare for further refactoring in pantsbuild#5788. Allow manylinux wheels when resolving plugins. (pantsbuild#5959) Also plumb manylinux resolution support for the python backend, on by default, but configurable via `python_setup.resolver_use_manylinux`. Fixes pantsbuild#5958 `exclude-patterns` and `tag` should apply only to roots (pantsbuild#5786) The `--exclude-patterns` flag currently applies to inner nodes, which causes odd errors. Moreover, tags should also apply only to roots. See pantsbuild#5189. - added `tag` & `exclude_patterns` as params to `Specs` - add tests for both - modify changed tests to pass for inner node filtering Fixes pantsbuild#5189. Remove value wrapper on the python side of ffi. (pantsbuild#5961) As explained in the comment in this change, the overhead of wrapping our CFFI "handle"/`void*` instances in a type that is shaped like the `Value` struct was significant enough to care about. Since the struct has zero overhead on the rust side, whether we represent it as typedef or a struct on the python side doesn't make a difference (except in terms of syntax). 6% faster `./pants list ::` in Twitter's repo. return an actual field use temporary native-compile goal Cobertura coverage: Include the full target closure's classpath entries for instrumentation (pantsbuild#5879) Sometimes Cobertura needs access to the dependencies of class files being instrumented in order to rewrite them (pantsbuild#5878). This patch adds an option that creates a manifest jar and adds an argument to the Cobertura call so that it can take advantage of it. class files that need to determine a least upper bound in order to be rewritten can now be instrumented. Fixes pantsbuild#5878 add ticket link fix xcode install locations and reduce it to a single dir_option return the correct amount of path entries Record start times per graph node and expose a method to summarize them. (pantsbuild#5964) In order to display "significant" work while it is running on the engine, we need to compute interesting, long-running leaves. Record start times per entry in the `Graph`, and add a method to compute a graph-aware top-k longest running leaves. We traverse the graph "longest running first", and emit the first `k` leaves we encounter. While this will almost certainly need further edits to maximize it's usefulness, visualization can begun to be built atop of it. Prepare the 1.8.0.dev4 release (pantsbuild#5969) Mark a few options that should not show up in `./pants help`. (pantsbuild#5968) `./pants help` contains core options that are useful to every pants command, and the vast majority of global options are hidden in order to keep it concise. A few non-essential options ended up there recently. Hide them. adding more documentation for python_app (pantsbuild#5965) The python_app target doesn't have the documentation specific for it and has a documentation that is specific to jvm_app. Added a few lines of documentation. There is no system-wide change, only a documentation change. Remove DeprecatedPythonTaskTestBase (pantsbuild#5973) Use PythonTaskTestBase instead. Fixes pantsbuild#5870 Chris first commit on fresh rebase Merge branch 'ctypes-test-project' of github.com:cosmicexplorer/pants into clivingston/ctypes-test-project-third-party unrevert reverted fix (NEEDS FOLLOWUP ISSUE!) put in a better fix for the strict_deps error until the followup issue is made add ticket link Merge branch 'ctypes-test-project' of github.com:cosmicexplorer/pants into clivingston/ctypes-test-project-third-party Shorten safe filenames further, and combine codepaths to make them readable. (pantsbuild#5971) Lower the `safe_filename` path component length limit to 100 characters, since the previous 255 value did not account for the fact that many filesystems also have a limit on total path length. This "fixes" the issue described in pantsbuild#5587, which was caused by using this method via `build_invalidator.py`. Additionally, merge the codepath from `Target.compute_target_id` which computes a readable shortened filename into `safe_filename`, and expand tests. This removes some duplication, and ensure that we don't run into a similar issue with target ids. The specific error from pantsbuild#5587 should be prevented, and consumers of `safe_filename` should have safe and readable filenames. Fixes pantsbuild#5587. Whitelist the --owner-of option to not restart the daemon. (pantsbuild#5979) Because the `--owner-of` option was not whitelisted as `daemon=False`, changing its value triggered unnecessary `pantsd` restarts. Whitelist it. Prepare the 1.8.0rc0 release. (pantsbuild#5980) Robustify test_namespace_effective PYTHONPATH. (pantsbuild#5976) The real problem is noted, but this quick fix should bolster against interpreter cache interpreters pointing off to python binaries that have no setuptools in their associated site-packages. Fixes pantsbuild#5972 make_target upgrades sources to EagerFilesetWithSpec (pantsbuild#5974) This better simulates how the engine parses BUILD files, giving a more faithful experience in tests. I'm about to make it a warning/error to pass a list of strings as the sources arg, so this will make tests which use make_target continue to work after that. Also, make cloc use base class scheduler instead of configuring its own. Lib and include as a dep-specifc location source attribute is automatically promoted to sources (pantsbuild#5908) This means that either the `source` or `sources` attribute can be used for any rule which expects sources. Places that `source` was expected still verify that the correct number of sources are actually present. Places that `sources` was expected will automatically promote `source` to `sources`. This is a step towards all `sources` attributes being `EagerFilesetWithSpec`s, which will make them cached in the daemon, and make them easier to work with with both v2 remote execution and in the v2 engine in general. It also provides common hooks for input file validation, rather than relying on them being done ad-hoc in each `Target` constructor. For backwards compatibility, both attributes will be populated on `Target`s, but in the future only the sources value will be provided. `sources` is guaranteed to be an `EagerFilesetWithSpec` whichever of these mechanisms is used. A hook is provided for rules to perform validation on `sources` at build file parsing time. Hooks are put in place for non-contrib rule types which currently take a `source` attribute to verify that the correct number of sources are provided. I imagine at some point we may want to add a "file type" hook too, so that rules can error if files of the wrong type were added as sources. This is a breaking change for rules which use both the `source` and `sources` attributes (and where the latter is not equivalent to the former), or where the `source` attribute is used to refer to something other than a file. `source` is now becoming a reserved attribute name, as `sources` and `dependencies` already are. This is also a breaking change for rules which use the `source` attribute, but never set `sources` in a Payload. These will now fail to parse. This is also a slightly breaking change for the `page` rule - before, omitting the `source` attribute would parse, but fail at runtime. Now, it will fail to parse. This is also a breaking change in that in means that the source attribute is now treated like a glob, and so if a file is specified which isn't present, it will be ignored instead of error. This feels a little sketchy, but it turns out we did the exact same thing by making all sources lists be treated like globs... Override get_sources for pants plugins (pantsbuild#5984) 1.7.0 release notes (pantsbuild#5983) No additional changes, so it's a very short release note. Fixups for native third party work hardcode in c/c++ language levels for now remove all the unnecessary code relating to file extensions Merge branch 'ctypes-test-project' of github.com:cosmicexplorer/pants into clivingston/ctypes-test-project-third-party Caching tests are parsed through the engine (pantsbuild#5985) reimplement a previous PR -- ignore this This commit is a reimplementation of registering @rules for backends, because this PR began before that one was split off. add some simple examples to demonstrate how to use backend rules ...actually make the changes to consume backend rules in register.py revert accidental change to a test target remove extraneous log statement fix lint errors add native backend to release.sh isolate native toolchain path and hope for the best add config subdir to native backend really just some more attempts start trying to dispatch based on platform extend EngineInitializer to add more rules from a backend refactor Platform to use new methods in osutil.py refactor the native backend to be a real backend and expose rules register a rule in the python backend to get a setup.py environment make python_dist() tests pass make lint pass create tasks and targets for c/c++ sources - refactors the "native toolchain" and introduces the "binaries" subdirectory of subsystems start by adding a new ctypes testproject add c/c++ sources add example BUILD file add some native targets add tasks dir remove gcc try to start adding tasks clean some leftover notes in BuildLocalPythonDistributions update NativeLibrary with headers move DependencyContext to target.py add native compile tasks houston we have compilation run: ./pants -ldebug --print-exception-stacktrace compile testprojects/src/python/python_distribution/ctypes: for an idea use the forbidden product request change target names to avoid conflict with cpp contrib and flesh out cpp_compile now we are compiling code we can link things now now we know how to infer headers vs sources fix the test case and fix include dir collection (un)?suprisingly, everything works, but bdist_wheel doesn't read MANIFEST.in houston we have c++ bring back gcc so we can compile halfway done with osx support now things work on osx???????? ok, now it works on linux again too first round of review - NB: refactors the organization of the `results_dir` for python_dist()! - move ctypes integration testing into python backend tests revert some unnecessary changes refactor native_artifact to be a datatype fix some copyright years add ctypes integration test add assert_single_element method in collections.py decouple the native tools for setup.py from the execution environment streamline the selection of the native tools for setup.py invocation make gcc depend on binutils on linux for the 'as' assembler fix logging visibility by moving it back into the task make the check for the external llvm url more clear refactor local dist building a bit - use SetupPyRunner.DIST_DIR as the source of truth - add a separate subdir of the python_dist target's results_dir for the python_dist sources - move shraed libs into the new subdir fix imports second round of review - fixed bugs - expanded error messages and docstrings make a couple docstring changes fix dist platform selection ('current' is not a real platform) lint fixes fix broken regex which modifies the `.dylib` extension for python_dist() fix the ctypes integration test respond to some review comments clear the error message if we can't find xcode cli tools refactor the compilation and linking pipeline to use subsystems - also add `PythonNativeCode` subsystem to bridge the native and python backends refactor the compilation and linking pipeline to use subsystems add some notes fix rebase issues add link to pantsbuild#5788 -- maybe use variants for args for static libs move `native_source_extensions` to a new `PythonNativeCode` subsystem update native toolchain docs and remove bad old tests move tgt_closure_platforms into the new `PythonNativeCode` subsystem remove unnecessary logging remove compile_settings_class in favor of another abstractmethod refactor `NativeCompile` and add documentation improve debug logging in NativeCompile document NativeCompileSettings refactor and add docstrings convert provides= to ctypes_dylib= and add many more docstrings remove or improve TODOs improve or remove FIXMEs improve some docstrings, demote a FIXME, and add a TODO link FIXMEs to a ticket add notes to the ctypes testproject update mock object for strict deps -- test passes fix failing integration test on osx add hack to let travis pass fix the system_id key in llvm and add a shameful hack to pass travis swap the order of alias_types remove unused EmptyDepContext class remove -settings suffix from compile subsystem options scopes add AbstractClass to NativeLibrary bump implementation_version for python_dist() build - we have changed the layout of the results_dir in this PR add ticket link and fix bug revert indentation changes to execute() method refactor `assert_single_element()` revert addition of `narrow_relative_paths()` add link to pantsbuild#5950 move common process invocation logic into NativeCompile revert an unnecessary change turn an assert into a full exception revert unnecessary change use get_local_platform() wherever possible delete duplicate llvm subsystem fix xcode cli tools resolution change `ctypes_dylib=` to `ctypes_native_library=` add a newline move UnsupportedPlatformError to be a class field remove unused codegen_types field fix zinc-compiler options to be valid ones return an actual field use temporary native-compile goal add ticket link fix xcode install locations and reduce it to a single dir_option return the correct amount of path entries unrevert reverted fix (NEEDS FOLLOWUP ISSUE!) put in a better fix for the strict_deps error until the followup issue is made add ticket link hardcode in c/c++ language levels for now remove all the unnecessary code relating to file extensions fix osx failures and leave a ticket link Add rang header-only lib for integration testing Merge branch 'ctypes-test-project' of github.com:cosmicexplorer/pants into clivingston/ctypes-test-project-third-party Fix TestSetupPyInterpreter.test_setuptools_version (pantsbuild#5988) Previously the test failed to populate the `PythonInterpreter` data product leading to a fallback to the current non-bare interpreter which allowed `setuptools` from `site-packages` to leak in. Fixes pantsbuild#5467 Refactor conan grab into subsystem Engine looks up default sources when parsing (pantsbuild#5989) Rather than re-implementing default source look-up. This pushes sources parsing of default sources through the engine, in parallel, rather than being later synchronous python calls. This also works for plugin types, and doesn't change any existing APIs. It updates the Go patterns to match those that the engine currently performs, rather than ones which aren't actually used by any code. Add unit tests, refactor C/C++ targets which can be compiled/linked and used in python_dist() with ctypes (pantsbuild#5815) It is currently possible to expose native code to Python by compiling it in a `python_dist()` target, specifying C or C++ source files as a `distutils.core.Extension` in the `setup.py` file, as well as in the target's sources. `python_dist()` was introduced in pantsbuild#5141. We introduced a "native toolchain" to compile native sources for this use case in pantsbuild#5490. Exposing Python code this way requires using the Python native API and `#include <Python.h>` in your source files. However, python can also interact with native code that does not use the Python native API, using the provided `ctypes` library. For this to work, the `python_dist()` module using `ctypes` needs to have a platform-specific shared library provided within the package. This PR introduces the targets, tasks, and subsystems to compile and link a shared library from native code, then inserts it into the `python_dist()` where it is easily accessible. - Introduce the `ctypes_compatible_c_library()` target which covers C sources (`ctypes_compatible_cpp_library()` for C++), and can specify what to name the shared library created from linking the object files compiled from its sources and dependencies. - Introduce `CCompile`, `CppCompile`, and `LinkSharedLibraries` to produce the shared libraries from the native sources. The compile tasks use options `CCompileSettings` or `CppCompileSettings` to define file extensions for "header" and "source" files. - Introduce the `CCompileSettings` and `CppCompileSettings` subsystems to control compile settings for those languages. - Convert `BuildLocalPythonDistributions` to proxy to the native backend through the new `PythonNativeCode` subsystem. - Move all the `BinaryTool` subsystems to a `subsystems/binaries` subdirectory, and expose them to the v2 engine through `@rule`s defined in the subsystem's file. - Move some of the logic in `pex_build_util.py` to `setup_py.py`, and expose datatypes composing the setup.py environment through `@rule`s in `setup_py.py`. `SetupPyRunner.for_bdist_wheel()` was created to set the wheel's platform, if the `python_dist()` target contains any native sources of its own, or depends on any `ctypes_compatible_*_library`s. **Note:** the new targets are specifically prefixed with `ctypes_compatible_` because we don't yet eclipse the functionality of `contrib/cpp`. When the targets become usable for more than this one use case, the name should be changed. To see how to link up native and Python code with `ctypes`, here's (most of) the contents of `testprojects/src/python/python_distribution/ctypes`: *BUILD*: ```python ctypes_compatible_c_library( name='c_library', sources=['some_math.h', 'some_math.c', 'src-subdir/add_three.h', 'src-subdir/add_three.c'], ctypes_dylib=native_artifact(lib_name='asdf-c'), ) ctypes_compatible_cpp_library( name='cpp_library', sources=['some_more_math.hpp', 'some_more_math.cpp'], ctypes_dylib=native_artifact(lib_name='asdf-cpp'), ) python_dist( sources=[ 'setup.py', 'ctypes_python_pkg/__init__.py', 'ctypes_python_pkg/ctypes_wrapper.py', ], dependencies=[ ':c_library', ':cpp_library', ], ) ``` *setup.py*: ```python setup( name='ctypes_test', version='0.0.1', packages=find_packages(), # Declare two files at the top-level directory (denoted by ''). data_files=[('', ['libasdf-c.so', 'libasdf-cpp.so'])], ) ``` *ctypes_python_pkg/ctypes_wrapper.py*: ```python import ctypes import os def get_generated_shared_lib(lib_name): # These are the same filenames as in setup.py. filename = 'lib{}.so'.format(lib_name) # The data files are in the root directory, but we are in ctypes_python_pkg/. rel_path = os.path.join(os.path.dirname(__file__), '..', filename) return os.path.normpath(rel_path) asdf_c_lib_path = get_generated_shared_lib('asdf-c') asdf_cpp_lib_path = get_generated_shared_lib('asdf-cpp') asdf_c_lib = ctypes.CDLL(asdf_c_lib_path) asdf_cpp_lib = ctypes.CDLL(asdf_cpp_lib_path) def f(x): added = asdf_c_lib.add_three(x) multiplied = asdf_cpp_lib.multiply_by_three(added) return multiplied ``` Now, the target `testprojects/src/python/python_distribution/ctypes` can be depended on in a BUILD file, and other python code can freely use `from ctypes_python_pkg.ctypes_wrapper import f` to start jumping into native code. 1. pantsbuild#5933 2. pantsbuild#5934 3. pantsbuild#5949 4. pantsbuild#5950 5. pantsbuild#5951 6. pantsbuild#5962 7. pantsbuild#5967 8. pantsbuild#5977 Add simple integration test Pull in master Minor cleanup Fix CI errors Debug log stdout Fix integration test Fix integration test Fix lint error on third party
Problem
There is no way to consume a python_library with c_extensions. Relates to #4740.
Solution
Create a new target,
python_distribution
, that specifies only a distribution name that maps to the directory the BUILD file lives in (a distribution's project directory). Pants will treat the directory that thepython_distribution
target is defined in aa a distribution project directory (containing a setup.py and sources) and run the pex command on that directory to build an inline wheel using the setup.py. The./pants binary
and./pants run
tasks will dump this distribution into the binary pex for consumption (or requirements pex for the./pants run
case). The setup.py takes care of linking the package sources and c/cpp code together as setuptoolsExtension
s. We've run a similar example by our customer to validate that this meets their needs.Result
Running
./pants binary
or./pants run
on apython_binary
target that depends on a python distribution will package the distribution as a wheel and include it in the created pex binary or requirements pex (for./pants run
) for consumption. This example has simple C code to demonstrate the ability to call C extensions from a package built by using thispython_distribution
target.