Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move testinfra code from tests/python/pants_tests to src/python/pants/testutil #8400

Merged
merged 14 commits into from
Nov 6, 2019

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Oct 5, 2019

Problem

We are in the process of moving our tests to live within the src directory, rather than tests (#7489). We should consequently have our core utility code live in src.

Further, the current utils are spread out across several folders, which makes discovery more difficult.

Solution

Create a new pants/testutil namespace with all of the files used by the pants.testinfra wheel. Specifically, it has this directory structure:

src/python/pants/testutil
├── BUILD
├── __init__.py
├── base
│   ├── BUILD
│   ├── __init__.py
│   └── context_utils.py
├── console_rule_test_base.py
├── engine
│   ├── BUILD
│   ├── __init__.py
│   ├── base_engine_test.py
│   └── util.py
├── file_test_util.py
├── git_util.py
├── interpreter_selection_utils.py
├── jvm
│   ├── BUILD
│   ├── __init__.py
│   ├── jar_task_test_base.py
│   ├── jvm_task_test_base.py
│   ├── jvm_tool_task_test_base.py
│   └── nailgun_task_test_base.py
├── mock_logger.py
├── option
│   ├── BUILD
│   ├── __init__.py
│   └── fakes.py
├── pants_run_integration_test.py
├── pexrc_util.py
├── process_test_util.py
├── subsystem
│   ├── BUILD
│   ├── __init__.py
│   └── util.py
├── task_test_base.py
└── test_base.py

We can't yet delete the equivalent files in pants_test due to a public API, so we instead have those files simply re-export the implementation in pants/testutils.

Does not move some util code

There are several util files not exposed to the pantsbuild.testinfra wheel, like engine/scheduler_test_base.py. To reduce the scope of this PR, we keep those there for now. Presumably, they will be able to be moved into pants/testinfra without a deprecation cycle.

Creates new wheel pantsbuild.pants.testutil

We had to create a new package, rather than using pantsbuild.pants.testinfra, due to rules about ownership of files (#8400 (comment)).

In a followup, we will deprecate pantsbuild.pants.testinfra.

Result

We can now either use pants.testutil or the conventional pants_test imports. Both have the same functionality.

In a followup PR, we will deprecate the pants_test version.

Copy link
Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huzzah!

@stuhood
Copy link
Member

stuhood commented Oct 6, 2019

We view src as the root of our code; every other source root, like contrib and tests, depend on src.

This is not true in general, although it might be true for the files targets and possibly for integration test pex fingerprinting. I mentioned a solution to the former thing in the slack thread about this, and the latter thing could be resolved by swapping to using ./pants filedeps on the target that we use to build the pex.

It would be good to make sure that we're not doing this only for the above reason. In general: yea, I agree, test helper code is still a library, and so that could be a good reason to do this.

@Eric-Arellano
Copy link
Contributor Author

It would be good to make sure that we're not doing this only for the above reason. In general: yea, I agree, test helper code is still a library, and so that could be a good reason to do this.

I agree! I realized that regardless of trying to chroot those Python ITs, I think this is a good change to make. If we're moving in the direction of all tests living in src, it will help to have the util code for it be in src.

Another big benefit of this new scheme is that it centralizes all util code into one root folder. It makes discovery much easier of all the different base classes we can use, for example.

I mentioned a solution to the former thing in the slack thread about this,

Haven't looked closely at this yet because I'm not sure yet it's necessary. If this approach falls through, will look more closely.

But, yes, to reiterate, regardless of chrooting ITs, I think this is a good change to make.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Eric-Arellano
Copy link
Contributor Author

I can't figure out how to land this due to the wheel building shards. /pants setup-py --run="bdist_wheel --python-tag py36.py37" src/python/pants/testinfra:test_infra and it fails too when the library is kept at pants_test via /pants setup-py --run="bdist_wheel --python-tag py36.py37" tests/python/pants_test:test_infra.

The issue looks like:

FAILURE: No exported target owner found for PythonLibrary(BuildFileAddress(src/python/pants/testinfra/BUILD, int-test-for-export))

The validation in setup_py.py ensures that every target in the dependency has an "owner" that ultimately resolves to :test_infra.

def reduced_dependencies(self, exported_target):
"""Calculates the reduced transitive dependencies for an exported target.
The reduced set of dependencies will be just those transitive dependencies "owned" by
the `exported_target`.
A target is considered "owned" if:
1. It's "3rdparty" and "directly reachable" from `exported_target` by at least 1 path.
2. It's not "3rdparty" and not "directly reachable" by any of `exported_target`'s "3rdparty"
dependencies.
Here "3rdparty" refers to targets identified as either `is_third_party` or `is_exported`.
And in this context "directly reachable" means the target can be reached by following a series
of dependency links from the `exported_target`, never crossing another exported target and
staying within the `exported_target` address space. It's the latter restriction that allows for
unambiguous ownership of exportable targets and mirrors the BUILD file convention of targets
only being able to own sources in their filesystem subtree. The single ambiguous case that can
arise is when there is more than one exported target in the same BUILD file family that can
"directly reach" a target in its address space.
:raises: `UnExportedError` if the given `exported_target` is not, in-fact, exported.
:raises: `NoOwnerError` if a transitive dependency is found with no proper owning exported
target.
:raises: `AmbiguousOwnerError` if there is more than one viable exported owner target for a
given transitive dependency.
"""
# The strategy adopted requires 3 passes:
# 1.) Walk the exported target to collect provisional owned exportable targets, but _not_
# 3rdparty since these may be introduced by exported subgraphs we discover in later steps!
# 2.) Determine the owner of each target collected in 1 by walking the ancestor chain to find
# the closest exported target. The ancestor chain is just all targets whose spec path is
# a prefix of the descendant. In other words, all targets in descendant's BUILD file family
# (its siblings), all targets in its parent directory BUILD file family, and so on.
# 3.) Finally walk the exported target once more, replacing each visited dependency with its
# owner.
if not self.is_exported(exported_target):
raise self.UnExportedError('Cannot calculate reduced dependencies for a non-exported '
'target, given: {}'.format(exported_target))
owner_by_owned_python_target = OrderedDict()
# Only check ownership on the original target graph.
original_exported_target = exported_target.derived_from
def collect_potentially_owned_python_targets(current):
if current.is_original:
owner_by_owned_python_target[current] = None # We can't know the owner in the 1st pass.
return (current == exported_target) or not self.is_exported(current)
self._walk(original_exported_target, collect_potentially_owned_python_targets)
for owned in owner_by_owned_python_target:
if self.requires_export(owned) and not self.is_exported(owned):
potential_owners = set()
for potential_owner in self._ancestor_iterator.iter_target_siblings_and_ancestors(owned):
if self.is_exported(potential_owner) and owned in self._closure(potential_owner):
potential_owners.add(potential_owner)
if not potential_owners:
raise self.NoOwnerError('No exported target owner found for {}'.format(owned))

The key function for finding potential owners looks at ancestor and sibling targets:

def iter_target_siblings_and_ancestors(self, target):
"""Produces an iterator over a target's siblings and ancestor lineage.
:returns: A target iterator yielding the target and its siblings and then it ancestors from
nearest to furthest removed.
"""
def iter_targets_in_spec_path(spec_path):
try:
siblings = SiblingAddresses(spec_path)
for address in self._build_graph.inject_specs_closure([siblings]):
yield self._build_graph.get_target(address)
except AddressLookupError:
# A spec path may not have any addresses registered under it and that's ok.
# For example:
# a:a
# a/b/c:c
#
# Here a/b contains no addresses.
pass
def iter_siblings_and_ancestors(spec_path):
for sibling in iter_targets_in_spec_path(spec_path):
yield sibling
parent_spec_path = os.path.dirname(spec_path)
if parent_spec_path != spec_path:
for parent in iter_siblings_and_ancestors(parent_spec_path):
yield parent
for target in iter_siblings_and_ancestors(target.address.spec_path):
yield target

Reading this code, especially line 149 parent_spec_path = os.path.dirname(spec_path), I don't think tests/python/pants_test:int-test-for-export could ever end up with src/python/pants/testinfra:test_infra as a sibling/ancestor target because they have different source roots. So, I don't think it's possible to have the top level targets like int-test-for-target both in src and tests?

@jsirois
Copy link
Contributor

jsirois commented Oct 13, 2019

So, I don't think it's possible to have the top level targets like int-test-for-target both in src and tests?

Correct.

I read quickly but I think the fundamental issue is maintaining backward compat. As such, just introduce 2 distributions. The new distribution exports src/python/pants/testinfra, the old just exports the re-exports.

@@ -13,36 +12,18 @@ python_library(
'src/python/pants/testinfra/engine:engine_test_base',
'src/python/pants/testinfra/subsystem',
],
# TODO(???): Once we remove `pants_test:testinfra`, reclaim the wheel name
Copy link
Contributor

@jsirois jsirois Oct 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't. In any language an atfiact name really needs to map to its contents and not lose the mapping. It allows version resolution to work. If you provide the same contents under multiple distribution names, the resolver is unaware and can place old==1.0.0 and new==1.2.3 in the same claspath/pythonpath, etc and now you have a 1st in path silent wins rest silently fail situation. This is a very real problem in jvm land, it would be great not to do this ourselves.

provides=pants_setup_py(
name='pantsbuild.pants.testinfra',
name='pantsbuild.pants.testinfra.new_namespace',
Copy link
Contributor

@jsirois jsirois Oct 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider just 'pantsbuild.pants.tests' or '''pantsbuild.pants.testsupport' - not as good names but probably good enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like testsupport or testutil (tests sounds like it contains actual tests).

@Eric-Arellano Eric-Arellano changed the title Move testinfra code from tests/python/pants_tests to src/python/pants/testinfra Move testinfra code from tests/python/pants_tests to src/python/pants/testutils Nov 6, 2019
@Eric-Arellano
Copy link
Contributor Author

Ready for re-review. I decided not to change setup_py.py to allow a package to own two top level modules (#8400 (comment)). The change was too complex to warrant the benefit, imo.

Instead, we simply use a new wheel pantsbuild.pants.testutils. (@benjyw note the extra s compared to your proposed testutil. I can change this if you would like. I only slightly prefer the s.) In a followup PR, we'll deprecate pantsbuild.pants.testinfra, including modifying its notes that go to PyPI to say that it's deprecated.

Copy link
Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I prefer util, singular, and that's a more common usage in our codebase:

     386
statler:[~/src/pants][master]$ git grep "util" | wc -l
    3213```

build_graph=build_graph, build_configuration=build_configuration,
address_mapper=address_mapper, console_outstream=console_outstream,
workspace=workspace, scheduler=scheduler)
from pants.testutils.base.context_utils import TestContext as TestContext # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the need for the apparently redundant aliasing? Afaict its not needed technically - a proof:

$ cat a/b/c/d.py 
from os import path
import sys


def func():
  return 42 
$ cat e/f/g.py 
from a.b.c.d import func, path, sys


print(f'{func()} {sys.executable} {path.basename(sys.executable)}')
$ python -c 'import e.f.g'
42 /usr/bin/python python

If it's for emphasis that these imports are just for re-exporting symbols in a new namespace a boilerplate comment on each file seems more savory (subjective), than a bunch of isort suppressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iirc, I believe we have a MyPy lint to require explicit exports in this style.

The noqa is to suppress pycodestyle saying that we don’t use the import. We would need those no matter what.

(On my phone cleaning old apartment - pardon me not confirming this all)

Copy link
Contributor

@jsirois jsirois Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha. OK, I'm happy to see this change, so not a huge deal, but a pretty explicit statement of all this then would be:

from a import b
...
from foo.bar import Baz


b = b
...
Baz = Baz

That seems pretty clearly to be importing for the purpose of re-export in the current __name__ space with no violations of any linter's sensibilities.

I'll underscore though that I'm happy enough with all this.

@Eric-Arellano Eric-Arellano changed the title Move testinfra code from tests/python/pants_tests to src/python/pants/testutils Move testinfra code from tests/python/pants_tests to src/python/pants/testutil Nov 6, 2019
@Eric-Arellano Eric-Arellano merged commit a0bcc23 into pantsbuild:master Nov 6, 2019
@Eric-Arellano Eric-Arellano deleted the move-testinfra branch November 6, 2019 21:54
Eric-Arellano added a commit that referenced this pull request Nov 8, 2019
…tutil (#8583)

Follows up on #8400. This deprecates the old modules and updates our internal usage to use the new `pants.testutil` namespace.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants