Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate BUILD-file deletion defenses into --changed-include-dependees #7581

Closed
stuhood opened this issue Apr 16, 2019 · 2 comments
Closed

Comments

@stuhood
Copy link
Member

stuhood commented Apr 16, 2019

When using --changed-include-dependees, it's important to locate targets affected by deleted/modified BUILD files.

In Twitter's CI (equivalent to), there are a few defenses layered on top of --changed-include-dependees in order to detect cases that are currently missed there. We should incorporate those defenses into the implementation of --changed-include-dependees such that the effect of change detection is that we definitely end up parsing and validating targets that might have been transitively affected by a file removal.

A sketch of the relevant snippet:

# If any build files have been deleted ('D'), list ::
if git diff --diff-filter=D --name-only "${SQ_BASE_BRANCH}"...HEAD | grep -E "${BUILD_PATH_REGEX}"; then
  # list ::
  ..
else
  # otherwise, inspect added/changed/modified/renamed BUILD files
  BUILD_FILES="$(git diff --diff-filter=ACMR --name-only "${SQ_BASE_BRANCH}"...HEAD | grep -E "${BUILD_PATH_REGEX}")"
  for BUILD_FILE in $BUILD_FILES
  do
    # This function has a gap when a target is removed but the BUILD file is
    # not deleted, and another target depends on the deleted target. Since a
    # common case of this has been users commenting out a whole BUILD file,
    # check for that and treat it as if the file was deleted.
    if ! grep -v "^\s*#" "${BUILD_FILE}" | grep -v "^\s*$"; then
      # list ::
      ..
      break
    fi
  done
  # if we're not doing a :: anyway, then validate all touched BUILD files
  # (or really, their directories, since that's all the flexibility we have)
  # list `$(echo "${BUILD_FILES}" | sed -r "s#${BUILD_PATH_REGEX}#:#")`
  ..
fi
@stuhood
Copy link
Member Author

stuhood commented Apr 16, 2019

It occurs to me that the current implementation of

@rule(BuildFileAddresses, [BuildConfiguration, AddressMapper, OwnersRequest])
def find_owners(build_configuration, address_mapper, owners_request):
sources_set = OrderedSet(owners_request.sources)
dirs_set = OrderedSet(dirname(source) for source in sources_set)
# Walk up the buildroot looking for targets that would conceivably claim changed sources.
candidate_specs = tuple(AscendantAddresses(directory=d) for d in dirs_set)
candidate_targets = yield Get(HydratedTargets, Specs(candidate_specs))
# Match the source globs against the expanded candidate targets.
def owns_any_source(legacy_target):
"""Given a `HydratedTarget` instance, check if it owns the given source file."""
target_kwargs = legacy_target.adaptor.kwargs()
# Handle `sources`-declaring targets.
# NB: Deleted files can only be matched against the 'filespec' (ie, `PathGlobs`) for a target,
# so we don't actually call `fileset.matches` here.
# TODO: This matching logic should be implemented using the rust `fs` crate for two reasons:
# 1) having two implementations isn't great
# 2) we're expanding sources via HydratedTarget, but it isn't necessary to do that to match
target_sources = target_kwargs.get('sources', None)
if target_sources and any_matches_filespec(sources_set, target_sources.filespec):
return True
return False
direct_owners = tuple(ht.adaptor.address
for ht in candidate_targets
if LegacyAddressMapper.any_is_declaring_file(ht.adaptor.address, sources_set) or
owns_any_source(ht))
# If the OwnersRequest does not require dependees, then we're done.
if owners_request.include_dependees == 'none':
yield BuildFileAddresses(direct_owners)
else:
# Otherwise: find dependees.
all_addresses = yield Get(BuildFileAddresses, Specs((DescendantAddresses(''),)))
all_structs = yield [Get(HydratedStruct, Address, a.to_address()) for a in all_addresses]
all_structs = [s.value for s in all_structs]
bfa = build_configuration.registered_aliases()
graph = _DependentGraph.from_iterable(target_types_from_build_file_aliases(bfa),
address_mapper,
all_structs)
if owners_request.include_dependees == 'direct':
yield BuildFileAddresses(tuple(graph.dependents_of_addresses(direct_owners)))
else:
assert owners_request.include_dependees == 'transitive'
yield BuildFileAddresses(tuple(graph.transitive_dependents_of_addresses(direct_owners)))
might already address this, since it unconditionally constructs the dependee graph...?

Worth experimenting with / encoding in some tests.

@stuhood
Copy link
Member Author

stuhood commented Apr 17, 2019

Ok, false alarm.

It looks like all of these cases are covered by --changed-include-dependees due to it unconditionally constructing the dependee graph. And they are additionally already covered in tests/python/pants_test/engine/legacy/test_changed_integration.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant