Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for relative links when multiple symbols in the hierarchy have the same name #578

Merged

Conversation

d-ronnqvist
Copy link
Contributor

@d-ronnqvist d-ronnqvist commented May 5, 2023

Bug/issue #, if applicable:

Summary

This implements much better support for relative symbol links when multiple symbols in the hierarchy have the same name (or have the same name as the module). This also unblocks—and reenables—support for omitting the module name in documentation extensions.

This change is smaller than the diff makes it appear. 867 of the additions are tests and test data and much of the implementation changes are moved or updated code (153 added, 143 removed).

Details

To describe these changes in detail it's easiest to look at an example. If we consider this Swift code:

/* in a module named "Something" */

public struct Something {
    /// Example 1: ``Something/second``
    /// Example 2: ``Something/fourth``
    public enum Something {
        case first
    }
    /// Example 3: ``Something/Something/Something``
    public var second = 0
}

/// Example 4: ``Something/second``
public struct Wrapper {
    public struct Something {
        /// Example 5: ``Something/Something/second``
        public var third = 0
    }
}

Considering example 1 in the code above

The link resolution starts at the scope of the enum (since the link is in the enum's documentation comment).

Before these changes, the link would match the "Something" with the enum and look for "second" among the enum's members. This would result in an error that "second" doesn't exist at "Something/Something" (the enum).

With these changes, the link resolution would start out the same way as before but when it fails to find "second" within the the enum it would remember that error and walk up the hierarchy to find a new starting point for the search. Since the struct also matches "Something", the implementation would proceed to look for "second" among the struct's members. This succeeds, so the implementation discards the previous error and returns the found "second" struct property.

Considering example 2 in the code above

The link resolution starts at the scope of the enum (since the link is in the enum's documentation comment).

Before these changes, just like in example 1, the link would match the "Something" with the enum and look for "fourth" among the enum's members. This would result in an error that "fourth" doesn't exist at "Something/Something" (the enum).

With these changes, like in example 1, the link resolution would start out the same way as before but when it fails to find "fourth" within the the enum it would remember that error and walk up the hierarchy to find a new starting point for the search. Since the struct also matches "Something", the implementation would proceed to look for "fourth" among the struct's members. This also fails, so the implementation walks up to the module which also match "Something". When the implementation fails to find "fourth" all the way up the hierarchy it raises the first (inner most) error that it encountered. This is the same error that would be raised before these changes.

Considering example 3 in the code above

A bit more condensed. Link resolutions starts at the "second" property. It's not a good starting point for the search (neither it nor its members match the first path component) so the implementation checks the outer scope. The "Something" struct matches the first component and the "Something" enum matches the second component but the enum doesn't have a "Something" member. Before these changes, link resolution would stop here with this error.

With these changes, the implementation encounters the same error, remembers that error, and tries again at the module scope (the scope that contained the previous starting point). At this scope the first "Something" match the module, the second "Something" match the struct, and the final "Something" match the enum. Since all path components are found the implementation discards the previous error and returns the found "Something" enum.

Considering example 4 in the code above

A bit more condensed. Link resolutions starts at the Wrapper. Wrapper has a "Something" member but that doesn't have a "second" member. Before these changes, link resolution would stop here with this error.

With these changes, the implementation encounters the same error, remembers that error, and tries the link at the module's scope. The module has a "Something" member (the struct) which has a "second" member (the property) so the property is returned.

Considering example 5 in the code above

A bit more condensed. Link resolutions starts at the "third" property. Similar to example 3, the property isn't a good starting point so the implementation checks the outer scope (the "Something" struct within the "Wrapper"). This matches the first component so the implementation tries—and fails—the second "Something" as an member of that inner struct. Before these changes, link resolution would stop here with this error.

With these changes the implementation encounters the same error, remembers that error, and tries the link at the Wrapper scope (the scope that contained the previous starting point). The wrapper isn't a good starting point so the implementation tries its outer scope (the module). The module match the first "Something" component, the "Something" top-level struct match the second path component, and the "something". property match the last path component. Since all path components are found the implementation discards the previous error and returns the found "Something" enum.

Performance

In my testing across different OS versions and architectures I haven't found any measurable performance difference from these changes. The retry logic is lazy and only happens when a link previously couldn't be resolved. Traversing the path hierarchy is pretty fast and the implementation peeks one level down before descending to avoid unnecessary traversal and error handling.

In practice symbol hierarchies aren't very deep (almost always single digit) and name collisions in one path branch are uncommon so multiple retries for a single link are expected to be very uncommon.

Previous regressions

The first PR that enabled module-relative documentation extension links needed to be reverted (#574) because it introduced a regression.

This root cause for that regression was that documentation links were resolved relative to the module scope with a special case for matching a single path component against the module. This meant that the two links in this documentation extension resolved to the same symbol (the module).

# ``Something``

## Topics

- ``Something``

I added a new test to verify that with these new changes the documentation extension match the module but the curation math the top-level symbol.

I also manually tested the project where this regression was first encountered.

Dependencies

None

Testing

See the Details section for examples of links that used to fail to resolve but work with these changes.

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • Updated documentation if necessary

@d-ronnqvist d-ronnqvist marked this pull request as ready for review May 5, 2023 17:49
@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist d-ronnqvist added the link resolution About DocC's link resolution system label May 5, 2023
Copy link
Contributor

@daniel-grumberg daniel-grumberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation, it made it much easier to review these changes.

@d-ronnqvist d-ronnqvist merged commit dd9c44c into swiftlang:main May 5, 2023
d-ronnqvist added a commit to d-ronnqvist/swift-docc that referenced this pull request May 5, 2023
…chy have the same name (swiftlang#578)

* Walk up the path hierarchy if links fail to resolve in a specific scope

rdar://108672152

Also, check the module's scope if the link couldn't otherwise resolve

rdar://76252171

* Fix test linking to heading that doesn't exist

* Update expression that was very slow to type check

* Fix warning about mutating a captured sendable value

* Remove outdated comment about adding more test assertions

* Update test for old link resolver implementation
d-ronnqvist added a commit that referenced this pull request May 9, 2023
…chy have the same name (#578) (#580)

* Walk up the path hierarchy if links fail to resolve in a specific scope

rdar://108672152

Also, check the module's scope if the link couldn't otherwise resolve

rdar://76252171

* Fix test linking to heading that doesn't exist

* Update expression that was very slow to type check

* Fix warning about mutating a captured sendable value

* Remove outdated comment about adding more test assertions

* Update test for old link resolver implementation
@d-ronnqvist d-ronnqvist deleted the retry-links-up-the-path-hierarchy branch May 9, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
link resolution About DocC's link resolution system
Projects
None yet
2 participants