Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TSCUtility] Correct semantic version parsing and comparison #214

Conversation

WowbaggersLiquidLunch
Copy link
Contributor

@WowbaggersLiquidLunch WowbaggersLiquidLunch commented May 13, 2021

This PR is similar to swiftlang/swift-package-manager#3486, and likely conflicts with #212.

parsing

The semantic versioning specification 2.0.0 states that pre-release identifiers must be positioned after the version core, and build metadata identifiers after pre-release identifiers. It also states that both pre-release and build metadata identifiers can contain "-" (hyphens), while at the same time "-" is used to indicate where pre-release identifiers begin.

In the old (currently shipped) implementation, if a version core was appended with build metadata identifiers that contain "-", the first "-" would be mistaken as an indication of pre-release identifiers thereafter. Then, the position of the first "-" would be treated as where the version core ends, resulting in a false negative after it was found that the version core (plus a part of the build metadata identifiers) contained non-numeric characters.

For example: the semantic version 1.2.3+some-meta.data is a well-formed, with 1.2.3 being the version core and some-meta.data the build metadata identifiers. However, the old implementation of Version.init?(_ versionString: String) would incorrectly treat 1.2.3+some as the version core and meta.data the pre-release identifiers.

The new implementation fixes this problem by restricting the search area for "-" to the substring before the first "+".

The initialiser wherein the parsing takes place has been renamed from init?(string: String) to init?(_ versionString: String) which calls init(versionString: String) throws. The old initialiser is not removed but marked as deprecated for source compatibility with SwiftPM. With the new initialiser name, Version now conforms to LosslessStringConvertible.

In addition, the logic for breaking up the version core into numeric identifiers has been rewritten to be more understandable.

comparison

Version already conforms to Comparable, but Comparable does not provide a default implementation for ==, so the compiler synthesises one composed of member-wise comparisons. This leads to a false false when 2 semantic versions differ by only their build metadata identifiers, contradicting SemVer 2.0.0's comparison rules.

This PR adds an implementation of == to Version that returns true iff one version is neither greater nor less than the other. One consequence, though, is that now two versions that differ by only their build metadata identifiers are not allowed in the same set, and one assertion in the tests is inverted accordingly.

Also, because Version declares conformance to Hashable, this PR adds a custom hash(into:) that aligns with the custom ==.

Comment on lines 47 to 280
XCTAssertNotEqual(Set([Version(1,2,3)]), Set([Version(1,2,3, buildMetadataIdentifiers: ["1011"])]))
XCTAssertEqual(Set([Version(1,2,3)]), Set([Version(1,2,3, buildMetadataIdentifiers: ["1011"])]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that two version that differ only by their build metadata identifiers are considered equal, they can not exist in the same set, or as keys in the same dictionary, etc. Would this be considered a correct behaviour?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether it would or not — could this cause problems for existing packages? I suppose in that case there would already have been an error about there being two versions of a single package in the same graph, when according to semver rules those should always have been treated the same. So if I understand correctly this would be safe because it's more permissive than it would have been in the past?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether that really answers anything — I'm trying to understand whether there could be anything here that causes existing packages to break.

@WowbaggersLiquidLunch
Copy link
Contributor Author

When given an invalid version string, most Version initialisers crash instead of creating a dummy version 0.0.0 like what they do in SwiftPM. How do I write tests to check that invalid strings do result in fatalerror without actually crashing the tests?

@tomerd
Copy link
Contributor

tomerd commented May 13, 2021

When given an invalid version string, most Version initialisers crash instead of creating a dummy version 0.0.0 like what they do in SwiftPM. How do I write tests to check that invalid strings do result in fatalerror without actually crashing the tests?

the way to do this with Swift would be a throwing or failable initializer, fatalError is designed for more severe assertion like situation when the program must stop running due to risk of memory corruption etc

@WowbaggersLiquidLunch
Copy link
Contributor Author

the way to do this with Swift would be a throwing or failable initializer, fatalError is designed for more severe assertion like situation when the program must stop running due to risk of memory corruption etc

I agree that fatalError doesn't seem right in Version's initializers. Instead of changing them to throwing or failable, which is going to affect their call sites in SwiftPM, what do you think of having them creating 0.0.0 dummy versions, like SwiftPM's Version initialisers do?

Also, should I change them as part of this PR, or in a follow-up PR?

@tomerd
Copy link
Contributor

tomerd commented May 13, 2021

what do you think of having them creating 0.0.0 dummy versions, like SwiftPM's Version initialisers do?

IMO returning dummy versions is not ideal, and a throwable or optional feels safer. that said, I dont have the history on why SwiftPM does that. @abertelrud @neonichu wdyt?

@tomerd
Copy link
Contributor

tomerd commented May 13, 2021

Also, should I change them as part of this PR, or in a follow-up PR?

personally, I always prefer smaller PRs where possible

@tomerd tomerd assigned neonichu and unassigned abertelrud May 13, 2021
@abertelrud
Copy link
Contributor

what do you think of having them creating 0.0.0 dummy versions, like SwiftPM's Version initialisers do?

IMO returning dummy versions is not ideal, and a throwable or optional feels safer. that said, I dont have the history on why SwiftPM does that. @abertelrud @neonichu wdyt?

Creating dummy versions (or any kind of value, really) on error is a surprise to me, and seems wrong. So if SwiftPM is doing that it should probably change (in a separate PR to keep each one as focused as possible), instead throwing an error if there is a problem. The caller can then always decide to substitute a default value after reporting the error, if that's appropriate.

@WowbaggersLiquidLunch
Copy link
Contributor Author

Sorry for the late response.

So if SwiftPM is doing that it should probably change (in a separate PR to keep each one as focused as possible), instead throwing an error if there is a problem.

Would a failable initializer be better than a throwing one? I don't remember where I read it (I can't find it in any documentation or the API design guidelines), but I remember reading that failable initializers should be used where the reason of the failure is clear. For example, init?(string:). And throwing initializers should be used where there are many reasons why it can fail.

@neonichu
Copy link
Contributor

@swift-ci please test

@WowbaggersLiquidLunch
Copy link
Contributor Author

Regarding the fatalError in one of TSC's Version.init and dummy version 0.0.0 in one of SwiftPM's, I just checked them again with fresh eyes.

Both of them happen in init(stringLiteral value: String), which is a requirement of ExpressibleByStringLiteral. Because the requirement is non-throwing and now-failable, neither throwing nor failable initializers can satisfy it. This leaves fatalError and dummy version as the only solutions.

I don't know if it was even correct to make Version be ExpressibleByStringLiteral to begin with, since it only works with specifically formatted strings. However, the correctness probably doesn't matter much now, because it doesn't seem possible to either remove or replace the conformance, because doing so is a big source break.

I also don't know which is the better choice between fatalError and dummy version. On one hand, it's (probably) better to crash than to create invalid output from invalid input; on the other hand, SwiftPM (probably), like the compiler, shouldn't crash even when the input is invalid. However, regardless of which one is better, I think TSC and SwiftPM should be consistent and use the same one.

@neonichu
Copy link
Contributor

neonichu commented Jul 8, 2021

I don't know if it was even correct to make Version be ExpressibleByStringLiteral to begin with, since it only works with specifically formatted strings.

Not entirely sure, but I am assuming this was done to be able to use literal strings for versions in a package manifest which does seem desirable. That said, it should be possible to express that in a different way today, e.g. possibly an enum with a case for an invalid version?

@neonichu
Copy link
Contributor

@WowbaggersLiquidLunch Apart from the question of how this API should potentially look like, I think for the purpose of this PR we can leave it as-is and you could write all your tests around the fallible initializer that also exists. What do you think?

@WowbaggersLiquidLunch
Copy link
Contributor Author

Sorry for the late response!

Not entirely sure, but I am assuming this was done to be able to use literal strings for versions in a package manifest which does seem desirable.

Yes I think this is exactly why SwiftPM's Version is ExpressibleByStringLiteral. However, I don't think TSC's Version is used in the manifest.

That said, it should be possible to express that in a different way today, e.g. possibly an enum with a case for an invalid version?

I haven't thought of it. It does seem like a possible solution though.

Apart from the question of how this API should potentially look like, I think for the purpose of this PR we can leave it as-is

I agree. What this API should look like should be its own PR. Would it be better if I move this discussion to the forums?

you could write all your tests around the fallible initializer that also exists.

This prompted me to check the tests again, and I think init?(_) is missing some tests for invalid inputs. I'm going to add those tests.

The semantic versioning specification 2.0.0 [states](https://semver.org/#spec-item-9) that pre-release identifiers must be positioned after the version core, and build metadata identifiers after pre-release identifiers.

In the old implementation, if a version core was appended with metadata identifiers that contain hyphens ("-"), the first hyphen would be mistaken as an indication of pre-release identifiers thereafter. Then, the position of the first hyphen would be treated as where the version core ends, resulting in a false negative after it was found that the "version core" contained non-numeric characters.

For example: the semantic version `1.2.3+some-meta.data` is a well-formed, with `1.2.3` being the version core and `some-meta.data` the metadata identifiers. However, the old implementation of `Version.init?(_ versionString: String)` would falsely treat `1.2.3+some` as the version core and `meta.data` the pre-release identifiers.

The new implementation fixes this problem by restricting the search area for "-" to the substring before the first "+".

The initialiser wherein the parsing takes place has been renamed from `init?(string: String)` to `init?(_ versionString: String)`. The old initialiser is not removed but marked as deprecated for source compatibility with SwiftPM. With the new initialiser name, `Version` now conforms to `LosslessStringConvertible`.

In addition, the logic for breaking up the version core into numeric identifiers has been rewritten to be more understandable.
`Comparable` does not provide a default implementation for `==`, so the compiler synthesises one composed of [member-wise comparisons](https://github.com/apple/swift-evolution/blob/main/proposals/0185-synthesize-equatable-hashable.md#implementation-details). This leads to a false `false` when 2 semantic versions differ by only their build metadata identifiers, contradicting to SemVer 2.0.0's [comparison rules](https://semver.org/#spec-item-10).

This commit adds a manual implementation of `==` for `Version`, along with appropriate tests. One consequence, though, is that now two versions that differ by only their build metadata identifiers are not allowed in the same set.
Because we have a non-synthesised `Equatable` conformance, the synthesised `Hashable` conformance composed of member-wise hashes is incorrect. `buildMetadataIdentifiers` does not participate in `Version`'s `Equatable` conformance, so it shouldn't participate in `Version`'s `Hashable` conformance either.

Relevant: [SR-11588](https://bugs.swift.org/browse/SR-11588)
@WowbaggersLiquidLunch WowbaggersLiquidLunch force-pushed the correct-semantic-version-parsing-and-comparison branch from a95194d to 4c4689c Compare July 29, 2021 18:04
@abertelrud
Copy link
Contributor

@neonichu Do you think this can be merged at this point, given that it's blocking #212?

@WowbaggersLiquidLunch WowbaggersLiquidLunch force-pushed the correct-semantic-version-parsing-and-comparison branch from 4bfd1ce to 9feba0d Compare August 2, 2021 21:17
@abertelrud
Copy link
Contributor

@swift-ci please test

This new initialiser throws a `VersionError` instance when initialisation fails. This gives the user more information and control over error handling. `Version`'s conformance to `LosslessStringConvertible` is preserved by having `init?(_ versionString: String)` call this new initialiser, and return `nil` when an error is thrown.
@WowbaggersLiquidLunch WowbaggersLiquidLunch force-pushed the correct-semantic-version-parsing-and-comparison branch from 9feba0d to 7336c0c Compare August 3, 2021 21:10
@WowbaggersLiquidLunch
Copy link
Contributor Author

Sorry for the new force-push. Just fixed a few more typos and squashed it with one of the commits. No more changes to this PR unless requested.

@tomerd
Copy link
Contributor

tomerd commented Aug 3, 2021

@swift-ci please test

@tomerd tomerd merged commit df0b2ea into swiftlang:main Aug 4, 2021
@tomerd
Copy link
Contributor

tomerd commented Aug 4, 2021

thank you @WowbaggersLiquidLunch!

WowbaggersLiquidLunch added a commit to WowbaggersLiquidLunch/swift that referenced this pull request Sep 20, 2021
Currently `Comparable` inherits from `Equatable`, but does not provide a default implementation for `==`, so the compiler synthesizes one composed of [member-wise `==`s](https://github.com/apple/swift-evolution/blob/main/proposals/0185-synthesize-equatable-hashable.md#implementation-details). This leads to a problem where if a type's `<` is not composed of member-wise inequalities, then `<`, `>`, and `==` can all evaluate to `false` for some pairs of values, contradicting `Comparable`'s documentation:

> Types with Comparable conformance implement the less-than operator (`<`) and the equal-to operator (`==`). These two operations impose a strict total order on the values of a type, in which exactly one of the following must be true for any two values `a` and `b`:
> * `a == b`
> * `a < b`
> * `b < a`

For example:

```swift
struct Length: Comparable {
    enum Unit: Double, Comparable {
        case mm = 0.001
        case m = 1
        case banana = 0.178
    }

    let magnitude: Double
    let unit: Unit

    static func < (lhs: Self, rhs: Self) -> Bool {
        lhs.magnitude * lhs.unit.rawValue < rhs.magnitude * rhs.unit.rawValue
    }
}

let aBanana = Length(magnitude: 1, unit: .banana)
let oneBanana = Length(magnitude: 0.178, unit: .m)

print(aBanana < oneBanana)  // prints "false", because Length's < says so.
print(aBanana > oneBanana)  // prints "false", because Comparable's default implementation of >(a,b) is <(b,a).
print(aBanana == oneBanana) // prints "false", because the 2 Length instances are not member-wise equal.
```

Relevant forums discussion: https://forums.swift.org/t/add-default-implementation-of-to-comparable/48832

This bug has previously resulted in incorrect semantic version comparison in SwiftPM (swiftlang/swift-package-manager#3486 and swiftlang/swift-tools-support-core#214)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants