-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release regex 1.0 (May 1, 2018) #457
Comments
Previously, we had some inconsistencies in how we were handling ASCII word boundaries. In particular, the translator was accepting a negated ASCII word boundary even if the caller didn't disable the UTF-8 invariant. This is wrong, since a negated ASCII word boundary can match between any two arbitrary bytes. However, fixing this is a breaking change, so for now we document the bug. We plan to fix it with regex 1.0. See rust-lang#457. Additionally, we were incorrectly declaring that an ASCII word boundary matched invalid UTF-8 via the Hir::is_always_utf8 property. An ASCII word boundary must always match an ASCII byte on one side, which implies a valid UTF-8 position.
Previously, we had some inconsistencies in how we were handling ASCII word boundaries. In particular, the translator was accepting a negated ASCII word boundary even if the caller didn't disable the UTF-8 invariant. This is wrong, since a negated ASCII word boundary can match between any two arbitrary bytes. However, fixing this is a breaking change, so for now we document the bug. We plan to fix it with regex 1.0. See #457. Additionally, we were incorrectly declaring that an ASCII word boundary matched invalid UTF-8 via the Hir::is_always_utf8 property. An ASCII word boundary must always match an ASCII byte on one side, which implies a valid UTF-8 position.
I'd advocate against this. Requiring a newer Rust version should be considered a breaking change. Discussion on this topic rust-lang/api-guidelines#123 |
@jethrogb It's a compromise. The topic has been discussed to death. I frankly don't have the energy to continue discussing it. |
My thoughts on the matter are here: rust-lang/api-guidelines#123 (comment) |
Let me rephrase this, because I do want to discuss it if there are things that I've missed or haven't thought about deeply. In particular, I would appreciate if further discussion built off of my thoughts here. If I'm missing something, then let's talk about it, but let's please try to avoid rehashing things. |
My hope is that a community-wide decision on the matter can be reached (possibly as a part of the API guidelines initiative) before committing to a particular strategy just for this crate. |
I don't see how that's going to happen. Did you read my comment that I linked? The regex crate will not be the first to adopt this policy. |
@jethrogb in that case, you can just pin to the minor version, which still allows updates of bugfix without upgrading your compiler. |
@BurntSushi but it will be (I think) the first crate under the rust-lang umbrella to adopt that policy, no? @WiSaGaN Not if you have dependencies that silently update their dependency version requirements (which should be ok because they're semver compatible)! See the linked thread for additional discussion. |
@jethrogb If the dependency crates' owners decides to upgrade to As long as |
@WiSaGaN Yes. The point of semver is to not have to meticulously read each dependency's "compatibility policy" because it's spelled out in semver. |
This conversation is frustrating. Semver does not spell out anything other than a high level interpretation of version numbers. There is already broad precedence in the Rust ecosystem that certain types of technically backwards incompatible changes aren't actually backwards incompatible when reasoning about semver. Moreover, the assumption that bumping the minimum Rust version is even a semver incompatible changed is contested, and acting like it isn't is not productive. I will stress this again: let's please table this discussion unless you have something new to add. @jethrogb Your distaste has been acknowledged. I don't find any of your arguments compelling because they don't bring anything new to the table. My plan at the moment is to continue as planned, and if that in practice causes problems in the ecosystem, then we can re-evaluate that policy. |
@BurntSushi Sorry, it wasn't my intention to have this discussion in this thread all over again. I'm fine with having the discussion in rust-lang/api-guidelines#123 and I suggest @WiSaGaN express their arguments over there is well. My main gripe is with a crate under the rust-lang umbrella (this crate) unilaterally defining what the policy is before consensus is achieved/a decision is made in that issue. |
OK, well, if you want to take that angle, then my proposal is far more conservative than what we currently do, at least for the nursery crates anyway. The nursery crates have bumped the minimum Rust version required to compile the crate in semver compatible releases quite a bit (some have even done so recently, albeit with good reason). The only official policy I'm aware of is that rust-lang crates must support at least stable minus 2 releases back, which is compatible with my proposal. |
@WiSaGaN As @jethrogb suggested, please take this discussion to rust-lang/api-guidelines#123 In particular, please carefully read my comment on that thread, which points out problems with constraints like |
This also clarifies our policy on increasing the minimum Rust version required. In particular, we reserve the right to increase the minimum Rust version in minor version releases of regexes, but never in patch releases. We will default to a reasonably conservative interpretation of this policy, and not bump the minimum required Rust version lightly. If this policy turns out to be too aggressive, then we may alter it in the future to state that the minimum Rust version is fixed for all of regex 1.y.z, and can only be bumped on major regex version releases. See: rust-lang#457
This commit disables octal syntax by default, which will permit us to produce useful error messages if a user tried to invoke a backreference. This commit adds a new `octal` method to RegexBuilder and RegexSetBuilder which permits callers to re-enable octal syntax. See rust-lang#457
The issue with the ASCII version of \B is that it can match between code units of UTF-8, which means it can cause match indices reported to be on invalid UTF-8 boundaries. Therefore, similar to things like `(?-u:\xFF)`, we ban negated ASCII word boundaries from Unicode regular expressions. Normal ASCII word boundaries remain accessible from Unicode regular expressions. See: rust-lang#457
This removes a public `From` impl that automatically converts errors from the regex-syntax crate to a regex::Error. This actually causes regex-syntax to be a public dependency of regex, which was an oversight. We now remove it, which completely breaks any source code coupling between regex and regex-syntax. See rust-lang#457
This also clarifies our policy on increasing the minimum Rust version required. In particular, we reserve the right to increase the minimum Rust version in minor version releases of regexes, but never in patch releases. We will default to a reasonably conservative interpretation of this policy, and not bump the minimum required Rust version lightly. If this policy turns out to be too aggressive, then we may alter it in the future to state that the minimum Rust version is fixed for all of regex 1.y.z, and can only be bumped on major regex version releases. See rust-lang#457
This commit disables octal syntax by default, which will permit us to produce useful error messages if a user tried to invoke a backreference. This commit adds a new `octal` method to RegexBuilder and RegexSetBuilder which permits callers to re-enable octal syntax. See rust-lang#457
The issue with the ASCII version of \B is that it can match between code units of UTF-8, which means it can cause match indices reported to be on invalid UTF-8 boundaries. Therefore, similar to things like `(?-u:\xFF)`, we ban negated ASCII word boundaries from Unicode regular expressions. Normal ASCII word boundaries remain accessible from Unicode regular expressions. See rust-lang#457
This removes a public `From` impl that automatically converts errors from the regex-syntax crate to a regex::Error. This actually causes regex-syntax to be a public dependency of regex, which was an oversight. We now remove it, which completely breaks any source code coupling between regex and regex-syntax. See rust-lang#457
This commit adds a new 'std' feature and enables it by default. This permits us to one day add support for building regex without 'std' (but with 'alloc', probably) by avoiding the introduction of incompatibilities. Namely, this setup ensures that all of today's uses of '--no-default-features' won't compile without also adding the 'std' feature. Closes rust-lang#457
For anyone following along at home, I've opened #471 that implements the above changes to bring us to 1.0. Unless something comes up, my plan is to release this on Tuesday (May 1). |
This also clarifies our policy on increasing the minimum Rust version required. In particular, we reserve the right to increase the minimum Rust version in minor version releases of regexes, but never in patch releases. We will default to a reasonably conservative interpretation of this policy, and not bump the minimum required Rust version lightly. If this policy turns out to be too aggressive, then we may alter it in the future to state that the minimum Rust version is fixed for all of regex 1.y.z, and can only be bumped on major regex version releases. See rust-lang#457
This commit disables octal syntax by default, which will permit us to produce useful error messages if a user tried to invoke a backreference. This commit adds a new `octal` method to RegexBuilder and RegexSetBuilder which permits callers to re-enable octal syntax. See rust-lang#457
The issue with the ASCII version of \B is that it can match between code units of UTF-8, which means it can cause match indices reported to be on invalid UTF-8 boundaries. Therefore, similar to things like `(?-u:\xFF)`, we ban negated ASCII word boundaries from Unicode regular expressions. Normal ASCII word boundaries remain accessible from Unicode regular expressions. See rust-lang#457
This removes a public `From` impl that automatically converts errors from the regex-syntax crate to a regex::Error. This actually causes regex-syntax to be a public dependency of regex, which was an oversight. We now remove it, which completely breaks any source code coupling between regex and regex-syntax. See rust-lang#457
This commit adds a new 'use_std' feature and enables it by default. This permits us to one day add support for building regex without 'use_std' (but with 'alloc', probably) by avoiding the introduction of incompatibilities. Namely, this setup ensures that all of today's uses of '--no-default-features' won't compile without also adding the 'use_std' feature. Closes rust-lang#457
This also clarifies our policy on increasing the minimum Rust version required. In particular, we reserve the right to increase the minimum Rust version in minor version releases of regexes, but never in patch releases. We will default to a reasonably conservative interpretation of this policy, and not bump the minimum required Rust version lightly. If this policy turns out to be too aggressive, then we may alter it in the future to state that the minimum Rust version is fixed for all of regex 1.y.z, and can only be bumped on major regex version releases. See #457
This commit disables octal syntax by default, which will permit us to produce useful error messages if a user tried to invoke a backreference. This commit adds a new `octal` method to RegexBuilder and RegexSetBuilder which permits callers to re-enable octal syntax. See #457
The issue with the ASCII version of \B is that it can match between code units of UTF-8, which means it can cause match indices reported to be on invalid UTF-8 boundaries. Therefore, similar to things like `(?-u:\xFF)`, we ban negated ASCII word boundaries from Unicode regular expressions. Normal ASCII word boundaries remain accessible from Unicode regular expressions. See #457
This removes a public `From` impl that automatically converts errors from the regex-syntax crate to a regex::Error. This actually causes regex-syntax to be a public dependency of regex, which was an oversight. We now remove it, which completely breaks any source code coupling between regex and regex-syntax. See #457
This commit adds a new 'use_std' feature and enables it by default. This permits us to one day add support for building regex without 'use_std' (but with 'alloc', probably) by avoiding the introduction of incompatibilities. Namely, this setup ensures that all of today's uses of '--no-default-features' won't compile without also adding the 'use_std' feature. Closes #457
I think the
0.2
release has baked long enough. I propose thatregex 1.0
be released on May 1, 2018.Here are the key breaking changes (all supremely minor) I'd like to make:
1.x.y
) should never increase the minimum Rust version required to compileregex
, but that new minor version releases (1.x
) may increase the minimum Rust version required to compileregex
.Regex::new(r"\1").unwrap().is_match("\u{1}")
evaluates totrue
. Instead, I'd like it to emit an error that backreferences are not supported. We will provide a method onRegexBuilder
to opt into the old syntax with octal escape sequences supported.(?-u:\B)
from use inRegex::new
, since it is permitted to match invalid UTF-8 boundaries. We, of course, continue to allow it forbytes::Regex::new
.(?-u:\b)
remains legal inRegex::new
, since it cannot match invalid UTF-8 boundaries.impl From<regex_syntax::Error> for regex::Error
definition. The fact that this exists was an oversight, and it actually causesregex-syntax
to be a public dependency ofregex
, which we very much do not want to happen.core
/alloc
-only use cases. To make this happen, we'll need to gate things on astd
feature that is enabled by default. We need to add this feature in 1.0 and gate the entire crate on it. If we didn't, and added this gate in the future, then existing uses ofdefault-features = false
would likely break, which would be a breaking change.The text was updated successfully, but these errors were encountered: