Skip to content

Commit

Permalink
Added newlines in tex for easier vim navigation
Browse files Browse the repository at this point in the history
  • Loading branch information
tonowak committed Mar 1, 2023
1 parent bcb760e commit e1addc6
Showing 1 changed file with 56 additions and 18 deletions.
74 changes: 56 additions & 18 deletions thesis-en.tex
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,8 @@ \chapter{State of the art}\label{r:chapter_stateoftheart}

\section{Problems with using semver in Rust}\label{r:section_usageofsemver}

It might seem easy to maintain semver, but some violations are really hard to notice, when not actively searching for them. Let's look at an example.
It might seem easy to maintain semver, but some violations are really hard to notice,
when not actively searching for them. Let's look at an example.
\vspace{-3pt}
\begin{verbatim}
struct Foo {
Expand All @@ -165,18 +166,32 @@ \section{Problems with using semver in Rust}\label{r:section_usageofsemver}
\end{verbatim}
\vspace{-5pt}

Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc<str>} causes semver break, even though it is a non-public field of a non-public struct. Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits, that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar} implement {\ttfamily Send} and {\ttfamily Sync}. In contrary, {\ttfamily Rc<str>} implements neither of them, so the change results in publicly visible struct {\ttfamily Bar} losing a trait.

Of course, things can get way more complex. Just for example, having these structs in very different locations complicates keeping track of such behaviours. A similar error crept into release v3.2.0 of a well-known crate, {\ttfamily clap}. More of that later on in section \ref{r:section_real_life_semver_breaks}.
Changing {\ttfamily Foo.x} type from {\ttfamily String} to {\ttfamily Rc<str>}
causes semver break, even though it is a non-public field of a non-public struct.
Why? {\ttfamily String} implements {\ttfamily Send} and {\ttfamily Sync} traits,
that are automatically derived, making both {\ttfamily Foo} and {\ttfamily Bar}
implement {\ttfamily Send} and {\ttfamily Sync}.
In contrary, {\ttfamily Rc<str>} implements neither of them,
so the change results in publicly visible struct {\ttfamily Bar} losing a trait.

Of course, things can get way more complex.
Just for example, having these structs in very different locations
complicates keeping track of such behaviours.
A similar error crept into release v3.2.0 of a well-known crate, {\ttfamily clap}.
More of that later on in section \ref{r:section_real_life_semver_breaks}.
It should be clear by now, that breaking semver on accident is possible.

\section{Consequences of breaking semver}

When you publish a new version of a crate, that is breaking semver, you are causing a major inconvenience for the crate's users.
When you publish a new version of a crate, that is breaking semver,
you are causing a major inconvenience for the crate's users.
Their code might just stop compiling, when the offending version gets downloaded.
This also could happen if the crate containing violation is not an immediate dependency, so one semver break, could result in tons of broken crates.
This also could happen if the crate containing violation is not an immediate dependency,
so one semver break, could result in tons of broken crates.

Debugging a cryptic compilation error that starts showing up one day, without any change to the code, can be really frustrating, and might drive the users to stop using your crate.
Debugging a cryptic compilation error that starts showing up one day,
without any change to the code, can be really frustrating,
and might drive the users to stop using your crate.

\section{Real-life examples of semver breaks} \label{r:section_real_life_semver_breaks}

Expand All @@ -185,22 +200,45 @@ \section{Real-life examples of semver breaks} \label{r:section_real_life_semver_
\item {\ttfamily pyo3 v0.5.1} accidentally changed a function signature\footnote{https://github.com/PyO3/pyo3/issues/285}
\item {\ttfamily clap v3.2.0} accidentally had a type stop implementing an auto-trait\footnote{https://github.com/clap-rs/clap/issues/3876};
\item multiple {\ttfamily block-buffer} versions accidentally broke their MSRV contract\footnote{https://github.com/RustCrypto/utils/issues/22};
\item and many more. We have developed a script that scans all releases for semver breaks we can detect, the results are covered in section \ref{r:section_scanning_script}
\item and many more. We have developed a script that scans all releases
for semver breaks we can detect, the results are covered in section \ref{r:section_scanning_script}
\end{itemize}

Of course, the problem is even more prominent in less popular crates, where developers might not be as experienced. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf}
claims that out of the yanked (un-publised) releases, semver break was the leading reason for yanking, with shocking 43\% rate.
It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already), are yanked, which should show the scale of the problem - thousands of detected semver breaks.
Of course, the problem is even more prominent in less popular crates,
where developers might not be as experienced. A paper\footnote{https://arxiv.org/pdf/2201.11821.pdf}
claims that out of the yanked (un-publised) releases,
semver break was the leading reason for yanking, with shocking 43\% rate.
It also mentions that 3.7\% of all releases (and there is more than 300 000 of them already),
are yanked, which should show the scale of the problem - thousands of detected semver breaks.

\section{Existing tools for detecting semver breaks}

There aren't many great existing tools for semver checking. The main reason for that, is that the semantics of popular languages do not allow for complete automatic verification. Of course, there are some initiatives to combat this - for example, the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning. It's type system enables automatic detection of all API changes. Outside of that, it does not appear that tools for checking semver in estabilished languages like Python or C++ are commonly used in the industry.

Unfortunately, the Rust langugage's semantic were not designed with semver in mind. Despite this, there are some existing tools for semver checking. First of them, cargo-breaking, works on the abstract syntax tree. The problem here is that to compare API changes, you must navigate two trees at once, which can get really complex and tedious, because the abstract syntax tree could change quite a lot, even without any public API changes. Another issue is that both language syntax and the structure of the abstract syntax tree might change along with the development of the language, which makes maintenance time-consuming.

The second existing tool is rust-semverver, which focuses on the metadata present in the rust-specific rlib binary dynamic static library format. Because of that, unfortunately, the user experience is far from ideal, as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited.

In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well. Adding new queries is designed to be quite accessible, and the maintaince comes to keeping adapter up to date with rustdoc API changes, which seems to be about as low effort as it could be.
There aren't many great existing tools for semver checking.
The main reason for that, is that the semantics of popular languages
do not allow for complete automatic verification.
Of course, there are some initiatives to combat this - for example,
the Elm languge\footnote{https://elm-lang.org/} enforces semantic versioning.
It's type system enables automatic detection of all API changes.
Outside of that, it does not appear that tools for checking semver
in estabilished languages like Python or C++ are commonly used in the industry.

Unfortunately, the Rust langugage's semantic were not designed with semver in mind.
Despite this, there are some existing tools for semver checking.
First of them, cargo-breaking, works on the abstract syntax tree.
The problem here is that to compare API changes, you must navigate two trees at once,
which can get really complex and tedious, because the abstract syntax tree could change quite a lot,
even without any public API changes.
Another issue is that both language syntax and the structure of the abstract syntax tree
might change along with the development of the language, which makes maintenance time-consuming.

The second existing tool is rust-semverver, which focuses on
the metadata present in the rust-specific rlib binary dynamic static library format.
Because of that, unfortunately, the user experience is far from ideal,
as it forces the user into some specific unstable versions of the language, and the quality of error messages is limited.

In comparsion, the cargo-semver-checks' approach to write lints as queries, seems to work really well.
Adding new queries is designed to be quite accessible, and the maintaince comes to
keeping adapter up to date with rustdoc API changes, which seems to be about as low effort as it could be.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Vision %
Expand Down

0 comments on commit e1addc6

Please sign in to comment.