Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace non-breaking spaces with plain spaces #44228

Merged
merged 4 commits into from
Mar 15, 2022
Merged

replace non-breaking spaces with plain spaces #44228

merged 4 commits into from
Mar 15, 2022

Conversation

StefanKarpinski
Copy link
Member

Also end all text files with newlines.

@StefanKarpinski
Copy link
Member Author

Some of my most important work.

@goretkin
Copy link
Contributor

goretkin commented Feb 17, 2022

These kinds of changes can be a bit disruptive as far as git blame, etc. goes. There may be a way to mitigate it: https://tekin.co.uk/2020/09/ignore-linting-and-formatting-commits-when-running-git-blame

test/file.jl Outdated Show resolved Hide resolved
@StefanKarpinski
Copy link
Member Author

This is mostly adding trailing newlines which don't affect git blame.

@StefanKarpinski
Copy link
Member Author

Well this snowballed. We now check for all trailing whitespace, and that line endings are UNIX style (\n only, no \r\n), and that text files end with exactly one newline.

@KristofferC
Copy link
Member

I think it would be worth adding a git blame ignore file for this afterward.

@StefanKarpinski
Copy link
Member Author

It doesn't actually touch a lot of meaningful lines, but sure. There's one file that had Windows line endings, a bunch of files with too many or too few trailing newlines, and a bunch of places where there were non-breaking spaces (but many were due to me anyway). Blame for most of that stuff doesn't matter.

@KristofferC
Copy link
Member

and a bunch of places where there were non-breaking spaces (but many were due to me anyway).

Having those refer to the original commit where they were added seems like it makes sense. What's the harm in adding the file?

@StefanKarpinski
Copy link
Member Author

No problem if you want to do it.

@StefanKarpinski
Copy link
Member Author

@staticfloat, I assume that the whitespace check that's failing is running the version of contrib/check-whitespace.sh that's on this branch, right? If so, this failure extra fun because it is claiming that line 56 of src/flisp/julia_opsuffs.h has trailing whitespace. Which it doesn't: the line ends with . However, the UTF-8 encoding of is e1 b5 a0 and a0 is the Latin-1 encoding of non-breaking space (code point U+00a0). So that failure indicates that while my git's Perl-compatible mode correctly treats files as UTF-8, the git on the test machines is compiled differently and treats files as Latin-1 instead. Potential solutions:

  1. Upgrade the git binary on the test machines
  2. Rewrite the whitespace check using a tool that handles UTF-8 correctly

Option two seems better and more reliable. This check would be easy to implement in Julia, but we don't have Julia at that point, so presumably the check would have to be implemented in some other language like Python or Perl. What language do we already depend on that I can do this in?

@staticfloat
Copy link
Member

I assume that the whitespace check that's failing is running the version of contrib/check-whitespace.sh that's on this branch, right?

Yes, that's correct. You can see precisely which gitsha (0d072a4be93a3df9e1b092acd6af2e1ad45e368c) was checked out in the preparing work directory section of the build output.

Upgrade the git binary on the test machines

The test job is running inside of a rootfs, which is installing the default git from debian when we debootstrap. If we use test_rootfs.jl to get a shell inside the sandbox, we can see what version of git that actually installs, and whether your suspicion is correct:

$ julia --project test_rootfs.jl -a x86_64 -t 2a058481b567f0e91b9aa3ce4ad4f09e6419355a -u https://github.com/JuliaCI/rootfs-images/releases/download/v4.8/package_linux.x86_64.tar.gz
[ Info: Artifact did not exist locally, downloading
[ Info: Unpacking /tmp/jl_0iuc5f-download.gz into /home/sabae/.julia/artifacts/jl_tplFGg...
juliaci@kholin:/build$ git --version
git version 2.20.1
$ git clone -b sk/nbsp https://github.com/JuliaLang/julia
Cloning into 'julia'...
remote: Enumerating objects: 365316, done.        
remote: Counting objects: 100% (15/15), done.        
remote: Compressing objects: 100% (15/15), done.        
remote: Total 365316 (delta 3), reused 5 (delta 0), pack-reused 365301        
Receiving objects: 100% (365316/365316), 224.39 MiB | 35.13 MiB/s, done.
Resolving deltas: 100% (270641/270641), done.
juliaci@kholin:/build$ cd julia
juliaci@kholin:/build/julia$ ./contrib/check-whitespace.sh 
src/flisp/julia_opsuffs.h:56:   0x00001d60, // ᵠ
src/flisp/julia_opsuffs.h:68:   0x00001da0, // ᶠ
Error: trailing whitespace found in source file(s)

This can often be fixed with:
    git rebase --whitespace=fix HEAD~1
or
    git rebase --whitespace=fix master
and then a forced push of the correct branch

So we could try a different built of git or something, but mostly I just wanted to highlight that we've built a good chunk of the functionality of docker on top of Artifacts, Sandbox, etc... :P

This check would be easy to implement in Julia, but we don't have Julia at that point

Perhaps somewhat surprisingly, we can have Julia here; with one small tweak, we can have the Julia that we used to setup the sandbox (v1.6) available inside of the sandbox. Here's the tweak, and here's the proof

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Mar 8, 2022

@staticfloat, what do you prefer? Should I reimplement the whitespace check as a Julia program or is it easier to upgrade git on the CI systems? Or I can implement in a different language altogether (Python, Perl).

@staticfloat
Copy link
Member

It's up to you, but I'm not confident that a newer version of git from apt will work; my guess is it's some transitive dependency of git that needs to be altered, and that might be complicated to fix.

In addition to whatever version of Julia you desire, you have access to the following tools (or newer, if we upgrade the rootfs image):

$ python3 --version
Python 3.9.2
# perl --version

This is perl 5, version 28, subversion 1 (v5.28.1) built for x86_64-linux-gnu-thread-multi
(with 65 registered patches, see perl -V for more detail)

@StefanKarpinski
Copy link
Member Author

Ok, wrote the whitespace check script in Julia.

@staticfloat
Copy link
Member

@StefanKarpinski
Copy link
Member Author

Btw, please DON'T squash merge this—having the changes separate makes it much easier to resolve any merge conflicts that may come up since each commit does a specific thing that can be automated.

@StefanKarpinski
Copy link
Member Author

This is good as far as I'm concerned. If it passes CI it can be merged (non-squash).

@StefanKarpinski
Copy link
Member Author

StefanKarpinski commented Mar 10, 2022

It would be ideal to get all CI passes green but it seems pretty unlikely that this change is related to the various failures here. Up to those managing these CI failures to decide what to do here (wait for green or merge [NO SQUASH PLEASE] before).

@KristofferC
Copy link
Member

Maybe try rebase it on lastest master to get a fresh CI run (CI doesn't run on merge commit).

@DilumAluthge
Copy link
Member

We'll need to install Julia inside the whitespace job: JuliaCI/julia-buildkite#58

@StefanKarpinski
Copy link
Member Author

Can you CI guys fix this up with additional commits to make it work? I can rebase the history once it's working but I don't think I can get this working myself with the CI stuff in flux.

@DilumAluthge
Copy link
Member

DilumAluthge commented Mar 11, 2022

Definitely.

Is it cool if I push directly to this branch? And then you can rebase when it's all green.

@StefanKarpinski
Copy link
Member Author

Yeah, go for it. Just add commits and I'll fix it up later.

This now checks for:

- trailing non-ASCII whitespace
- non-breaking spaces anywhere
- non-UNIX line endings
- trailing blank lines
- no trailing newline

Git can't handle this, largely because whether it interprets files as
UTF-8 or Latin-1 depends on how system libraries that it uses for regex
matching are compiled, which is inconsistent and hard to fix. The end of
file checks are also quite awkard and inefficient to implement with Git
and shell scripting. Julia is fast and lets us present results clearly.
@DilumAluthge
Copy link
Member

@StefanKarpinski Looks like Buildkite is green on this now.

@StefanKarpinski
Copy link
Member Author

Does that mean it's safe to merge this PR? Buildbot failures look unrelated.

@DilumAluthge
Copy link
Member

Yes, I think you can merge this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants