Skip to content
This repository has been archived by the owner on Mar 22, 2022. It is now read-only.

configlet lint: Should check whether UUIDs are valid. #99

Open
petertseng opened this issue Oct 15, 2017 · 17 comments · May be fixed by #168
Open

configlet lint: Should check whether UUIDs are valid. #99

petertseng opened this issue Oct 15, 2017 · 17 comments · May be fixed by #168

Comments

@petertseng
Copy link
Member

petertseng commented Oct 15, 2017

Decision: configlet lint should check whether UUIDs are valid.

Support:

Opposition: None.

Task: Make configlet lint check whether UUIDs are valid.


Original issue text/question:

What does valid mean? Is it the standard xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx where each x is a hexadecimal digit?

Whatever "valid" means in the Exercism context, would it be good for configlet to check for invalid UUIDs? If whatever is using those UUIDs is expecting them all to be valid, it might be needed.

[gems/trackler-2.2.1.45]$ for c in tracks/*/config.json; do echo $c; ruby -rjson -e 'h = /[0-9a-fA-F]/; j = JSON.parse(ARGF.read); j["exercises"].each { |e| u = e["uuid"]; puts "%24s has %s" % [e["slug"], u] if u !~ /^#{h}{8}-#{h}{4}-#{h}{4}-#{h}{4}-#{h}{12}$/ }' $c; done 
tracks/bash/config.json
tracks/c/config.json
tracks/ceylon/config.json
tracks/cfml/config.json
tracks/clojure/config.json
tracks/coffeescript/config.json
tracks/common-lisp/config.json
tracks/coq/config.json
tracks/cpp/config.json
           atbash-cipher has ba4e0186-0389-af80-d3e4-695dfc4e63bff1fa503
            bracket-push has ce9fd8ac-0beb-1a80-4ff6-be19ce1540d905c31b7
                 pangram has ea8d6224-0440-6d80-1569-a04127908a200d2ecf0
tracks/crystal/config.json
tracks/csharp/config.json
tracks/dart/config.json
               raindrops has d6657d4a-0257-0c80-6667-fddd127a571f0ad1fd4
tracks/d/config.json
tracks/delphi/config.json
             hello-world has 765b5bcf-00e0-d180-ea7f-92dae26e6da0b30151c
                 two-fer has b38f504c-098e-fe80-896f-60e6c45751033d26161
                     bob has cd80ef91-0451-c380-3d58-01a2ddff3029f430eea
                    leap has 9234d900-050c-5980-561b-e1f772ae2b9c7b3fd08
       rna-transcription has e8849e37-0b3e-ec80-c41c-c53f2a871f6ee59e2d6
               raindrops has 3b1391da-04c2-e580-30a5-6063e888da196dfba6a
                 hamming has 68c5b50b-03c0-4880-af0f-9cfacb1ba507e837d12
            bank-account has bb32f666-0f50-e480-9738-e4e3e8f7ece13c9b460
        nucleotide-count has d5cae090-0626-c880-3b62-58ed5d301931de3171d
                     etl has 4e538a24-0e30-2b80-2b3f-82ca73b593a81a8cc8d
                  grains has acf5b5ab-03d7-0e80-f419-255df25d35a2466a421
      collatz-conjecture has d8093412-0665-0280-b4ae-edd0a7b483a316f236b
               beer-song has 311b59d8-01ed-0780-14a9-fa94a2937508a4fbf40
            phone-number has ce2587db-02b3-3880-e5aa-0765496de558966abf5
         perfect-numbers has ea99e05d-0f87-4280-aec3-907d40556314f89a033
           binary-search has 5a865bce-0f07-e980-5c71-c5307dc82e30076faeb
                   clock has 4e38035b-040b-0880-b592-f750e82b574e0e58c41
                triangle has ad99318a-0397-e080-5d92-0043fa439ccc604f7ed
               allergies has aedbd675-0c9f-3680-126d-38e9c675c8a00297062
           saddle-points has dbfbea24-014a-0080-5b0e-fe38d498443be3df149
         circular-buffer has a1cb2ebe-05f0-4380-c2c4-7efe9eba69d05f6de75
             minesweeper has b36d2d0d-0ef6-0780-cb76-de828d83c7ce7f5b62b
          roman-numerals has 90d241e1-07d4-ff80-de15-77f2042d2d249233b28
                 bowling has 0bfab0d4-04ca-2080-f5ae-3199d09960b80633e82
               pig-latin has 40c44251-0dff-3380-6a14-ef8e02d1e9242dca58f
              book-store has 234df60b-0b1a-1e80-67c5-f8bcefb32e61245df7d
                   wordy has 426eaa07-02bf-cc80-d4b6-d444f233c0b0eab3b8e
                   poker has 14675cf2-0504-1480-8495-d9aa32915b3a76f2383
tracks/ecmascript/config.json
             twelve-days has 0e43944b-0a68-5680-ef11-70999d2df897c894476
tracks/elisp/config.json
tracks/elixir/config.json
tracks/elm/config.json
      collatz-conjecture has 1f52ab74-02d5-db80-8f95-521aba04d183e704422
           all-your-base has 7890045d-0cf5-1980-21ae-7bd0228ad2a2993ae82
        pascals-triangle has a730ca85-057d-db80-e44e-25b8028acede6337cca
                 isogram has 5f540090-061e-2f80-40a8-d9782700ed2efdf8965
tracks/erlang/config.json
                 two-fer has 1a8abaa2-04f5-0e80-7986-5fb33995971caeb1496
tracks/factor/config.json
tracks/fortran/config.json
tracks/fsharp/config.json
tracks/gnu-apl/config.json
                    leap has 9a89822e-0d61-f980-49fa-c273c0a0e0869b9fc7f
tracks/go/config.json
                 two-fer has dc4ad67d-0fe4-0880-5a13-3aebb8759bcc3095de1
tracks/groovy/config.json
                 two-fer has 3227f4a2-0ffc-1480-5018-4ef3479dbcffdcf3f2b
          scrabble-score has 1d098fb3-0a27-5380-3849-73f2ca77ff2f84ca698
tracks/haskell/config.json
tracks/haxe/config.json
                    leap has d30952cf-0b7d-a980-9351-caedebfed16362c6a7a
tracks/idris/config.json
              accumulate has 612f5f98-09a6-9d80-5820-edeec5ce1e1544073ed
tracks/java/config.json
              tournament has 99191d53-0fed-f680-de7a-55cfb872de17bc58629
tracks/javascript/config.json
tracks/julia/config.json
      collatz-conjecture has 43f1dc63-0412-9a80-49ba-b620f4bbfccac2731e4
         robot-simulator has deb5a654-056c-2180-2dd8-efaa14dc5225f5af86c
                  grains has 2ed6e171-04e1-b080-f383-a801b2564b3e0f4f1ee
tracks/kotlin/config.json
tracks/lfe/config.json
tracks/lua/config.json
tracks/mips/config.json
tracks/nim/config.json
tracks/objective-c/config.json
tracks/ocaml/config.json
tracks/perl5/config.json
tracks/perl6/config.json
tracks/php/config.json
tracks/plsql/config.json
             hello-world has 53156fc9-0945-f080-08a9-ce269bfc9121a5d2c5c
tracks/pony/config.json
tracks/powershell/config.json
tracks/prolog/config.json
tracks/purescript/config.json
tracks/python/config.json
                 two-fer has bacb2184-025f-0980-38d6-34fb419e03a6d3d8b44
              two-bucket has 8c89f739-05fb-7b80-b5f9-6ad079c750ba8302be8
tracks/racket/config.json
tracks/r/config.json
tracks/ruby/config.json
tracks/rust/config.json
tracks/scala/config.json
tracks/scheme/config.json
tracks/sml/config.json
             hello-world has 4450455d-0786-6f80-7672-48368a115df16e0aa7e
                     bob has 9d8e76f1-0e85-a280-b779-16335f29d1a96ce3a72
   difference-of-squares has b6713a74-08d5-4480-6524-f9dbf5cf2563d2b2a91
           atbash-cipher has 2bfc685e-0d13-0280-de5e-53bc093e3a8c5ac7c73
         perfect-numbers has b83ce171-06e6-dd80-080e-9dd189bf6df83833e92
                 pangram has 42adb6ed-09f0-6980-2d4f-75ac90cc2bf1f59fb3a
           prime-factors has 6c191163-0a98-2380-debc-75144c1787f3853e868
        sum-of-multiples has b9377be8-06c1-5280-33b6-a2071298947c9ddf65e
tracks/swift/config.json
tracks/typescript/config.json
      binary-search-tree has b2f90523-0cb1-f080-a03c-5aee5f919abf93b491b
                triangle has cf2d545c-036e-0980-3ac2-64ad54c3867e37f4cfa
tracks/vbnet/config.json
tracks/vimscript/config.json
@ferhatelmas
Copy link
Contributor

@petertseng I am happy it's brought up because I had seen it while doing exercism/go#894 but waited for input from others since I was away for sometime and I might be missing something.

Also, created this issue in UUID dependency: mattetti/uuid#2

@nywilken
Copy link
Contributor

nywilken commented Oct 16, 2017

@petertseng @ferhatelmas the concept of a UUID is fairly new, at least the support of it within Configlet. In fact, the lint command only checks for a non-empty UUID and uniqueness across exercises and tracks. Seeing as configlet is the first line of defense for any track changes, I think validating UUIDs should be part of Configlet. But I will defer to anyone on the team who may have more insight on UUID usage and validation across Nextercism.

@NobbZ
Copy link
Member

NobbZ commented Oct 16, 2017

I think formal validity of an UUID should be checked before even asking upstream for duplicates… Of course the API server should check again ;) We know this already from web forms ;)

Since I already copied from wikipedia, a the canonical representation is written with 32 hexadecimal digits, grouped with a rythm of 8-4-4-4-12, therefore a quick and dirty check in configlet could be to match against a regex ~r/[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}/.

This should catch at least obviously invalid UUIDs. If though the package that is used to generate, is also able to parse them, I'd prefer full offline validation, depending on which version we use.

@nywilken
Copy link
Contributor

I think formal validity of an UUID should be checked before even asking upstream for duplicates…

Agreed.

@ferhatelmas
Copy link
Contributor

If though the package that is used to generate, is also able to parse them ...

https://godoc.org/github.com/satori/go.uuid might be a good option.

@NobbZ
Copy link
Member

NobbZ commented Oct 17, 2017

At least the documentation looks good.

I've looked at the currently used UUID generator and its in fact only capable of generating UUIDv4 (randomly generated). And has no option to read or verify UUIDs.

As a matter of fact, I'd suggest not to use plain UUIDv4. I'd generate a single UUID using v1 or v4 by @kytrinyx as a base. We could use this to namespace UUIDv3/v5 per track and then use the tracks UUID as namespace for the exercises.

This way, we will still have low chances of collision, while the UUID per track/exercise combination is deterministic.

I'm not sure though, how we would handle clashes then. With v4 we could simply regenerate, With v3/v5 we could switch version once, and after that do a v4, but to be honest, if we clash 2 times in a row, then we have another kind of trouble :)

@tleen
Copy link
Member

tleen commented Oct 18, 2017

Are we seeing UUID clashing as an issue here? Lint will check for a clash although if using the uuid command to generate the chances of having a clash are remote. Most likely it would come in from a copy/paste type action. At which point a new one can be generated. Because @NobbZ is right, if we get two in a row we have bigger problems 😄

I'm not sure what we use the UUID for in nexterism so understanding that may help. Is it to represent the exercise in the DB or just give it some sort of unique identifier in general?

@tleen
Copy link
Member

tleen commented Oct 19, 2017

The uuid generating library is just plain broken it seems (and perhaps abandoned). If we are going by the strict definition of UUID the final group is too long and the library's own tests do not catch it. We should move to a more robust option like the one @ferhatelmas suggested or google/uuid?

@kytrinyx
Copy link
Member

The UUID is used to represent the exercise, and it needs to be unique across all exercises on Exercism.

I'd suggest that we use google/uuid.

We have until v2 goes live to get the uuids right in some final/usable form. I would suggest that we switch out the dependent library, and submit PRs to fix all the tracks to fix the UUIDs.

@Stargator
Copy link

Is there a summary of what was ultimately the decision?

@kytrinyx
Copy link
Member

We decided to:

  • replace the current uuid library with the google/uuid one
  • write a (temporary, unofficial) script/commend in configlet to help detect incorrect uuids and replace them
  • submit PRs to each of the tracks with fixes

This doesn't impact the v1 exercism site, but will require a full database refresh of v2 (which was already planned, so we're not losing any data that we thought we would keep around).

@kytrinyx
Copy link
Member

kytrinyx commented Dec 5, 2017

Going back to the original title of this issue:

Should we detect invalid UUIDs on CI? (I think it would be worth doing).

@tleen
Copy link
Member

tleen commented Dec 6, 2017

@kytrinyx we should during the lint? We can message as to the nature of the failure and how to correct it.

@nywilken
Copy link
Contributor

nywilken commented Dec 8, 2017

I agree that checking on CI is crucial, even at lint. But my reservation, and I admit that I have not thought much into the legitimacy of this problem, is that if we run this at lint we will need to change all test data to use valid UUIDs unless there is a disable flag.

@Stargator
Copy link

Stargator commented Dec 8, 2017

@nywilken Are you saying some tests for exercises may trigger a failure if this is run in lint?

@nywilken
Copy link
Contributor

nywilken commented Dec 8, 2017

@Stargator I was specifically speaking about the unit tests for configlet. We currently use UUIDs like “aaa” in configlet fixtures. UUID validation in configlet should not affect tests within exercises.

@petertseng petertseng changed the title configlet lint: Should it check whether UUIDs are valid? configlet lint: Should check whether UUIDs are valid. Feb 27, 2018
@petertseng
Copy link
Member Author

Rephrased from question to task.

petertseng referenced this issue in petertseng/configlet Aug 26, 2019
all tests that involve lint are hereby updated to do one of the two:
1. have valid UUIDs
2. keep their invalid UUIDs, and expect that an error for them is shown.

Closes exercism/configlet#99
@petertseng petertseng linked a pull request Aug 26, 2019 that will close this issue
petertseng referenced this issue in petertseng/configlet Aug 26, 2019
all tests that involve lint are hereby updated to do one of the two:
1. have valid UUIDs
2. keep their invalid UUIDs, and expect that an error for them is shown.

Closes exercism/configlet#99
petertseng referenced this issue in petertseng/exercism-ceylon Aug 30, 2019
Want to prevent any invalid UUIDs from being entered.

Want to put this in configlet lint, but can't:
exercism/configlet#99
exercism/configlet#168
So it will go in individual tracks' .travis.yml for now.

It appears that the site will accept pretty much arbitrary strings as
UUIDs for now, but we want to make less work for ourselves when valid
UUIDs are required.
petertseng referenced this issue in petertseng/exercism-ceylon Aug 30, 2019
Want to prevent any invalid UUIDs from being entered.

Want to put this in configlet lint, but can't:
exercism/configlet#99
exercism/configlet#168
So it will go in individual tracks' .travis.yml for now.

It appears that the site will accept pretty much arbitrary strings as
UUIDs for now, but we want to make less work for ourselves when valid
UUIDs are required.
petertseng referenced this issue in petertseng/exercism-ceylon Aug 30, 2019
Want to prevent any invalid UUIDs from being entered.

Want to put this in configlet lint, but can't:
exercism/configlet#99
exercism/configlet#168
So it will go in individual tracks' .travis.yml for now.

It appears that the site will accept pretty much arbitrary strings as
UUIDs for now, but we want to make less work for ourselves when valid
UUIDs are required.
petertseng referenced this issue in petertseng/exercism-ceylon Aug 30, 2019
Want to prevent any invalid UUIDs from being entered.

Want to put this in configlet lint, but can't:
exercism/configlet#99
exercism/configlet#168
So it will go in individual tracks' .travis.yml for now.

It appears that the site will accept pretty much arbitrary strings as
UUIDs for now, but we want to make less work for ourselves when valid
UUIDs are required.
petertseng referenced this issue in exercism/ceylon Aug 30, 2019
Want to prevent any invalid UUIDs from being entered.

Want to put this in configlet lint, but can't:
exercism/configlet#99
exercism/configlet#168
So it will go in individual tracks' .travis.yml for now.

It appears that the site will accept pretty much arbitrary strings as
UUIDs for now, but we want to make less work for ourselves when valid
UUIDs are required.
petertseng referenced this issue in exercism/rust Aug 31, 2019
Want to prevent any invalid UUIDs from being entered.

Want to put this in configlet lint, but can't:
exercism/configlet#99
exercism/configlet#168
So it will go in individual tracks' .travis.yml for now.

It appears that the site will accept pretty much arbitrary strings as
UUIDs for now, but we want to make less work for ourselves when valid
UUIDs are required.
sshine referenced this issue in exercism/haskell Sep 3, 2019
* travis: reject invalid UUIDs

Want to prevent any invalid UUIDs from being entered.

Want to put this in configlet lint, but can't:
exercism/configlet#99
exercism/configlet#168
So it will go in individual tracks' .travis.yml for now.

It appears that the site will accept pretty much arbitrary strings as
UUIDs for now, but we want to make less work for ourselves when valid
UUIDs are required.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants