-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lint(track_config): add more checks of exercise slugs #417
Conversation
With this commit, `configlet lint` now checks that a track-level `config.json` file follows the below rules: - The `exercises.concept[].slug` value must be unique in `exercises.concept[].slug` and may not exist in `exercises.practice[].slug` - The `exercises.practice[].slug` value must be unique in `exercises.practice[].slug` and may not exist in `exercises.concept[].slug` - There must be exactly one `exercises.practice[].slug` value that is the string `hello-world`
With this commit, `configlet lint` now checks that a track-level `config.json` file follows the below rule: - The `exercises.foregone` values must not match any of the concept or practice exercise slugs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent
src/lint/track_config.nim
Outdated
## only exists once on the track. | ||
## - There is exactly one Practice Exercise with the slug `hello-world`. | ||
## - The `foregone` array does not contain a slug of an implemented exercise. | ||
var conceptExerciseSlugs = initHashSet[string](200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the 200
just an initial size, or is it a maximum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an initial size - how many items the HashSet
should support. But, like Nim sequences, a HashSet
"expands" automatically if you keep adding items.
In case you're interested, the implementation details for HashSet
are here:
We allocate an initial number of "slots" using the formula theIntegerThatYouPass * 3 div 2 + 4
, and in general the HashSet
expands when it's more than two-thirds full.
As a quick demonstration: let's try initHashSet(3)
(8 slots, by the above calculation), then insert 5 items, and print the representation:
import std/sets
var s = initHashSet[string](3)
s.incl "aaa"
s.incl "bbb"
s.incl "ccc"
s.incl "ddd"
s.incl "eee"
echo s.repr
The output looks something like the below.
[
data = 0x70000000c050@[
[Field0 = 735026264, Field1 = 0x70000000d0c0"ccc"],
[Field0 = 2446311049, Field1 = 0x70000000d090"bbb"],
[Field0 = 4095589471, Field1 = 0x70000000d0f0"ddd"],
[Field0 = 1505603105, Field1 = 0x70000000d120"eee"],
[Field0 = 0, Field1 = ""],
[Field0 = 0, Field1 = ""],
[Field0 = 0, Field1 = ""],
[Field0 = 3033554871, Field1 = 0x70000000d060"aaa"]
],
counter = 5
]
We can see that a HashSet[string]
is just an object
(again, see the implementation here) with:
- A
data
field of typeseq[tuple[hcode: Hash, key: string]]
, whereHash
is just anint
- A
counter
field, corresponding to the number of included elements
Now if we add another item, and print the representation again:
s.incl "fff"
echo s.repr
We see something like:
[
data = 0x70000000e050@[
Field0 = 966543936, Field1 = 0x70000000dab0"fff"],
Field0 = 1505603105, Field1 = 0x70000000da50"eee"],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 3033554871, Field1 = 0x70000000da80"aaa"],
Field0 = 735026264, Field1 = 0x70000000d9c0"ccc"],
Field0 = 2446311049, Field1 = 0x70000000d9f0"bbb"],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 0, Field1 = ""],
Field0 = 4095589471, Field1 = 0x70000000da20"ddd"]
],
counter = 6
]
In particular:
- The
HashSet
grows as we insert the 6th item - The resulting
HashSet
occupies double the memory (becausegrowthFactor == 2
) - (If you stare hard enough) the pointer to the
HashSet
changes - we allocated a new block of memory - The rehashing condition is
(t.dataLen * 2 < t.counter * 3) or (t.dataLen - t.counter < 4)
. AninitialSize
of 3 produces adataLen
of 8, so the above condition becomes true whent.counter
is 5. With non-tiny sets, we rehash when theHashSet
is more than two-thirds full.
Needless allocations are an important source of performance problems, so it's a good optimization to avoid growing strings/sequences/HashSet
s etc when they're large and you're doing a lot of work. But of course, it doesn't matter at all for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks.
This signals to the reader that we know the size of the HashSet in advance.
This makes it clearer that this proc doesn't use `trackConfig.concepts`.
With this PR,
configlet lint
now checks that a track-levelconfig.json
file follows the below rules:exercises.concept[].slug
value must be unique inexercises.concept[].slug
and may not exist inexercises.practice[].slug
exercises.practice[].slug
value must be unique inexercises.practice[].slug
and may not exist inexercises.concept[].slug
exercises.practice[].slug
value that isthe string
hello-world
exercises.foregone
values must not match any of the concept orpractice exercise slugs
At the time of writing (
2021-09-07T13:37:00Z
), this PR produces the following diff to the output ofconfiglet lint
, per track:julia
https://github.com/exercism/julia/blob/0a134640aeed/config.json#L102-L296
Looks like Sascha has a PR for it: exercism/julia#365