Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a practice exercise generator #480

Merged
merged 38 commits into from
Apr 13, 2022
Merged

Conversation

jiegillet
Copy link
Contributor

@jiegillet jiegillet commented Apr 2, 2022

I know that this is not a priority compared to concept exercises, but I love implementing practice exercises, and I had some time off, so I decided to build a generator to help creating those.

The generator code is in generate, all you need to make it work is running bin/generate.sh <exercise-name> and follow the instructions.

For example, try to run .bin/generate.sh dominoes, it will ask you to modify config.json and after that the following files will be created:

exercises/practice/dominoes/
├── .docs
│   └── instructions.md
├── .meta
│   ├── config.json
│   └── src
│       └── Dominoes.example.elm
├── elm.json
├── src
│   └── Dominoes.elm
└── tests
    └── Tests.elm

the docs and config file are created by configlet, and the rest by the generator. Once the solution is implemented for real, the instructions will ask you to run one more configlet command to generate the tests.toml.

The Elm files generated are by no means perfect, but it really gets the ball going because all of the data is there. With some good regex-fu, the tests are not too hard to get into shape. For examples for dominoes I feel it would be enough to replace the lists [1, 2] by tuples (1, 2) and start working on the solution, it can be done with one clever sed command.
I tried to catch some corner cases (all sorts of input, reserved words, reimplemented exercises) but it's probably not going to be perfect for all exercises. I've ran it on a bunch and it seems to work well enough though.

The code is not really critical to anything, so I feel that a deep review may not be necessary here, but of course suggestions are welcome.

I decided to test it on a real case: complex-numbers, but if that's too much to review, I can put it in a different PR.

As with all large PRs, I do not expect a quick review, especially since there are other ones pending.

@jiegillet jiegillet added x:module/generator Work on Exercise generators x:module/practice-exercise Work on Practice Exercises x:type/coding Write code that is not student-facing content (e.g. test-runners, generators, but not exercises) x:size/massive Massive amount of work labels Apr 2, 2022
Copy link
Member

@mpizenberg mpizenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must say I wasn't expecting an elm cli app to generate templates. My main objection is that elm is not suited for this task. This is a lot of template filling, network requesting, json parsing. My feeling is that doing all this directly in python or javascript will be much more straightforward and clearer, making it easier to maintain.

I don't want to discard all you work, and I totally get wanting to do it in elm, though I'm feeling that there is nothing fundamentally requiring elm for this (we don't need to parse elm files as I understand) and elm is not very suited for this kind of tasks. So I'm wondering if the most adequate way for this isn't instead a markdown readme containing a few commands reusing the files in template/ and filling the data with appropriate data from the canonical repo.

What do you think?

Also I think it's worth splitting this PR and the one for the creation of the complex exercise.

@jiegillet jiegillet changed the title Create a practice exercise generator and use it on complex-numbers Create a practice exercise generator Apr 4, 2022
@jiegillet jiegillet removed the x:module/practice-exercise Work on Practice Exercises label Apr 4, 2022
@jiegillet
Copy link
Contributor Author

It's true that this is well suited for a scripting language...

The logic in generate/src/Main.elm is not completely straightforward though, for example searchFunctions, and I think that more than a few commands would be necessary would be required to transform canonical-data.json into a functional Test.elm.

But if you think this is a maintenance burden, I understand and I can close the PR. I'll definitely keep using it for myself on my branch :)

@mpizenberg
Copy link
Member

I'll have a closer look at this in the coming days, and probably give a try at a JS equivalent, to see the tradeoffs.

@mpizenberg
Copy link
Member

I've started some review of the code but it would be nice if you could add a bit more comments on how things work. I've added some comments and some remarks directly in comments and I have not spend enough time yet to understand it all.

Some questions I had while reading, why is reimplements per test and not per exercise? and what is the purpose of replaceReservedWord? I didn't get why there could be elm keywords in the canonical value decoded but I might see it from the wrong perspective.

@mpizenberg
Copy link
Member

Ok so for reimplements it's because tests should be immutable so when changing a test case we can track which test a new one is re-implementing.

@mpizenberg
Copy link
Member

mpizenberg commented Apr 8, 2022

I've added set -euo pipefail because I got in a confusing situation where I had answered yes to he question without actually changing the config.json (I thought I didn't need if I didn't change anything). Then the configlet sync command failed, and so the directory for the folder was not created, and so the cp template/elm.json $exercise_dir create a file of the name of the exercise instead of inside it. Basically a cascade of errors.

With the fail fast, I saw the error with configlet sync saying the exercise was missing from the config file.

@mpizenberg
Copy link
Member

Also there might be some issue in the escaping of strings for function parameters in the generated test file since when run with accumulate, I'm seeing this in the generated test file:

Accumulate.accumulate [ "a", "b", "c" ] "(x) => accumulate([" 1 ", " 2 ", " 3 "], (y) => x + y))"
    |> Expect.equal [ [ "a1", "a2", "a3" ], [ "b1", "b2", "b3" ], [ "c1", "c2", "c3" ] ]

Where the " inside the accumulate expression are not escaped and thus split the string into multiple strings.

@mpizenberg
Copy link
Member

I've also started / made an attempt at a JS equivalent of the elm script. Don't hesitate to have a look and to let me know what you think. I'll try to finish it tomorrow.

@jiegillet
Copy link
Contributor Author

what is the purpose of replaceReservedWord? I didn't get why there could be elm keywords in the canonical value decoded but I might see it from the wrong perspective.

It's for when variable names from the inputs are Elm reserved words. It actually happened to me during a test, I think the variable was called type or something and the file wouldn't compile/format. Those are printed in the stub file.

Also there might be some issue in the escaping of strings for function parameters in the generated test file since when run [...] Where the " inside the accumulate expression are not escaped and thus split the string into multiple strings.

Good catch!

@jiegillet
Copy link
Contributor Author

I've also started / made an attempt at a JS equivalent of the elm script. Don't hesitate to have a look and to let me know what you think. I'll try to finish it tomorrow.

I don't actually know JS (big gap in my knowledge, I'll fill it some time) so I won't be much help :)

@ceddlyburge
Copy link
Contributor

Ok, I've taken a look! Personally I am happy with either Javascript or Elm. There are pros and cons, but to me, it's not a huge deal either way.

For me, the pros of the Elm solution are: I enjoy writing Elm more than I like writing Javascript, the json parsing is more robust, and the types improve readability and error messages.

And I think the pros of the Javascript are: it is shorter (although not by that much, as Elm code is famous for large vertical spacing), JSON is Javascript so parsing it is easy, IO is easy, no extra step is required in the script.

So I would be happy with either, and will leave the decision to people that have a stronger opinion one way or the other ...

bin/generate.sh Outdated
@@ -0,0 +1,53 @@
#!/usr/bin/env bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename this to make it clear what is being generated?

@@ -0,0 +1,24 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename this directory to make it clear what is being generated? And maybe add a readme if we feel that it is warranted? It could have some instructions for how to generate a new practice exercise from the canonical data for example.

@mpizenberg
Copy link
Member

Thanks @ceddlyburge . I'm slightly in favor of the JS version, but since this is @jiegillet proposition and he's certainly the one going to use it most, let's leave the decision to Jeremie then since you don't have preferences Cedd.

If we go with the Elm version, I'd like to give another review pass and try some refactoring (refactoring is my favorite hobby)

@jiegillet
Copy link
Contributor Author

Well if you leave it to me, I guess I'll pick the Elm, because I love Elm, I don't know JS (yet), I'm probably going to use it a lot (and most likely modify it), and finally who am I to deprive @mpizenberg from his favorite hobby?

I'll leave it in your hands then, refactor away and I'll review it.

@ceddlyburge
Copy link
Contributor

Refactoring is very satisfying :)

@mpizenberg
Copy link
Member

mpizenberg commented Apr 12, 2022

I've made the following changes to the Elm version:

  1. Replace resultType : ResultType by canError : Bool in the Function type. Generally it's better to use custom types over booleans, especially when these booleans are just identified by position in a list of arguments. But here, the boolean has a clear attribute name in a record, and the use of booleans enables simpler logic to combine them in the search function. As for the canError name. I thought it reflected better the actual data, because it might also be relevant to use Maybe instead of Result in some cases (maybe in the future).

  2. Use Dict.insert instead of Dict.update in the search function. The Dict.update has a rather cumbersome api, and it's usually simpler to just insert, with a default value that's the current one.

  3. Use multiline strings templates, with String.replace for printFunction and printTest as those are much easier to read.

  4. Delay the file path generation to the JS code. This enables doing it in an OS-independent way and help with windows compatibility if someone writes a windows equivalent to the bash script.

  5. Split the decoding and the string conversion of JSON values in two steps. This enables simpler code for both the decoder and the string conversion to Elm code.

@mpizenberg
Copy link
Member

There is one last thing I want to try is another custom type for canonical data.

@mpizenberg
Copy link
Member

Ok I only did a tiny custom type change for test cases since what I had in mind didn't pan out.

With the few changes I did regarding the decoding and string conversions, the order in which all functions are in the file may be a bit untidy, feel free to rearrange as you like. I'm done with the modifications I wanted to try, the ball is in your court now.

Copy link
Member

@mpizenberg mpizenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition to have a better guess at types! I think you could even do the same for the arguments types. Using multiple tests to guess better the types. Since tests usually start with trivial tests, it's highly probable that first tests will have arguments like [] : List todo or null : Maybe todo that could be refined with the other tests using the same function, as you do for the return type.

curl https://raw.githubusercontent.com/exercism/problem-specifications/main/exercises/${SLUG}/canonical-data.json \
| node generate_practice_exercise/src/cli.js $SLUG
elm-format --yes ${exercise_dir}/**/*.elm
elm-format --yes ${exercise_dir}/.meta/src/*.elm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't these filesd be matched by the previous line too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ** patern doesn't go in hidden directory to my knowledge so I think both are needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, when I do ** it doesn't match hidden directories. Maybe it's a config thing on my side, but I'd rather leave it this way if others are in the same case.

@jiegillet
Copy link
Contributor Author

Thank you for all the reviews, improvements all around.

I renamed the script and folder to generate_practice_exercise, removed the now redundant stub-new-exercise.sh, added type annotation, which JsonValue made possible (it's not perfect by any means but it's good enough for now), improved the bash script and added a README. I'll merge it as it is, I'll probably make some changes as I use it in the future.

This has been really, really fun, thank you all. Hopefully this is a useful tool.

@jiegillet jiegillet merged commit 1458dd4 into exercism:main Apr 13, 2022
@jiegillet jiegillet deleted the jie-generate branch April 13, 2022 11:34
@mpizenberg
Copy link
Member

Thank you! I hope I haven't been too picky even if you said it was fun, I know I can be a bit too much sometimes ^^.

PS, we can safely remove the generate.js in a following PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
x:module/generator Work on Exercise generators x:size/massive Massive amount of work x:type/coding Write code that is not student-facing content (e.g. test-runners, generators, but not exercises)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants