-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dune & Jbuild validation for atoms #891
Conversation
src/usexp/atom.ml
Outdated
let rec jbuild s i len = | ||
i = len || | ||
match String.unsafe_get s i with | ||
| '"' | '(' | ')' | ';' | '\000'..'\032' | '\127'..'\255' -> false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to get back the old function here. The one that was checking for |#
and #|
as well
I think we were only doing this check on the lexer side before. So the
lexer would actually parse only a subset of symbols that were is_valid. In
any case, I agree that we should harmonize the two. Should I do it in this
pr?
…On Tue, Jun 19, 2018, 4:26 PM Jérémie Dimino ***@***.***> wrote:
***@***.**** requested changes on this pull request.
------------------------------
In src/usexp/atom.ml
<#891 (comment)>:
> @@ -0,0 +1,55 @@
+type t = A of string [@@unboxed]
+
+let invalid_argf fmt = Printf.ksprintf invalid_arg fmt
+
+type syntax = Jbuild | Dune
+
+let string_of_syntax = function
+ | Jbuild -> "jbuild"
+ | Dune -> "dune"
+
+let (is_valid_jbuild, is_valid_dune) =
+ let rec jbuild s i len =
+ i = len ||
+ match String.unsafe_get s i with
+ | '"' | '(' | ')' | ';' | '\000'..'\032' | '\127'..'\255' -> false
We need to get back the old function here. The one that was checking for
|# and #| as well
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#891 (review)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAIe-7Usjyx1-vE8c-rkc5sL9laf8ZJIks5t-MO0gaJpZM4UtHSB>
.
|
Yes, we might as well do it now. |
ea04cad
to
4e0e165
Compare
@diml I improved the checks in the jbuild atom validator |
test/unit-tests/sexp.mlt
Outdated
@@ -140,7 +140,7 @@ parse {|"$bar%foo%"|} | |||
|
|||
parse {|\%{foo}|} | |||
[%%expect{| | |||
- : parse_result = Same (Ok [\%{foo}]) | |||
Exception: Invalid_argument "'\\%{foo}' is not a valid dune atom". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also modify parse
so that Invalid_argument
exceptions are captured? So that we can see if the behavior is different or not.
BTW, following the conversation on slack, I assumed that |
Yup. Will remove that extra validation.
…On Tue, Jun 19, 2018, 10:35 PM Jérémie Dimino ***@***.***> wrote:
BTW, following the conversation on slack, I assumed that of_string
wouldn't do the validation anymore and it would be done in the
pretty-printer. In particular we shouldn't need to pass Dune or Jbuild to
Atom.of_string.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#891 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAIe-y-5qEXdqYjF1fqcxpoVZsewStHlks5t-Rm3gaJpZM4UtHSB>
.
|
958ac68
to
260591c
Compare
@diml this is now rebased on master. PS, we should probably get rid of Invalid_argument in the sexp library as well, but for that we'll need to invert the dependency of stdlib and usexp first. |
@diml taht failure from camomile is concerning:
Is it because we aren't setting the version properly in |
Wondering how should we version include. I suppose we could just make it use the same syntax as the file it's used in. |
src/usexp/atom.ml
Outdated
match String.unsafe_get s i with | ||
| '#' -> disallow_next '|' s (i + 1) len | ||
| '|' -> disallow_next '#' s (i + 1) len | ||
| '"' | '(' | ')' | ';' | '\000'..'\032' | '\127'..'\255' -> false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these characters where allowed in jbuild syntax. Let's just restore the old function:
let is_valid str =
let len = String.length str in
len > 0 &&
let rec loop ix =
match str.[ix] with
| '"' | '(' | ')' | ';' -> true
| '|' -> ix > 0 && let next = ix - 1 in str.[next] = '#' || loop next
| '#' -> ix > 0 && let next = ix - 1 in str.[next] = '|' || loop next
| ' ' | '\t' | '\n' | '\012' | '\r' -> true
| _ -> ix > 0 && loop (ix - 1)
in
not (loop (len - 1))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old function was this:
let is_valid =
let rec loop s i len =
i = len ||
match String.unsafe_get s i with
| '"' | '(' | ')' | ';' | '\000'..'\032' | '\127'..'\255' -> false
| _ -> loop s (i + 1) len
in
fun s ->
let len = String.length s in
len > 0 && loop s 0 len
I'll use your definition however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant from before #837
src/usexp/atom.mli
Outdated
|
||
val of_string : string -> t | ||
|
||
val to_string : t -> syntax -> string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think to_string
should do the validation. to_string
should just return the contents of the atom. For instance I think that's why the build of camomile is broken in travis.
The validation should be done by a separate printing function.
test/unit-tests/sexp_tests.ml
Outdated
@@ -23,13 +23,14 @@ let () = | |||
| Atom _ -> true | |||
| _ -> false | |||
in | |||
if Usexp.Atom.is_valid s <> parser_recognizes_as_atom then begin | |||
let valid_dune_atom = Usexp.Atom.is_valid_dune s in | |||
if valid_dune_atom <> parser_recognizes_as_atom then begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update this test so that it runs for both the jbuild and Dune syntax? For instance it would have caught the issue related to is_valid_jbuild
The camomile failure is because |
src/usexp/usexp.mli
Outdated
@@ -7,20 +7,18 @@ module Atom : sig | |||
(** Acceptable atoms are composed of chars in the range ['!' .. '~'] excluding | |||
[' ' '"' '(' ')' ';' '\\'], and must be nonempty. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is no longer valid BTW, we can just remove it.
e3acf92
to
5771a15
Compare
test/unit-tests/sexp.mlt
Outdated
Different | ||
{jbuild = | ||
Ok | ||
[<printer pp_sexp_ast raised an exception: Invalid_argument("atom '\\%{foo}' cannot be in dune syntax")>]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should do #remove_printer pp_sexp_ast
before these tests, it took me a while to understand what was going on here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think we should print the atoms unambigiously instead. Otherwise it sucks that we can't pp some of these ast's for debugging.
The last commit prints Sexp ast in a clearer way for debugging. Atoms are expanded out using an |
Atoms can now be constructed and pretty printed with a syntax = Jbuild | Dune. The syntax controls validation that will be used to make sure we are printing something/reading valid Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
And make the tests reflect back Invalid_argument Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>
28100dd
to
53d9c64
Compare
Atoms can now be constructed and pretty printed with a syntax = Jbuild | Dune.
The syntax controls validation that will be used to make sure we are printing
something/reading valid
An issue here is that I'm hard coding the dune syntax in a couple of places where we should probably be flexible - like the sub systems file. But it seems to work.