Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit doc string to plain string and custom literals that start with doc #11970

Closed
wants to merge 1 commit into from

Conversation

Keno
Copy link
Member

@Keno Keno commented Jul 1, 2015

Fixes #11968. cc @one-more-minute and @JeffBezanson

@pao pao added the docsystem The documentation building system label Jul 1, 2015
@ScottPJones
Copy link
Contributor

I don't think it should be doing anything at all with plain strings either, unless they are "void" (not used in any expression). I think this will cause breakage, for example if somebody is returning a string from a method.

@Keno
Copy link
Member Author

Keno commented Jul 2, 2015

@one-more-minute Thoughts?

@StefanKarpinski
Copy link
Member

@Keno, what's the idea here? Why limit them like this? The main motivation I can think of is wanting the $ to be for LaTeX in docstrings instead of interpolation.

@Keno
Copy link
Member Author

Keno commented Jul 3, 2015

Because otherwise using custom string literals becomes an absolute nightmare for anything other than documentation. There are a number good reasons to use custom string literals that are not for documentation (in particular Cxx.jl, but I'm sure there are others). $ can still be used for latex, you'll just have to call the macro doctex instead of tex for that. Alternatively, I'd be have with an alternative syntax that does have the doc properties ``` maybe, but I'm not sure you want to open that can of worms.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 3, 2015

I know this would be hard to implement in the parser (as it is now) but and we evaluate the string macro first and only treat it as doc string if it is of certain types? (and let all doc string types inherit that type)

@yuyichao
Copy link
Contributor

yuyichao commented Jul 3, 2015

IIRC, @one-more-minute was showing using text_str and some other string macros as doc strings, which IMHO is quite useful. (edit: and they don't start with doc but returns a object that makes sense to be a doc string)

@Keno
Copy link
Member Author

Keno commented Jul 3, 2015

There's no problem in adding an doc* version for any of those macros, the only problem would be the three extra characters. Alternatively, we could leave the parsing but change the doc macro to require registration for your string macro and if it's not one of the registered ones, just emit the expression as normal.

@Keno
Copy link
Member Author

Keno commented Jul 3, 2015

(though that would still leave the repl pasting issue, which I guess I'm ok with).

@yuyichao
Copy link
Contributor

yuyichao commented Jul 3, 2015

Alternatively, we could leave the parsing but change the doc macro to require registration for your string macro and if it's not one of the registered ones, just emit the expression as normal.

I like this one (if implementing it is not overly complicated). I was just thinking about registering the type of the expension but registering the macro sounds better.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 3, 2015

(though that would still leave the repl pasting issue, which I guess I'm ok with).

Why? (If the @doc macro is just no-op in that case.)

@yuyichao
Copy link
Contributor

yuyichao commented Jul 3, 2015

Actually thinking about it more, starting with doc is basically "registering itself" and the approach here doesn't sounds too bad anymore, especially because it is not that easy (AFAIK) to figure out where a macro is coming from.

I would slightly prefer limiting it to doc or starting with doc_ to prevent accidental clash and to make the name of the doc version more readable but the idea here sounds good to me.

@hayd
Copy link
Member

hayd commented Jul 3, 2015

+1 to registering rather than checking for containing/starting with "doc",

Part of the reason was for this to allow the rst macro (or other doc system). Having to write doc_rst or doc_md everywhere seems unnecessarily long-winded (noting the push back on writing doc"...").

An isdoc on the type / macro (return true for plain strings, md, rst etc.) seems very reasonable.

@MichaelHatherly
Copy link
Member

+1 for this.

How about only having @doc_str (and plain strings) and specifying the type/format with a postfix argument?

doc"""
...
"""rst
f(x) = x

@doc_str could then dispatch to some user-extensible method such as parsedoc(::Format"rst", ...), in a similar way to writemime and @MIME_str, for parsing the docstring.

@StefanKarpinski
Copy link
Member

+1 to that, @MichaelHatherly

@Keno
Copy link
Member Author

Keno commented Jul 4, 2015

Yes, that sounds like a good solution.

@hayd
Copy link
Member

hayd commented Jul 4, 2015

specifying the type/format with a postfix argument?

This seems awful to me. Please can we illicit more feedback before rolling with this.

@MichaelHatherly
Copy link
Member

This seems awful to me.

@hayd, any particular reasons why that would be an awful approach?

@tkelman
Copy link
Contributor

tkelman commented Jul 4, 2015

Do we have support for postfix modifiers on custom string literals? How does that work, or does it need to be implemented in the parser first?

@MichaelHatherly
Copy link
Member

Do we have support for postfix modifiers on custom string literals?

In the same way as r"" handles them I guess?

macro doc_str(text, format...)
    format = isempty(format) ? "md" : format[1]
    ...
end

@yuyichao
Copy link
Contributor

yuyichao commented Jul 4, 2015

Do we have support for postfix modifiers on custom string literals?

I didn't know that we do until I tried it out either.

julia> Meta.show_sexpr(:(a"b"c))
(:macrocall, symbol("@a_str"), "b", "c")

I think this is what makes the suffix a great idea.

@MichaelHatherly
Copy link
Member

With master and this branch the parser isn't transforming @doc_str when it has a suffix:

julia> """
       module M

       doc"..."foo
       f(x) =x

       end
       """ |> parse
:(module M
    eval(x) = begin  # none, line 1:
            top(Core).eval(M,x)
        end
    eval(m,x) = begin  # none, line 1:
            top(Core).eval(m,x)
        end # line 3:
    @doc_str "..." "foo" # line 4:
    f(x) = begin  # none, line 4:
            x
        end
    end)

Is that just an oversight?

@yuyichao
Copy link
Contributor

yuyichao commented Jul 4, 2015

Is that just an oversight?

The parsing should be fixed by this. However, the suffix seems to be used for "flavor" in the current @doc_str. Is it necessary and can it be specified in a different way?

diff --git a/src/julia-parser.scm b/src/julia-parser.scm
index d20c92d..12f2ca8 100644
--- a/src/julia-parser.scm
+++ b/src/julia-parser.scm
@@ -2068,7 +2068,8 @@

 (define (any-string-literal? e)
   (or (simple-string-literal? e)
-      (and (length= e 3) (eq? (car e) 'macrocall)
+      (and (or (length= e 3) (length= e 4))
+           (eq? (car e) 'macrocall)
           (simple-string-literal? (caddr e))
           (let ((mname (string (cadr e))))
             (equal? (string.sub mname (string.dec mname (length mname) 4))
julia> Meta.show_sexpr(quote
       baremodule A
       doc"""
       <doc>
       """md
       f() = 1
       end
       end)
(:block,
  (:line, 2, :none),
  (:module, false, :A, (:block,
      :( # line 3:),
      (:macrocall, (:., :Base, (:quote, #QuoteNode
            symbol("@doc")
          )), (:macrocall, symbol("@doc_str"), "<doc>\n", "md"), (:(=), (:call, :f), (:block,
            (:line, 6, :none),
            1
          )))
    ))
)

@hayd
Copy link
Member

hayd commented Jul 5, 2015

The reason I don't like it, is that it's noisy (and a little because the noise is both at the start and the end, when you expect it at the beginning - if you have a long docstring this'll be especially annoying). This noise was an original concern when writing doc".." everywhere, SO plain string literals were decided to work and parse as markdown.

If you're postfixing, to label the format/flavour, will it work without the doc?

"""
docstring with ``rst``
"""rst
function f() end

@yuyichao
Copy link
Contributor

yuyichao commented Jul 5, 2015

If you're postfixing, to label the format/flavour, will it work without the doc?

No because it won't be parsed as string macro

The reason I don't like it, is that it's noisy (and a little because the noise is both at the start and the end, when you expect it at the beginning - if you have a long docstring this'll be especially annoying).

Maybe a little but shouldn't be too bad. At least this is still standard julia syntax.

Also since plain string works as md, I expect most people should just use that and non-md format would only be needed in special cases (very long ones for module maby?)

@MichaelHatherly
Copy link
Member

is that it's noisy

doc_rst"..." is also noisy. The lack of an underscore in doc"..."rst seems less noisy to me. For long docstrings the format is going to be clear, hopefully, based solely on the syntax used without a need to look at the suffix.

Is it necessary and can it be specified in a different way?

It's not necessary if we only ever support markdown. Other than these three ways:

  • with different string macros as is currently allowed
  • internally in the docstring text
  • with a suffix

I'm not sure there's any other syntax that could be used to specify the format.

@kmsquire
Copy link
Member

kmsquire commented Jul 5, 2015

Just a question, if we're already parsing plain text, how much more work would it be to also parse

rst"""
*ReStrucuturedText*
"""

and

md"""
Markdown
========
"""

?

Could these even be changed into doc"""..."""rst and doc"""..."""md (via macro or parser)?

@yuyichao
Copy link
Contributor

yuyichao commented Jul 5, 2015

@kmsquire I don't quite understand your question.

We already parse

rst"""
*ReStructuredText*
"""
f() = 1

And similar stuff as doc string. However, this means that string macros that should never be used as doc string (@cxx_str for @Keno I guess?) will also be parsed as doc string if they appears in the right position (especially since they may have side effects).

With my patch above #11970 (comment), we would also parse doc"..."rst as doc string. The only issue would be @doc_str uses the second parameter as the flavor so we might need to change that to support other format.

There's two ways I can think about right now.

  1. Use doc"<...>"md_github as md"<...>"github and similarly for other flavors then we don't need to worry about conflicting with other formats
  2. Keep using doc"<...>"github as md"<...>"github, i.e. just treat different flavors of md as a different format. This might not be introducing too much conflict either....

@kmsquire
Copy link
Member

kmsquire commented Jul 5, 2015

Got it--thanks for the explanation, @yuyichao!

@MikeInnes
Copy link
Member

I'm definitely sympathetic to Keno's use case here and, since we're pushing for markdown docs anyway, I'm not vehemently opposed to making custom doc strings a little noisier. That said, it's worth considering the other option of making string macro code, as opposed to docs, more explicit.

Both of the following should work already:

sql"CREATE TABLE foo";

or

begin
cxx"""
blah blah blah
"""
end

The first one especially doesn't seem so bad, and it might be a good thing that these are very clearly running as code.

Going the other way, the other thing we could do is set doc syntax for the current module, e.g.

module Foo
@docsyntax rst

"""
foo bar
"""
foo() = ...

end

The @docsyntax macro tells the doc parser to dispatch to @doc_rst so that all doc strings (including unadorned ones) are parsed as RST, and this work on a per-module/per-file basis. This actually makes a lot of sense to me, since it's not like you're going to be mixing formats within a file all the time.

@MichaelHatherly
Copy link
Member

@docsyntax

That's a nice approach, +1.

since it's not like you're going to be mixing formats within a file all the time

Disallowing that might be a good thing.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 5, 2015

Not sure if I agree with disallowing it but setting module/file level syntax should make most cases less noisy.

@hayd
Copy link
Member

hayd commented Jul 5, 2015

Is another option was to have some isdoc function on string macros (by default false), which if true allows it to be parsed as a docstring? That way cxx".." won't parse as a docstring, but rst".." will (if isdoc(rst_str) == true or similar). ?

@MikeInnes
Copy link
Member

That's not ideal because then the outcome of parsing depends on hidden state, and you can't fully parse a Julia program without also evaluating it. It's probably not a disaster but keeping the parser simple is really helpful for tooling. c.f. this Perl disaster.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 5, 2015

It's not really necessary to change the result of parsing. I suppose what we can do is

  1. Parse the doc string as is (possibly allow suffix in string macro)

    Edit: and only allow plan string and @doc_str string as doc string.

  2. Record the module level default format as well as all doc strings in the metadata, e.g. for @doc "<...>" obj the record for obj would be str="<...>", fmt=DocFormat{}, for @doc doc"<...>"rst obj the record for obj would be str="<...>", fmt="rst".

The help system then use these info to call the corresponding conversion function.

This way the parsing should not depend on any magical state. On possible concern is that whoever read the code would have to find the place where the global doc format is defined and it would be slightly harder to move doc string arround (still possible by explicitly specify the format as suffix).

@Keno
Copy link
Member Author

Keno commented Jul 5, 2015

How about instead of parsing everything to @doc str expr, we parse string prefixes to @doc_md str expr. That way we can maintain scoping and behavior can be user defined.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 5, 2015

How about instead of parsing everything to @doc str expr, we parse string prefixes to @doc_md str expr. That way we can maintain scoping and behavior can be user defined.

What about doc string without the @doc. Turn doc"<...>"md expr into @doc_md "<...>" expr? I feel like that's not really better than just parse the suffix.

@Keno
Copy link
Member Author

Keno commented Jul 5, 2015

What I meant was to translate md"str" expr into @doc_md "str" expr in the parser instead of (as currently is the case) @doc md"str" expr

@JeffBezanson
Copy link
Member

I would be ok with only doing the special parsing for plain strings and doc" " strings. Alternate formats can use @doc directly. Honestly, I think allowing many formats is not so important, and if necessary can be handled purely as a presentation issue. For example we could use syntax like

"""format:rst
f(x)
"""
f(x) = x

@StefanKarpinski
Copy link
Member

I have to agree that I don't find supporting multiple documentation markup formats to be very important. In fact, I'd say that it's a positive thing to force people as much as possible to use a common format and make sure that one format is good enough for all realistic use cases. I know some people disagree, but maybe it's my inner dictatorial tendencies showing themselves. Make a good decision and go with it.

@ssfrr
Copy link
Contributor

ssfrr commented Jul 11, 2015

BOFL: Benevolent Oligarchy For Life

@hayd
Copy link
Member

hayd commented Jul 13, 2015

I guess there's a pragmatic argument: people may already have documentation in other formats kicking about... like julia's own documentation (that's in rst).

@JeffBezanson
Copy link
Member

@one-more-minute Does that sound ok?

@MikeInnes
Copy link
Member

Absolutely, I'm very much on board with what everyone else has said. Since it looks like everyone agrees I'll merge. Whoops, scratch that, looks like it needs a rebase.

@JeffBezanson
Copy link
Member

Ok, since it's such a small change I'll just re-do this.

@DilumAluthge DilumAluthge deleted the kf/limitdoc branch March 25, 2021 22:12
@DilumAluthge DilumAluthge restored the kf/limitdoc branch March 25, 2021 22:12
@DilumAluthge DilumAluthge deleted the kf/limitdoc branch March 25, 2021 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docsystem The documentation building system
Projects
None yet
Development

Successfully merging this pull request may close these issues.