RFC: Documentation System #8588

MikeInnes · 2014-10-05T18:12:02Z

Ok, so this has ended up evolving into a full doc system. Functions and macros can be documented with

@doc """
Description here
""" ->
function foo() ...

(prettier syntax forthcoming)

Documenting multiple methods of a function will concatenate the doc strings together, in the order the methods were originally defined.

By default @doc treats unadorned strings as equivalent to md"" (i.e. markdown syntax) but you can actually put anything before the arrow to associate it with the function.

Docs can be queried with e.g.

doc(Docs.doc)
@doc Docs.doc
@doc @doc # For macros
@doc text""

(and you can try these out on this branch)

Original: Here's a first pass at getting Markdown.jl into Base. This will us allow us to display Julia's doc strings in the terminal and rich environments; see the readme for some examples.

MikeInnes · 2014-10-05T19:06:22Z

Ok, I just implemented a tiny metadata system, so that you can now do this:

At the moment, the actual metadata part (docs.jl) is all but trivial. This is by design; it's the simplest thing that works. My thinking is that once this is in, we can start using it and iterate from there. I believe @MichaelHatherly has some good ideas for future developments. Yeah that's not really true anymore

johnmyleswhite · 2014-10-05T19:10:51Z

This is awesome. :)

MichaelHatherly · 2014-10-05T19:36:23Z

Looks good! It might be worth having const META = Dict{Module, Dict()}() to avoid clashes between keys once we allow documenting macros (which would use a Symbol as their key presumably). But that's not really relevant right now, since this doesn't seem to support documenting macros?

IainNZ · 2014-10-05T21:27:12Z

MikeInnes · 2014-10-05T23:02:17Z

@MichaelHatherly You're right, no support for macros just yet, although that's definitely something to look into next.

Unfortunately, while we can do hacks with symbols etc., I really think the only way to do it robustly is to have a way to reference a macro as a first-class object. Otherwise you have to hack around trying to figure out which module the symbol you're looking at came from. I'll open an issue on this at some point.

JeffBezanson · 2014-10-05T23:16:01Z

There is an expander function for each macro that you can access by
manually looking up its @ name in its module.

MichaelHatherly · 2014-10-06T04:46:25Z

So macros are kind of first-class? That's cool! There's also the question of documenting globals -- is there anything other than symbols we could use to reference those?

staticfloat · 2014-10-06T05:30:36Z

ViralBShah · 2014-10-06T07:08:12Z

+Inf

tonyhffong · 2014-10-06T09:24:20Z

Sweet.

MikeInnes · 2014-10-06T11:17:01Z

@JeffBezanson so you can – that's awesome! I'll integrate this.

Turns out this works if you just evaluate the symbol, too – I can't believe I never tried that before.

Documenting globals is probably a lot less important, but if there's a simple way to ask "what module is this symbol imported from, if any?" I could integrate it reasonably easily.

If not, perhaps I should just knock something up with jl_module_usings and names. Although @MichaelHatherly's suggestion of having a reference would be ideal.

quinnj · 2014-10-06T11:36:39Z

Here's a probably-too-early feature request. Can the function/macro/type body source code following the @doc be automatically inserted as a meta item in its docs? One could then see the source code of pretty much any object by doing @source f(x) or something equivalent. This would drastically simplify a long-outstanding issue's (#2625) implementation and an object's source code can be seen very much a part of its documentation. This could perhaps be automatic for all objects, with or without an explicit @doc, as a sort of default documentation.

MikeInnes · 2014-10-06T12:02:53Z

@quinnj I could probably knock something up if it's useful, although it wouldn't be well integrated with the documentation system at this stage since there's no support for documenting specific methods. Though I've had some ideas for method-specific documentation which I'll have a go at implementing.

quinnj · 2014-10-06T12:15:37Z

What's the difficulty with method-specific docs? Knowing what to store as the key for the method? Or how to call it?

MikeInnes · 2014-10-06T12:26:46Z

(1) is knowing what key to store for the method. I'm just going to try using the type signature. AFAIK method objects themselves are replaced, which means they can't be used as a key (or redefining functions interactively would do horrible things).

(2) is knowing what to do when help(f) is called. What's the correct order of methods? How are doc objects combined? My plan is to just to call catdoc(meta...) with metadata in the order the methods were defined, and provide a default implementation for Markdown objects (or default to an array).

That said, unless I can get this working fairly easily I think anything which doesn't actively block this being useful should be considered OT for the sake of this PR. We can live without method-specific docs for a while, but we really don't want this to be paralyzed any further. I'll give it a go though, anyway.

MichaelHatherly · 2014-10-06T13:35:03Z

I'd not come across issues with Method objects being replaced -- or perhaps just hadn't noticed :) I'll keep an eye out for any issues related to that in Docile.

A nice way I found for displaying docs for specific method signatures is to have @help behave like @which, so @help foobar(1, "2") would display docs for method foobar(::Int, ::String).

MikeInnes · 2014-10-06T13:42:54Z

Ok, just implemented macro support, as well as having the @doc macro do querying, e.g.@doc @time.

@MichaelHatherly Truth be told I haven't looked into it properly, but I'll take a look now. And for querying method docs, I was just thinking the same thing when implementing @doc ;) If I can get methods working I'll throw that in too.

MikeInnes · 2014-10-06T14:38:58Z

Yes, you're right, Method objects aren't themselves replaced. Only problem is your current way of figuring out which method was defined doesn't allow docs to be redefined for a specific method, but I'm pretty sure I can work around that.

MichaelHatherly · 2014-10-06T14:55:37Z

Yeah, it not totally ideal. Other things can be overwritten, I think it's just Methods. Probably due to dealing with optional arguments in signatures since they produce more than one Method object per definition.

MikeInnes · 2014-10-06T16:21:30Z

@MichaelHatherly That's a very good point, I almost forgot about that. Let me know what you think of this implementation.

Supporting methods turned out not to be so hard, so I've added that on. I'll update the original PR text.

hayd · 2014-10-22T22:28:24Z

In principle you can attach any data you want as documentation,

Where does this sit with @stevengj's additional metadata proposal? In #8514 (comment):

"a text/plain comment"
f(x) = x

md"A *Markdown* comment."
g(x) = x

doc md"A *Markdown* comment with metadata" { :section => "Math", :subsection => "Special functions" }
besselj(m,x) = ...

const specfuns = { :section => "Math", :subsection => "Special functions" }
doc md"Another *Markdown* comment with predefined metadata." specfuns
bessely(m,x) = ...

MikeInnes · 2014-10-22T22:37:07Z

There's no syntactic support for that kind of metadata yet, but you can always define an object that displays a docstring and stores custom metadata. That's the advantage of storing arbitrary objects, of course. We can always add syntax later if that becomes a commonly used feature, though for now I think it's best to get the 99% use case working well.

stevengj · 2014-10-23T14:15:58Z

I agree with @one-more-minute. I no longer think we want a special syntax for metadata. Since we support arbitrary documentation objects, we can just do:

@doc MetaDoc(md"....docs....", :author=>"Me", :section=>"Foo functions", ...)

where MetaDoc is a wrapper around another documentation object that passes through writemime requests. This is hardly any more typing than any "sugar" variant.

stevengj · 2014-10-23T14:27:09Z

Regarding the interpolation, how does that relate to the discussion in MichaelHatherly/Docile.jl#29? We want to easily allow documentation that includes $ (both for LaTeX equations and for Julia code samples), and having a special flavor of Markdown that parses $ as interpolated Julia code seems at odds with this.

stevengj · 2014-10-23T14:30:49Z

base/markdown/Markdown.jl

+license(pkg::Module; flavor = github) = license(string(pkg), flavor = flavor)
+
+function mdexpr(s, flavor = :julia)
+  md = parse(s, flavor = symbol(flavor))


Since md_str calls mdexpr, and mdexpr calls parse, this sounds like it will parse all Markdown documentation strings when a module is loaded. As discussed in MichaelHatherly/Docile.jl#29, however, we almost certainly want something that initially stores just the raw text/markdown string and parses lazily.

Is Markdown.jl performing poorly enough to impact load times for you? I'd love to see the benchmarks that led you to that conclusion, since I'm not able to reproduce.

Base currently contains 1296 doc strings with an average of 4 non-blank lines. Let's be optimistic and imagine that Markdown.jl's elegant output inspires a cascade of flowing prose from Julians everywhere, such that Base's docstring count quadruples overnight and each one is ~10 lines long (including code samples, headers etc. of course).

On my machine, parsing 100,000 such docstrings takes all of 7.43 seconds – in other words, I can parse 13,460 doc strings per second. You'll have to forgive me for not optimising at all yet, but it's not too shabby either. By our estimation above it would add about 0.37 seconds to Base's load time.

Last time I checked (just now, that is) loading Base's 70,912 lines of code takes two minutes, which means the overhead in the worst (best?) case is 0.31%. To an extent these things are subjective, of course, but that strikes me as a premature optimisation. I'd much rather spend my time making Markdown.jl itself faster.

Again, though, I'd love to see your benchmarks, just to make sure I haven't missed anything.

@MichaelHatherly did some benchmarks for Docile, and he was saying that parsing added significantly to load times.... I'm not sure why his benchmarks differ from yours.

Though I'm open to being corrected, I don't remember Michael expressing any concern about parse times or the need to parse lazily. So I'm not sure where that's come from, but it sounds like it's not something to worry about.

In the abovementioned issue, Michael wrote:

Docile 0.1.0 parsed all the docstrings into Markdown.Blocks. This led to a slow down in the load time of modules I was testing on since Docile had to import Markdown.jl and then parse each string as well. This won't be a problem once pre-compilation is available, but for now it's a bit annoying.

0.2.0 now just captures the strings and leaves the parsing and rendering business to Lexicon. This has made loading modules that use @doc quite a bit faster (just anecdotal evidence though, I've not got any numbers regarding this).

Since all the Markdown.jl code will be in Base and thus pre-compiled, this shouldn't really be much of a concern. It was the package load times rather than any parsing that was taking too long for my liking. As far as I can tell the parsing itself is plenty fast enough.

Great, good to hear.

MichaelHatherly · 2014-10-23T15:31:25Z

+1 for a MetaDoc rather than special syntax when it's needed. Docile's current metadata syntax can be a bit subtle since the Dict has to start on the same line as the docstring ends, ie.

@doc """
...
"""
[
    :key => "value"
] ->
...

doesn't work since the [ isn't on the same line as """. Rather avoid that by using a single MetaDoc object. And as @stevengj says, it isn't much more to type either.

astrieanna · 2014-10-23T16:16:10Z

Is translating package documentation into other languages something this approach is going to (or could) support? (i.e. the message on julia-users today, asking about translating Distributions.jl docs into Spanish.)

johnmyleswhite · 2014-10-23T16:18:30Z

I kind of think we should leave i18n for another round. Doing it right is a truly massive effort (and requires a whole extra layer of indirection). Historically, the attempts to translate the Julia docs have gone pretty badly.

StefanKarpinski · 2014-10-23T16:32:48Z

I'm afraid that without a commercially backed effort to internationalize, translated docs are doomed to be worse than useless – indeed, actively misleading.

ivarne · 2014-10-23T16:48:27Z

I think this proposal makes a strong foundation that can be used to support internationalized documentation. As long as we have a type with an appropriate writemime method, accessing translations is about generating a key, and look it up in an external file (or parse the docstring that contain multiple languages, but that seems like too much clutter in a source file).

Unfortunately @stevengj insists that everything you'd want to do to configure the output from writemime should go in the MIME object, but I guess we could do worse than a MIME"text/plain{languages:[no, es, en]}".

johnmyleswhite · 2014-10-23T17:00:05Z

I'm afraid that without a commercially backed effort to internationalize, translated docs are doomed to be worse than useless – indeed, actively misleading.

Agreed.

mlubin · 2014-10-23T17:06:56Z

One could say the same thing about commercially backed numerical libraries and programming languages ;)
I wouldn't turn away volunteer translators. All it takes are a couple good volunteers to get documentation into good shape.

johnmyleswhite · 2014-10-23T17:10:06Z

I wouldn't turn away volunteer translators. All it takes are a couple good volunteers to get documentation into good shape.

I don't think this is the right perspective. In OSS, doing coding work once is often sufficient. But a translation needs to be kept current, which means you need a translator who's committed to doing work on a frequent basis for the next couple of years. Basically, you need an employee, not a consultant.

hayd · 2014-10-23T17:14:00Z

There are collaborative solutions to this problem however e.g. https://www.transifex.com/projects/p/discourse-org/ (which is free for open source projects).

ivarne · 2014-10-23T17:17:16Z

I think the biggest problem with volunteer translators is that they often don't have the motivation to translate everything. As long as we don't have a nice integration with a service that provides some gamification, it will be too boring and hard to collaborate on the translation, so stefans comment will be correct.

timholy · 2014-10-23T17:18:23Z

Desktop environments like KDE do what appears to be a pretty reasonable job of volunteer-organized translation. But it requires some serious release discipline (the "message freezes" in particular): https://techbase.kde.org/Schedules/KDE4/4.14_Release_Schedule.

stevengj · 2014-10-23T17:46:19Z

There's a difference here between documentation for a GUI program or other end-user software and technical documentation for a programming language/library. The former are often translated by volunteer efforts, but the documentation changes less rapidly and the problem is less severe if it is slightly out of date. The latter, being technical docs, really have to be accurate and up to date, and are rarely translated as far as I can tell. (Even Python does not seem to translate its core manual. Even large companies like Microsoft and Apple do not seem to translate their technical API docs.)

Since all we are talking about here is technical API documentation, it seems very unlikely that we will ever want to translate it. (Which is not to say that there should never be Julia tutorials in other languages, but that's a very different type of thing.)

waldyrious · 2014-10-23T17:51:43Z

I've used quite a few collaborative translation platforms on the past few years (as a contributor), so I think I can offer some insight about a couple of those I see as the main contenders:

http://translatewiki.net -- decent user interface, wiki backend (so every change is recorded, which isn't the case in most other platforms AFAIK) and a nice obsolescence marker (so we can filter by obsolete messages, and see the diff of what changed in the source string so that updating the translations is easier). But it's slightly less user friendly as some of the top players (see below), and I hear the setup process for the project owners isn't as streamlined either (which makes sense since it's not a commercial venture with a dedicated support team)
http://transifex.com -- very user-friendly, and free for FOSS projects as noted above. Does have some UX issues but IIRC they're minor. I can provide a more detailed overview if needed.
http://crowdin.com -- also free for FOSS projects, and extremely usable (the best user experience I've had so far, though not 100% fault-free).

I think the biggest problem with volunteer translators is that they often don't have the motivation to translate everything. As long as we don't have a nice integration with a service that provides some gamification, it will be too boring and hard to collaborate on the translation

Exactly. Two additional points worth noting:

Translation is often a great way to allow interested newcomers to contribute to the project and get their feet wet, while also learning about how the internals work.
Long texts, such as the manual, will be much less pleasing to translate, but content that can be easily broken down --such as API docs-- will be perfect for progressive translation and a game-like, inherently motivating translation experience.

a translation needs to be kept current, which means you need a translator who's committed to doing work on a frequent basis for the next couple of years. Basically, you need an employee, not a consultant.

Not at all. If a collaborative platform is used, the users need not be the same ones over time. In fact, most platforms nowadays provide support for collaboratively developing a language-specific glossary, so consistency is maintained even if each user only translates a few strings.

MikeInnes · 2014-10-23T18:14:52Z

@astrieanna There's nothing stopping internationalised docs being implemented as a library on top of this. That might be the best thing, so that packages can support it if they want without adding complexity to Base.

@stevengj $ for interpolation is no more at odds with $$ for latex than * for italic is at odds with ** for bold. Likewise, we can still use *, [, and many other meaningful characters literally within code just fine...

Regardless, interpolation isn't officially supported as of this PR, so we can bikeshed that later.

hayd · 2014-10-23T18:27:51Z

There's nothing stopping internationalised docs being implemented as a library on top of this.

This is probably the best solution, assuming this can be made feasible. IMO it's incredibly wasteful to internationalize continuously (between releases), you only want to document a specific release and only require a re-translation if the doc has been updated since the previous release. Being a package means it can be worked on after the release is tagged.

prcastro · 2014-10-23T19:02:40Z

I used Transifex a while ago, and it was a very good experience. Could work really well with Julia Docs. Also, provide easy i18n support to new packages would be welcome.

MikeInnes · 2014-10-23T19:51:36Z

Superseded by #8791

nalimilan · 2014-10-28T13:52:48Z

As a belated reaction to the issue of the interest of translations, I maintain a R GUI package which I personally translate into French for French users, and unfortunately in R the docs cannot be translated at all (only the GUI), which is quite an issue for me. In this kind of situation supporting translations makes sense, even if in the general case we don't consider it useful to translate APIs of most packages.

johnmyleswhite · 2014-10-28T15:11:54Z

I still think worrying about i18n now adds too much complexity to a problem we haven't even solved the base case of yet.

hayd · 2014-10-28T17:15:39Z

+1 it's a separate issue and (translations) must be outside Base anyway.

IMO from @one-more-minute's comment it sounds like it will be technically feasible (at least, I have an idea how this may work, a few in fact), and will make a great package - (infrastructure for) translating Base and even packages (at a tagged release). However we can't even think about that until this is stabilised in Base.

nalimilan · 2014-10-28T21:59:12Z

Yes, I wasn't implying that this should be tackled right now. Just adding a point to an already long discussion about having this support at some point far in the future.

add markdown.jl

5a00df3

MikeInnes changed the title ~~Add Markdown.jl~~ WIP: Add Markdown.jl Oct 5, 2014

implement Docs module

04b4a92

MikeInnes changed the title ~~WIP: Add Markdown.jl~~ RFC: Documentation System Oct 5, 2014

use isexpr and implement @doc

31fa78f

that wasn't right

b0ac0e5

support for macros

ac6f179

MikeInnes added 2 commits October 6, 2014 16:13

implement trackmethod

ce6c749

support for documenting methods

18cfa4d

store the source code

bddae91

stevengj reviewed Oct 23, 2014
View reviewed changes

MikeInnes closed this Oct 23, 2014

This was referenced Nov 7, 2014

Fix help for .+ in the REPL #8922

Closed

formatting conventions for function documentation #8966

Closed

ViralBShah mentioned this pull request Nov 23, 2014

Some way to link comments to Julia constructs #762

Closed

RFC: Documentation System #8588

RFC: Documentation System #8588

Conversation

MikeInnes commented Oct 5, 2014

MikeInnes commented Oct 5, 2014

johnmyleswhite commented Oct 5, 2014

MichaelHatherly commented Oct 5, 2014

IainNZ commented Oct 5, 2014

MikeInnes commented Oct 5, 2014

JeffBezanson commented Oct 5, 2014

MichaelHatherly commented Oct 6, 2014

staticfloat commented Oct 6, 2014

ViralBShah commented Oct 6, 2014

tonyhffong commented Oct 6, 2014

MikeInnes commented Oct 6, 2014

quinnj commented Oct 6, 2014

MikeInnes commented Oct 6, 2014

quinnj commented Oct 6, 2014

MikeInnes commented Oct 6, 2014

MichaelHatherly commented Oct 6, 2014

MikeInnes commented Oct 6, 2014

MikeInnes commented Oct 6, 2014

MichaelHatherly commented Oct 6, 2014

MikeInnes commented Oct 6, 2014

hayd commented Oct 22, 2014

MikeInnes commented Oct 22, 2014

stevengj commented Oct 23, 2014

stevengj commented Oct 23, 2014

stevengj Oct 23, 2014

Choose a reason for hiding this comment

MikeInnes Oct 23, 2014

Choose a reason for hiding this comment

stevengj Oct 23, 2014

Choose a reason for hiding this comment

MikeInnes Oct 24, 2014

Choose a reason for hiding this comment

stevengj Oct 24, 2014

Choose a reason for hiding this comment

MichaelHatherly Oct 24, 2014

Choose a reason for hiding this comment

stevengj Oct 24, 2014

Choose a reason for hiding this comment

MichaelHatherly commented Oct 23, 2014

astrieanna commented Oct 23, 2014

johnmyleswhite commented Oct 23, 2014

StefanKarpinski commented Oct 23, 2014

ivarne commented Oct 23, 2014

johnmyleswhite commented Oct 23, 2014

mlubin commented Oct 23, 2014

johnmyleswhite commented Oct 23, 2014

hayd commented Oct 23, 2014

ivarne commented Oct 23, 2014

timholy commented Oct 23, 2014

stevengj commented Oct 23, 2014

waldyrious commented Oct 23, 2014

MikeInnes commented Oct 23, 2014

hayd commented Oct 23, 2014

prcastro commented Oct 23, 2014

MikeInnes commented Oct 23, 2014

nalimilan commented Oct 28, 2014

johnmyleswhite commented Oct 28, 2014

hayd commented Oct 28, 2014

nalimilan commented Oct 28, 2014