Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Documentation System #8588

Closed
wants to merge 19 commits into from
Closed

RFC: Documentation System #8588

wants to merge 19 commits into from

Conversation

MikeInnes
Copy link
Member

Ok, so this has ended up evolving into a full doc system. Functions and macros can be documented with

@doc """
Description here
""" ->
function foo() ...

(prettier syntax forthcoming)

Documenting multiple methods of a function will concatenate the doc strings together, in the order the methods were originally defined.

By default @doc treats unadorned strings as equivalent to md"" (i.e. markdown syntax) but you can actually put anything before the arrow to associate it with the function.

Docs can be queried with e.g.

doc(Docs.doc)
@doc Docs.doc
@doc @doc # For macros
@doc text""

(and you can try these out on this branch)


Original: Here's a first pass at getting Markdown.jl into Base. This will us allow us to display Julia's doc strings in the terminal and rich environments; see the readme for some examples.

@MikeInnes MikeInnes changed the title Add Markdown.jl WIP: Add Markdown.jl Oct 5, 2014
@MikeInnes
Copy link
Member Author

Ok, I just implemented a tiny metadata system, so that you can now do this:
screenshot 2014-10-05 20 05 46
At the moment, the actual metadata part (docs.jl) is all but trivial. This is by design; it's the simplest thing that works. My thinking is that once this is in, we can start using it and iterate from there. I believe @MichaelHatherly has some good ideas for future developments. Yeah that's not really true anymore

@MikeInnes MikeInnes changed the title WIP: Add Markdown.jl RFC: Documentation System Oct 5, 2014
@johnmyleswhite
Copy link
Member

This is awesome. :)

@MichaelHatherly
Copy link
Member

Looks good! It might be worth having const META = Dict{Module, Dict()}() to avoid clashes between keys once we allow documenting macros (which would use a Symbol as their key presumably). But that's not really relevant right now, since this doesn't seem to support documenting macros?

@IainNZ
Copy link
Member

IainNZ commented Oct 5, 2014

Monolith

@MikeInnes
Copy link
Member Author

@MichaelHatherly You're right, no support for macros just yet, although that's definitely something to look into next.

Unfortunately, while we can do hacks with symbols etc., I really think the only way to do it robustly is to have a way to reference a macro as a first-class object. Otherwise you have to hack around trying to figure out which module the symbol you're looking at came from. I'll open an issue on this at some point.

@JeffBezanson
Copy link
Member

There is an expander function for each macro that you can access by
manually looking up its @ name in its module.

@MichaelHatherly
Copy link
Member

So macros are kind of first-class? That's cool! There's also the question of documenting globals -- is there anything other than symbols we could use to reference those?

@staticfloat
Copy link
Member

@ViralBShah
Copy link
Member

+Inf

@tonyhffong
Copy link

Sweet.

@MikeInnes
Copy link
Member Author

@JeffBezanson so you can – that's awesome! I'll integrate this.

Turns out this works if you just evaluate the symbol, too – I can't believe I never tried that before.

Documenting globals is probably a lot less important, but if there's a simple way to ask "what module is this symbol imported from, if any?" I could integrate it reasonably easily.

If not, perhaps I should just knock something up with jl_module_usings and names. Although @MichaelHatherly's suggestion of having a reference would be ideal.

@quinnj
Copy link
Member

quinnj commented Oct 6, 2014

Here's a probably-too-early feature request. Can the function/macro/type body source code following the @doc be automatically inserted as a meta item in its docs? One could then see the source code of pretty much any object by doing @source f(x) or something equivalent. This would drastically simplify a long-outstanding issue's (#2625) implementation and an object's source code can be seen very much a part of its documentation. This could perhaps be automatic for all objects, with or without an explicit @doc, as a sort of default documentation.

@MikeInnes
Copy link
Member Author

@quinnj I could probably knock something up if it's useful, although it wouldn't be well integrated with the documentation system at this stage since there's no support for documenting specific methods. Though I've had some ideas for method-specific documentation which I'll have a go at implementing.

@quinnj
Copy link
Member

quinnj commented Oct 6, 2014

What's the difficulty with method-specific docs? Knowing what to store as the key for the method? Or how to call it?

@MikeInnes
Copy link
Member Author

(1) is knowing what key to store for the method. I'm just going to try using the type signature. AFAIK method objects themselves are replaced, which means they can't be used as a key (or redefining functions interactively would do horrible things).

(2) is knowing what to do when help(f) is called. What's the correct order of methods? How are doc objects combined? My plan is to just to call catdoc(meta...) with metadata in the order the methods were defined, and provide a default implementation for Markdown objects (or default to an array).

That said, unless I can get this working fairly easily I think anything which doesn't actively block this being useful should be considered OT for the sake of this PR. We can live without method-specific docs for a while, but we really don't want this to be paralyzed any further. I'll give it a go though, anyway.

@MichaelHatherly
Copy link
Member

I'd not come across issues with Method objects being replaced -- or perhaps just hadn't noticed :) I'll keep an eye out for any issues related to that in Docile.

A nice way I found for displaying docs for specific method signatures is to have @help behave like @which, so @help foobar(1, "2") would display docs for method foobar(::Int, ::String).

@MikeInnes
Copy link
Member Author

Ok, just implemented macro support, as well as having the @doc macro do querying, e.g.@doc @time.

@MichaelHatherly Truth be told I haven't looked into it properly, but I'll take a look now. And for querying method docs, I was just thinking the same thing when implementing @doc ;) If I can get methods working I'll throw that in too.

@MikeInnes
Copy link
Member Author

Yes, you're right, Method objects aren't themselves replaced. Only problem is your current way of figuring out which method was defined doesn't allow docs to be redefined for a specific method, but I'm pretty sure I can work around that.

@MichaelHatherly
Copy link
Member

Yeah, it not totally ideal. Other things can be overwritten, I think it's just Methods. Probably due to dealing with optional arguments in signatures since they produce more than one Method object per definition.

@MikeInnes
Copy link
Member Author

@MichaelHatherly That's a very good point, I almost forgot about that. Let me know what you think of this implementation.

Supporting methods turned out not to be so hard, so I've added that on. I'll update the original PR text.

@hayd
Copy link
Member

hayd commented Oct 22, 2014

In principle you can attach any data you want as documentation,

Where does this sit with @stevengj's additional metadata proposal? In #8514 (comment):

"a text/plain comment"
f(x) = x

md"A *Markdown* comment."
g(x) = x

doc md"A *Markdown* comment with metadata" { :section => "Math", :subsection => "Special functions" }
besselj(m,x) = ...

const specfuns = { :section => "Math", :subsection => "Special functions" }
doc md"Another *Markdown* comment with predefined metadata." specfuns
bessely(m,x) = ...

@MikeInnes
Copy link
Member Author

There's no syntactic support for that kind of metadata yet, but you can always define an object that displays a docstring and stores custom metadata. That's the advantage of storing arbitrary objects, of course. We can always add syntax later if that becomes a commonly used feature, though for now I think it's best to get the 99% use case working well.

@stevengj
Copy link
Member

I agree with @one-more-minute. I no longer think we want a special syntax for metadata. Since we support arbitrary documentation objects, we can just do:

@doc MetaDoc(md"....docs....", :author=>"Me", :section=>"Foo functions", ...)

where MetaDoc is a wrapper around another documentation object that passes through writemime requests. This is hardly any more typing than any "sugar" variant.

@stevengj
Copy link
Member

Regarding the interpolation, how does that relate to the discussion in MichaelHatherly/Docile.jl#29? We want to easily allow documentation that includes $ (both for LaTeX equations and for Julia code samples), and having a special flavor of Markdown that parses $ as interpolated Julia code seems at odds with this.

license(pkg::Module; flavor = github) = license(string(pkg), flavor = flavor)

function mdexpr(s, flavor = :julia)
md = parse(s, flavor = symbol(flavor))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since md_str calls mdexpr, and mdexpr calls parse, this sounds like it will parse all Markdown documentation strings when a module is loaded. As discussed in MichaelHatherly/Docile.jl#29, however, we almost certainly want something that initially stores just the raw text/markdown string and parses lazily.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Markdown.jl performing poorly enough to impact load times for you? I'd love to see the benchmarks that led you to that conclusion, since I'm not able to reproduce.

Base currently contains 1296 doc strings with an average of 4 non-blank lines. Let's be optimistic and imagine that Markdown.jl's elegant output inspires a cascade of flowing prose from Julians everywhere, such that Base's docstring count quadruples overnight and each one is ~10 lines long (including code samples, headers etc. of course).

On my machine, parsing 100,000 such docstrings takes all of 7.43 seconds – in other words, I can parse 13,460 doc strings per second. You'll have to forgive me for not optimising at all yet, but it's not too shabby either. By our estimation above it would add about 0.37 seconds to Base's load time.

Last time I checked (just now, that is) loading Base's 70,912 lines of code takes two minutes, which means the overhead in the worst (best?) case is 0.31%. To an extent these things are subjective, of course, but that strikes me as a premature optimisation. I'd much rather spend my time making Markdown.jl itself faster.

Again, though, I'd love to see your benchmarks, just to make sure I haven't missed anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelHatherly did some benchmarks for Docile, and he was saying that parsing added significantly to load times.... I'm not sure why his benchmarks differ from yours.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I'm open to being corrected, I don't remember Michael expressing any concern about parse times or the need to parse lazily. So I'm not sure where that's come from, but it sounds like it's not something to worry about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the abovementioned issue, Michael wrote:

Docile 0.1.0 parsed all the docstrings into Markdown.Blocks. This led to a slow down in the load time of modules I was testing on since Docile had to import Markdown.jl and then parse each string as well. This won't be a problem once pre-compilation is available, but for now it's a bit annoying.

0.2.0 now just captures the strings and leaves the parsing and rendering business to Lexicon. This has made loading modules that use @doc quite a bit faster (just anecdotal evidence though, I've not got any numbers regarding this).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all the Markdown.jl code will be in Base and thus pre-compiled, this shouldn't really be much of a concern. It was the package load times rather than any parsing that was taking too long for my liking. As far as I can tell the parsing itself is plenty fast enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, good to hear.

@MichaelHatherly
Copy link
Member

+1 for a MetaDoc rather than special syntax when it's needed. Docile's current metadata syntax can be a bit subtle since the Dict has to start on the same line as the docstring ends, ie.

@doc """
...
"""
[
    :key => "value"
] ->
...

doesn't work since the [ isn't on the same line as """. Rather avoid that by using a single MetaDoc object. And as @stevengj says, it isn't much more to type either.

@astrieanna
Copy link
Contributor

Is translating package documentation into other languages something this approach is going to (or could) support? (i.e. the message on julia-users today, asking about translating Distributions.jl docs into Spanish.)

@johnmyleswhite
Copy link
Member

I kind of think we should leave i18n for another round. Doing it right is a truly massive effort (and requires a whole extra layer of indirection). Historically, the attempts to translate the Julia docs have gone pretty badly.

@StefanKarpinski
Copy link
Member

I'm afraid that without a commercially backed effort to internationalize, translated docs are doomed to be worse than useless – indeed, actively misleading.

@ivarne
Copy link
Member

ivarne commented Oct 23, 2014

I think this proposal makes a strong foundation that can be used to support internationalized documentation. As long as we have a type with an appropriate writemime method, accessing translations is about generating a key, and look it up in an external file (or parse the docstring that contain multiple languages, but that seems like too much clutter in a source file).

Unfortunately @stevengj insists that everything you'd want to do to configure the output from writemime should go in the MIME object, but I guess we could do worse than a MIME"text/plain{languages:[no, es, en]}".

@johnmyleswhite
Copy link
Member

I'm afraid that without a commercially backed effort to internationalize, translated docs are doomed to be worse than useless – indeed, actively misleading.

Agreed.

@mlubin
Copy link
Member

mlubin commented Oct 23, 2014

One could say the same thing about commercially backed numerical libraries and programming languages ;)
I wouldn't turn away volunteer translators. All it takes are a couple good volunteers to get documentation into good shape.

@johnmyleswhite
Copy link
Member

I wouldn't turn away volunteer translators. All it takes are a couple good volunteers to get documentation into good shape.

I don't think this is the right perspective. In OSS, doing coding work once is often sufficient. But a translation needs to be kept current, which means you need a translator who's committed to doing work on a frequent basis for the next couple of years. Basically, you need an employee, not a consultant.

@hayd
Copy link
Member

hayd commented Oct 23, 2014

There are collaborative solutions to this problem however e.g. https://www.transifex.com/projects/p/discourse-org/ (which is free for open source projects).

@ivarne
Copy link
Member

ivarne commented Oct 23, 2014

I think the biggest problem with volunteer translators is that they often don't have the motivation to translate everything. As long as we don't have a nice integration with a service that provides some gamification, it will be too boring and hard to collaborate on the translation, so stefans comment will be correct.

@timholy
Copy link
Member

timholy commented Oct 23, 2014

Desktop environments like KDE do what appears to be a pretty reasonable job of volunteer-organized translation. But it requires some serious release discipline (the "message freezes" in particular): https://techbase.kde.org/Schedules/KDE4/4.14_Release_Schedule.

@stevengj
Copy link
Member

There's a difference here between documentation for a GUI program or other end-user software and technical documentation for a programming language/library. The former are often translated by volunteer efforts, but the documentation changes less rapidly and the problem is less severe if it is slightly out of date. The latter, being technical docs, really have to be accurate and up to date, and are rarely translated as far as I can tell. (Even Python does not seem to translate its core manual. Even large companies like Microsoft and Apple do not seem to translate their technical API docs.)

Since all we are talking about here is technical API documentation, it seems very unlikely that we will ever want to translate it. (Which is not to say that there should never be Julia tutorials in other languages, but that's a very different type of thing.)

@waldyrious
Copy link
Contributor

I've used quite a few collaborative translation platforms on the past few years (as a contributor), so I think I can offer some insight about a couple of those I see as the main contenders:

  • http://translatewiki.net -- decent user interface, wiki backend (so every change is recorded, which isn't the case in most other platforms AFAIK) and a nice obsolescence marker (so we can filter by obsolete messages, and see the diff of what changed in the source string so that updating the translations is easier). But it's slightly less user friendly as some of the top players (see below), and I hear the setup process for the project owners isn't as streamlined either (which makes sense since it's not a commercial venture with a dedicated support team)
  • http://transifex.com -- very user-friendly, and free for FOSS projects as noted above. Does have some UX issues but IIRC they're minor. I can provide a more detailed overview if needed.
  • http://crowdin.com -- also free for FOSS projects, and extremely usable (the best user experience I've had so far, though not 100% fault-free).

I think the biggest problem with volunteer translators is that they often don't have the motivation to translate everything. As long as we don't have a nice integration with a service that provides some gamification, it will be too boring and hard to collaborate on the translation

Exactly. Two additional points worth noting:

  • Translation is often a great way to allow interested newcomers to contribute to the project and get their feet wet, while also learning about how the internals work.
  • Long texts, such as the manual, will be much less pleasing to translate, but content that can be easily broken down --such as API docs-- will be perfect for progressive translation and a game-like, inherently motivating translation experience.

a translation needs to be kept current, which means you need a translator who's committed to doing work on a frequent basis for the next couple of years. Basically, you need an employee, not a consultant.

Not at all. If a collaborative platform is used, the users need not be the same ones over time. In fact, most platforms nowadays provide support for collaboratively developing a language-specific glossary, so consistency is maintained even if each user only translates a few strings.

@MikeInnes
Copy link
Member Author

@astrieanna There's nothing stopping internationalised docs being implemented as a library on top of this. That might be the best thing, so that packages can support it if they want without adding complexity to Base.

@stevengj $ for interpolation is no more at odds with $$ for latex than * for italic is at odds with ** for bold. Likewise, we can still use *, [, and many other meaningful characters literally within code just fine...

Regardless, interpolation isn't officially supported as of this PR, so we can bikeshed that later.

@hayd
Copy link
Member

hayd commented Oct 23, 2014

There's nothing stopping internationalised docs being implemented as a library on top of this.

This is probably the best solution, assuming this can be made feasible. IMO it's incredibly wasteful to internationalize continuously (between releases), you only want to document a specific release and only require a re-translation if the doc has been updated since the previous release. Being a package means it can be worked on after the release is tagged.

@prcastro
Copy link
Contributor

I used Transifex a while ago, and it was a very good experience. Could work really well with Julia Docs. Also, provide easy i18n support to new packages would be welcome.

@MikeInnes
Copy link
Member Author

Superseded by #8791

@MikeInnes MikeInnes closed this Oct 23, 2014
@nalimilan
Copy link
Member

As a belated reaction to the issue of the interest of translations, I maintain a R GUI package which I personally translate into French for French users, and unfortunately in R the docs cannot be translated at all (only the GUI), which is quite an issue for me. In this kind of situation supporting translations makes sense, even if in the general case we don't consider it useful to translate APIs of most packages.

@johnmyleswhite
Copy link
Member

I still think worrying about i18n now adds too much complexity to a problem we haven't even solved the base case of yet.

@hayd
Copy link
Member

hayd commented Oct 28, 2014

+1 it's a separate issue and (translations) must be outside Base anyway.

IMO from @one-more-minute's comment it sounds like it will be technically feasible (at least, I have an idea how this may work, a few in fact), and will make a great package - (infrastructure for) translating Base and even packages (at a tagged release). However we can't even think about that until this is stabilised in Base.

@nalimilan
Copy link
Member

Yes, I wasn't implying that this should be tackled right now. Just adding a point to an already long discussion about having this support at some point far in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.