-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embedded command documentation needs to be moved outside the C source files #507
Comments
For what it's worth the fish shell has decided to switch from Doxygen to Sphinx. Something I strongly recommend for this project. The ancient Troff markup language is all but unknown to anyone younger than myself (50+ years old). Not to mention that Troff markup is awful by modern sensibilities and there are few tools to translate it to other formats. |
Another example is the I'm tempted to mark this as a blocking bug for the next ksh release. It should be impossible for the man page and "internal" documentation (e.g., "a_builtin --help") to disagree. People modifying the code shouldn't have to keep two distinct sources of truth in sync. There should be only one source of truth regarding the documented flags and behavior of builtin commands. |
The fact that the ast The
which means one possibility is to generate sections of the
Or the output of The |
@siteshwar and I discussed this issue in the ksh93/users Gitter room today. What follows are the comments I made. Doxygen is a really awful tool for writing user documentation. The fish shell project has had multiple issues related to writing documentation in Doxygen. And among other problems is the lexicon_filter.in script which no one understands. Using Doxygen makes things hard that should be trivial and it is really hard to know if you've got the markup right. Too, they've had problems with the HTML and man pages looking different. Moving the command documentation out of the source code has two goals: 1) a single source of truth, and b) switching from the ancient and frankly awful nroff markup to something more sensible that is easy to turn into man pages, HTML, and whatever else makes sense. And a lesser goal is making it easier to edit the command docs by removing all the C syntax surrounding the text. Embedding the docs so cmd --help works is a great idea. It just shouldn't have been done by embedding it in the source code. It should be embedded at build time by merging plain text files that contain the documentation. |
I do not know if this has been suggested yet or not, but can we get the on-line documentation out of the regular default run-time code altogether? That is: not included at source-edit time and NOT included either at build time. But rather, stored in the distribution directory tree some place (like for example under /usr/man/ or /usr/share/man or /usr/share/ksh/man/ or /usr/lib/ksh/help/, et cetera). Getting as small a run-time memory footprint for KSH was important years ago for very tight embedded memory budgets. I think this is still a good goal even now. For myself, I feel that the source documentation should never have been included into the regular run-time object from day one, but somehow it got in there in its current edit-time format (hard to read and hard to maintain). Storing the help text separately can also be thought of as using a plugin or a sort. That is, it is loaded at run-time, but only when called for. I tend to have extremely extensive use of run-time plugins (object files) now-a-days, as many other projects in the world also do now-a-days. Storing the help-man text separately can be thought of as an extension of this plugin idea. I and a gazillion others have included a feature such as $ cmd --help for tons of years (decades) now, and we always stored the help-man text in separate files, stored separately within the run-time distribution tree. I have even done this for my own KSH built-ins. In fact, where I have done this feature at all for programs, I have done it storing the help documentation separately for all of the 40 years I have been programming this particular feature! Is it possible that we can do the same for KSH going forward (that is: store the help-man text in a separate file, accessed at run-time when called for)? Thanks for any consideration. |
It is possible to store the documentation external to the binary. The fish shell shows how to do it. It supports This should be done in steps:
|
Okay, I think it would be good to provide this ability to have the documentation external to the binary, as long as the existing My main concern though, is will it still be able to work with relative paths because |
I am sure that something can be worked out. KSH already does some "magic path" searching for other things. For myself, as I have implemented this sort of thing myself in the past, I search both a set of possible (somewhat standard) set of relative paths, and also a set of relatively standard system absolute paths for a program-specific storage directory in question (in this case one that would contain the doc files). For example, the first doc files found under ${PATH}/../doc/ksh/ would do. But a whole set of both relative and absolute paths could also be searched. For those still without enough imagination, reference both my and Kurtis Rader's previous posts on this topic for search-path suggestions. Not to intentionally complicate things too much, but I actually do an entire indirection on this whole process also. I first search a set of relative and absolute paths for a possible configuration file that itself contains a list of possible relative and absolute search paths, and failing that I search a default list of relative and absolute search paths. Also, not to explain here, but I also allow for arbitrarily different search paths for each individual program or KSH builtin, but that is likely not needed for most purposes. For those who may say, "but this is a lot of code?" the answer is, it only has to be written once (pretty much for all time). For a reasonably good and experienced programmer (with maybe a little OOD and knowledge of containers and the such), this is not a big problem (rather trivial in the scheme of things). But only a minimal solution is needed for starters. If someone is thinking that searching for doc or help files might be a time consuming and long process, the correct answer is "who the hell cares." It is not something that matters from a performance perspective. Sorry if I was a bit too abrupt just now. :-) |
I hit this issue again in #945 and #948 and I think it's the right time to start taking steps to switch to a newer documentation system. I have discussed this issue before with @krader1961 and we agreed on switching to |
@dannyweldon mentioned the ability to embed documentation inside scripts/functions that can be used by builtin AFAICT the ability to use DocOpt like strings with the builtin Note that
Heck if I can tell how that results from the text in var For comparison the fish shell has an open issue to implement a DocOpt like capability which is more than six years old. The lead on that project promised, 2 years ago, to implement it within six months. That never happened and didn't appear like it would ever happen. So I implemented argparse in just a couple of weeks. Now every fish function/script can implement option parsing in a manner 100% compatible with the bog standard |
When I say "builtin getopts command is undocumented" I mean in the output of "man ksh" which is all that most people will read. Few people are going to think to run |
FWIW, I noticed that the clang documentation is generated using Sphinx. For example, see the bottom of this page: https://clang.llvm.org/docs/ClangFormat.html |
Did you know the |
I've started working on moving the documentation out of the source code. That revealed several problems. The main one being the "docopt" text format that is recognized by |
I've moved almost all of the builtin command "docopt" style documentation and flag definition text used by the AST |
I just noticed, while refactoring the src/cmd/ksh93/bltins/bg_fg_disown.c module, this bogosity:
Notice that |
I have been chipping away at converting builtin commands from the DocOpt style documentation and flag definition used by the AST |
Also, this work has brought to light more instances where the documentation for a builtin command in the |
I just found another "gotcha". It looks like P.S., I really dislike the non-POSIX behavior that continues to scan for flags after the first non-flag argument. Thus, I like the AST |
Converting the
What does that do? It dynamically defines an option that maps a single letter option like As we see over and over again in this project this is too clever by half for something that is not performance sensitive and should be coded in a straightforward manner. |
Note that my previous comment applies to the ksh93u+ release. That |
The flag parsing by the Not even using P.S., This comment also affects the |
Implement an alternative to the legacy AST optget() function. This looks and behaves much like the getopt_long() function provided by GNU and BSD based platforms but with two extensions supported by AST optget(). First, integers represented as short flags; e.g., `-123`. Second, short flags that are prefixed by `+` rather than `-`. It also deviates from getopt_long() by not supporting some legacy behaviors of that API which we don't need. This is related to Github issue #507 because having such a function is required for parsing the `set`, `typeset`, and `ksh` command args. That is because those commands/programs require supporting short flags like `+o abc` to mean the opposite of `-o abc`. Something that is effectively impossible using the borg standard getopt_long() function. Handling numeric args that otherwise look like an invalid short flag can be done with getopt_long_only() but this implementation makes it much easier. Too, without the risk of recognizing `-abc` as equivalent to `--abc` since that wasn't supported by the legacy AST optget() function. Related #507
Most of these fixes are for typos and extra whitespace at the end of lines. These are the notable changes: - Fixed a compatibility issue with how asterisks are displayed using certain fonts. Bug report: att#764 - Fixed a bug in the man page that caused searches for the '|' character to fail. Bug report: att#871 - Removed a duplicate description of 'set -B' from the man page. Bug report: att#789 - Added documentation for options missing from the ksh man page (applies to 'hist -N', 'sleep -s', 'whence -q' and many of ulimit's options). Bug reports: att#948 att#503 (comment) att#507 (comment) - Applied the following ksh2020 man page fixes: att#351 att#352 - Fixed a minor GCC -Wformat warning in procopen.c by changing NiL to NULL.
Most of these fixes are for typos and extra whitespace at the end of lines. These are the notable changes: - Fixed a compatibility issue with how asterisks are displayed using certain fonts. Bug report: att#764 - Fixed a bug in the man page that caused searches for the '|' character to fail. Bug report: att#871 - Removed a duplicate description of 'set -B' from the man page. Bug report: att#789 - Added documentation for options missing from the ksh man page (applies to 'hist -N', 'sleep -s', 'whence -q' and many of ulimit's options). Bug reports: att#948 att#503 (comment) att#507 (comment) - Applied the following ksh2020 man page fixes: att#351 att#352 - Fixed a minor GCC -Wformat warning in procopen.c by changing a sentinel NULL.
Most of these fixes are for typos and extra whitespace at the end of lines. These are the notable changes: - Fixed a compatibility issue with how asterisks are displayed using certain fonts. Bug report: att#764 - Fixed a bug in the man page that caused searches for the '|' character to fail. Bug report: att#871 - Removed a duplicate description of 'set -B' from the man page. Bug report: att#789 - Added documentation for options missing from the ksh man page (applies to 'hist -N', 'sleep -s', 'whence -q' and many of ulimit's options). Bug reports: att#948 att#503 (comment) att#507 (comment) - Applied the following ksh2020 man page fixes: att#351 att#352 - Fixed a minor GCC -Wformat warning in procopen.c by changing a sentinel NULL.
Most of these fixes are for typos and extra whitespace at the end of lines. These are the notable changes: - Fixed a compatibility issue with how asterisks are displayed using certain fonts. Bug report: att#764 - Fixed a bug in the man page that caused searches for the '|' character to fail. Bug report: att#871 - Removed a duplicate description of 'set -B' from the man page. Bug report: att#789 - Added documentation for options missing from the ksh man page (applies to 'hist -N', 'sleep -s', 'whence -q' and many of ulimit's options). Bug reports: att#948 att#503 (comment) att#507 (comment) - Applied the following ksh2020 man page fixes: att#351 att#352 - Fixed a minor GCC -Wformat warning in procopen.c by changing a sentinel NULL.
Most of these fixes are for typos and extra whitespace at the end of lines. These are the notable changes: - Fixed a compatibility issue with how asterisks are displayed using certain fonts. Bug report: att#764 - Fixed a bug in the man page that caused searches for the '|' character to fail. Bug report: att#871 - Removed a duplicate description of 'set -B' from the man page. Bug report: att#789 - Added documentation for options missing from the ksh man page (applies to 'hist -N', 'sleep -s', 'whence -q' and many of ulimit's options). Bug reports: att#948 att#503 (comment) att#507 (comment) - Applied the following ksh2020 documentation fixes: att#351 att#352 - Fixed a minor GCC -Wformat warning in procopen.c by changing a sentinel NULL.
Most of these fixes are for typos and extra whitespace at the end of lines. These are the notable changes: - Fixed a compatibility issue with how asterisks are displayed using certain fonts. Bug report: att#764 - Fixed a bug in the man page that caused searches for the '|' character to fail. Bug report: att#871 - Removed a duplicate description of 'set -B' from the man page. Bug report: att#789 - Added documentation for options missing from the ksh man page (applies to 'hist -N', 'sleep -s', 'whence -q' and many of ulimit's options). Bug reports: att#948 att#503 (comment) att#507 (comment) - Applied the following ksh2020 documentation fixes: att#351 att#352 - Fixed a minor GCC -Wformat warning in procopen.c by changing a sentinel to NULL.
Most of these fixes are for typos and extra whitespace at the end of lines. These are the notable changes: - Fixed a compatibility issue with how asterisks are displayed using certain fonts. Bug report: att#764 - Fixed a bug in the man page that caused searches for the '|' character to fail. Bug report: att#871 - Removed a duplicate description of 'set -B' from the man page. Bug report: att#789 - Added documentation for options missing from the ksh man page (applies to 'hist -N', 'sleep -s', 'whence -q' and many of ulimit's options). Bug reports: att#948 att#503 (comment) att#507 (comment) - Applied the following ksh2020 documentation fixes: att#351 att#352 - Fixed a minor GCC -Wformat warning in procopen.c by changing a sentinel to NULL.
See issue #503 for an example where the information in
man ksh
differs fromsome_command --help
. In that issue it is with respect to thehist
command but there are plenty of others. It is bad enough that having two sources for command documentation inevitably leads to such discrepancies. Embedding the command help text in the C source also makes it hard for a human to read and edit.Contrast this state of affairs with how the
fish
shell does it. It supports bothbuiltin_command --help
andman builtin_command
. And due to how the help text is managed it is guaranteed they will always be in agreement. This project should be able to do the same thing.P.S., If and when this is resolved it would be a good idea to switch the markup language from the ancient troff style to Markdown.
The text was updated successfully, but these errors were encountered: