-
Notifications
You must be signed in to change notification settings - Fork 346
Allow building all docs with a local repo #659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We current offer two ways to build the docs:
1. Build an arbitrary book with `--doc` and its companion arguments like
`--resource` and `--asciidoctor`.
2. Build *all* of the books that Elastic publishes with `--all`.
Building an arbitrary book is nice and quick because it is have very
little overhead on top of the time it takes to build the actual book.
But it isn't always the best test because the command that you use to
build the book might not exactly match the invocation that `--all` uses
to build the book. Furthermore, building a single book is not usually
enough! When you are testing documentation changes that you've made
locally you often don't know all of the books that your changes might
effect. The Elasticsearch repository is used in about a dozen books in
combination with a hand full of other repositories.
Building all of the docs is quite slow and you might run into errors
that you didn't cause. It is exactly what we need to build the docs for
the public web site but it isn't right for testing some changes that you
make locally. It also doesn't line up properly with local development as
it only knows how to pull repos from the Elastic organization at github.
This change adds a "third way" to build test the docs that splits the
difference between the two options. It uses the same mechanisms that we
usually use to build all of the docs but allows you to substitute local
directories in place of some branch of some repo:
```
./build_docs --all --target_repo git@github.com:elastic/built-docs.git \
--keep_hash \
--sub_dir elasticsearch:master:~/Code/elasticsearch
```
This is nice because it:
1. Rebuilds all of the books that your local directory is involved with
2. Uses standard switches when building each of those books
3. Does not rebuild books that don't use your local directory
That third thing is actually caused by that new `--keep_hash` flag. It
causes the build's up-to-date checking to use the same hash that was
used the last time the book was built. `--keep_hash` has uses beyond
being paired with `--sub_dir`. In particular, it is useful when testing
what happens when you switch a book to Asciidoctor. You can switch the
book to Asciidoctor in `conf.yaml` and rebuild with `--keep_hash` and
the build will *only* have the asciidoctor change in it.
This also drop `--no_fetch` because `--keep_hash` is better in almost
every way.
The goal of `--sub_dir` isn't so much that people will execute it
locally, but that we can execute it on CI. I'm sure folks will execute
it locally but it pretty heavy because, just like a normal `--all` build
it has to:
1. Clone the `built-docs` repo or fetch updates if it is already cloned.
2. Check out the `built-docs` repo. This is truly slow because it has
many, many files in it.
3. Clone all of the repos used to build every book or fetch updates if
they are already cloned.
I suspect in time we can fix some of these. But for now, this is a heavy
thing.
build_docs
Outdated
| ]) | ||
| build_docs_args.append( | ||
| '/doc/' + repo_name + path.replace(repo_root, '')) | ||
| return '/doc/' + repo_name + path.replace(repo_root, '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped automatically adding it to the list because we don't want to do that for --sub_dir.
| user => $Opts->{user}, | ||
| url => $Opts->{target_repo}, | ||
| reference => $reference_dir, | ||
| keep_hash => 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't keep_hash on the target repo because the target_repo contains the list of hashes to keep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's worth a review comment, is it worth a code comment?
| $pm->start($name) and next; | ||
| eval { | ||
| $repo->update_from_remote() unless $Opts->{no_fetch}; | ||
| $repo->update_from_remote(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I drop this feature because --keep_hash is better in most cases.
integtest/Makefile
Outdated
| .PHONY: simple_all_expected_files | ||
| simple_all_expected_files: | ||
| # Test the simplest possible `--all` invocation | ||
| $(call BUILD_MINIMAL_ALL,/tmp/simple_all) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to these call invocations because they don't bump into make's broken up-to-date checking for directories.
| current: master | ||
| branches: [ master ] | ||
| index: minimal.asciidoc | ||
| index: index.asciidoc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index.asciidoc allows me to reuse this file in other tests.
| } | ||
| $new .= '|asciidoctor' if $asciidoctor; | ||
|
|
||
| return $old ne $new if $self->keep_hash; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're keeping the hash then we don't need to check to see if any of the files that we're using changed so we can just return here.
| } | ||
|
|
||
| #=================================== | ||
| sub tree { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tree subroutine was not used.
| my ($msg) = ( $log =~ /^\w+\s+([^\n]+)/ ); | ||
| my $msg; | ||
| if ( $sha eq 'local' ) { | ||
| $msg = 'local changes'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is worth looking at the substitution branch to try and find a commit message here. We don't even know that the substitution directory is in a git repo....
| #=================================== | ||
| my ( $self, $destination ) = @_; | ||
|
|
||
| die 'sub_dir not supported with checkout_to' if %{ $self->{sub_dirs}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is only used by the target_repo so we don't need to support sub_dirs. In fact, it doesn't make sense to have it!
|
This seems like it helps with some of the issues described in #249 I initially thought that it could help with the scenario where we need to know which books to rebuild when we make changes to shared files like https://github.com/elastic/docs/blob/master/shared/attributes.asciidoc. On closer reading, however, I don't think that's right--these parameters wouldn't catch that scenario and we'll still need to forcefully rebuild all. |
I think that is true.
I think it catches the "Check cross-doc links and perform other post-build verification on specific books using the already-built versions of everything else." bit. I don't think it covers the rest though! |
|
@Jarpy, could you have a look at this one when you get a chance please? |
|
Woops, there are a lot of conflicts! I'll see about fixing those as soon as I finish what I've got locally. |
integtest/Makefile
Outdated
| # First build a repository to use as the source. | ||
| # REPLACE the minimal documentation in the source repo | ||
| cp ../README.asciidoc /tmp/source/index.asciidoc | ||
| cd /tmp/source && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this ever be run in parallel with itself? Might need a proper tmpdir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be difficult to get it to run in parallel with itself but I may as well make a proper tmpdir. You are right about that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess Docker kinda takes care of that, doesn't it? I may have been thinking too old-school.
| mount_docs_repo_and_dockerify_path(resource_dir, resource_dir) | ||
| build_docs_args.append(mount_docs_repo_and_dockerify_path( | ||
| resource_dir, resource_dir)) | ||
| elif arg == '--sub_dir': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you add features, all the manual parsing of argv is going to cost you more and more. argparse isn't great, but is there any value in considering it over full DIY?
(Normally, I would suggest Click, but I know you can't do packages.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've considered it on and off for a while. I think the value that DIY gets me is being able to pass through arguments that I don't know about. I only rewrite the arguments that have anything to do with paths. The cost of DIY is that I need this everywhere. At this point I think it is net simpler, but I'd be receptive to any argument to the contrary. Like, I wonder if it'd be helpful to force me to list all of the supported arguments in the python file because we plan to drop the perl file one day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I've considered it"
Cool. I trust your judgement.
| user => $Opts->{user}, | ||
| url => $Opts->{target_repo}, | ||
| reference => $reference_dir, | ||
| keep_hash => 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's worth a review comment, is it worth a code comment?
We current offer two ways to build the docs:
--docand its companion arguments like--resourceand--asciidoctorand--chunk.--all.Building an arbitrary book is nice and quick because it is have very
little overhead on top of the time it takes to build the actual book.
But it isn't always the best test because the command that you use to
build the book might not exactly match the invocation that
--allusesto build the book. Furthermore, building a single book is not usually
enough! When you are testing documentation changes that you've made
locally you often don't know all of the books that your changes might
effect. The Elasticsearch repository is used in about a dozen books in
combination with a hand full of other repositories. Also! If you want
to build any books that require multiple repositories then you have to
set all of them up appropriately.
Building all of the docs doesn't have any of those problems but it isn't
wasn't compatible with building from the local filesystem. This change
fixes that by adding a
--sub_diroption to "substitute" and "directory"in place of some branch of some repository when building all of the docs.
This overrides the standard up-to-date checks to force all books being
built with these subsitutions to be rebuilt every time. It looks like:
This is nice because it:
That third thing is actually caused by that new
--keep_hashflag. Itcauses the build's up-to-date checking to use the same hash that was
used the last time the book was built.
--keep_hashhas uses beyondbeing paired with
--sub_dir. In particular, it is useful when testingwhat happens when you switch a book to Asciidoctor. You can switch the
book to Asciidoctor in
conf.yamland rebuild with--keep_hashandthe build will only have the asciidoctor change in it.
This also drops
--no_fetchbecause--keep_hashis better in almostevery way.
The goal of
--sub_dirisn't so much that people will execute itlocally, but that we can execute it on CI. I'm sure folks will execute
it locally but it pretty heavy because, just like a normal
--allbuildit has to:
built-docsrepo or fetch updates if it is already cloned.many, many files in it.
built-docsrepo. This is truly slow because it hasthey are already cloned.
I suspect in time we can fix some of these. But for now, this is a heavy
thing.