-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Support code-snippet extraction and docs-updating #3948
[CI] Support code-snippet extraction and docs-updating #3948
Conversation
1a1025a
to
fee3614
Compare
The issue with this approach is we're going to now have 11+ additional sources of contributors messing up submodules. In practice I think most contributions are in the text and not code snippets, but the introduction of that many submodules means we're probably going to have commit to cleaning up peoples' branches for them. Submodules are difficult to work with and easy to screw up. |
Contributors can now simply run As @svrnm pointed out during the last Comms meeting, the issue isn't how many submodules we have, it's whether we have submodules at all. Since we're pretty much stuck with submodules, having a few more won't be an extra burden; and besides, IMHO As maintainers the question we need to ask is whether we want #1635 solved in the repo or not. If we do, then this PR offers one of the best solutions I know of. /cc @theletterf |
fee3614
to
0cafa2e
Compare
Big fan of that, thank you for tackling this @chalin!
First: +1 for Yeah, in my experience it's not one submodule that contributors screw up, it's "all or nothing". Although I agree that adding another 11+ submodules may come with additional challenges, e.g. how much is this slowing down the build process, etc. But this is an issue to worry about later, since we start with one repo now... Also, I am wondering, how much do contributors need to interact with that? So, if I fix some typos or words in the docs, do I even need to install&use that tooling? Or, even better, could we package all of that in a github workflow that regularly fetches the code repos, executes that code and then raises the appropriate PR?
I want this to be solved: we are software developers creating software for software developers to make it easier for them to fix problems in their code. So there are rightfully some high expectations for our stuff (including the docs) to work. Also, with our docs still growing quicker than we onboard new docs contributors we need every tooling we can have to keep us afloat. One organizational note: |
For this:
I'd say the more submodules, the more ways things can get out of date, so it's also a matter of how many. I'm happy that we have |
@svrnm wrote:
Great idea! Let's track that via:
Initially, no interaction at all.
No.
Yes.
Right. We're trying to solve an inherently challenging problem. I propose that we approach this incrementally. First, let's agree on the base tooling (even if it changes later) so that we can start with something concrete. Then find a SIG willing to have us sync their code. Then incrementally bring in the automation, adjusting as we go along. My proposal is that contributors not have to interact with this at all in the beginning. We maintainers/approvers would manually run the code-excerpt script when needed -- e.g., to handle situations like #3949 -- and commit changes. Eventually we'll need to educate contributors that code changes need to happen at the source. There are ways to mitigate this (that I can share at our next Comms meeting), but IMHO we should stay focus on first steps. @cartermp wrote:
You're right, of course: the number of submodules does have an incremental impact.
That's the plan.
Yup, and we've been trying to do our best to keep all stakeholders as happy as possible (#2448, #3149). There are inherent complexities in the problem we're trying to manage (supporting the build, checking and deployment of a non-trivial tech-docs website with parts spread across repos) beyond "just writing docs". E.g., submodules can be a pain to deal, but most of the time IMHO, they help cleanly solve the problem of keeping content across repos in sync. |
For the sake of consistency, can we wrap this in a short code that is only there to replace those
+1 for this approach, in the worst case(!) we can pull back and revert that tooling. We should gain some experience with it and see if it adds more value vs more issues and then make a judgement. Maybe we need to onboard 2 to 3 SIGs to make a final decision if we want to go all in or not. Let's assume we take 1-2 months to add one more SIG we might be able to go from "beta" to rolling this out in ~3-4 months. WDYT?
If we wrap all that code-updating into a workflow or into something only maintainers touch, I wonder if we should pick some "conditional" submodules, i.e. they are not part of .gitmodules by default and a maintainer (or a bot) needs to run a script first to add and load them, do the script updates, run a script to unload them and send the PR. It adds some extra burden on us / the workflow but it removes the necessity for regular contributors to interact with even more submodules |
@svrnm wrote:
Eventually, but for now I'd like to focus on using the tools rather than customizing them. I have further improvements in mind, but again, I'd rather we gain experience with existing tooling for starters.
👍🏻
Sure, but with the magic and ease of WDYT? |
+1
If we start with one repo (=> go), I have no concern with adding this one more sub-module for now. Let's make it work with go, and then keep it going for a few weeks and see how many things break. |
So one thing that this has caused me to think about is adding an action for a "please fix all my shit". So it fixes formatting, the refcache, runs |
Like a fix all? It does not have all the fixes yet, but updating that workflow file has become much easier now. So that should hopefully help a lot. Another thing we could do is adding a workflow that checks CI results and let's people know that they can trigger those actions, or that they should not worry about them too much and that we can fix that for them before merging. |
I prefer this approach given it's more "context-aware" and proactive to the contributors issue/flow, rather than "oh read the doc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm other than one concern/question
@@ -43,6 +43,9 @@ | |||
"check:text": "npm run _check:text -- ", | |||
"check": "npm run seq -- $(npm run -s _list:check:*)", | |||
"clean": "make clean", | |||
"code-excerpts": "rm -Rf tmp/excerpts/* && npm run seq -- code-excerpts:get code-excerpts:update-docs", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure: the other tmp
references are ../tmp
and this one is just tmp
, is that expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is expected, but thanks for noticing: the other commands do a cd tools
first, hence the leading ../
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed yesterday during the SIG call we want to proceed with this.
@chalin update and merge this PR whenever you are ready |
2fe8b61
to
234cf67
Compare
awesome, thanks @chalin! Now let's see how this works in the future. |
@svrnm - I'm OOO next week, but will dive into using this again (probably for the demo) when I'm back (FYI) |
What's this PR about?
Notes: I only marked three code blocks for syncing. Of these:
How it works
<?code-excerpt ...?>
directives before code blocks to have the code snippet synced. That's it, the tools do the rest once configured.How can I try this myself? (For maintainers/approvers only)
tools/README.md
: install the Dart SDK if you don't have it already.npm run code-excerpts
This does some preprocessing of the code excerpts and then syncs the docs -- both substeps are independent. Once primed, that full prep and sync takes less than 6 seconds on my machine, and that's mostly tooling overhead. The tools scale very nicely in the number of code excerpts and pages processed.
PR details
examples
tools
code-excerpts
,code-excerpts:get
, andcode-excerpts:update-docs
opentelemetry-go
as a submodule source to have access toexample/dice
codeFor later