[build] Add DocFX updater script by titusfortner · Pull Request #16980 · SeleniumHQ/selenium

titusfortner · 2026-01-22T22:05:47Z

User description

Making sure dependencies we're pinning generally have a way to auto-update.

💥 What does this PR do?

Adds automated DocFX version updating for .NET documentation builds:

Add scripts/update_docfx.py to automatically fetch the latest DocFX release from GitHub
Integrate updater into dotnet/update-deps.sh
Bump DocFX to latest version

🔧 Implementation Notes

The script fetches release information from the GitHub API and updates dotnet/private/docfx_repo.bzl
with the latest version and SHA256 hash.

🔄 Types of changes

New feature (non-breaking change which adds functionality and tests!)

PR Type

Enhancement

Description

Add automated DocFX version updater script for NuGet packages
Fetch latest DocFX release and compute SHA256 hash automatically
Integrate updater into dotnet/update-deps.sh build workflow
Bump DocFX from 2.78.2 to 2.78.4

Diagram Walkthrough

flowchart LR
  A["NuGet API"] -->|fetch versions| B["update_docfx.py"]
  B -->|compute SHA256| C["docfx_repo.bzl"]
  D["update-deps.sh"] -->|invoke| B
  B -->|update| C

File Walkthrough

Relevant files

Enhancement

update_docfx.py `DocFX version updater script implementation` scripts/update_docfx.py New Python script to fetch latest DocFX version from NuGet API Computes SHA256 hash of downloaded nupkg file Supports explicit version selection and prerelease filtering Generates updated `docfx_repo.bzl` with version and hash	+141/-0

Dependencies

docfx_repo.bzl `Update DocFX to version 2.78.4` dotnet/private/docfx_repo.bzl Bump DocFX version from 2.78.2 to 2.78.4 Update SHA256 hash to match new version	+2/-2

Configuration changes

update-deps.sh `Integrate DocFX updater into build workflow` dotnet/update-deps.sh Add invocation of `bazel run //scripts:update_docfx` at end of script Integrates DocFX updater into automated dependency update workflow	+2/-0
BUILD.bazel `Add build target for DocFX updater` scripts/BUILD.bazel Add new `py_binary` target for `update_docfx` script Declare dependency on `packaging` library for version parsing	+8/-0

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

qodo-code-review · 2026-01-22T22:06:18Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
⚪	Supply chain update Description: The new updater script performs network fetches and derives the version automatically from remote NuGet index data (or user input) before downloading and hashing the corresponding package, which introduces a potential supply-chain risk if the upstream metadata or package source is compromised and the script is run automatically (e.g., via `dotnet/update-deps.sh`). update_docfx.py [21-135] Referred Code def fetch_json(url): with urllib.request.urlopen(url) as response: return json.loads(response.read()) def choose_version(versions, allow_prerelease, explicit_version=None): if explicit_version: return explicit_version parsed = [] for v in versions: try: pv = Version(v) except InvalidVersion: continue if not allow_prerelease and pv.is_prerelease: continue parsed.append((pv, v)) if not parsed: # Fall back to any parseable version. ... (clipped 94 lines)
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Network error handling: The new script makes external HTTP requests via `urllib.request.urlopen()` without timeouts or contextual error handling, so transient failures may raise unhandled exceptions with limited actionable context. Referred Code def fetch_json(url): with urllib.request.urlopen(url) as response: return json.loads(response.read()) def choose_version(versions, allow_prerelease, explicit_version=None): if explicit_version: return explicit_version parsed = [] for v in versions: try: pv = Version(v) except InvalidVersion: continue if not allow_prerelease and pv.is_prerelease: continue parsed.append((pv, v)) if not parsed: # Fall back to any parseable version. ... (clipped 21 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Output path validation: The `--output` argument is written directly via `output_path.write_text(...)` without validation or safeguards, which could overwrite arbitrary files if the script is run with an unexpected path. Referred Code parser.add_argument( "--output", default="dotnet/private/docfx_repo.bzl", help="Output file path (default: dotnet/private/docfx_repo.bzl)", ) args = parser.parse_args() index = fetch_json(NUGET_INDEX_URL) versions = index.get("versions", []) if not versions: raise ValueError("NuGet index returned no versions for DocFX") version = choose_version(versions, args.allow_prerelease, args.version) nupkg_url = NUGET_NUPKG_URL.format(version=version) sha256 = sha256_of_url(nupkg_url) output_path = Path(args.output) if not output_path.is_absolute(): workspace_dir = os.environ.get("BUILD_WORKSPACE_DIRECTORY") if workspace_dir: output_path = Path(workspace_dir) / output_path ... (clipped 2 lines) Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2026-01-22T22:07:45Z

PR Code Suggestions ✨

Latest suggestions up to 406d8a8

Category	Suggestion	Impact
Possible issue	Verify download status and cleanup In `sha256_of_url`, verify the HTTP status is 200 before hashing and use a `try...finally` block to ensure `release_conn()` is always called. scripts/update_docfx.py [53-59] def sha256_of_url(url): digest = hashlib.sha256() - r = http.request("GET", url, preload_content=False) - for chunk in r.stream(1024 * 1024): - digest.update(chunk) - r.release_conn() - return digest.hexdigest() + r = http.request( + "GET", + url, + preload_content=False, + timeout=urllib3.Timeout(connect=5.0, read=60.0), + ) + try: + if r.status != 200: + raise RuntimeError(f"Failed to download {url} (HTTP {r.status})") + for chunk in r.stream(1024 * 1024): + digest.update(chunk) + return digest.hexdigest() + finally: + r.release_conn() Apply / Chat Suggestion importance[1-10]: 8 __ Why: This suggestion fixes a potential bug where an incorrect SHA256 hash of an error page could be generated and written to a build file, and also prevents connection leaks.	Medium
	Validate HTTP status for JSON Add HTTP status code validation and a timeout to the `fetch_json` function to handle network failures gracefully. scripts/update_docfx.py [23-25] def fetch_json(url): - r = http.request("GET", url) - return json.loads(r.data) + r = http.request("GET", url, timeout=urllib3.Timeout(connect=5.0, read=30.0)) + if r.status != 200: + raise RuntimeError(f"Failed to fetch JSON from {url} (HTTP {r.status})") + try: + return json.loads(r.data) + except json.JSONDecodeError as e: + raise RuntimeError(f"Invalid JSON from {url} (HTTP {r.status})") from e Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies that the script lacks robustness by not checking HTTP status codes, which could lead to unhelpful errors if the NuGet API fails.	Medium
	Make file writes atomic Implement an atomic write operation by writing to a temporary file first and then replacing the original file to prevent corruption. scripts/update_docfx.py [132] -output_path.write_text(render_docfx_repo(version, sha256)) +contents = render_docfx_repo(version, sha256) +tmp_path = output_path.with_suffix(output_path.suffix + ".tmp") +tmp_path.write_text(contents, encoding="utf-8") +tmp_path.replace(output_path) Apply / Chat Suggestion importance[1-10]: 6 __ Why: The suggestion improves the script's robustness by proposing an atomic file write, which prevents file corruption if the script is interrupted.	Low
More

Previous suggestions

✅ Suggestions up to commit 894dde4

Category	Suggestion	Impact
High-level	Modify bzl file instead of overwriting Instead of overwriting `docfx_repo.bzl` with a hardcoded template, the script should read the file and use regular expressions to replace only the `version` and `sha256` values. This preserves other content and makes the update process more robust. Examples: scripts/update_docfx.py [65-100] def render_docfx_repo(version, sha256): return f'''\ """Repository rule to download the docfx NuGet package.""" _BUILD = """ package(default_visibility = ["//visibility:public"]) exports_files(glob(["*/"])) filegroup(name = "docfx_dll", srcs = ["tools/net8.0/any/docfx.dll"]) """ ... (clipped 26 lines) Solution Walkthrough: Before: # scripts/update_docfx.py def render_docfx_repo(version, sha256): # Hardcoded template for the entire file return f'''\ ... def _docfx_extension_impl(module_ctx): docfx_repo( name = "docfx", version = "{version}", sha256 = "{sha256}", ) ... ''' def main(): # ... fetch version and sha # Overwrites the entire file output_path.write_text(render_docfx_repo(version, sha256)) After: # scripts/update_docfx.py import re def update_bzl_file(path, version, sha256): content = path.read_text() content = re.sub( r'(version = ")[^"](")', f'\\1{version}\\2', content ) content = re.sub( r'(sha256 = ")[^"](")', f'\\1{sha256}\\2', content ) path.write_text(content) def main(): # ... fetch version and sha update_bzl_file(output_path, version, sha256) Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies a significant design flaw where overwriting `docfx_repo.bzl` is brittle; modifying only the necessary values would make the script more robust and maintainable.	Medium
Possible issue	✅ ~~Fix version selection to respect prerelease flag~~ Suggestion Impact: The commit removed the fallback that parsed "any version" when no stable versions were found, and replaced it with conditional ValueErrors that respect the allow_prerelease flag (stable-only error vs parseable-versions error). This prevents unintended prerelease selection. code diff: def choose_version(versions, allow_prerelease, explicit_version=None): if explicit_version: + if explicit_version not in versions: + raise ValueError(f"Requested DocFX version {explicit_version!r} not found in NuGet index") return explicit_version parsed = [] @@ -38,27 +42,20 @@ parsed.append((pv, v)) if not parsed: - # Fall back to any parseable version. - for v in versions: - try: - parsed.append((Version(v), v)) - except InvalidVersion: - continue - - if not parsed: - raise ValueError("No parseable DocFX versions found in NuGet index") + if allow_prerelease: + raise ValueError("No parseable DocFX versions found in NuGet index") + else: + raise ValueError("No stable DocFX versions found. Use --allow-prerelease to include prereleases.") return max(parsed, key=lambda item: item[0])[1] In the `choose_version` function, remove the fallback logic that ignores the `allow_prerelease` flag to prevent unintentional selection of prerelease versions. Instead, raise an error if no suitable versions are found. scripts/update_docfx.py [26-51] def choose_version(versions, allow_prerelease, explicit_version=None): if explicit_version: return explicit_version parsed = [] for v in versions: try: pv = Version(v) except InvalidVersion: continue if not allow_prerelease and pv.is_prerelease: continue parsed.append((pv, v)) if not parsed: - # Fall back to any parseable version. - for v in versions: - try: - parsed.append((Version(v), v)) - except InvalidVersion: - continue - - if not parsed: - raise ValueError("No parseable DocFX versions found in NuGet index") + if allow_prerelease: + raise ValueError("No parseable DocFX versions found in NuGet index") + else: + raise ValueError("No stable DocFX versions found in NuGet index. Use --allow-prerelease to include them.") return max(parsed, key=lambda item: item[0])[1] `[Suggestion processed]` Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies a logic flaw where the `allow_prerelease` flag is ignored in the fallback path, potentially leading to an unintended prerelease version being selected.	Medium
Possible issue	✅ ~~Check explicit version validity~~ Suggestion Impact: The commit added an explicit_version validity check in choose_version(): if the requested version is not present in the NuGet index versions list, it raises a ValueError with a clear message before returning. code diff: def choose_version(versions, allow_prerelease, explicit_version=None): if explicit_version: + if explicit_version not in versions: + raise ValueError(f"Requested DocFX version {explicit_version!r} not found in NuGet index") return explicit_version Before returning an `explicit_version`, validate that it exists in the list of available `versions` from the NuGet index to fail early with a clear error. scripts/update_docfx.py [27-28] if explicit_version: - return explicit_version + if explicit_version in versions: + return explicit_version + else: + raise ValueError(f"Explicit version {explicit_version!r} not found in NuGet index") `[Suggestion processed]` Suggestion importance[1-10]: 7 __ Why: This is a valuable suggestion for improving robustness by validating user input early, which provides a clearer error message than letting the script fail later during the download phase.	Medium
General	Handle HTTP errors during file download In `sha256_of_url`, add a `try...except` block to handle potential `HTTPError` exceptions during file download and provide a more informative error message. scripts/update_docfx.py [54-62] def sha256_of_url(url): digest = hashlib.sha256() - with urllib.request.urlopen(url) as response: - while True: - chunk = response.read(1024 * 1024) - if not chunk: - break - digest.update(chunk) + try: + with urllib.request.urlopen(url) as response: + while True: + chunk = response.read(1024 * 1024) + if not chunk: + break + digest.update(chunk) + except urllib.error.HTTPError as e: + raise ValueError(f"Failed to download from {url}: {e}") from e return digest.hexdigest() Suggestion importance[1-10]: 6 __ Why: This suggestion improves the script's robustness by adding error handling for network download failures, which provides better feedback to the user than an unhandled exception.	Low
General	Handle HTTP errors gracefully In the `fetch_json` function, wrap the `urlopen` call in a `try...except` block to handle network failures and raise a more informative error. scripts/update_docfx.py [21-23] def fetch_json(url): - with urllib.request.urlopen(url) as response: - return json.loads(response.read()) + try: + with urllib.request.urlopen(url) as response: + return json.loads(response.read()) + except Exception as e: + raise RuntimeError(f"Failed to fetch JSON from {url}: {e}") from e Suggestion importance[1-10]: 6 __ Why: This suggestion improves the script's reliability by handling network errors when fetching the NuGet index, preventing the script from crashing and providing a clear error message.	Low
Learned best practice	✅ ~~Validate CLI/env inputs before use~~ Suggestion Impact: The commit added stricter validation around explicit DocFX version selection by rejecting an explicit_version that is not present in the NuGet index, and improved error handling when no suitable (stable vs prerelease) versions are available. However, it did not implement the suggested CLI/env trimming/validation for --output or BUILD_WORKSPACE_DIRECTORY handling. code diff: def choose_version(versions, allow_prerelease, explicit_version=None): if explicit_version: + if explicit_version not in versions: + raise ValueError(f"Requested DocFX version {explicit_version!r} not found in NuGet index") return explicit_version parsed = [] @@ -38,27 +42,20 @@ parsed.append((pv, v)) if not parsed: - # Fall back to any parseable version. - for v in versions: - try: - parsed.append((Version(v), v)) - except InvalidVersion: - continue - - if not parsed: - raise ValueError("No parseable DocFX versions found in NuGet index") + if allow_prerelease: + raise ValueError("No parseable DocFX versions found in NuGet index") + else: + raise ValueError("No stable DocFX versions found. Use --allow-prerelease to include prereleases.") Trim and validate `--version`/`--output` and require/validate `BUILD_WORKSPACE_DIRECTORY` (or explicitly define a fallback) so the script doesn't write to an unexpected relative path or accept invalid versions. scripts/update_docfx.py [126-135] -version = choose_version(versions, args.allow_prerelease, args.version) +explicit_version = args.version.strip() if args.version else None +if explicit_version: + try: + Version(explicit_version) + except InvalidVersion as e: + raise ValueError(f"Invalid --version: {explicit_version}") from e + +version = choose_version(versions, args.allow_prerelease, explicit_version) nupkg_url = NUGET_NUPKG_URL.format(version=version) sha256 = sha256_of_url(nupkg_url) -output_path = Path(args.output) +output_arg = (args.output or "").strip() +if not output_arg: + raise ValueError("--output must be a non-empty path") +output_path = Path(output_arg) if not output_path.is_absolute(): - workspace_dir = os.environ.get("BUILD_WORKSPACE_DIRECTORY") - if workspace_dir: - output_path = Path(workspace_dir) / output_path + workspace_dir = (os.environ.get("BUILD_WORKSPACE_DIRECTORY") or "").strip() + if not workspace_dir: + raise EnvironmentError("BUILD_WORKSPACE_DIRECTORY is required when --output is a relative path") + output_path = Path(workspace_dir) / output_path output_path.write_text(render_docfx_repo(version, sha256)) `[Suggestion processed]` Suggestion importance[1-10]: 6 __ Why: Relevant best practice - Add explicit validation and availability guards at integration boundaries (e.g., environment variables, CLI inputs, network calls) before use.	Low

Copilot

Pull request overview

This PR adds automated DocFX version updating for .NET documentation builds. The script fetches the latest DocFX release from NuGet (not GitHub as stated in the description), computes its SHA256 hash, and updates the Bazel repository rule configuration file.

Changes:

Add scripts/update_docfx.py to automatically fetch and update DocFX versions from NuGet
Add py_binary target in scripts/BUILD.bazel for the new script
Integrate the updater into dotnet/update-deps.sh workflow
Update DocFX from version 2.78.2 to 2.78.4 with corresponding SHA256

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
scripts/update_docfx.py	New Python script that fetches DocFX version info from NuGet API and generates updated Bazel configuration
scripts/BUILD.bazel	Adds py_binary target for update_docfx with packaging dependency
dotnet/update-deps.sh	Integrates DocFX updater into the .NET dependency update workflow
dotnet/private/docfx_repo.bzl	Updates DocFX version from 2.78.2 to 2.78.4 with new SHA256 hash

scripts/update_docfx.py

- Validate explicit version exists in NuGet index before use - Remove fallback that ignored --allow-prerelease flag Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Validate explicit version exists in NuGet index before use - Remove fallback that ignored --allow-prerelease flag - Switch from urllib.request to urllib3 for consistency with other scripts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

qodo-code-review · 2026-01-23T00:34:45Z

Persistent suggestions updated to latest commit 406d8a8

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

* Add DocFX updater script * Validate explicit version exists in NuGet index before use * Switch from urllib.request to urllib3 for consistency with other scripts

titusfortner and others added 4 commits January 22, 2026 15:44

Add DocFX updater script

a705586

Bump DocFX artifact pin

9616d25

Run DocFX updater in dotnet/update-deps.sh

8fe0598

Fix import order in update_docfx.py

894dde4

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

titusfortner requested a review from Copilot January 22, 2026 22:05

selenium-ci added C-dotnet .NET Bindings B-build Includes scripting, bazel and CI integrations labels Jan 22, 2026

titusfortner changed the title ~~Add DocFX updater script~~ [build] Add DocFX updater script Jan 22, 2026

Copilot started reviewing on behalf of titusfortner January 22, 2026 22:06 View session

qodo-code-review bot added the Review effort 2/5 label Jan 22, 2026

Copilot AI reviewed Jan 22, 2026

View reviewed changes

titusfortner and others added 2 commits January 22, 2026 18:20

Fix version selection in update_docfx.py

3ce148d

- Validate explicit version exists in NuGet index before use - Remove fallback that ignored --allow-prerelease flag Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

titusfortner requested a review from Copilot January 23, 2026 00:33

Copilot started reviewing on behalf of titusfortner January 23, 2026 00:33 View session

Copilot AI reviewed Jan 23, 2026

View reviewed changes

titusfortner merged commit e232599 into trunk Jan 23, 2026
28 of 29 checks passed

titusfortner deleted the docfx_updater branch January 23, 2026 00:54

titusfortner added a commit that referenced this pull request Jan 23, 2026

[build] Add DocFX updater script (#16980)

5c1dbb3

* Add DocFX updater script * Validate explicit version exists in NuGet index before use * Switch from urllib.request to urllib3 for consistency with other scripts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[build] Add DocFX updater script#16980

[build] Add DocFX updater script#16980
titusfortner merged 6 commits intotrunkfrom
docfx_updater

titusfortner commented Jan 22, 2026 •

edited

Loading

Uh oh!

qodo-code-review bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

qodo-code-review bot commented Jan 22, 2026 •

edited

Loading

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qodo-code-review bot commented Jan 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

titusfortner commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

💥 What does this PR do?

🔧 Implementation Notes

🔄 Types of changes

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

qodo-code-review bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qodo-code-review bot commented Jan 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

titusfortner commented Jan 22, 2026 •

edited

Loading

qodo-code-review bot commented Jan 22, 2026 •

edited

Loading

qodo-code-review bot commented Jan 22, 2026 •

edited

Loading