Skip to content

Conversation

@kaxil
Copy link
Member

@kaxil kaxil commented Dec 20, 2025

Preview: https://airflow.staged.apache.org/docs/apache-airflow/3.1.5/

I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode (and other related changes in https://github.com/apache/airflow-site), this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by Pagefind. The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning. Completely replaced Sphinx search now.

image image

What’s included

New Sphinx extension: pagefind_search

Located in devel-common/src/sphinx_exts/pagefind_search/:

  • init.py: Extension setup with configuration values and event handlers
  • builder.py: Automatic index building with graceful fallback
  • static/css/pagefind.css: Search modal and button styling with dark mode support
  • static/js/search.js: Search functionality with keyboard shortcuts
  • templates/search-modal.html: Search modal HTML template

Features

  • Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
  • Arrow key navigation through results
  • Works offline (no external services)
  • Automatic indexing during documentation build
  • Dark mode support
  • Sub-50ms search performance
  • Configurable content indexing via conf.py

User Experience

Users can now:

  • Press Cmd+K from any documentation page to search
  • Navigate results with arrow keys, Enter to select, Esc to close
  • Search works immediately without network requests
  • Results show page title, breadcrumb, and excerpt

Configuration

Available in conf.py:

  • pagefind_enabled: Toggle search indexing
  • pagefind_verbose: Enable build logging
  • pagefind_root_selector: Define searchable content area
  • pagefind_exclude_selectors: Exclude navigation, headers, footers
  • pagefind_custom_records: Index non-HTML content (PDFs, etc.)

Ranking Optimization

I have also spent a lot of time tuning the search the below knows.. Now, the extension uses optimized ranking parameters in search.js which in my testing has produced better results:

  • termFrequency: 1.0 - Standard term occurrence weighting
  • termSaturation: 0.7 - Moderate saturation to prevent over-rewarding repetition
  • termSimilarity: 7.5 - Maximum boost for exact phrase matches and similar terms
  • pageLength: 0 - No penalty for longer pages (important for reference documentation)

closes apache/airflow-site#666

@choo121600
Copy link
Member

WoW, it looks really Cool😲

e5244ff8f60bf23957a9bbd4a6cc3453

@Dev-iL
Copy link
Collaborator

Dev-iL commented Dec 20, 2025

Great work, Kaxil!

@potiuk
Copy link
Member

potiuk commented Dec 20, 2025

NICE!

I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)
@kaxil
Copy link
Member Author

kaxil commented Dec 20, 2025

I completely replaced Sphinx search based on the mailing list feedback. This is how it would look now

image image image image

kaxil added a commit to apache/airflow-site-archive that referenced this pull request Dec 20, 2025
@kaxil kaxil merged commit d0bd2df into apache:main Dec 20, 2025
126 checks passed
@kaxil kaxil deleted the indexed-search branch December 20, 2025 13:01
kaxil added a commit to astronomer/airflow that referenced this pull request Dec 20, 2025
I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)

(cherry picked from commit d0bd2df)
@kaxil kaxil added this to the Airflow 3.1.6 milestone Dec 20, 2025
kaxil added a commit to apache/airflow-site-archive that referenced this pull request Dec 20, 2025
kaxil added a commit to apache/airflow-site-archive that referenced this pull request Dec 20, 2025
potiuk pushed a commit to potiuk/airflow that referenced this pull request Dec 20, 2025
…che#59658)

I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)
(cherry picked from commit d0bd2df)

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
@potiuk
Copy link
Member

potiuk commented Dec 20, 2025

Backport in #59669

potiuk added a commit to potiuk/airflow that referenced this pull request Dec 20, 2025
Follow up after apache#59658 - there are few more places where canary
builds used `pip check` and until the `pip` inconsistent behaviour
is fixed (pypa/pip#13709) we should use
`uv pip check` instead.
kaxil added a commit that referenced this pull request Dec 20, 2025
I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)

(cherry picked from commit d0bd2df)
kaxil pushed a commit that referenced this pull request Dec 20, 2025
Follow up after #59658 - there are few more places where canary
builds used `pip check` and until the `pip` inconsistent behaviour
is fixed (pypa/pip#13709) we should use
`uv pip check` instead.
kaxil pushed a commit to astronomer/airflow that referenced this pull request Dec 20, 2025
Follow up after apache#59658 - there are few more places where canary
builds used `pip check` and until the `pip` inconsistent behaviour
is fixed (pypa/pip#13709) we should use
`uv pip check` instead.

(cherry picked from commit 33e6d3e)
@amoghrajesh
Copy link
Contributor

Looks really cool!

potiuk pushed a commit that referenced this pull request Dec 28, 2025
I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)

(cherry picked from commit d0bd2df)
potiuk pushed a commit that referenced this pull request Dec 28, 2025
I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)

(cherry picked from commit d0bd2df)
Subham-KRLX pushed a commit to Subham-KRLX/airflow that referenced this pull request Jan 2, 2026
Follow up after apache#59658 - there are few more places where canary
builds used `pip check` and until the `pip` inconsistent behaviour
is fixed (pypa/pip#13709) we should use
`uv pip check` instead.
ephraimbuddy pushed a commit that referenced this pull request Jan 6, 2026
I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)

(cherry picked from commit d0bd2df)
ephraimbuddy pushed a commit that referenced this pull request Jan 6, 2026
Follow up after #59658 - there are few more places where canary
builds used `pip check` and until the `pip` inconsistent behaviour
is fixed (pypa/pip#13709) we should use
`uv pip check` instead.

(cherry picked from commit 33e6d3e)
@ephraimbuddy ephraimbuddy added the type:doc-only Changelog: Doc Only label Jan 6, 2026
stegololz pushed a commit to stegololz/airflow that referenced this pull request Jan 9, 2026
I have been frustrated by Sphinx search for a long-long time. So after adding dark-mode, this was next in my list!

This PR/commit introduces a fast, fully client-side search experience for the Apache Airflow documentation, powered by [Pagefind](https://pagefind.app/). The new search is keyboard-accessible (Cmd+K / Ctrl+K), works offline, and requires no external services.

Search indexes are generated automatically at documentation build time and loaded entirely in the browser, enabling sub-50 ms queries even on large docs.

I have kept the Sphinx search too as a backup and it will keep functioning.

----

Add keyboard-accessible search (Cmd+K) to Apache Airflow documentation with
automatic indexing and offline support.

New Sphinx extension: `pagefind_search`

Located in `devel-common/src/sphinx_exts/pagefind_search/`:
- __init__.py: Extension setup with configuration values and event handlers
- builder.py: Automatic index building with graceful fallback
- static/css/pagefind.css: Search modal and button styling with dark mode support
- static/js/search.js: Search functionality with keyboard shortcuts
- templates/search-modal.html: Search modal HTML template

- Keyboard shortcut (Cmd+K/Ctrl+K) opens search modal
- Arrow key navigation through results
- Works offline (no external services)
- Automatic indexing during documentation build
- Dark mode support
- Sub-50ms search performance
- Configurable content indexing via conf.py

Users can now:
- Press Cmd+K from any documentation page to search
- Navigate results with arrow keys, Enter to select, Esc to close
- Search works immediately without network requests
- Results show page title, breadcrumb, and excerpt

Available in conf.py:
- pagefind_enabled: Toggle search indexing
- pagefind_verbose: Enable build logging
- pagefind_root_selector: Define searchable content area
- pagefind_exclude_selectors: Exclude navigation, headers, footers
- pagefind_custom_records: Index non-HTML content (PDFs, etc.)
stegololz pushed a commit to stegololz/airflow that referenced this pull request Jan 9, 2026
Follow up after apache#59658 - there are few more places where canary
builds used `pip check` and until the `pip` inconsistent behaviour
is fixed (pypa/pip#13709) we should use
`uv pip check` instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Favour human-written documentation over machine-generated documentation

7 participants