Skip to content

Dry run Elasticsearch indexing #21

Dry run Elasticsearch indexing

Dry run Elasticsearch indexing #21

name: Dry run Elasticsearch indexing
# **What it does**: Tests to index records into a local Elasticsearch
# **Why we have it**: To make sure the indexing code works.
# **Who does it impact**: Docs engineering.
on:
merge_group:
pull_request:
paths:
- 'script/search/**'
- 'package*.json'
- .github/workflows/dry-run-elasticsearch-indexing.yml
permissions:
contents: read
jobs:
dry-run-elasticsearch-indexing:
# Avoid github/docs and forks of it
if: github.repository == 'github/docs-internal'
runs-on: ubuntu-20.04-xl
steps:
- uses: getong/elasticsearch-action@95b501ab0c83dee0aac7c39b7cea3723bef14954
with:
elasticsearch version: '8.2.0'
host port: 9200
container port: 9200
host node port: 9300
node port: 9300
discovery type: 'single-node'
- name: Checkout
uses: actions/checkout@dcd71f646680f2efd8db4afa5ad64fdcba30e748
with:
lfs: 'true'
- name: Check out LFS objects
run: git lfs checkout
- name: Setup node
uses: actions/setup-node@1f8c6b94b26d0feae1e387ca63ccbdc44d27b561
with:
node-version: 16.15.x
cache: npm
- name: Install dependencies
run: npm ci
- name: Cache nextjs build
uses: actions/cache@48af2dc4a9e8278b89d7fa154b955c30c6aaab09
with:
path: .next/cache
key: ${{ runner.os }}-nextjs-${{ hashFiles('package*.json') }}
- name: Run build scripts
run: npm run build
- name: Start the server in the background
env:
ENABLE_DEV_LOGGING: false
run: |
npm run sync-search-server > /tmp/stdout.log 2> /tmp/stderr.log &
# first sleep to give it a chance to start
sleep 6
curl --retry-connrefused --retry 4 -I http://localhost:4002/
- if: ${{ failure() }}
name: Debug server outputs on errors
run: |
echo "____STDOUT____"
cat /tmp/stdout.log
echo "____STDERR____"
cat /tmp/stderr.log
- name: Scrape records into a temp directory
env:
# If a reusable, or anything in the `data/*` directory is deleted
# you might get a
#
# RenderError: Can't find the key 'site.data.reusables...' in the scope
#
# But that'll get fixed in the next translation pipeline. For now,
# let's just accept an empty string instead.
THROW_ON_EMPTY: false
run: |
mkdir /tmp/records
npm run sync-search-indices -- \
--language en \
--version dotcom \
--out-directory /tmp/records \
--no-compression --no-lunr-index
ls -lh /tmp/records
# Serves two purposes;
# 1. Be confident that the Elasticsearch server start-up worked at all
# 2. Sometimes Elasticsearch will bind to the port but still not
# technically be ready. By using `curl --retry` we can know it's
# also genuinely ready to use.
- name: Ping Elasticsearch
run: curl --retry-connrefused --retry 5 -I http://localhost:9200/
- name: Index some
env:
ELASTICSEARCH_URL: 'http://localhost:9200'
run: |
./script/search/index-elasticsearch.js --verbose \
-l en \
-V dotcom -- /tmp/records
- name: Show created indexes and aliases
run: |
curl http://localhost:9200/_cat/indices?v
curl http://localhost:9200/_cat/aliases?v