Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the DVC image diff workflow to support side-by-side comparison of modified images #1219

Merged
merged 4 commits into from
Apr 21, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 36 additions & 13 deletions .github/workflows/dvc-diff.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ jobs:
- name: Setup continuous machine learning (CML)
uses: iterative/setup-cml@v1.0.0

- name: Pull image data from cloud storage
run: dvc pull --remote upstream

# Produce the markdown diff report, which should look like:
# ## Summary of changed images
#
Expand All @@ -32,32 +35,52 @@ jobs:
# | Status | Path |
# |----------|-------------------------------------|
# | added | pygmt/tests/baseline/test_image.png |
- name: Put list of images that were added or changed into report
# | deleted | pygmt/tests/baseline/test_image2.png |
# | modified | pygmt/tests/baseline/test_image3.png |
- name: Generate the image diff report
env:
repo_token: ${{ secrets.GITHUB_TOKEN }}
id: image-diff
run: |
echo -e "## Summary of changed images\n" > report.md
echo -e "This is an auto-generated report of images that have changed on the DVC remote\n" >> report.md
dvc diff --show-md master HEAD >> report.md
cat report.md

- name: Pull image data from cloud storage
run: dvc pull --remote upstream

- name: Put image diff(s) into report
env:
repo_token: ${{ secrets.GITHUB_TOKEN }}
id: image-diff
run: |
# Get just the filename of the changed image from the report
awk 'NF==5 && NR>=7 {print $4}' report.md > diff_files.txt
# Get just the filename of the added and modified image from the report
awk 'NF==5 && NR>=7 && $2=="added" {print $4}' report.md > added_files.txt
awk 'NF==5 && NR>=7 && $2=="modified" {print $4}' report.md > modified_files.txt

# Append each image to the markdown report
echo -e "## Image diff(s)\n" >> report.md
echo -e "<details>\n" >> report.md

# Added images
echo -e "### Added images\n" >> report.md
while IFS= read -r line; do
echo -e "- $line \n" >> report.md
cml-publish --title $line --md "$line" >> report.md < /dev/null
done < diff_files.txt
done < added_files.txt

# Modified images
echo -e "### Modified images\n" >> report.md
# Upload new images
while IFS= read -r line; do
cml-publish --title $line --md "$line" > modified_images_new.md < /dev/null
done < modified_files.txt

# Pull images in the master branch from cloud storage
git checkout master
dvc pull --remote upstream
# Upload old images
while IFS= read -r line; do
cml-publish --title $line --md "$line" > modified_images_old.md < /dev/null
done < modified_files.txt

# Append image report for modified images
echo -e "| Path | Old | New |" >> report.md
echo -e "|---|---|---|" >> report.md
paste modified_files.txt modified_images_old.md modified_images_new.md -d"|" |
awk -F"|" 'function basename(file) {sub(".*/", "", file); return file} {printf("| %s | %s | %s |\n", basename($1), $2, $3)}' >> report.md

echo -e "</details>\n" >> report.md

Expand Down