Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CU-8695hghww backwards compatibility workflow #478

Merged
merged 9 commits into from
Nov 1, 2024
2 changes: 2 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ jobs:
timeout 25m python -m unittest ${second_half_nl[@]}
- name: Regression
run: source tests/resources/regression/run_regression.sh
- name: Model backwards compatibility
run: source tests/resources/model_compatibility/check_backwards_compatibility.sh
- name: Get the latest release version
id: get_latest_release
uses: actions/github-script@v6
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# CONSTANTs/ shouldn't change
REGRESSION_MODULE="medcat.utils.regression.regression_checker"
REGRESSION_OPTIONS="--strictness STRICTEST --require-fully-correct"

# CHANGABLES
# target models
DL_LINK="https://cogstack-medcat-example-models.s3.eu-west-2.amazonaws.com/medcat-example-models/all_fake_medcat_models.zip"
ZIP_FILE_NAME="all_fake_medcat_models.zip"
# target regression set
REGRESSION_TEST_SET="tests/resources/regression/testing/test_model_regresssion.yml"
# folder to house models under test
MODEL_FOLDER="fake_models"

# START WORK

echo "Downloading models"
wget $DL_LINK
# Create folder if it doesn't exit
if [ ! -d "$MODEL_FOLDER" ]; then
mkdir "$MODEL_FOLDER"
CREATED=1
Copy link
Member

@tomolopolis tomolopolis Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty defensive? would it matter if this dir is removed between runs? Maybe just do:

mkdir -p $MODEL_FOLDER
unzip $ZIP_FILE_NAME -d $MODEL_FOLDER
...
rm -r  $MODEL_FOLDER

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that if the model existed before, then it was "meant to be there" and may have things that are expected in there. And I didn't want to remove other stuff that's (potentially) in there at the end.

With that said, you're probably right - there shouldn't really be a reason for that to be the case.

else
# mark to NOT remove if folder already existed
CREATED=0
fi
echo "Uncompressing files"
unzip $ZIP_FILE_NAME -d $MODEL_FOLDER
echo "Cleaning up the overall zip"
rm $ZIP_FILE_NAME
for model_path in `ls $MODEL_FOLDER/*.zip`; do
if [ -f "$model_path" ]; then
echo "Processing $model_path"
python -m $REGRESSION_MODULE \
"$model_path" \
$REGRESSION_TEST_SET \
$REGRESSION_OPTIONS
# this is a sanity check - needst to run after so that the folder has been created
grep "MedCAT Version" "${model_path%.*}/model_card.json"
# clean up here so we don't leave both the .zip'ed model
# and the folder so we don't fill the disk
echo "Cleaning up at: ${model_path%.*}"
rm -rf ${model_path%.*}*
else
echo "No files found matching the pattern: $file"
fi
done

# Remove the folder if it was created by the script
if [ $CREATED -eq 1 ]; then
rm -r "$MODEL_FOLDER"
fi
Loading