Improve viewFeedback output parsing, error robustness #1913

damianhxy · 2023-06-02T10:01:43Z

Summary

Summary generated by Reviewpad on 02 Jun 23 11:14 UTC

This pull request modifies app/controllers/assessments_controller.rb and app/views/assessments directory. It adds a TODO comment in a method and refactors another method to use rstrip instead of chomp. Moreover, it modifies app/views/assessments/_results_panel.html.erb to handle score values that cannot be parsed and app/views/assessments/_semantic.html.erb to check if @scoreHash is nil before accessing it.

Description

Strip trailing lines before parsing score and feedback on viewFeedback page
Rename valid_json? to valid_json_hash?, add check that we parsed a hash
Cast scores to float, add error handling
Hide Score section for semantic feedback if score JSON could not be parsed

Motivation and Context

Issue 1: If the Autograder output has trailing newlines, semantic feedback options / score JSON are not parsed.
Consequence: Semantic feedback is not triggered when it should be, and results panel does not show all scores (it falls back to showing the current problem's score)

Issue 2: valid_json? does not check that we parsed a hash
Consequence: A scalar like 2 or true would pass the check, and we get an error when we try to perform hash operations on it afterwards.

Issue 3: Insufficient casting / error handling for scores
Consequence: Scores in strings are not opportunistically cast to floats. Scores of non-numeric types prevent the page from loading.

Fixes #1718
Fixes #1877
Incorporates cg2v@f112dba (with a bugfix)

How Has This Been Tested?

Import the following assessment: randomlab_20230602.tar.zip

For each of the following tests, you can submit any file, it simply assigns 6 random scores from 0 to 99.

For each test, the new output code will be provided. Update driver.sh and recompile autograde.tar accordingly (e.g. expand autograde.tar, edit random/driver.sh, run gtar -cvf autograde.tar random), and upload it to the assessment. gtar is gnu-tar, available via brew.

Test 1: Normal Case

Before + After

Test 2: Semantic UI + Trailing new lines

echo "{\"_presentation\": \"semantic\"}"
echo "{\"scores\": {\"Problem 1\": $rand1, \"Problem 2\": $rand2, \"Problem 3\": $rand3, \"Problem 4\": $rand4, \"Problem 5\": $rand5, \"Problem 6\": $rand6}}"
echo ""

Before

After

Test 3: Scalars

echo "2"
echo "{\"scores\": {\"Problem 1\": $rand1, \"Problem 2\": $rand2, \"Problem 3\": $rand3, \"Problem 4\": $rand4, \"Problem 5\": $rand5, \"Problem 6\": $rand6}}"

Before

After

Test 4: Score JSON with strings

echo "{\"scores\": {\"Problem 1\": \"$rand1\", \"Problem 2\": $rand2, \"Problem 3\": $rand3, \"Problem 4\": $rand4, \"Problem 5\": $rand5, \"Problem 6\": $rand6}}"

Before

After

Test 5: Score JSON with non-numeric

echo "{\"scores\": {\"Problem 1\": true, \"Problem 2\": $rand2, \"Problem 3\": $rand3, \"Problem 4\": $rand4, \"Problem 5\": $rand5, \"Problem 6\": $rand6}}"

Before

After

Test 6: Invalid Score JSON

echo "{\"scores\": {\"Problem 1\": invalid, \"Problem 2\": $rand2, \"Problem 3\": $rand3, \"Problem 4\": $rand4, \"Problem 5\": $rand5, \"Problem 6\": $rand6}}"

Before + After

Test 7: Score JSON with non-numeric + Semantic UI

echo "{\"_presentation\": \"semantic\"}"
echo "{\"scores\": {\"Problem 1\": true, \"Problem 2\": $rand2, \"Problem 3\": $rand3, \"Problem 4\": $rand4, \"Problem 5\": $rand5, \"Problem 6\": $rand6}}"

Before

After

Test 8: Invalid Score JSON + Semantic UI

echo "{\"_presentation\": \"semantic\"}"
echo "{\"scores\": {\"Problem 1\": invalid, \"Problem 2\": $rand2, \"Problem 3\": $rand3, \"Problem 4\": $rand4, \"Problem 5\": $rand5, \"Problem 6\": $rand6}}"

Before

After

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

I have run rubocop for style check. If you haven't, run overcommit --install && overcommit --sign to use pre-commit hook for linting
My change requires a change to the documentation, which is located at Autolab Docs
I have updated the documentation accordingly, included in this PR

Other issues / help required

Currently, the viewFeedback page re-parses the score JSON, which could display the wrong score if the parseAutoresult or modifySubmissionScores hooks were used by the lab author.

In another PR, we ought to make use of the scores directly, rather than parsing them again.

damianhxy · 2023-06-02T11:05:03Z

@cg2v This PR incorporates / addresses two of the patches you apply to CMU prod, I was wondering if you had any thoughts on this?

For your patch on JSON scalars, it seems like the check does not work since hash is not defined. Here, I check the return value of JSON.parse instead.

For your patch on not parsing semantic feedback unless long, it is no longer needed since this PR performs the rstrip beforehand, and no longer calls chomp after indexing into lines.

Also, if there are any other patches you would like to merge upstream, I'd be happy to take a look at them! Some of them look really useful, such as the one preventing the scheduler race condition, workaround for visual run, and filtering out of sensitive variables.

cg2v · 2023-06-02T16:56:24Z

#1463 changed the context in valid_json? since I first wrote that and I didn't notice. I am surprised that it has not caused any exceptions. This change is fine.

The problem with the short or empty output wasn't that nil was being fed to chomp, but that the array offset lines.length-2 could be negative

damianhxy · 2023-06-02T17:02:23Z

The problem with the short or empty output wasn't that nil was being fed to chomp, but that the array offset lines.length-2 could be negative

From my understanding, in ruby, nil is returned for an out of range access, so a negative offset should be fine

20wildmanj

Went through testing scheme, everything works according to guide, LGTM

Sidenote: seems like when true gets fed into autograding output, although the parse failure error happens, the score given for that problem is 1, not 0 (probably b/c true casts to 1?)

damianhxy · 2023-06-14T17:48:02Z

Sidenote: seems like when true gets fed into autograding output, although the parse failure error happens, the score given for that problem is 1, not 0 (probably b/c true casts to 1?)

Discussed offline - basically a side effect of #1914. Since we are re-parsing scores instead of making use of the final problem score, this is due to differences between parseScore in assessments_controller (cast to float using to_f) and saveAutograde in assessment_autograde_core (saved directly to db, which apparently interprets it as 1).

* rstrip feedback in pareScore and parseFeedback * Validate that feedback is a hash * Cast scores to floats, add error messages * Hide scores if score JSON can not be parsed * Handle booleans in score hash * Add TODO * Update comment * Remove unnecessary TypeError * Handle non-numeric score in result sidebar (cherry picked from commit 5e33614)

damianhxy added 6 commits June 2, 2023 16:40

rstrip feedback in pareScore and parseFeedback

1097765

Validate that feedback is a hash

1690470

Cast scores to floats, add error messages

60d4785

Hide scores if score JSON can not be parsed

53ab915

Handle booleans in score hash

1526b70

Add TODO

a1d48ff

reviewpad bot added the small Pull request is small label Jun 2, 2023

damianhxy marked this pull request as ready for review June 2, 2023 10:59

reviewpad bot requested a review from evanyeyeye June 2, 2023 11:00

reviewpad bot added the waiting-for-review label Jun 2, 2023

damianhxy added 3 commits June 2, 2023 19:13

Update comment

28b2522

Remove unnecessary TypeError

17473a0

Handle non-numeric score in result sidebar

0caf31d

damianhxy requested a review from jlge June 5, 2023 17:20

damianhxy changed the title ~~Improve viewFeedback page error handling~~ Improve viewFeedback output parsing, error robustness Jun 8, 2023

20wildmanj approved these changes Jun 14, 2023

View reviewed changes

reviewpad bot removed the waiting-for-review label Jun 14, 2023

damianhxy added this pull request to the merge queue Jun 14, 2023

Merged via the queue into master with commit 5e33614 Jun 14, 2023

damianhxy deleted the semantic-feedback-fixes branch June 14, 2023 17:53

damianhxy removed request for evanyeyeye and jlge June 15, 2023 09:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve viewFeedback output parsing, error robustness #1913

Improve viewFeedback output parsing, error robustness #1913

damianhxy commented Jun 2, 2023 •

edited

Loading

damianhxy commented Jun 2, 2023 •

edited

Loading

cg2v commented Jun 2, 2023

damianhxy commented Jun 2, 2023

20wildmanj left a comment

damianhxy commented Jun 14, 2023

Improve viewFeedback output parsing, error robustness #1913

Improve viewFeedback output parsing, error robustness #1913

Conversation

damianhxy commented Jun 2, 2023 • edited Loading

Summary

Summary generated by Reviewpad on 02 Jun 23 11:14 UTC

Description

Motivation and Context

How Has This Been Tested?

Test 1: Normal Case

Test 2: Semantic UI + Trailing new lines

Test 3: Scalars

Test 4: Score JSON with strings

Test 5: Score JSON with non-numeric

Test 6: Invalid Score JSON

Test 7: Score JSON with non-numeric + Semantic UI

Test 8: Invalid Score JSON + Semantic UI

Types of changes

Checklist:

Other issues / help required

damianhxy commented Jun 2, 2023 • edited Loading

cg2v commented Jun 2, 2023

damianhxy commented Jun 2, 2023

20wildmanj left a comment

Choose a reason for hiding this comment

damianhxy commented Jun 14, 2023

damianhxy commented Jun 2, 2023 •

edited

Loading

damianhxy commented Jun 2, 2023 •

edited

Loading