Support non-ascii feedback #1384

fanpu · 2021-09-20T16:30:20Z

Description

See #975, issue recently resurfaced again with one of the courses in CMU

Motivation and Context

Confusing for students when no feedback is returned

How Has This Been Tested?

Test with the following file:

#include <stdio.h>
int main() {
  // non-ascii semicolon
  puts("hello")；
}

This will cause the compiler to output non-ASCII output as an error. With this PR Autolab will be able to process and return the feedback successfully.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation, which is located at Autolab Docs
I have updated the documentation accordingly, included in this PR

Other issues / help required

If unsure, feel free to submit first and we'll help you along.

xinyis991105

Tested by submitting hello.c with a non-ASCII character. Autograding didn't hang. The output is as expected, indicating the error. Thanks for the quick fix!

cg2v · 2021-09-21T15:08:29Z

Why are both changes needed? It seems that either change alone would allow output that's encoded as UTF-8 to be written to the file. It's not obvious to me what the force_encoding change will do to output that has stray binary characters in it (I've seen this in proxylab logs)

fanpu · 2021-09-21T16:32:53Z

With just the "wb" change, the following exception is still thrown from the write call:

Exception in autograde_done: Encoding::UndefinedConversionError ("\xEF" from ASCII-8BIT to UTF-8)
Completed 204 No Content in 159ms (ActiveRecord: 4.1ms)

You're right though that just the force_encoding change would be sufficient

fanpu · 2021-09-21T16:39:44Z

It's not obvious to me what the force_encoding change will do to output that has stray binary characters in it (I've seen this in proxylab logs)

I'll keep that in mind, but since Autolab currently would not even be able to return anything, even if they do see gibberish UTF8 logs (I'm assuming this is due to bugs or logging binary content that their proxy is serving) I think that's fine.

cg2v · 2021-09-24T18:31:32Z

I think I'm going to go with this instead:

    feedback.force_encoding("UTF-8")
    if not feedback.valid_encoding?
       feedback.force_encoding("Windows-1252")
       hexify = Proc.new {|c| "\\x" + c.bytes[0].to_s(16) }
       feedback = feedback.encode("UTF-8", invalid: :replace, fallback: hexify)

Which will replace nulls and other non-iso characters with \xNN, but not all 8 bit chars.

The question is do people prefer seeing

\x8C\x8D\x8E\x8F\x90\x91\x92\x93\x94\x95\x96\x97\x98
or
Œ\x8dŽ\x8f\x90‘’“”•–—˜

Different situations might want different results, but we can't really tell from the content

fanpu · 2021-09-24T18:37:18Z

In most cases I would say the former, I'll submit another PR with your fix which I agree is better. Thanks so much!

cg2v · 2021-09-24T18:46:21Z

if you want all hex, then change feedback.force_encoding("Windows-1252") to feedback.force_encoding("ASCII-8BIT")

cg2v · 2021-09-24T18:54:38Z

Also note that I had python on the brain when I wrote that and it is missing an end

Support working with non-utf8 feedback

b0ead16

fanpu mentioned this pull request Sep 20, 2021

Do not fail on receiving non-ascii from Tango #975

Closed

xinyis991105 self-requested a review September 20, 2021 23:32

xinyis991105 approved these changes Sep 20, 2021

View reviewed changes

fanpu merged commit 13bbbdb into master Sep 20, 2021

fanpu deleted the feedback-file-nonascii-fix branch September 20, 2021 23:36

fanpu mentioned this pull request Sep 27, 2021

Better output for non-ascii autograder feedback #1392

Merged

5 tasks

ugogon pushed a commit to ugogon/Autolab that referenced this pull request Oct 13, 2021

Support working with non-utf8 feedback (autolab#1384)

e5e5db7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support non-ascii feedback #1384

Support non-ascii feedback #1384

fanpu commented Sep 20, 2021

xinyis991105 left a comment

cg2v commented Sep 21, 2021

fanpu commented Sep 21, 2021 •

edited

Loading

fanpu commented Sep 21, 2021

cg2v commented Sep 24, 2021 •

edited

Loading

fanpu commented Sep 24, 2021

cg2v commented Sep 24, 2021

cg2v commented Sep 24, 2021

Support non-ascii feedback #1384

Support non-ascii feedback #1384

Conversation

fanpu commented Sep 20, 2021

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

Other issues / help required

xinyis991105 left a comment

Choose a reason for hiding this comment

cg2v commented Sep 21, 2021

fanpu commented Sep 21, 2021 • edited Loading

fanpu commented Sep 21, 2021

cg2v commented Sep 24, 2021 • edited Loading

fanpu commented Sep 24, 2021

cg2v commented Sep 24, 2021

cg2v commented Sep 24, 2021

fanpu commented Sep 21, 2021 •

edited

Loading

cg2v commented Sep 24, 2021 •

edited

Loading