Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unclear semantics of debug_mode in /ad, /pub route #453

Open
jacob-lee opened this issue Oct 29, 2020 · 11 comments
Open

unclear semantics of debug_mode in /ad, /pub route #453

jacob-lee opened this issue Oct 29, 2020 · 11 comments

Comments

@jacob-lee
Copy link
Collaborator

the /ad, /pub route includes a confusing bit of code:

mode = request.args['mode']
if hit_id[:5] == "debug":
    debug_mode = True
else:
    debug_mode = False

The confusion here is that request.args['mode'] is set to 'debug' when a task is launched in debug mode (do_debug). So it seems like pulling it out of the hit_id is redundant. But, this difference has consequences. In test_repeat_experiment_fail, the request to /ad is:

request = "&".join([
            "assignmentId=%s" % self.assignment_id,
            "workerId=%s" % self.worker_id,
            "hitId=%s" % self.hit_id,
            "mode=debug"])

Note, there is no debug in the hitId. This means that the /ad, /pub route sets debug_mode = False, despite the mode=debug in the request args. Note, that if you force it to run in debug mode (by prefacing the hitId with debug), the test fails, because it doesn't raise the 1010 experiment error that it expects.

So two questions:

  1. Is there a intended difference between mode=debug and debug_mode=True? Or is this an artifact of development history?
  2. When debugging code locally, should requesting the ad after completing/submitting raise the 1010? Is that the intended behavior?
@deargle
Copy link
Collaborator

deargle commented Oct 29, 2020 via email

@jacob-lee
Copy link
Collaborator Author

Just fyi, I am working on an alternative implementation of the route. It currently fails several tests, but that is because my code uses requests.args['mode'] directly to test whether it is in debug mode.

Currently that implementation is as follows:

@app.route('/ad', methods=['GET'])
@app.route('/pub', methods=['GET'])
@nocache
def advertisement():
    if not check_browser(request.user_agent.string):
        app.logger.warning(f'Browser type not allowed: {request}')
        raise ExperimentError('browser_type_not_allowed')

    if not all(k in request.args for k in ['hitId', 'assignmentId', 'mode']):
        app.logger.error(f'Request args to /ad incomplete: {request.args}')
        raise ExperimentError('ad_args_not_set')

    hit_id = request.args['hitId']
    assignment_id = request.args['assignmentId']
    mode = request.args['mode']
    worker_id = request.args['workerId'] if 'worker_id' in request.args else None
    allow_repeats = CONFIG.getboolean('Task Parameters', 'allow_repeats')

    # Just looking at the ad
    if assignment_id == 'ASSIGNMENT_ID_NOT_AVAILABLE':
        return render_template('ad.html', pop_sub_url='')

    # In debug mode we always just show the ad
    if mode == 'debug':
        return render_template(
            'ad.html',
            pop_sub_url=f'consent?hitId={hit_id}'
                        f'&assignmentId={assignment_id}'
                        f'&workerId={worker_id}'
                        f'&mode={mode}'
        )

    # Short-circuit attempted repeaters when repeating disallowed
    if not allow_repeats:
        try:
            nrecords = Participant.query. \
                filter(Participant.assignmentid != assignment_id). \
                filter(Participant.workerid == worker_id). \
                count()
        except sqlalchemy.exc.SQLAlchemyError:
            app.logger.error('Error counting number records.', exc_info=True)
            raise ExperimentError('unknown_error')
        if nrecords > 0:
            raise ExperimentError('already_did_exp_hit')

    # Anything past this point satisfies all of the following:
    #   - NOT just-looking
    #   - NOT debug
    #   - (allow-repeats OR (NOT allow-repeats AND num-records == 0)

    try:
        result = Participant.query.\
            filter(Participant.assignmentid == assignment_id). \
            filter(Participant.workerid == worker_id). \
            filter(Participant.hitid == hit_id). \
            one_or_none()
    except (sqlalchemy.exc.SQLAlchemyError,
            sqlalchemy.orm.exc.MultipleResultsFound):
        app.logger.error('Combination of assignment_id, worker_id, '
                         'and hit_id not unique.')
        raise ExperimentError('hit_assign_appears_in_database_more_than_once')

    if result:
        if result.status == STARTED or result.status == QUITEARLY:
            raise ExperimentError('already_started_exp_mturk')
        elif result.status == COMPLETED or result.status == SUBMITTED:
            return render_template(
                'thanks-mturksubmit.html',
                using_sandbox=(mode == "sandbox"),
                hitid=hit_id,
                assignmentid=assignment_id,
                workerid=worker_id
            )
        else:
            app.logger.error(
                f'Assignment {assignment_id} exists for hitid '
                f'{hit_id} and worker {worker_id} but status '
                f'{result.status} is unexpected for an ad request.')
            raise ExperimentError('status_incorrectly_set')

    # Just show the damn ad already
    return render_template(
        'ad.html',
        pop_sub_url=f'consent?hitId={hit_id}'
                    f'&assignmentId={assignment_id}'
                    f'&workerId={worker_id}'
                    f'&mode={mode}'
    )

@jacob-lee
Copy link
Collaborator Author

The existing code, and the code above, have race-condition issues that may need to be resolved. I haven't decided how serious a concern it is, or figured out how to prevent it.

@deargle
Copy link
Collaborator

deargle commented Oct 29, 2020 via email

@deargle
Copy link
Collaborator

deargle commented Oct 29, 2020 via email

@jacob-lee
Copy link
Collaborator Author

Sure. Its a particularly critical and gnarly bit of code. I'm taking this one slow.

Part of the problem though is that the intended behavior is implicit in the code, so it can be difficult to tell whether existing behavior is a bug or what was desired (e.g. the debug_mode issue).

@deargle
Copy link
Collaborator

deargle commented Oct 30, 2020 via email

@jacob-lee
Copy link
Collaborator Author

jacob-lee commented Oct 30, 2020

I am floored.

Ok. So, right now, the /exp route sets ad server location to /complete. But it used to do this:

    if use_psiturk_ad_server and (mode == 'sandbox' or mode == 'live'):
        # If everything goes ok here relatively safe to assume we can lookup
        # the ad.
        ad_id = get_ad_via_hitid(hit_id)
        if ad_id != "error":
            if mode == "sandbox":
                ad_server_location = 'https://sandbox.ad.psiturk.org/complete/'\
                    + str(ad_id)
            elif mode == "live":
                ad_server_location = 'https://ad.psiturk.org/complete/' +\
                    str(ad_id)
        else:
            raise ExperimentError('hit_not_registered_with_ad_server')
    else:
        ad_server_location = '/complete'

which sounds exactly what you described.

The revision of the /ad route that I'm working on only checks against mode == debug, so nothing changes there (though I guess I'd prefer that we make this a documented use case explicitly supported in the code).

Do you think that no repeats should be allowed when debugging? I've personally found that to be an annoyance.

@jacob-lee
Copy link
Collaborator Author

mode == custom could direct to a custom completion route.

@jacob-lee
Copy link
Collaborator Author

jacob-lee commented Oct 30, 2020

In start_exp is this bit of code:

        nrecords = 0
        for record in matches:
            other_assignment = False
            if record.assignmentid != assignment_id:
                other_assignment = True
            else:
                nrecords += 1

As written, this is equivalent to:

other_assignment = matches[-1].assignmentid != assignment_id
nrecords = sum([m.assignmentid == assignment_id for m in matches])

I mention this because it seems like other_assignment = False should precede the for-loop.

@jacob-lee
Copy link
Collaborator Author

/exp route has a race condition between the checking of the number of matching records and inserting a new user in the data base. Probably wouldn't happen in normal circumstances, but ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants