Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

Optionally allow user provisioning failures #3398

Conversation

haikuginger
Copy link
Contributor

@haikuginger haikuginger commented Sep 28, 2016

When using the user management playbook in the context of an OpenEdX instance with persistent databases stored on another server, we discovered that we may run into errors when managing users if the user's email address has been changed since the initial provisioning. Since this is expected, we'd like to optionally set a variable while provisioning that determines whether we fail at provisioning if user account creation fails.

In other words, if we have a persistent database that has previously been fully provisioned, then a failure of the management command is likely due to an acceptable reason, and we can safely ignore it.

Dependencies: None

Partner information: 3rd party-hosted open edX instance

Testing instructions:

  1. On a devstack, pull this version of configuration.

  2. Run the following command:

    ansible-playbook -i "localhost," manage_edxapp_users_and_groups.yml --extra-vars '{"django_users":[{"username":"staff","email":"fake-staff@example.com"}, {"username":"honor","email":"honor@example.com"}], "django_groups":[], "deployment_settings":"devstack"}' -c local

  3. Observe that the playbook fails (ends with the message: "FATAL: all hosts have already failed -- aborting").

  4. Run the following command:

    ansible-playbook -i "localhost," manage_edxapp_users_and_groups.yml --extra-vars '{"django_users":[{"username":"staff","email":"fake-staff@example.com"}, {"username":"honor","email":"honor@example.com"}], "django_groups":[], "deployment_settings":"devstack", "ignore_user_creation_errors":"yes"}' -c local

  5. Observe that there is an "...ignoring" note under the failed play, and that the final report does not show any failures.

Author notes and concerns:

  1. The previous version of the playbook was statically tied to settings=aws. To make testing this change in a devstack feasible, it was necessary to add a variable to change that to have an arbitrary value (e.g., settings=devstack or settings=openstack).
  2. While it would have been nice to simply set a variable in ignore_errors on a single play, this does not appear to be possible with the current version of Ansible. As a result, it was necessary to create an additional play that fails if the user creation play failed AND ignore_user_creation_errors isn't True.

Reviewers

@openedx-webhooks
Copy link

Thanks for the pull request, @haikuginger! It looks like you're a member of a company that does contract work for edX. If you're doing this work as part of a paid contract with edX, you should talk to edX about who will review this pull request. If this work is not part of a paid contract with edX, then you should ensure that there is an OSPR issue to track this work in JIRA, so that we don't lose track of your pull request.

To automatically create an OSPR issue for this pull request, just visit this link: https://openedx-webhooks.herokuapp.com/github/process_pr?number=3398&repo=edx%2Fconfiguration

@openedx-webhooks
Copy link

Thanks for the pull request, @haikuginger! I've created OSPR-1477 to keep track of it in JIRA. JIRA is a place for product owners to prioritize feature reviews by the engineering development teams.

Feel free to add as much of the following information to the ticket:

  • supporting documentation
  • edx-code email threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will still be done via the GitHub pull request interface. As a reminder, our process documentation is here.

If you like, you can add yourself to the AUTHORS file for this repo, though that isn't required. Please see the CONTRIBUTING file for more information.

@openedx-webhooks openedx-webhooks added needs triage open-source-contribution PR author is not from Axim or 2U labels Sep 28, 2016
Copy link
Contributor

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haikuginger Your change works as described, but I don't like the duplication of logic required to work around the ignore_errors variable bug in ansible.

Please see my alternative suggestion. I'm open to debate on whether it's clarifies and de-risks the behaviour or not.

Also, there's an unnecessary --extra-vars variable defined in your PR description; please remove it:

"ignore_user_creation_failures":true

{% if item.get('initial_password_hash') %}--initial-password-hash {{ item.initial_password_hash | quote }}{% endif %}
with_items: django_users
ignore_errors: yes
when: ignore_user_creation_errors | bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like the duplication of logic here.. seems to be asking for trouble down the road.

I found an alternative way of throwing conditional errors. Do you think this way is less confusing?

    - name: Manage users
      shell: >
        {{ python_path }} {{ manage_path }} lms --settings={{ deployment_settings|default('aws') }}
        manage_user {{ item.username | quote }} {{ item.email | quote }}
        {% if item.get('groups', []) | length %}--groups {{ item.groups | default([]) | map('quote') | join(' ') }}{% endif %}
        {% if item.get('remove') %}--remove{% endif %}
        {% if item.get('superuser') %}--superuser{% endif %}
        {% if item.get('staff') %}--staff{% endif %}
        {% if item.get('unusable_password') %}--unusable-password{% endif %}
        {% if item.get('initial_password_hash') %}--initial-password-hash {{ item.initial_password_hash | quote }}{% endif %}
      register: manage_users_result
      ignore_errors: True
      with_items: django_users

    - name: "Manage users fails on error unless {{ ignore_user_creation_errors }}"
      fail: item
      when:
      - item|failed
      - not ignore_user_creation_errors | bool
      with_items: manage_users_result.results

Would merit a nice comment explaining that the ignore_errors setting won't take a variable, and which ansible version fixes the issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. Reimplementing that way.

tasks:
- name: Manage groups
shell: >
{{ python_path }} {{ manage_path }} lms --settings=aws
{{ python_path }} {{ manage_path }} lms --settings={{ deployment_settings|default('aws') }}
Copy link
Contributor

@pomegranited pomegranited Sep 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please:

  • Add deployment_settings to the list of vars above.
  • Reference the existing EDXAPP_SETTINGS variable, falling back to aws as default.
  • Remove the default value from each usage in the tasks list.
    vars:
      python_path: /edx/bin/python.edxapp
      manage_path: /edx/bin/manage.edxapp
      ignore_user_creation_errors: no
      deployment_settings: "{{ EDXAPP_SETTINGS | default('aws') }}"
...
      shell: >
          {{ python_path }} {{ manage_path }} lms --settings={{ deployment_settings }}
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to note that, rather than a role that's called as part of any standard EdX playbook, so while I'm happy to change to using that variable as opposed to an alternative, I'm not sure it actually increases ease-of-use on its own aside from being more "canonical".

- name: Manage users
shell: >
{{ python_path }} {{ manage_path }} lms --settings=aws
{{ python_path }} {{ manage_path }} lms --settings={{ deployment_settings|default('aws') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment re deployment_settings, and remove default value.

@openedx-webhooks openedx-webhooks added waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. and removed needs triage labels Sep 29, 2016
@haikuginger
Copy link
Contributor Author

@pomegranited, most of those changes have been made. I've confirmed that this is not an issue as of Ansible 2.1.0, but have not investigated further on that front.

@pomegranited
Copy link
Contributor

Thanks for making those changes @haikuginger! 👍

One minor thing left with the PR instructions - "notes and concerns" point #2 needs updating with the new workaround.

  • I tested this on my devstack, using the PR instructions.
  • I read through the code.
  • I checked for accessibility issues - accessibility is not affected.
  • Includes documentation - explains the bug in ansible it's working around.

@gsong
Copy link
Contributor

gsong commented Oct 3, 2016

@haikuginger Did you already make the changes in the PR instructions per @pomegranited? If so, I'll move this one along the review track.

@haikuginger
Copy link
Contributor Author

@gsong, yes, this should be ready to go.

@openedx-webhooks openedx-webhooks added awaiting prioritization and removed waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. labels Oct 3, 2016
@gsong
Copy link
Contributor

gsong commented Oct 18, 2016

@edx/devops Is this something for you to review in one of your sprints?

@maxrothman
Copy link
Contributor

@fredsmith let's try to get the corresponding OSPR ticket into a sprint.

@gsong
Copy link
Contributor

gsong commented Oct 27, 2016

@maxrothman @fredsmith Any updates on https://openedx.atlassian.net/browse/OSPR-1477? Let @haikuginger or me know if there's anything that needs to be done.

@gsong
Copy link
Contributor

gsong commented Nov 15, 2016

@edx/devops bump

@maxrothman
Copy link
Contributor

Out of curiosity, have you tested a combination of register and failed_when rather than ignore_errors and a separate task?

@haikuginger
Copy link
Contributor Author

@maxrothman, we discussed that option internally, and came to the conclusion that, because this is running a manage.py command, parsing the message and allowing the failure only for the specific relevant error messages would add unnecessary complexity.

@feanil
Copy link
Contributor

feanil commented Nov 17, 2016

👍 looks good to me. Go ahead an merge if it looks good to @maxrothman

@maxrothman
Copy link
Contributor

maxrothman commented Nov 17, 2016

I'm 👍 , but I'm surprised you can't just move the when in the second task into a failed_when on the first.

@haikuginger
Copy link
Contributor Author

@maxrothman, I may have misunderstood your intent. I'll double-check that method before merging to be sure.

@openedx-webhooks openedx-webhooks added waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc. and removed awaiting prioritization labels Nov 18, 2016
@feanil
Copy link
Contributor

feanil commented Nov 23, 2016

@haikuginger any reason this hasn't merged yet?

@haikuginger
Copy link
Contributor Author

@feanil, sorry, it's been a busy week, and I haven't gotten around to doing the investigation into @maxrothman's question. I can squash and merge as-is if you'd like.

@feanil
Copy link
Contributor

feanil commented Nov 23, 2016

@haikuginger up to you. If you plan on making this better, then by all means take your time. I just wanted to make sure it was still active.

@haikuginger haikuginger force-pushed the haikuginger/allow-user-provisioning-failures branch from 0300988 to 19a468b Compare November 28, 2016 22:03
@haikuginger
Copy link
Contributor Author

@maxrothman, I took a second look, and your suggestion paid off to the tune of several removed lines of code. I've tested this new version, and it works fine; if you and @feanil want to make another pass, that'd be great.

@maxrothman
Copy link
Contributor

Awesome! 👍

Copy link
Contributor

@feanil feanil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@haikuginger haikuginger force-pushed the haikuginger/allow-user-provisioning-failures branch from 19a468b to 102c731 Compare November 30, 2016 15:43
@haikuginger
Copy link
Contributor Author

Squashed from 19a468b.

@haikuginger haikuginger merged commit 7628a31 into openedx-unsupported:master Nov 30, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
open-source-contribution PR author is not from Axim or 2U waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants