These acceptance tests are based on the NERC Operational Use Cases.
Reference for NERC OpenStack: https://nerc-project.github.io/nerc-docs/get-started/user-onboarding-on-NERC/
Acceptance testers will require access to the following applications:
- OpenShift admin access (cluster-admins, nerc-org-admins, nerc-ops groups in OpenShift) to access the Observability dashboards and cluster logging.
- ColdFront admin access, because most OpenShift verification steps amd some of the ColdFront verification steps (delete a user) require admin access.
- VPN access to XDMoD, to view the reports for OpenShift resources.
- Request a new account
- As a new user, I should be able to create an account for myself in the NERC.
- Criteria
- A prospective user follows the steps documented in https://nerc-project.github.io/nerc-docs/get-started/create-a-user-portal-account/ to create an account on the NERC.
- Acceptance tests:
- Check that an OpenShift User exists, with access to the project allocations:
- Create a new user account following the How to Create a User Account documentation.
- The user must accept the Acceptable Use Notice during the sign up process. This is shown to all users and without approval doesn't allow account creation.
- For a given username of
nerc-test-account
for example, check that the given username is listed in theoc
CLI:$ oc get user/nerc-test-account NAME UID FULL NAME IDENTITIES nerc-test-account db6324f8-e3df-4543-a40c-3fecb91b5204
- For a given username of
nerc-test-account
and a namespace of01234567-89ab-cdef-0123-456789abcdef
for example, check the user's RoleBindings exist:$ oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit Name: edit Labels: <none> Annotations: <none> Role: Kind: ClusterRole Name: edit Subjects: Kind Name Namespace ---- ---- --------- User nerc-test-account
- Verify that the given username has a RoleBinding with
Role ref
ofedit
, for the project's namespaces. - Try logging into OpenShift as the user to test the Keycloak authentication.
- Check that an OpenShift User exists, with access to the project allocations:
- Remove a user
- As an administrator, I should be able to remove a user from their projects and allocations.
- Criteria
- A NERC admin can follow the steps documented in What is NERC's ColdFront? to remove a user from a project on the NERC. A NERC admin can also use a private runbook to disable/ deactivate a user from KeyCloak.
- The user can no longer access their project.
- Acceptance tests:
- A NERC admin can follow the steps documented in What is NERC's ColdFront? to remove a user from a project on the NERC.
- Check that the user no longer has access to the project in OpenShift.
- For a given username of
nerc-test-account
for example, Click here to check the user's RoleBindings. - Verify that the given username no longer has a RoleBinding with
Role ref
ofedit
, for the project's namespaces.
- For a given username of
- Add/Remove PI privilege to a user
- For any user account, the administrator should be able to add or remove PI status associated with that account. A user may be a PI on multiple projects, but a project can have only 1 PI.
- Criteria
- User creates a new account as described in section 1. See the ColdFront documentation here.
- User fills out the PI request form.
- To approve the request, NERC admin assigns the user to the PI role on KeyCloak`s user management.
- NERC admin approves the request and responds with a ticket reply.
- To remove the PI role, just reverse the step c. And send out an email to the user informing about it.
- Acceptance tests:
- Create the new user account in ColdFront.
- Fill out the PI request form.
- Assign the USER the PI role in Keycloak.
- Check that an OpenShift User exists, with access to the project allocations:
- Approve the user's request in ColdFront.
- Because ColdFront users and managers will have the same access of
edit
on the project, there is no difference in OpenShift roles between a user and a manager. - For a given username of
nerc-test-account
and a namespace of01234567-89ab-cdef-0123-456789abcdef
for example, check the user's RoleBindings exist:$ oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit Name: edit Labels: <none> Annotations: <none> Role: Kind: ClusterRole Name: edit Subjects: Kind Name Namespace ---- ---- --------- User nerc-test-account
- Verify that the given username has a RoleBinding with
Role ref
ofedit
, for the project's namespaces.
- Check that an OpenShift User exists, with access to the project allocations:
- See the Adding User to Manager Role documentation to also remove the PI role from a user.
- Click on the edit icon next to the user's name on the Project Detail page.
- Then toggle the "Role" from Manager to User.
- Approve the request to remove the PI role from the user in ColdFront.
- Because ColdFront users and managers will have the same access of
edit
on the project, there is no difference in OpenShift roles between a user and a manager. - For a given username of
nerc-test-account
and a namespace of01234567-89ab-cdef-0123-456789abcdef
for example, check the user's RoleBindings exist:$ oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit Name: edit Labels: <none> Annotations: <none> Role: Kind: ClusterRole Name: edit Subjects: Kind Name Namespace ---- ---- --------- User nerc-test-account
- Verify that the given username has a RoleBinding with
Role ref
ofedit
, for the project's namespaces.
- Add a new project
- As a user who is a PI, I should be able to create a project by.
- Criteria
- User has previously been set as a PI.
- User logs in to ColdFront and requests a project by going to Home → Projects and by clicking on
Create Project
. See the ColdFront documentation here. - User requests a resource allocation on the
OpenShift
resource for the above created project. - Administrator approves the request.
- Upon approval, a project will be created in OpenShift for this particular allocation. If the user does not already exist in OpenShift the user will be created. The project created will have the following attributes
- Project name prefixed by a random 6 char hex
- Project ID / namespace as a uuid
- Requested quota attributes.
- The user will be able to authenticate using Keycloak and their institutional login.
- Acceptance tests:
- Setup the user as a PI in ColdFront, see
Add/Remove PI privilege to a user
above. - Log into ColdFront as the user and request a project, See the ColdFront documentation here.
- As the user, request an OpenShift resource allocation in ColdFront.
-
The PI must accept the End User License Agreement for the resource allocation request, for each new resource allocation. (See image below and "Placeholder for EULA" text box for OpenStack. It is only displayed to the PI requesting the allocation at the moment of the request and not to other users that may be later added.)
-
- Log into ColdFront as an admin and approve the request.
- Validate the project was created with the requested quota:
-
For a given project named
012345myproject
for example, check that the given project is listed in theoc
CLI:$ oc get project/012345myproject NAME DISPLAY NAME STATUS 012345myproject Active
-
For a given namespace named
01234567-89ab-cdef-0123-456789abcdef
for example, check that the given namespace is listed in theoc
CLI:$ oc get namespace/01234567-89ab-cdef-0123-456789abcdef NAME STATUS AGE 01234567-89ab-cdef-0123-456789abcdef Active 14m
-
You can explore quotas from within the Observability dashboard. For a given project named
012345myproject
, and resourcelimits.cpu
orlimits.memory
for example, Click here to check thevalue
for the type=hard (max limit) and type=used (current value).
-
- Check that an OpenShift User exists, with access to the project allocations:
- For a given username of
nerc-test-account
for example, check that the given username is listed in theoc
CLI:$ oc get user/nerc-test-account NAME UID FULL NAME IDENTITIES nerc-test-account db6324f8-e3df-4543-a40c-3fecb91b5204
- For a given username of
nerc-test-account
and a namespace of01234567-89ab-cdef-0123-456789abcdef
for example, check the user's RoleBindings exist:$ oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit Name: edit Labels: <none> Annotations: <none> Role: Kind: ClusterRole Name: edit Subjects: Kind Name Namespace ---- ---- --------- User nerc-test-account
- Verify that the given username has a RoleBinding with
Role ref
ofedit
, for the project's namespaces. - Try logging into OpenShift as the user to test the Keycloak authentication.
- For a given username of
- Setup the user as a PI in ColdFront, see
- Deactivate a project or resource allocation
- As an administrator, I should be able to archive any project or resource allocation and release the resources associated with it back to the pool.
- Criteria
- A ColdFront admin can navigate to the project. See the ColdFront documentation here.
- They can archive a project and expire all associated allocations by clicking
archive project
by navigating to the project and clickingarchive project
. - They can navigate to an allocation, set the status to
Denied
, and update the allocation
- They can archive a project and expire all associated allocations by clicking
- Disabling an allocation will delete the associated OpenShift namespace, which differs from OpenStack behavior which simply disables the project.
- A ColdFront admin can navigate to the project. See the ColdFront documentation here.
- Acceptance tests:
- Validate that the project allocations have been removed from the project users and managers.
- As an admin in ColdFront, archive the project.
- For a given username of
nerc-test-account
and a namespace of01234567-89ab-cdef-0123-456789abcdef
for example, check the user's RoleBindings have been removed:- As an admin in ColdFront, set the allocation status to
Denied
, and update the allocation. - Check that the RoleBinding no longer exists:
$ oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit Name: edit Labels: <none> Annotations: <none> Role: Kind: ClusterRole Name: edit Subjects: Kind Name Namespace ---- ---- --------- ```
- As an admin in ColdFront, set the allocation status to
- Validate the project was deleted, as well as the namespaces:
- As an admin in ColdFront, disable the allocation for the project.
- For a given project named
012345myproject
for example, check that the given project is no longer listed in theoc
CLI:$ oc get project/012345myproject NAME DISPLAY NAME STATUS
- For a given namespace named
01234567-89ab-cdef-0123-456789abcdef
for example, check that the given namespace is no longer listed in theoc
CLI:$ oc get namespace/01234567-89ab-cdef-0123-456789abcdef NAME STATUS AGE
- Validate that the project allocations have been removed from the project users and managers.
- Manage a project as a PI.
- As a PI, I should be able to manage and share my project with others on the team, but no one except the the administrator should be able to remove the project.
- Criteria
- A PI can add keycloak users to a ColdFront project under the
users
section in the given project (https://nerc-project.github.io/nerc-docs/get-started/get-an-allocation/#adding-and-removing-user-from-the-project) - From here, a PI can set the user to a particular role.
- The
manager
role has anedit
role to the project, and is the one that lets users create and remove allocations by delegating PI role/responsibilities in ColdFront. - The
user
role also has anedit
role to the project, but cannot create and remove allocations.
- The
- A PI can add keycloak users to a ColdFront project under the
- Acceptance tests:
- Because ColdFront gives the same
edit
role to a Manager and a User, you can expect all users and PIs in a project to share the same role. For a given namespace named01234567-89ab-cdef-0123-456789abcdef
, and a given user namednerc-test-account
, and the given roleedit
, check that the given project contains a RoleBinding with aRole ref
ofedit
, aSubject kind
ofUser
, and aSubject name
ofnerc-test-account
in theoc
CLI:oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit
- After making any changes to user roles, check that the given project contains a RoleBinding with a
Role ref
ofedit
, aSubject kind
ofUser
, and aSubject name
ofnerc-test-account
for all Users and PIs in theoc
CLI:oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit
- Because ColdFront gives the same
- Set and modify quotas for projects
- As an administrator of the cluster, I should be able to set and modify compute, storage and object counts quotas for any project.
- Criteria
- For modifying attributes, allocation change requests can be requested by navigating to the active allocation.See the ColdFront documentation here
- From here, an admin can approve the request and a call to the acct-mgt service will be made.
- For setting attributes, adding a new allocation attribute triggers a call to the acct-mgt service endpoint
/projects/{project_id}/quota
.
- For setting attributes, adding a new allocation attribute triggers a call to the acct-mgt service endpoint
- Acceptance tests:
- As a ColdFront admin, make a request to change an allocation's attributes.
- As a ColdFront admin, approve the request.
-
You can explore quotas from within the Observability dashboard. For a given project named
012345myproject
, and resourcelimits.cpu
orlimits.memory
for example, Click here to check thevalue
has been updated for the type=hard (max limit) and type=used (current value).
-
- View and Manage Role bindings
- As an administrator of the cluster, I should be able to create, view and manage role bindings for the users in the cluster.
- Criteria
- After a user is added, an admin can go to the user
actions
tab and set their role tomanager
oruser
.
- After a user is added, an admin can go to the user
- Acceptance tests:
- For a given username of
nerc-test-account
and a namespace of01234567-89ab-cdef-0123-456789abcdef
and a role ofedit
for example, check the user's RoleBindings exist:- As a ColdFront admin, set the user's role to
manager
. - Check the user role bindings.
$ oc -n 01234567-89ab-cdef-0123-456789abcdef describe rolebinding/edit Name: edit Labels: <none> Annotations: <none> Role: Kind: ClusterRole Name: edit Subjects: Kind Name Namespace ---- ---- --------- User nerc-test-account
- Because ColdFront users and managers will have the same access of
edit
on the project, there is no difference in OpenShift roles between a user and a manager.
- As a ColdFront admin, set the user's role to
- For a given username of
- Online Documentation
- As an administrator or a user, I should be able to perform any routine operations by referring to online documentation which lists the steps that need to be taken to complete an operation.
- Criteria
- Administrator/user accesses documentation at:
- Administrator accesses XDMoD documentation at:
- Administrator accesses ColdFront documentation at:
- Acceptance tests:
- Verify the administrator/user has access to the main documentation.
- Verify the administrator has access to the XDMoD documentation.
- Verify the administrator has access to the ColdFront documentation.
- Add and track new hardware
- As an administrator of the cluster, I should be able to add nodes to the cluster. I should also be able to view all the nodes and their status.
- Criteria
- Netbox
- Acceptance tests:
- Add new nodes to the cluster.
-
Click here to view the ACM Observability Grafana dashboards. These dashboards provide insights into Control Plane Health, Optimization, Capacity, Utilization and more. You can change the timespan in the top right to show results in terms of minutes, hours, days, months or years.
- Add new nodes to the cluster.
- Track faulty hardware
- As an administrator of the cluster, I should be able to view and track the list of faulty nodes that need to be replaced.
- Criteria
- Nagios or refer to notes in Netbox
- Acceptance tests:
- Track faulty nodes that need to be replaced.
-
Click here to view the ACM Observability Grafana dashboards. These dashboards provide insights into Control Plane Health, Optimization, Capacity, Utilization and more. You can change the timespan in the top right to show results in terms of minutes, hours, days, months or years.
- Track faulty nodes that need to be replaced.
- Establish OpenShift cluster upgrade process
- As an administrator of the cluster, I should be able to follow a set of documented instructions to help me in upgrading to newer versions of OpenShift. The rule book should also establish the process to mitigate any issues that might arise during the upgrade.
- Criteria
- See the official OpenShift updating clusters documentation for the version of OpenShift to which you wish to upgrade.
- Follow the instructions to update the cluster.
- Acceptance tests:
- Verify the administrator has access to the OpenShift updating clusters documentation.
- You can explore cluster versions and upgrades from within the Observability dashboard. For a given cluster named
nerc-ocp-prod
from version4.10.13
to version4.10.15
for example, Click here to check thefrom_version
of thecluster
type record, and theversion
of thecompleted
type record to ensure the versions are what you expected.
- Generate and share operations alerts
- As an administrator of the cluster, I should be able to generate and share operation alerts in OpenShift using the monitoring tools available in the NERC cluster.
- Criteria
- Adding alerts to Slack involves creating a pull request with the alert name, summary, description, expression, for, and labels in the
thanos-ruler-custom-rules
ConfigMap. It also requires adding the corresponding alert names to the matchers in theopen-cluster-management-observability-alertmanager-config
ExternalSecret. Click here to see an example pull request of adding custom alerts to Slack.
- Adding alerts to Slack involves creating a pull request with the alert name, summary, description, expression, for, and labels in the
- Acceptance tests:
- Create pull request of adding custom alerts to Slack, similar to this PR.
- Logging
- As an administrator of the cluster, I should be able to track all the events in the cluster using the logging system in OpenShift.
- Criteria
- Click here to visit the Multi Cluster Logging.
- You can easily filter by recent date, or date range in the past.
- You can easily filter by content, namespaces, pods, and containers.
- You can also filter by log levels: critical, error, warning, info, debug, trace, unknown.
- Click "Show Query" to add more advanced filters like cluster ID:
- Here are the logs for the infra cluster, you can also add the following query to the end of your log query to filter on infra cluster logs:
| openshift_cluster_id="b3c6e302-f119-4adb-bc48-e04c6aa2eaa5"
- Here are the logs for the prod cluster, you can also add the following query to the end of your log query to filter on infra cluster logs:
| openshift_cluster_id="fcb727d6-3e61-4d23-913d-756cf41c7982"
- Here are the logs for the infra cluster, you can also add the following query to the end of your log query to filter on infra cluster logs:
- NERC Admins have access to application logs.
- Infrastructure and audit logs have always been reserved to cluster admins in OpenShift Logging ( even on the old stack with Elasticsearch). LokiStack is best configured for admin access via a group (currently we support three dedicated names cluster-admin, dedicated-admin and the standard group for kubeadmin). These groups require a ClusterRoleBinding to the ClusterAdmin ClusterRole.
- Acceptance tests:
-
Explore the logs as described and ensure you are finding the logs you are looking for.
-
Add any dashboards and alerts you wish to test.
- Log archiving and rollover could run the Ceph Storage out of space. Check on log storage space consumed vs. available using these OpenShift metrics:
- Monitoring and logging for the infrastructure hardware and software that is not OpenShift (for example Grafana)
- As a NERC administrator, I should be able to monitor the status of any infrastructure software or hardware that supports operations for the NERC OpenShift environment, even if it is not itself part of OpenShift.
- Criteria
- You can access many metrics for pods of a particular application like grafana or loki.
- Acceptance tests:
-
See the available logs and metrics:
-
Available logs:
-
Available metrics:
-
There are metrics that are available in OpenShift that may not be available in Observability. Here is an example of how to query OpenShift Data Foundations Ceph Storage Percent Used Metric
-
-
-
- Track/report usage of the cluster
- As an administrator of the cluster, I should be able to view daily, weekly and monthly reports of the cluster infrastructure utilization.
- Criteria
- Administrator logs into the associated XDMoD instance and views reports.
- Acceptance tests:
- As an administrator, check that the XDMoD utilization of OpenShift resources matches the cpu and memory reported in ACM Observability:
-
As an admin in XDMoD, view the reports for OpenShift resources.
-
Click here to view the ACM Observability Grafana dashboards. These dashboards provide insights into Control Plane Health, Optimization, Capacity, Utilization and more. You can change the timespan in the top right to show results in terms of minutes, hours, days, months or years.
-
- As an administrator, check that the XDMoD utilization of OpenShift resources matches the cpu and memory reported in ACM Observability:
- Track/report usage of the project
- As a user and the owner of a project, I should be able to view daily, weekly and monthly reports of the infrastructure utilization by the projects I own.
- Criteria
- User logs into the associated XDMoD instance and views reports for projects they own.
- User cannot view reports for projects they do not own. We will need to look into this, to restrict the view to only projects that they own.
- Acceptance tests:
- As an administrator, check that the XDMoD utilization of project cpu and memory matches the cpu and memory reported for projects in ACM Observability:
-
As an admin in XDMoD, view the reports for OpenShift projects.
-
Click here to show the projects using the top 5 CPU usage at each point in time.
-
- Not applicable at this time.
- As an administrator, check that the XDMoD utilization of project cpu and memory matches the cpu and memory reported for projects in ACM Observability:
-
Access operational logs for at least 30 days
- As an administrator of the cluster, I should be able to access operational logs, error messages, alerts, and other relevant data used to investigate and resolve operational issues when they are created. I should also be able to access this information in place in the operations environment for at least 30 days following its creation.
- Criteria
- Administrator logs, error messages, alerts, and other relevant data used to investigate and resolve operational issues in the logging Grafana and Observability Grafana instance for at least 30 days.
- Acceptance tests:
- See the
Monitoring
andReporting
sections above for information about the logs, alerts, and dashboards. - We are not yet able to configure retention of Loki logs to 30 days, because the
RetentionStreamSpec
feature is not yet released in the latest Loki Operator. We plan to enable this retention feature when it becomes available. Here is an issue where we are tracking this issue regarding log retention.
- See the
-
Access audit logs for at least 90 days
- As an administrator of the cluster, I should be able to access operational information that is specifically useful for security audits and investigations (e.g. records of privilege escalations, certificate changes, etc.) when it is created and for 90 days thereafter.
- Criteria
- Access operational information that is specifically useful for security audits and investigations when it is created and for 90 days thereafter.
- Acceptance tests:
- See the
Monitoring
andReporting
sections above for information about the logs, alerts, and dashboards. - We are not yet able to configure retention of Loki logs to 30 days, because the
RetentionStreamSpec
feature is not yet released in the latest Loki Operator. We plan to enable this retention feature when it becomes available. Here is an issue where we are tracking this issue regarding log retention.
- See the
-
Operational data should be archived and stored securely monthly
- All operational data should be archived and stored securely outside the operations environment monthly. This operations data will eventually be provided to researchers after appropriate procedures have been established for protecting any sensitive data and controlling researcher access to the data. Current operations use cases do not call for deleting any archived data. (Defining procedures for allowing researchers access to the archived data is outside the scope of this document.)
- Criteria
- All operational data should be archived and stored securely outside the operations environment monthly.
- Acceptance tests:
- The NERC team will be meeting to discuss the approach and acceptance tests for this use case.
-
Operations data is archived and then removed
- Once operations data is archived, it can be removed from the operations environment.
- Criteria
- Once operations data is archived, it can be removed from the operations environment.
- Acceptance tests:
- The NERC team will be meeting to discuss the approach and acceptance tests for this use case.