Skip to content

Commit

Permalink
Agent Installer Development Environment
Browse files Browse the repository at this point in the history
  • Loading branch information
lranjbar committed Mar 30, 2022
1 parent 53bf422 commit 5a5fa47
Showing 1 changed file with 234 additions and 0 deletions.
234 changes: 234 additions & 0 deletions enhancements/agent-installer/development-environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
---
title: ephemeral-agent-installer-e2e-tests
authors:
- "@lranjbar"
reviewers:
- "@andfasano"
- "@bfournie"
- "@hardys"
- "@pawanpinjarkar"
- "@rwsu"
- "@zaneb"
approvers:
- "@dhellmann"
- "@celebdor"
api-approvers: # in case of new or modified APIs or API extensions (CRD, aggregated api servers, webhooks, finalizers)
- N/A
creation-date: 2022-02-10
last-updated: 2022-03-29
tracking-link: # link to the tracking ticket (for example: Jira Feature or Epic ticket) that corresponds to this enhancement
- https://issues.redhat.com/browse/AGENT-20
---

# Ephemeral Agent Installer Development Infrastructure

## Release Signoff Checklist

- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Operational readiness criteria is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary

Create a development environment for the ephemeral agent based installer image in the
[dev-scripts](https://github.com/openshift-metal3/dev-scripts) repository and refactor
components as needed for the new project. In addition we will add base case e2e test flows
integrated into Openshift CI to get the project started.

## Motivation

The new agent based installer project will need a standard development environment for
the team to use. We will need an e2e testing framework for validation in Openshift CI.
In this enhancement we will explain our desire to reuse existing frameworks for the new
project while limiting the scope of modifications to them to the new project's requirements.

In this space we have two existing frameworks that could be used for this new project. These
frameworks are:
* [dev-scripts](https://github.com/openshift-metal3/dev-scripts): Created by the Metal Platform
(IPI) team to set up environments to install OCP both for local development and in Openshift CI.
* [assisted-test-infra](https://github.com/openshift/assisted-test-infra): Created by the Assisted
Installer team to run full e2e test flows.

Looking at existing e2e test flow steps from these two frameworks it was determined that
in dev-scripts test flows the beginning steps are very similar to the development
flows we desire for the agent installer. In contrast the assisted-test-infra test flows
set up extra things in the development environment that is undesirable for the agent installer.

### Goals

* Using this development environment in the new ephemeral agent based installer image project.
* Re-factoring parts of dev-scripts that are needed for this ephemeral agent based installer.
* Convergence of the underlying scripts and environment setup code between Metal Platform (IPI)
team and the new Agent Installer teams.

### Non-Goals

* Re-factoring dev-scripts and assisted-test-infra for reasons not related to the new
ephemeral agent based installer image project.
* Overall convergence between all the installer development teams of scripts and e2e test flows.

## Proposal

Create an development environment for the ephemeral agent based installer image in the
[dev-scripts](https://github.com/openshift-metal3/dev-scripts) repository and refactor
components as needed for the new project.

### User Stories

* As a developer, I need to set up an environment so that I can simulate the installation of a
users first cluster using the ephemeral based agent installer image.

* As a developer, I need to launch the built ephemeral agent based installer image so that I can test
and validate this image during development.

* As a developer, I need to check cluster installation progress of the Openshift cluster being
installed by agent installer and when this installation is complete to validate the Openshift
cluster.

* As a developer, I need to validate the kubeconfig is available so I can use it to access the
installed cluster.

* As a developer, I want to extract logs from agent installer and the Openshift cluster under
install to troubleshoot and debug.

### API Extensions

Not applicable.

### Risks and Mitigations

The direction of this proposal is now to extend and refactor parts of dev-scripts
for the agent-based installer.

** There will be now a third team using dev-scripts for slightly different purposes.
Which means the risk of breaking things in the central framework now is larger and
has a larger "splash zone."

We can mitigate this for the new ephemeral agent installer by adding e2e tests into the
dev-scripts repository that run the agent flows. The purpose of this would be to find
breakages in the tests across the different teams.

** The dev-scripts repository clones the [metal-dev-env](https://github.com/metal3-io/metal3-dev-env)
repository for most of its Ansible roles. Which means there is a risk of changes in this
repository can break things in our repository as well. We have less control over this repository
than others inside Openshift.

This is partially mitigated already because the repository is cloned at a specific
Git SHA. To further this we can add a test in dev-scripts that runs maybe once or twice
a week that clones the HEAD of this repository. We would be alerted for breakages more
frequently and we can mitigate them on a case-by-case basis.


## Design Details

### General Strategy:

* Using the scope of the agent installer project we will identify possible refactors to
existing scripts and new functionality required by this new project.
* New functionality we will write in Ansible, unless it is highly coupled with existing
shell scripts.
* We will create Anisble roles to wrap the existing code making it easier to use within
a pure Ansible solution
* The existing bash scripts will be left as is. Refactoring them will be done piece by
piece as needed.

### Phase 1 (Crawl): Add development scripts into dev-scripts for Agent Installer

1) Add development scripts into dev-scripts for the new agent installer project using
existing conventions.
2) Add Ansible roles and tasks to wrap existing shell scripts to make them easier
to integrate with an Ansible solution.
3) Add base e2e test flow to run the ephemeral agent installer image in Openshift CI.
4) Write documentation for new developers joining the agent installer team.

In this phase it should also become more clear what scripts and code are overlapping
between the two teams.

### Phase 2 (Walk): Define common environment default configurations for Agent Installer

1) Define an Ansible role that will pipe in values to existing shell scripts and manage
the environment variables needed for these scripts.
2) Add in more configuration examples like
[config_example.sh](https://github.com/openshift-metal3/dev-scripts/blob/master/config_example.sh)
for dev-scripts that define common configurations for OCP installs.
3) Use the above examples to define default configurations for VMs (HA, Compact, SNO)
and networks (IPV4, IPV6, dualstack) in Ansible.
4) Identify minor refactors to existing scripts deemed helpful for the two projects and
move them into Ansible.

### Phase 3 (Run): Update existing e2e IPI flows to use the refactored Ansible in dev-scripts

1) Update Agent Installer flows to use the new Ansible roles to configure the flows instead
of using the environment variables directly.
2) Update the e2e IPI flows to use this new Ansible configuration roles as needed.

### Open Questions [optional]

### Test Plan

* Test locally while creating the development environment.
* On-board the base e2e flow into Openshift CI for agent installer.

### Graduation Criteria

Not applicable.

#### Removing a deprecated feature

Not applicable.

### Upgrade / Downgrade Strategy

Not applicable.

### Version Skew Strategy

* Currently the dev-scripts repository is not versioned. This will stay the same.

* However for the e2e test scenarios themselves it makes sense to version these.
Since the ephemeral agent based installer is targeted to be released with OCP it would
be easiest to version the e2e framework the same as OCP x.y. This is the strategy
that openshift-tests takes.

### Operational Aspects of API Extensions

Not applicable.

#### Failure Modes

Not applicable.

#### Support Procedures

Not applicable.

## Implementation History

## Drawbacks

The biggest argument against it is that by using dev-scripts as the base is we don't
inherit the work that was done previously in validating Assisted Installer in
assisted-test-infra.

## Alternatives

The choices for making a new development environment for the agent installer project were:

1) Make an entirely new framework
2) Extend assisted-test-infra for agent-installer
3) Extend dev-scripts for agent-installer (Chosen)

Making a new framework (Option #1) was not chosen due to the amount of progress
that would be lost by starting over. In existing frameworks we have solved a lot
of problems for example: image registry mirroring, proxies, upgrades, etc. Redoing
this work is not trivial. The risks were deemed likely to happen with this approach.

As far as extending assisted-test-infra (Option #2) for agent installer was not pursued.
This is because the flow of the e2e tests in this framework first setup a minikube cluster.
From the perspective of the agent-installer this is not needed. The beginning of the
e2e test flow used in dev-scripts is significantly closer to the flow we desire for the
agent installer.

0 comments on commit 5a5fa47

Please sign in to comment.