forked from abhishek-g-suresh/docs-csm
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
MTL-1695 Overhaul 1.3 Documentation (Cray-HPE#1554)
MTL-1695 Consolidate the Install path, define a new flow that incorporates the automation calls for CSI input files. Ensure cable and SHCDs are checked before deploying NCNs and right after. Use a new command line context convention; make the code snippets "copy-pasteable." Add README.md symbolic links for rending index.md on GitHub where index.md exists. Liniting of extra white space. Linting of headers; all headers use hyphens to allow native matching to header references while also matching to anchor refs.
- Loading branch information
Showing
755 changed files
with
10,323 additions
and
14,158 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,7 @@ | ||
# docs-csm pull request review team | ||
* @Cray-HPE/docs-csm-reviewers | ||
|
||
* @Cray-HPE/docs-csm-reviewers | ||
|
||
background/ncn_* @Cray-HPE/metal | ||
install/* @Cray-HPE/metal | ||
install/livecd @Cray-HPE/metal | ||
operations/network @Cray-HPE/management-network |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Description | ||
|
||
<!--- Describe what this change is and what it is for. --> | ||
|
||
# Checklist Before Merging | ||
|
||
<!--- An empty check is two brackets with a space inbetween, a checked checkbox is two brackets with an x inbetween --> | ||
<!--- unchecked checkbox: [ ] --> | ||
<!--- checked checkbox: [x] --> | ||
<!--- invalid checkbox: [] --> | ||
|
||
- [ ] If I added any command snippets, the steps they belong to follow the prompt conventions (see [example][1]). | ||
- [ ] If I added a new directory, I also updated `.github/CODEOWNERS` with the corresponding team in [Cray-HPE][2]. | ||
- [ ] My commits or Pull-Request Title contain my JIRA information, or I don't have a JIRA. | ||
|
||
[1]: https://github.com/Cray-HPE/docs-csm/blob/MTL-1695/introduction/documentation_conventions.md#using-prompts | ||
[2]: https://github.com/Cray-HPE/teams |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
1.13.15 | ||
1.14.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,130 @@ | ||
# Cray System Management (CSM) - README | ||
|
||
The documentation included here describes how to install or upgrade the Cray System Management (CSM) | ||
software and related supporting operational procedures. CSM software is the foundation upon which | ||
other software product streams for the HPE Cray EX system depend. | ||
|
||
This documentation is in Markdown format. Although much of it can be viewed with any text editor, | ||
a richer experience will come from using a tool which can render the Markdown to show different font | ||
sizes, the use of bold and italics formatting, inclusion of diagrams and screen shots as image files, | ||
and to follow navigational links within a topic file and to other files. | ||
|
||
There are many tools which can render the Markdown format to get these advantages. Any Internet search | ||
for Markdown tools will provide a long list of these tools. Some of the tools are better than others | ||
at displaying the images and allowing you to follow the navigational links. | ||
|
||
The exploration of the CSM documentation begins with | ||
the [Cray System Management Documentation](index.md) which introduces | ||
topics related to CSM software installation, upgrade, and operational use. Notice that the | ||
previous sentence had a link to the index.md file for the Cray System Management Documentation. | ||
If the link does not work, then a better Markdown viewer is needed. | ||
# Cray System Management Documentation | ||
|
||
## Scope and Audience | ||
|
||
The documentation included here describes the Cray System Management (CSM) software, how to install | ||
or upgrade CSM software, and related supporting operational procedures to manage an HPE Cray EX system. | ||
CSM software is the foundation upon which other software product streams for the HPE Cray EX system depend. | ||
|
||
The CSM installation prepares and deploys a distributed system across a group of management | ||
nodes organized into a Kubernetes cluster which uses Ceph for utility storage. These nodes | ||
perform their function as Kubernetes master nodes, Kubernetes worker nodes, or utility storage | ||
nodes with the Ceph storage. | ||
|
||
System services on these nodes are provided as containerized micro-services packaged for deployment | ||
via Helm charts. Kubernetes orchestrates these services and schedules them on Kubernetes worker | ||
nodes with horizontal scaling. Horizontal scales increases or decreases the number of service instances as | ||
demand for them varies, such as when booting many compute nodes or application nodes. | ||
|
||
This information is intended for system installers, system administrators, and network administrators | ||
of the system. It assumes some familiarity with standard Linux and open source tools, such as shell | ||
scripts, revision control with git, configuration management with Ansible, YAML, JSON, and TOML file formats, etc. | ||
|
||
## Table of Contents | ||
|
||
1. [Introduction to CSM Installation](introduction/README.md) | ||
|
||
This chapter provides an introduction to using the CSM software to manage the HPE Cray EX system which | ||
also describes the scenarios for installation and upgrade of CSM software, how product stream updates | ||
for CSM are delivered, the operational activities done after installation for on-going management | ||
of the HPE Cray EX system, differences between previous release and this release, and conventions | ||
used in this documentation. | ||
|
||
1. [Bare-Metal Steps](operations/bare_metal/Bare-Metal.md) | ||
|
||
This chapter outlines how to set up default credentials for River BMCs and | ||
ServerTech PDUs, which must be done before the initial installation of | ||
CSM, in order to enable HSM software to interact with River Redfish BMCs | ||
and PDUs. | ||
|
||
1. [Update CSM Product Stream](update_product_stream/README.md) | ||
|
||
This chapter explains how to get the CSM product release, get any patches, update to the latest | ||
documentation, and check for any Field Notices or Hotfixes. | ||
|
||
1. [Install CSM](install/README.md) | ||
|
||
This chapter provides an order list of procedures which can be used for CSM software installation or reinstall | ||
that indicate when to do operational tasks as part of the installation workflow. Updating software is in another chapter. | ||
Installation of the CSM product stream has many steps in multiple procedures which should be done in a | ||
specific order. Information about the HPE Cray EX system and the site is used to prepare the configuration | ||
payload. The initial node used to bootstrap the installation process is called the PIT node because the | ||
Pre-Install Toolkit is installed there. Once the management network switches have been configured, the other | ||
management nodes can be deployed with an operating system and the software to create a Kubernetes cluster | ||
utilizing Ceph storage. The CSM services provide essential software infrastructure including the API gateway | ||
and many micro-services with REST APIs for managing the system. Once administrative access has been configured, | ||
the installation of CSM software and nodes can be validated with health checks before doing operational tasks | ||
like the check and update of firmware on system components or the preparation of compute nodes. | ||
|
||
1. [Upgrade CSM](upgrade/README.md) | ||
|
||
This chapter provides an order list of procedures which can be used to update CSM software that indicate when | ||
to do operational tasks as part of the software upgrade workflow. There are procedures to prepare the | ||
HPE Cray system for the upgrade, and update the management network, the management nodes, and the CSM services. | ||
After the upgrade of CSM software, the CSM health checks are used to validate the system before doing any other | ||
operational tasks like the check and update of firmware on system components. | ||
|
||
1. [CSM Operational Activities](operations/README.md) | ||
|
||
This chapter provides an unordered set of administrative procedures required to operate an HPE Cray EX system with CSM software and grouped into several major areas: | ||
* CSM Product Management | ||
* Artifact Management | ||
* Boot Orchestration | ||
* Compute Rolling Upgrade | ||
* Configuration Management | ||
* Console Management | ||
* Firmware Management | ||
* Hardware State Manager | ||
* Image Management | ||
* Kubernetes | ||
* Network Management | ||
* Node Management | ||
* Package Repository Management | ||
* Power Management | ||
* Resiliency | ||
* River Endpoint Discovery Service | ||
* Security And Authentication | ||
* System Configuration Service | ||
* System Layout Service | ||
* System Management Health | ||
* UAS User And Admin Topics | ||
* Utility Storage | ||
* Validate CSM Health | ||
|
||
1. [CSM Troubleshooting Information](troubleshooting/README.md) | ||
|
||
This chapter provides information about some known issues in the system and tips for troubleshooting Kubernetes. | ||
|
||
1. [CSM Background Information](background/README.md) | ||
|
||
This chapter provides background information about the NCNs (non-compute nodes) which function as | ||
management nodes for the HPE Cray EX system. This information is not normally needed to install | ||
or upgrade software, but provides background which might be helpful for troubleshooting an installation. | ||
|
||
1. [Glossary](glossary.md) | ||
|
||
This chapter provides explanations of terms and acronyms used throughout the rest of this documentation. | ||
|
||
## Copyright and License | ||
|
||
MIT License | ||
|
||
(C) Copyright [2020-2022] Hewlett Packard Enterprise Development LP | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a | ||
copy of this software and associated documentation files (the "Software"), | ||
to deal in the Software without restriction, including without limitation | ||
the rights to use, copy, modify, merge, publish, distribute, sublicense, | ||
and/or sell copies of the Software, and to permit persons to whom the | ||
Software is furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included | ||
in all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL | ||
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR | ||
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, | ||
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR | ||
OTHER DEALINGS IN THE SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.