Skip to content

Commit

Permalink
docs: new announce_outage document
Browse files Browse the repository at this point in the history
Now referenced bot by the "release" and "upgrade" documentation.
  • Loading branch information
praiskup committed Dec 6, 2023
1 parent cf10ec3 commit bb3da7e
Show file tree
Hide file tree
Showing 4 changed files with 106 additions and 57 deletions.
59 changes: 6 additions & 53 deletions doc/how_to_release_copr.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Check that .repo files correctly points to ``@copr/copr``. And run on batcave01.
.. note::

If there is a new version of copr-rpmbuild, follow the
:ref:`terminate_os_vms` and :ref:`terminate_resalloc_vms` instructions.
:ref:`terminate_resalloc_vms` instructions.

Make sure expected versions of Copr packages are installed on the dev
instances::
Expand Down Expand Up @@ -215,31 +215,8 @@ notes against Copr git repository.
Schedule and announce the outage
................................

.. warning::

Schedule outage even if it has to happen in the next 5 minutes!

Get faimiliar with the `Fedora Outage SOP <https://docs.fedoraproject.org/en-US/infra/sysadmin_guide/outage/>`_.
In general, please follow these steps:

1. Prepare the infrastructure ticket similar to `this old one <https://pagure.io/fedora-infrastructure/issue/10854>`_.

2. Send email to `copr-devel`_ mailing list informing about an upcomming
release. We usually copy-paste text of the infrastructure ticket created in a
previous step. Don't forget to put a link to the ticket at the end of the
email. See the `example <https://lists.fedoraproject.org/archives/list/copr-devel@lists.fedorahosted.org/message/FVVX3Y7IVRTFW3NYVBTWX3AK3BHNRATX/>`_.

3. Send ``op #fedora-buildsys MyIrcNick`` message to ``ChanServ`` on
libera.chat to get the OP rights, and then adjust the channel title so it
starts with message similar to::

Planned outage 2022-08-17 20:00 UTC - https://pagure.io/fedora-infrastructure/issue/10854

4. Create a new "planned" `Fedora Status SOP`_ entry.
5. Create warning banner on Copr homepage::

copr-frontend warning-banner --outage_time "2022-12-31 13:00-16:00 UTC" --ticket 1234

See a specific document :ref:`announcing_fedora_copr_outage`, namely the
"planned" outage state.

Release window
--------------
Expand All @@ -248,16 +225,10 @@ If all the pre-release preparations were done meticulously and everything
was tested properly, the release window shouldn't take more than ten
minutes. That is, if nothing goes terribly sideways...


Let users know
--------------

1. Change the "planned" `Fedora Status SOP`_ entry into an "ongoing" entry.

2. Announce on ``#fedora-buildsys``, change title like
``s/Planned outage ../OUTAGE NOW .../`` and send some message like
``WARNING: The scheduled outage just begings!``.

See :ref:`announcing_fedora_copr_outage` again, ad "ongoning" issue.

Production infra tags
---------------------
Expand Down Expand Up @@ -371,24 +342,8 @@ If schema was modified you should generate new Schema documentation.
Announce the end of the release
...............................

1. Remove the "Outage" note from the ``#fedora-buildsys`` title.

2. Send a message on ``fedora-buildsys`` that the outage is over!

3. Send email to `copr-devel`_ mailing list. If there is some important change
you can send email to fedora devel mailing list too. Mention the link to the
"Highlights from XXXX-XX-XX release" documentation page.

4. Propose a new "highlights" post for the `Fedora Copr Blog`_,
see `the example
<https://github.com/fedora-copr/fedora-copr.github.io/pull/55/files>`_.

5. Close the Fedora Infra ticket.

6. Change the "ongoing" `Fedora Status SOP`_ entry into a "resolved" one.

7. Remove the warning banner from frontend page using
``copr-frontend warning-banner --remove``
See a specific document :ref:`announcing_fedora_copr_outage`, the "resolved"
section.


Release packages to PyPI
Expand Down Expand Up @@ -446,6 +401,4 @@ Fix this document to make it easy for the release nanny of the next release to u

.. _`Copr release directory`: https://releases.pagure.org/copr/copr
.. _`copr-devel`: https://lists.fedoraproject.org/archives/list/copr-devel@lists.fedorahosted.org/
.. _`Fedora Status SOP`: https://docs.fedoraproject.org/en-US/infra/sysadmin_guide/status-fedora/
.. _`example stg infra repo`: https://kojipkgs.fedoraproject.org/repos-dist/f36-infra-stg/
.. _`Fedora Copr Blog`: https://fedora-copr.github.io/
19 changes: 15 additions & 4 deletions doc/how_to_upgrade_persistent_instances.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,11 @@ be done post-upgrade.
Avoid conducting the pre-upgrade too far in advance of the actual upgrade.
Ideally, perform this phase a couple of hours or a day before.

Announce the outage
-------------------

See a specific document :ref:`announcing_fedora_copr_outage`, namely the
"planned" outage state.

Preparation
-----------
Expand Down Expand Up @@ -197,6 +202,14 @@ Outage window
When initiating this section, aim for time efficiency as the services will be
down and inaccessible to users.

Let users know
--------------

See :ref:`announcing_fedora_copr_outage` again, ad "ongoning" issue.

Move IPs and Volumes to the New Instances
-----------------------------------------

.. warning::
Prepare to follow the instructions provided during the playbook run. You'll
need to perform manual steps such as DB backups, consistency checks, etc.
Expand Down Expand Up @@ -285,10 +298,8 @@ option, click ``Actions``, navigate to ``Instance settings`` and then to
Final steps
-----------

Remember to announce on `fedora devel`_ and `copr devel`_ mailing lists as well
as in the ``#fedora-buildsys`` channel that everything is functional again.

Close the infrastructure ticket to complete the upgrade process.
See a specific document :ref:`announcing_fedora_copr_outage`, the "resolved"
section.

.. _`Fedora Infra OpenStack`: https://fedorainfracloud.org
.. _`OpenStack images dashboard`: https://fedorainfracloud.org/dashboard/project/images/
Expand Down
84 changes: 84 additions & 0 deletions doc/maintenance/announce_outage.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
.. _announcing_fedora_copr_outage:

Fedora Copr Outages
===================

This document is primarily intended for planning outages due to future
infrastructure updates. However, in the event of any incidents or accidents
such as networking issues, IBM Cloud problems, Fedora Rawhide repository issues,
or any other matters that affect users, it's advisable to refer to this document
(possibly jump directly to the "Ongoing State" section).

.. warning::

Schedule an outage even if it needs to occur within the next 5 minutes!

Please familiarize yourself with the `Fedora Outage SOP`_. But in general,
follow the steps outlined in this document.

Planned outage
--------------

1. Prepare the infrastructure ticket similar to `this old one <https://pagure.io/fedora-infrastructure/issue/10854>`_.

2. Send email to `copr-devel`_ mailing list informing about an upcomming
release. We usually copy-paste text of the infrastructure ticket created in a
previous step. Don't forget to put a link to the ticket at the end of the
email. See the `example <https://lists.fedoraproject.org/archives/list/copr-devel@lists.fedorahosted.org/message/FVVX3Y7IVRTFW3NYVBTWX3AK3BHNRATX/>`_.

3. Send ``op #fedora-buildsys MyIrcNick`` message to ``ChanServ`` on
libera.chat to get the OP rights, and then adjust the channel title so it
starts with message similar to::

Planned outage 2022-08-17 20:00 UTC - https://pagure.io/fedora-infrastructure/issue/10854

4. Ditto for the Matrix channel. TODO: we need to get admin access there.

5. Create a new "planned" `Fedora Status SOP`_ entry.

6. Create warning banner on Copr homepage::

copr-frontend warning-banner --outage_time "2022-12-31 13:00-16:00 UTC" --ticket 1234


Ongoing outage
--------------

When the outage begins to cause real effects

1. Change the "planned" `Fedora Status SOP`_ entry into an "ongoing" entry.

2. Announce on ``#fedora-buildsys``, change title like
``s/Planned outage ../OUTAGE NOW .../`` and send some message like
``WARNING: The scheduled outage just begings!``.

3. Announce on Matrix.


Resolved outage
---------------

1. Remove the "Outage" note from the ``#fedora-buildsys`` and Matrix title.

2. Send a message on ``fedora-buildsys`` that the outage is over!

3. Send email to `copr-devel`_ mailing list. If there is some important change
you can send email to fedora devel mailing list too. Mention the link to the
"Highlights from XXXX-XX-XX release" documentation page.

4. Propose a new "highlights" post for the `Fedora Copr Blog`_,
see `the example
<https://github.com/fedora-copr/fedora-copr.github.io/pull/55/files>`_.

5. Close the Fedora Infra ticket.

6. Change the "ongoing" `Fedora Status SOP`_ entry into a "resolved" one.

7. Remove the warning banner from frontend page using
``copr-frontend warning-banner --remove``


.. _`copr-devel`: https://lists.fedoraproject.org/archives/list/copr-devel@lists.fedorahosted.org/
.. _`Fedora Outage SOP`: https://docs.fedoraproject.org/en-US/infra/sysadmin_guide/outage/
.. _`Fedora Status SOP`: https://docs.fedoraproject.org/en-US/infra/sysadmin_guide/status-fedora/
.. _`Fedora Copr Blog`: https://fedora-copr.github.io/
1 change: 1 addition & 0 deletions doc/maintenance_documentation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ This section contains information about maintenance topics. You may also be inte
How to manage active chroots <how_to_manage_chroots>
How to rename chroots <how_to_rename_chroot>
Fedora Copr hypervisors <maintenance/hypervisors>
Outage announcements <maintenance/announce_outage>


.. toctree::
Expand Down

0 comments on commit bb3da7e

Please sign in to comment.