Merge pull request #62 from hpc-unibe-ch/61-streamline-documentation-…

…prior-to-ubelix9 Streamlined documentation (outdated/obsolete)
hpc-unibe-ch · Jan 12, 2024 · b13a282 · b13a282
2 parents 0b4697b + 5c78e21
commit b13a282
Show file tree

Hide file tree

Showing 26 changed files with 191 additions and 870 deletions.
diff --git a/docs/file-system/quota.md b/docs/file-system/quota.md
@@ -59,36 +59,3 @@ Furthermore, the SCRATCH quota is presented starting with `SCR_`, where `SCR_usr
  - HOME and SCRATCH: values presented are actual values directly gathered from the file system
 
 Note: the coloring of the relative values is green (<70%), yellow (70% < x < 90%), red (>90%).
-
-## advanced quota method
-
-The following `mmlsquota` command present you actual values from the file system. 
-For `$HOME`:
-
-```Bash
-$ mmlsquota --block-size=G -u $USER rs_gpfs:svc_homefs
- Block Limits | File Limits
-Filesystem Fileset type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks
-rs_gpfs svc_homefs USR 444181792 1073741824 1073741824 6072144 none | 815985 1000000 1000000 2462 none
-```
-
-The `--block-size` option specify the unit {K , M, G, T} in which the numbers of blocks are displayed:
-
-```Bash
-mmlsquota --block-size=G -j workspace1 rs_gpfs
- Block Limits | File Limits
-Filesystem type GB quota limit in_doubt grace | files quota limit in_doubt grace Remarks
-rs_gpfs FILESET 57 10240 11264 0 none | 5 10000000 11000000 0 none
-```
-
-The output shows the quotas for a worspace called `Workspace1`. The quotas are set to a soft limit of 10240 GB, and a hard limit of 11264 GB. 57 GB is currently allocated to the workspace. An in_doubt value greater than zero means that the quota system has not yet been updated as to whether the space that is in doubt is still available or not. 
-
-If the user/workspace exceeds the soft limit, the grace period will be set to one week. If usage is not reduced to a level below the soft limit during that time, the quota system interprets the soft limit as the hard limit and no further allocation is allowed. The user(s) can reset this condition by reducing usage enough to fall below the soft limit. The maximum amount of disk space the workspace/user can accumulate during the grace period is defined by the hard limit. The same information is also displayed for the file limits (number of files).
-
-## Request Higher Quota Limits
-
-!!! types info "clean before ask"
- Make sure to clean up your directories before requesting additional storage space. 
-
-There will be no quota increase for HOME directories. Additional storage for workspaces can be requested by the workspace owner or the deputy, see [Workspace Management](../hpc-workspaces/management.md#additional-storage)
-
diff --git a/docs/code-of-conduct.md → docs/general/code-of-conduct.md b/docs/code-of-conduct.md → docs/general/code-of-conduct.md
diff --git a/docs/general/costs_investments.md b/docs/general/costs_investments.md
@@ -17,39 +17,19 @@ used to buy additional resources for the cluster/storage will be purchased with
 the additional budget.
 
 !!! tip "Get in touch with us!"
- If you are interested in any of the below investment opportunities, get in
+ If you are interested in our investment opportunities, get in
  touch with us by starting a service request at the [Service
  Portal](https://serviceportal.unibe.ch/sp).
 
-## CPUs
-
-For CPUs we do not have formal investment opportunities as of yet. We are
-working on a business model that allows fair investment in this area. Please
-get in touch with us if you are interested in getting higher/broader privileges
-on CPU partitions.
-
-## GPUs
-
-We provide far beyond 100 GPUs to our users. In contrast to CPUs we work with
-preemption on the GPU partition. For your chosen investment you gain the
-privilege to preempt other user on provided number of GPUs you invested for.
-That means, whenever there are no free GPUs and you start your job, other
-user's jobs are terminated to have yours start almost immediately.
-
-As the number of GPUs is limited there may be none available but will be
-ordered for you as soon as possible. Nevertheless, please try to plan ahead as
-much as possible as GPU availability is quite scarce and it may take months
-until we can get new cards.
-
-## Disk Storage
+## Disk Storage Costs
 
 ### Workspaces
 
 Every **research group** has **10TB** free of charge quota. This can be used
 within one or more Workspaces. The amount used per Workspace is set at
 application time and can be changed later within the limitation.
 
-Additional storage can be purchased for CHF 50 per TB and year. On the
+Additional storage can be purchased. On the
 application or modification form an quota upper limit can be set.
 Accounted will be the actual usage only. Therefore, the actual usage is monitored
 twice a day. The average value of all data points is used for accounting and

diff --git a/docs/general/faq.md b/docs/general/faq.md
@@ -1,7 +1,5 @@
 # FAQ
 
-## Description
-
 This page provides a collection of frequently asked questions.
 
 ## File system
@@ -11,10 +9,9 @@ If you reached your quota, you will get strange warning about not being able to
 
 1. Decluttering: Check for unnecessary data. This could be:
 
-- unused application packages, e.g. Python(2) packages in `$HOME/.local/lib/python*/site-packages/*`
-- temporary computational data, like already post processed output files
-- duplicated data
-- ...
+ - unused application packages, e.g. Python(2) packages in `$HOME/.local/lib/python*/site-packages/*`
+ - temporary computational data, like already post processed output files
+ - duplicated data
 
 2. Pack and archive: The HPC storage is a high performance parallel storage and not meant to be an archive. Data not used in the short to midterm should be packed and moved to an archive storage. 
 
@@ -31,64 +28,7 @@ HPC Workspaces are managed by the group manager/leader and if applicable a deput
 ### I need to share data with my colleges. What can I do?
 HPC Workspaces are meant to host shared data. See [HPC Workspaces](../hpc-workspaces/workspaces.md)
 
-<!-- ## Where should I put my data?
-A coarse classification may be: 
-
-| data type | suggested target |
-| :--- | :--- |
-| private configuration data, e.g. SSH keys | HOME |
-| temporary (weeks to month) application input/output data | SCRATCH |
-| persistent application input/results, meant to be shared (some-when) | Workspace |
-| applications, meant to be shared (some-when) | Workspace | -->
-
-### Where can I get a Workspace?
-A research group manager need to **create** the Workspace, since there are possibilities for charged extensions. 
-
-If you want to **join an existing** Workspace. Ask the Workspace manager or its deputy to add you. 
-See [HPC Workspaces](../hpc-workspaces/workspaces.md)
-
-### How much does a Workspace cost?
-Workspaces itself are free of charge. Every research group has 10TB disk space free of charge, which can be used in multiple Workspaces. 
-If necessary, additional storage can be purchased per Workspace, where only the actual usage will be charged, see [Workspace Management](../hpc-workspaces/management.md#additional-storage)
-
-
-### What if our 10TB free of charge research group quota is full?
-Your Research group manager or a registered deputy can apply for an additional quota. Actual used quota will be charged. 
-
-### Why can I not submit jobs anymore?
-After joining an HPC Workspace the private SLURM account gets deactivated and a Workspace account need to be specified. 
-This can be done by loading the Workspace module, see [Workspace environment](../hpc-workspaces/environment.md):
-
-```Bash 
-module load Workspace
-```
-
-Otherwise Slurm will present the following error message:
-```Bash
-sbatch: error: AssocGrpSubmitJobsLimit
-sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
-```
-
-With this method we aim to distribute our resources in a more fair manner. HPC resources including compute power should be distriuted between registered research groups. We can only relate users with research groups by utilizing Workspace information. 
-
-
 ## Software issues
-### Why is my private conda installation broken after migration
-Unfortunately, Anaconda hard wires absolute paths into almost all files (including scripts and binary files). 
-A proper migration process may have included `conda pack`. 
-There is a way you may access your old environments and create new ones with the same specification:
-```
-export CONDA_ENVS_PATH=${HOME}/anaconda3/envs ## or where you had your old envs
-module load Anaconda3
-eval "$(conda shell.bash hook)"
-conda info --envs
-conda activate oldEnvName ## choose your old environment name
-conda list --explicit > spec-list.txt
-unset CONDA_ENVS_PATH
-conda create --name myEnvName --file spec-list.txt # select a name
-```
-Please, also note that there is a system wide Anaconda installation, so no need for your own separate one. 
-Finally, after recreating your environments please delete all old Anaconda installations and environments. These are not only big but also a ton of files. 
 
 ### Why the system is complaining abount not finding an existing module?
 
@@ -122,8 +62,6 @@ When loading `foss/2021a`, the `zlib/.1.2.11-GCCcore-10.3.0` should get loaded,
 Please take this as an indication that you accidentality mix different toolchains, and rethink your procedure, and stay within the same toolchain and toolchain version. 
 
 ## Environment issues
-### I am using zsh, but some commands and tools fail, what can I do?
-There are known caveats with LMOD (or module system) and Bash scripts in zsh environments. Bash scripts do not source any system or user files. To initialize the (module) environment properly, you need to set `export BASH_ENV=/etc/bashrc` in your zsh profile (`.zshrc`).
 
 ### I modified my bashrc, but its not doing what I expect, how can I debug that bash script?
 The bashrc can be debugged as all other bash scripts, using 
@@ -163,6 +101,24 @@ The job is not allowed to start because you have reached the maximum of allowed
 **(ReqNodeNotAvail, UnavailableNodes:...)** 
 Some node required by the job is currently not available. The node may currently be in use, reserved for another job, in an advanced reservation, `DOWN`, `DRAINED`, or not responding.**Most probably there is an active reservation for all nodes due to an upcoming maintenance downtime (see output of** `scontrol show reservation`) **and your job is not able to finish before the start of the downtime. Another reason why you should specify the duration of a job (--time) as accurately as possible. Your job will start after the downtime has finished.** You can list all active reservations using `scontrol show reservation`.
 
+### Why can I not submit jobs anymore?
+After joining an HPC Workspace the private SLURM account gets deactivated and a Workspace account need to be specified. 
+This can be done by loading the Workspace module, see [Workspace environment](../hpc-workspaces/environment.md):
+
+```Bash 
+module load Workspace
+```
+
+Otherwise Slurm will present the following error message:
+```Bash
+sbatch: error: AssocGrpSubmitJobsLimit
+sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
+```
+
+With this method we aim to distribute our resources in a more fair manner. HPC resources including compute power should be distriuted between registered research groups. We can only relate users with research groups by utilizing Workspace information. 
+
+
+
 ### Why can't I submit further jobs?
 
 !!! types note ""

diff --git a/docs/halloffame.md → docs/general/halloffame.md b/docs/halloffame.md → docs/general/halloffame.md
@@ -3,7 +3,7 @@
 If you previously used UBELIX to do your computational work and you acknowledged this
 in your publication and want to your publication listed here, please drop us a note via [https://serviceportal.unibe.ch/hpc](https://serviceportal.unibe.ch/hpc). 
 If you are wondering how you can acknowledge the usage of UBELIX in your
-publication, have a look at the [homepage](index.md) of this documentation, where
+publication, have a look at the [homepage](../index.md) of this documentation, where
 you will find a text recommendation acknoowledging the use of our cluster.
 
 ## Papers and Articles
@@ -79,9 +79,9 @@ Leichtle A, Fiedler G et al. | Pancreatic carcinoma, pancreatitis, and healthy c
 
 ## Posters
 
-![Poster J. T. Casanova et al, 2023](images/casanova_2023_iap.png "J. T. Casanova et al., Computational approach to anti-Kasha photochemistry of Pt-dithiolene complexes, 2023"){: style="max-width: 100%"}
+![Poster J. T. Casanova et al, 2023](../images/casanova_2023_iap.png "J. T. Casanova et al., Computational approach to anti-Kasha photochemistry of Pt-dithiolene complexes, 2023"){: style="max-width: 100%"}
 
-![Poster Schwab et al, 2016](images/hof_schwab_2016_ncsml.png "Schwab et al., Computational neuroscience: Validation and reliability of directed dynamic networks of the brain, 2016"){: style="max-width: 100%"}
+![Poster Schwab et al, 2016](../images/hof_schwab_2016_ncsml.png "Schwab et al., Computational neuroscience: Validation and reliability of directed dynamic networks of the brain, 2016"){: style="max-width: 100%"}
 
 ## Newspapers
 
@@ -92,10 +92,4 @@ Berner Forscher entdecken neue Klimazustände, in denen Leben möglich ist. | De
 ## Create an Entry
 
 If you used UBELIX for your publication please let your entry added to the list.
-Open a ticket or create a pull request, see [Documentation Update](general/support.md). 
-The format of the entry would be markdown:
-
-```
-<first author>, <last author> | <title> | [Details](<Boris link>) | [Direct Link](<DOI link>)
-```
-where the authors are lastname and first letter of first name. Subscripts can be created using `<sub>2</sub> `. 
+Please open a ticket with the details of your publication.
diff --git a/docs/general/news.md b/docs/general/news.md
@@ -1,17 +1,33 @@
 # News
 
-11-08-2022:
+12.01.2024:
 
-- added two additional login nodes to the cluster
+- The user documentation has been streamlined and updated with recent information
+- The UBELIX9 testing system, previewing the next generation OS is now availble for all users.
+
+ In our ongoing commitment to providing a secure and efficient computing environment, we are migrating the HPC system from CentOS 7 to Rocky Linux 9. 
+
+ We are pleased to inform you that a part of our infrastructure has been migrated and is ready to be tested by you. To get you started, please take the time to read this information thoroughly. 
+
+ As part of the migration, we have implemented general software and security updates to ensure a secure and optimized computing environment. Please consult the manual pages (i.e., `man <command>`) to review the latest command syntax. 
+
+ The list of software modules managed by the UBELIX Team accessible via the module commands has been updated. Please note that old software versions may have been discontinued in favor of more recent versions. Additionally, the Vital-IT and UBELIX software stacks have been merged. Explore the enhanced range of modules to benefit from the latest tools and applications available on UBELIX using the module spider command. 
+
+ While we have taken measures to minimize user impact, it is crucial to be aware of potential adjustments needed on your end. Most importantly, please verify that your workflows, scripts, and applications are compatible with the new environment. 
+
+ It is important to note that there may be a need to recompile your executables for compatibility with the new system. Existing Python environments are expected to remain functional unless special libraries such as TensorFlow with GPU support are used. These may require a fresh installation. 
+
+ Additionally, older software modules that are no longer managed by the UBELIX team may need to be installed by users if required. Instructions for custom software modules installations can be found in the documentation section on EasyBuild. 
+
+ The testing system is kept simple, and therefore, only default Quality of Service (QOS) is available now. Investor resources have not been migrated yet and are still fully accessible on the old system. Existing job scripts that use the debug, long, gpu_preempt and invest QOS need to be updated. Investors are encouraged to reach out to us if they wish to proceed with the migration of their resources. 
+
+ To access the new system please login to submit02: `ssh <username>@submit02.unibe.ch`
+
+ Note that the graphical monitoring ([https://ubelix.unibe.ch/)](https://ubelix.unibe.ch/)) does not cover the new testing environment yet. Please use the `squeue --me` command to query your jobs status on the new system. More details on the monitoring of the new system will follow. 
+
+ If you encounter any issues, we are ready to assist you. Feel free to reach out via [https://serviceportal.unibe.ch/hpc](https://serviceportal.unibe.ch/hpc). Please make sure to specify that your problem is related to the UBELIX testing environment and provide as much information as possible. 
+
+ We appreciate your attention to these details and your cooperation as we work together to ensure a smooth transition to Rocky Linux 9.
 
-04-05-2021: 
+ Happy computing!
 
-- major SLUM partition restructure, see [Slurm partitions](../slurm/partitions.md). Job scripts may need to be adapted.
-- HPC Workspace officially in production [HPC Workspace Overview](../hpc-workspaces/workspaces.md)
-- Kernel, CUDA driver, SLURM, and Spectrum Scale update
-- in June: HOME quota fixed to 1TB, removal of GPFS institute shared directories
-
-17-02-2021: 
-
-- Home migration: User HOMEs started to get migrated to the newer Spectrum Scale System storage
-- HPC Workspaces: Beta Phase of custom group shared file spaces with tools and Slurm accounting