Skip to content

Conversation

cerlane
Copy link

@cerlane cerlane commented Aug 25, 2025

Start again using a branch from #231

Copy link

preview available: https://docs.tds.cscs.ch/241

@bcumming bcumming changed the title Fix/feedbacks Add docs for GPU saturation tool Aug 27, 2025
Copy link
Member

@bcumming bcumming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

I have some suggested changes, and I have tried to add some extra information that might have been missing in earlier reviews.

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update \
&& apt-get install -y wget rsync rclone vim git htop nvtop nano \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is nano needed here?

```
As you can see from the above example, gssr can easily be installed with a `RUN pip install gssr` command.

Once your `ContainerFile` is ready, you can build it on any Alps platforms with the following commands to create a container with label `mycontainer`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs have a guide on how to build containers on Alps, that you could like to.

For more information about building containers on Alps, see our [Podman guide][ref-building-containers].

https://docs.cscs.ch/contributing/#internal-links


## Create CSCS configuration for Container

The next step is to tell CSCS container engine solution where your container is and how you would like to run it. To do so, you will have to create a`{label}.toml` file in your `$HOME/.edf` directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the existing documentation for the EDF file format, to make your life easier.
Find sections to link to here: https://docs.cscs.ch/software/container-engine/

* [Quickstart Guide][ref-gssr-quickstart]
* [Container Guide][ref-gssr-containers]

This tool will produce time-series and heatmaps of the profiled metric values. Here is an example of one set of plots generated by the tool from the application Megatron-LLM from EPFL.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance on including images has been updated:
https://docs.cscs.ch/contributing/#screenshots

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too follow up - the images are attractive and suggest that the tool is capable of providing diverse feedback.

Maybe you could add a brief documentation about the type of feedback provided, and use the images to illustrate this?

@msimberg
Copy link
Contributor

msimberg commented Sep 5, 2025

@cerlane what's the status of this PR? Do you think you'll have time to implement some of the suggested changes or would you like any help in getting it merged?

cerlane and others added 4 commits September 9, 2025 11:14
Co-authored-by: Ben Cumming <bcumming@cscs.ch>
Co-authored-by: Ben Cumming <bcumming@cscs.ch>
Co-authored-by: Ben Cumming <bcumming@cscs.ch>
Copy link

github-actions bot commented Sep 9, 2025

preview available: https://docs.tds.cscs.ch/241

@bcumming bcumming marked this pull request as draft September 29, 2025 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants