Skip to content

Commit

Permalink
Merge branch 'master' into prometheus-metrics-update
Browse files Browse the repository at this point in the history
  • Loading branch information
jaywonchung authored Jan 14, 2025
2 parents 5d99d99 + c5e09ed commit a2c672f
Show file tree
Hide file tree
Showing 17 changed files with 1,868 additions and 216 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/check_homepage_build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
python-version: 3.9
cache: 'pip'
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v3
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/deploy_homepage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
python-version: 3.9
cache: 'pip'
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v3
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/zeusd_fmt_lint_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
- name: Install the Rust toolchain
run: rustup toolchain install stable --profile minimal
- name: Cache dependencies
uses: actions/cache@v2
uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
---
**Project News**

- \[2024/08\] Perseus, an optimizer for large model training, was accepted to SOSP'24! [Preprint](https://arxiv.org/abs/2312.06902) | [Blog](https://ml.energy/zeus/research_overview/perseus) | [Optimizer](https://ml.energy/zeus/optimize/pipeline_frequency_optimizer)
- \[2024/08\] Perseus, an optimizer for large model training, was accepted to SOSP'24! [Paper](https://dl.acm.org/doi/10.1145/3694715.3695970) | [Blog](https://ml.energy/zeus/research_overview/perseus) | [Optimizer](https://ml.energy/zeus/optimize/pipeline_frequency_optimizer)
- \[2024/07\] Added AMD GPU, CPU, and DRAM energy measurement support, and preliminary JAX support!
- \[2024/05\] Zeus is now a PyTorch ecosystem project. Read the PyTorch blog post [here](https://pytorch.org/blog/zeus/)!
- \[2024/02\] Zeus was selected as a [2024 Mozilla Technology Fund awardee](https://foundation.mozilla.org/en/blog/open-source-AI-for-environmental-justice/)!
Expand Down
3 changes: 1 addition & 2 deletions docs/research_overview/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,4 @@ Even more research is ongoing, and Zeus will continue to expand and get better a

1. Zeus (NSDI, 2023): [Paper](https://www.usenix.org/conference/nsdi23/presentation/you) | [Blog](zeus.md) | [Slides](https://www.usenix.org/system/files/nsdi23_slides_chung.pdf)
1. Chase (ICLRW, 2023): [Paper](https://arxiv.org/abs/2303.02508)
1. Perseus (SOSP, 2024): [Preprint](https://arxiv.org/abs/2312.06902) | [Blog](perseus.md)

1. Perseus (SOSP, 2024): [Paper](https://dl.acm.org/doi/10.1145/3694715.3695970) | [Blog](perseus.md)
2 changes: 1 addition & 1 deletion docs/research_overview/perseus.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ description: Reducing Energy Bloat in Large Model Training

SOSP '24

[**Paper**](https://arxiv.org/abs/2312.06902)
[**Paper**](https://dl.acm.org/doi/10.1145/3694715.3695970) | [**ArXiv**](https://arxiv.org/abs/2312.06902)
</div>

<div class="critic-dark" markdown>
Expand Down
30 changes: 30 additions & 0 deletions examples/carbon_emission_monitor/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Integrating Zeus with HuggingFace 🤗

This example will demonstrate how to integrate Zeus's `CarbonEmissionMonitor` with HuggingFace Transformers:

`run_clm.py`: Transformers [`Trainer`](https://huggingface.co/docs/transformers/main_classes/trainer) for **causal langauge modeling** (i.e., pre-training)

## Dependencies

To run the `Trainer` integration script (`run_clm.py`):
```sh
pip install -r requirements.txt

## `CarbonEmissionMonitor`

[`CarbonEmissionMonitor`](https://ml.energy/zeus/reference/monitor/carbon/#zeus.monitor.carbon.CarbonEmissionMonitor): Measures the GPU time, energy consumption, and carbon emission of arbitrary code blocks.

## Running the Example

By default, `Trainer` will make use of all available GPUs. If you would like to use only a subset of the GPUs, specify the `CUDA_VISIBLE_DEVICES` environment variable, which Zeus will also automatically respect.

```bash
python run_clm.py \
--model_name_or_path gpt2 \
--dataset_name wikitext \
--dataset_config_name wikitext-2-raw-v1 \
--per_device_train_batch_size 8 \
--per_device_eval_batch_size 8 \
--do_train \
--do_eval \
--output_dir /tmp/test-clm
9 changes: 9 additions & 0 deletions examples/carbon_emission_monitor/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
zeus-ml
accelerate >= 0.12.0
torch >= 1.3
datasets >= 1.8.0
sentencepiece != 0.1.92
protobuf
evaluate
scikit-learn
transformers>=4.37.2
Loading

0 comments on commit a2c672f

Please sign in to comment.