@@ -13,6 +13,22 @@ You can also do pip install mlc-scripts and then use `mlcr` commands for downloa
1313- DeepSeek-R1 model is automatically downloaded as part of setup
1414- Checkpoint conversion is done transparently when needed.
1515
16+ ** Using MLC R2 Downloader**
17+
18+ Download the model using the MLC R2 Downloader:
19+
20+ ``` bash
21+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
22+ https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
23+ ```
24+
25+ To specify a custom download directory, use the ` -d ` flag:
26+ ``` bash
27+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
28+ -d /path/to/download/directory \
29+ https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
30+ ```
31+
1632## Dataset Download
1733
1834The dataset is an ensemble of the datasets: AIME, MATH500, gpqa, MMLU-Pro, livecodebench(code_generation_lite). They are covered by the following licenses:
@@ -23,24 +39,17 @@ The dataset is an ensemble of the datasets: AIME, MATH500, gpqa, MMLU-Pro, livec
2339- MMLU-Pro: [ MIT] ( https://opensource.org/license/mit )
2440- livecodebench(code_generation_lite): [ CC] ( https://creativecommons.org/share-your-work/cclicenses/ )
2541
26- ### Preprocessed
27-
28- ** Using MLCFlow Automation**
29-
30- ```
31- mlcr get,dataset,whisper,_preprocessed,_mlc,_rclone --outdirname=<path to download> -j
32- ```
42+ ### Preprocessed & Calibration
3343
34- ** Using Native method **
44+ ** Using MLC R2 Downloader **
3545
3646Download the preprocessed dataset using the MLCommons downloader:
3747
3848``` bash
39- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
40- https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
49+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) -d ./ https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
4150```
4251
43- This will download the dataset file ` mlperf_deepseek_r1_dataset_4388_fp8_eval.pkl ` .
52+ This will download both the full preprocessed dataset ( ` mlperf_deepseek_r1_dataset_4388_fp8_eval.pkl ` ) and the calibration dataset ( ` mlperf_deepseek_r1_calibration_dataset_500_fp8_eval.pkl ` ) .
4453
4554To specify a custom download directory, use the ` -d ` flag:
4655``` bash
@@ -49,30 +58,20 @@ bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/he
4958 https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
5059```
5160
52- ### Calibration
61+ ### Preprocessed
5362
5463** Using MLCFlow Automation**
5564
5665```
57- mlcr get,preprocessed, dataset,deepseek-r1,_calibration ,_mlc,_rclone --outdirname=<path to download> -j
66+ mlcr get,dataset,whisper,_preprocessed ,_mlc,_rclone --outdirname=<path to download> -j
5867```
5968
60- ** Using Native method **
69+ ### Calibration
6170
62- Download the calibration dataset using the MLCommons downloader:
71+ ** Using MLCFlow Automation **
6372
64- ``` bash
65- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
66- https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
6773```
68-
69- This will download the calibration dataset file ` mlperf_deepseek_r1_calibration_dataset_500_fp8_eval.pkl ` .
70-
71- To specify a custom download directory, use the ` -d ` flag:
72- ``` bash
73- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
74- -d /path/to/download/directory \
75- https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
74+ mlcr get,preprocessed,dataset,deepseek-r1,_calibration,_mlc,_rclone --outdirname=<path to download> -j
7675```
7776
7877## Docker
0 commit comments