From d63b862139dd3da85764730c688f025929cdae39 Mon Sep 17 00:00:00 2001
From: Manfei <41607353+ManfeiBai@users.noreply.github.com>
Date: Sun, 5 Nov 2023 23:54:22 -0800
Subject: [PATCH] Update TORCH_XLA_USER_GUIDE.md

---
 TORCH_XLA_USER_GUIDE.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/TORCH_XLA_USER_GUIDE.md b/TORCH_XLA_USER_GUIDE.md
index 10c561f85..55d156285 100644
--- a/TORCH_XLA_USER_GUIDE.md
+++ b/TORCH_XLA_USER_GUIDE.md
@@ -91,11 +91,11 @@ gcloud compute tpus tpu-vm scp params_70b.json ${TPU_NAME}:params.json --zone ${
 gcloud compute tpus tpu-vm ssh ${TPU_NAME} --zone ${ZONE} --project ${PROJECT_ID} --worker=all --command="cd $HOME/llama && 
 PJRT_DEVICE=TPU XLA_FLAGS=--xla_dump_to=/tmp/dir_name PROFILE_LOGDIR=/tmp/home/ python3.8 example_text_completion.py --ckpt_dir . --tokenizer_path $HOME/llama/t5_tokenizer/spiece.model --max_seq_len 2048 --max_gen_len 1000 --max_batch_size 2 --mp True --dynamo True"
 ```
-## Commands to Run Llama2 using XLA:GPU (e.g. L4 or H100)
+## Commands to Run Llama2 using XLA:GPU (e.g. L4 or H100) without Quantization
 
-`example_text_completion.py` can also be ran on GPUs with XLA:GPU. To do that, you need different wheels than the above such
+`example_text_completion.py` can also be ran on GPUs with XLA:GPU without quantization. To do that, you need different wheels than the above such
 that you have XLA:GPU support. Please refer to [pytorch/xla](https://github.com/pytorch/xla#wheel) repo to download
-a suitable GPU nightly wheel for your environment.
+a suitable GPU nightly at 2023/04/22 wheel for your environment.
 
 After that, you can run the following the command:
 ```