Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensor shape match error #14

Closed
walegahaha123 opened this issue Oct 31, 2024 · 1 comment
Closed

tensor shape match error #14

walegahaha123 opened this issue Oct 31, 2024 · 1 comment

Comments

@walegahaha123
Copy link

walegahaha123 commented Oct 31, 2024

The size of tensor a (0) must match the size of tensor b (1536) at non-singleton dimension 1
Error only if i change gradient_checkpointing into True, and that waste lots of memory

environment as follows:
cuda==11.7(limited by machine)
python==3.9
pytorch==2.0.1(limited by cuda)
deepspeed==0.14.2
transformers==4.41.1
lightning==2.3.0 (lightning==2.4.0 need torch<4.0,>=2.1.0)
wheel==0.44.0
flash-attn==2.6.3 (False,limited by machine)
fbgemm-gpu==0.5.0
sentencepiece==0.2.0
pandas==2.2.3
colorlog==6.9.0
tensorboardX==2.6.2.2
tensorflow_cpu==2.8.0
colorama==0.4.6
torch_geometric==2.5.3
scikit-learn==1.5.2
protobuf==3.20

ERROR LOG:

++ date +%FT%T

  • start_time=2024-10-31T18:29:49
  • [[ '' == '' ]]
  • [[ '' == '' ]]
  • nnodes=1
  • node_rank=0
  • master_port=12345
    ++ nvidia-smi --list-gpus
    ++ wc -l
  • nproc_per_node=1
  • torchrun --master_port=12345 --node_rank=0 --nproc_per_node=1 --nnodes=1 run.py --config_file overall/LLM_deepspeed.yaml HLLM/HLLM.yaml --MAX_ITEM_LIST_LENGTH 3 --epochs 5 --optim_args.learning_rate 1e-4 --MAX_TEXT_LENGTH 3 --train_batch_size 2
    [2024-10-31 18:29:55,303] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
    �[93m [WARNING] �[0m async_io requires the dev libaio .so object and headers but these were not found.
    �[93m [WARNING] �[0m async_io: please install the libaio-devel package with yum
    �[93m [WARNING] �[0m If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
    �[93m [WARNING] �[0m Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
    �[93m [WARNING] �[0m NVIDIA Inference is only supported on Ampere and newer architectures
    �[93m [WARNING] �[0m sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
    �[93m [WARNING] �[0m using untested triton version (2.0.0), only 1.0.0 is known to be compatible
    31 Oct 18:30 INFO Update text_path to /data/home/xconnorwang/HLLM/information/Pixel200K.csv
    31 Oct 18:30 INFO Loading <class 'REC.data.dataload.Data'> from scratch with self.data_split = None.
    31 Oct 18:30 INFO Interaction feature loaded successfully from [../dataset/Pixel200K.csv].
    31 Oct 18:30 INFO self.user_num = 200001 self.item_num = 96283
    31 Oct 18:30 INFO self.inter_feat['item_id'].isna().any() = False self.inter_feat['user_id'].isna().any() = False
    31 Oct 18:30 INFO build Pixel200K dataload
    31 Oct 18:30 INFO Use random sample True for mask id
    31 Oct 18:30 INFO Text path: /data/home/xconnorwang/HLLM/information/Pixel200K.csv
    31 Oct 18:30 INFO Text keys: ['title', 'tag', 'description']
    31 Oct 18:30 INFO Item prompt: Compress the following sentence into embedding:
    31 Oct 18:30 INFO Text Item num: 96281
    31 Oct 18:30 INFO [Training]: train_batch_size = [2]
    31 Oct 18:30 INFO [Evaluation]: eval_batch_size = [3]
    /data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 11 worker processes in total. Our suggested max number of worker in current system is 10, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
    warnings.warn(_create_warning_msg(
    len(train_loader) = 408043
    31 Oct 18:30 INFO create item llm
    31 Oct 18:30 INFO ******* create LLM ../item_pretrain *******
    31 Oct 18:30 INFO hf_config: LlamaConfig {
    "_name_or_path": "../item_pretrain",
    "architectures": [
    "LlamaForCausalLM"
    ],
    "attention_bias": false,
    "attention_dropout": 0.0,
    "bos_token_id": 0,
    "eos_token_id": 0,
    "hidden_act": "silu",
    "hidden_size": 576,
    "initializer_range": 0.02,
    "intermediate_size": 1536,
    "max_position_embeddings": 2048,
    "mlp_bias": false,
    "model_type": "llama",
    "num_attention_heads": 9,
    "num_hidden_layers": 30,
    "num_key_value_heads": 3,
    "pretraining_tp": 1,
    "rms_norm_eps": 1e-05,
    "rope_scaling": null,
    "rope_theta": 10000.0,
    "tie_word_embeddings": true,
    "torch_dtype": "bfloat16",
    "transformers_version": "4.41.1",
    "use_cache": true,
    "vocab_size": 49152
    }

31 Oct 18:30 INFO xxxxx starting loading checkpoint
31 Oct 18:30 INFO Using flash attention False for llama
31 Oct 18:30 INFO Init True for llama
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model.
31 Oct 18:30 INFO create user llm
31 Oct 18:30 INFO ******* create LLM ../user_pretrain *******
31 Oct 18:30 INFO hf_config: LlamaConfig {
"_name_or_path": "../user_pretrain",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 0,
"eos_token_id": 0,
"hidden_act": "silu",
"hidden_size": 576,
"initializer_range": 0.02,
"intermediate_size": 1536,
"max_position_embeddings": 2048,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 9,
"num_hidden_layers": 30,
"num_key_value_heads": 3,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.41.1",
"use_cache": true,
"vocab_size": 49152
}

31 Oct 18:30 INFO xxxxx starting loading checkpoint
31 Oct 18:30 INFO Using flash attention False for llama
31 Oct 18:30 INFO Init True for llama
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model.
31 Oct 18:30 INFO nce thres setting to 0.99
31 Oct 18:30 INFO item_emb_tokens torch.Size([1, 1, 576]) True
31 Oct 18:30 INFO logit_scale torch.Size([]) True
31 Oct 18:30 INFO item_llm.model.embed_tokens.weight torch.Size([49152, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO item_llm.model.layers.0.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO item_llm.model.layers.0.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO item_llm.model.layers.1.self_attn.v_proj.weight torch.Size([192, 576]) True
...
31 Oct 18:30 INFO user_llm.model.layers.26.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.26.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.26.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.27.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.27.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.28.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.28.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.q_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.k_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.v_proj.weight torch.Size([192, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.self_attn.o_proj.weight torch.Size([576, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.mlp.gate_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.mlp.up_proj.weight torch.Size([1536, 576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.mlp.down_proj.weight torch.Size([576, 1536]) True
31 Oct 18:30 INFO user_llm.model.layers.29.input_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.layers.29.post_attention_layernorm.weight torch.Size([576]) True
31 Oct 18:30 INFO user_llm.model.norm.weight torch.Size([576]) True
31 Oct 18:30 INFO
World_Size = 1

31 Oct 18:30 INFO
General Hyper Parameters:
seed = 2020
state = INFO
use_text = True
reproducibility = True
checkpoint_dir = saved
show_progress = True
log_wandb = False
data_path = ../dataset/
strategy = deepspeed
precision = bf16-mixed
model = HLLM

Training Hyper Parameters:
epochs = 5
train_batch_size = 2
optim_args = {'learning_rate': 0.0001, 'weight_decay': 0.01}
eval_step = 1
stopping_step = 5

Evaluation Hyper Parameters:
eval_batch_size = 3
topk = [5, 10, 50, 200]
metrics = ['Recall', 'NDCG']
valid_metric = NDCG@200
metric_decimal_place = 7
eval_type = EvaluatorType.RANKING
valid_metric_bigger = True

Dataset Hyper Parameters:
MAX_ITEM_LIST_LENGTH = 3
MAX_TEXT_LENGTH = 3
text_keys = ['title', 'tag', 'description']
item_prompt = Compress the following sentence into embedding:

Other Hyper Parameters:
wandb_project = REC
text_path = /data/home/xconnorwang/HLLM/information/Pixel200K.csv
item_emb_token_n = 1
loss = nce
scheduler_args = {'type': 'cosine', 'warmup': 0.1}
stage = 3
gradient_checkpointing = True
zero3_init_flag = False
item_pretrain_dir = ../item_pretrain
item_llm_init = True
user_pretrain_dir = ../user_pretrain
user_llm_init = True
use_ft_flash_attn = False
MODEL_INPUT_TYPE = InputType.SEQ
device = cuda:0

31 Oct 18:30 INFO Pixel200K
The number of users: 200001
Average actions of users: 19.82828
The number of items: 96283
Average actions of items: 41.187927130720176
The number of inters: 3965656
The sparsity of the dataset: 99.9794063532928%
31 Oct 18:30 INFO HLLM(
(item_llm): LlamaForCausalLM(
(model): LlamaModel(
(embed_tokens): Embedding(49152, 576)
(layers): ModuleList(
(0-29): 30 x LlamaDecoderLayer(
(self_attn): LlamaAttention(
(q_proj): Linear(in_features=576, out_features=576, bias=False)
(k_proj): Linear(in_features=576, out_features=192, bias=False)
(v_proj): Linear(in_features=576, out_features=192, bias=False)
(o_proj): Linear(in_features=576, out_features=576, bias=False)
(rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
(gate_proj): Linear(in_features=576, out_features=1536, bias=False)
(up_proj): Linear(in_features=576, out_features=1536, bias=False)
(down_proj): Linear(in_features=1536, out_features=576, bias=False)
(act_fn): SiLUActivation()
)
(input_layernorm): LlamaRMSNorm()
(post_attention_layernorm): LlamaRMSNorm()
)
)
(norm): LlamaRMSNorm()
)
(lm_head): Linear(in_features=576, out_features=49152, bias=False)
)
(user_llm): LlamaForCausalLM(
(model): LlamaModel(
(embed_tokens): Embedding(49152, 576)
(layers): ModuleList(
(0-29): 30 x LlamaDecoderLayer(
(self_attn): LlamaAttention(
(q_proj): Linear(in_features=576, out_features=576, bias=False)
(k_proj): Linear(in_features=576, out_features=192, bias=False)
(v_proj): Linear(in_features=576, out_features=192, bias=False)
(o_proj): Linear(in_features=576, out_features=576, bias=False)
(rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
(gate_proj): Linear(in_features=576, out_features=1536, bias=False)
(up_proj): Linear(in_features=576, out_features=1536, bias=False)
(down_proj): Linear(in_features=1536, out_features=576, bias=False)
(act_fn): SiLUActivation()
)
(input_layernorm): LlamaRMSNorm()
(post_attention_layernorm): LlamaRMSNorm()
)
)
(norm): LlamaRMSNorm()
)
(lm_head): Linear(in_features=576, out_features=49152, bias=False)
)
)
Trainable parameters: 269030593.0
31 Oct 18:30 INFO Use consine scheduler with 204021.5 warmup 2040215 total steps
31 Oct 18:30 INFO Use deepspeed strategy
initializing deepspeed distributed: GLOBAL_RANK: 0, MEMBER: 1/1
Enabling DeepSpeed BF16. Model parameters and inputs will be cast to bfloat16.
31 Oct 18:30 INFO Added key: store_based_barrier_key:2 to store for rank: 0
31 Oct 18:30 INFO Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 1 nodes.
Parameter Offload: Total persistent parameters: 70849 in 124 params

Train [ 0/ 5]: 0%| | 0/408043 [00:00<?, ?it/s]/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 11 worker processes in total. Our suggested max number of worker in current system is 10, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(create_warning_msg(
elem_shape: torch.Size([4]) elem: tensor([35434, 40551, 37030, 59065])
batch_len: 2 batch: [tensor([35434, 40551, 37030, 59065]), tensor([12579, 4861, 4391, 11309])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: elem_shape: torch.Size([4]) elem: tensor([87871, 81222, 81198, 22876])
batch_len: 2 batch: [tensor([87871, 81222, 81198, 22876]), tensor([48319, 60427, 57998, 62150])]
elem_shape: torch.Size([3]) elem: tensor([64465, 17314, 64780, 42073])
batch_len: 2 batch: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([64465, 17314, 64780, 42073]), tensor([30084, 36569, 75655, 9662])]
[tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4]) elem_shape: elem: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
tensor([[2021, 5, 11, 22, 21, 22],
[2021, 6, 26, 3, 21, 25],
[2021, 6, 26, 19, 6, 11],
[2021, 7, 2, 3, 9, 1]])
batch_len: 2 batch: tensor([ 425, 14391, 328, 17107])
batch_len: 2 batch: tensor([ 847, 39123, 15991, 22791])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4])[tensor([ 425, 14391, 328, 17107]), tensor([ 391, 13496, 15739, 5500])]
elem: [tensor([ 847, 39123, 15991, 22791]), tensor([41123, 14041, 23611, 73814])]
tensor([51018, 50191, 91965, 58533])
batch_len: 2 batch: [tensor([[2021, 5, 11, 22, 21, 22],
[2021, 6, 26, 3, 21, 25],
[2021, 6, 26, 19, 6, 11],
[2021, 7, 2, 3, 9, 1]]), tensor([[2019, 4, 2, 16, 6, 14],
[2019, 5, 13, 15, 55, 25],
[2019, 5, 27, 15, 25, 55],
[2020, 3, 13, 6, 15, 16]])]
[tensor([51018, 50191, 91965, 58533]), tensor([44203, 41980, 80114, 74846])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4])elem_shape: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem: elem_shape: torch.Size([4])torch.Size([4]) [tensor([1, 1, 1]), tensor([1, 1, 1])]elem: elem:
tensor([75806, 61616, 23817, 95456])
batch_len: 2 batch: tensor([71534, 69682, 37869, 58014])
batch_len: 2 /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
batch: elem_shape: elem_shape: torch.Size([4, 6]) elem: torch.Size([4]) elem: [tensor([75806, 61616, 23817, 95456]), tensor([53752, 39814, 26241, 94833])]
[tensor([71534, 69682, 37869, 58014]), tensor([67918, 40640, 76791, 7968])]
tensor([23374, 8848, 3948, 36802])
batch_len: 2tensor([[2020, 8, 23, 14, 40, 54],
[2021, 5, 2, 8, 51, 43],
[2021, 6, 26, 13, 45, 19],
[2021, 7, 23, 7, 49, 36]])
batch: batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
tensor([ 2969, 25609, 13490, 14848])batch_len:
batch_len: 2 2 batch: batch: [tensor([23374, 8848, 3948, 36802]), tensor([46118, 11192, 20427, 6691])]
elem_shape: torch.Size([4]) elem: [tensor([ 2969, 25609, 13490, 14848]), tensor([78124, 39314, 69903, 52117])]
[tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
[tensor([[2020, 8, 23, 14, 40, 54],
[2021, 5, 2, 8, 51, 43],
[2021, 6, 26, 13, 45, 19],
[2021, 7, 23, 7, 49, 36]]), tensor([[2020, 7, 25, 16, 56, 10],
[2020, 8, 31, 10, 45, 23],
[2020, 8, 31, 11, 10, 33],
[2020, 9, 9, 8, 37, 37]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
tensor([18921, 20968, 23234, 15771])
batch_len: 2 batch: tensor([[2020, 5, 16, 9, 20, 33],
[2020, 8, 12, 12, 42, 53],
[2020, 8, 25, 23, 54, 8],
[2020, 12, 9, 8, 38, 4]])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4])elem_shape: elem: torch.Size([4]) elem: [tensor([18921, 20968, 23234, 15771]), tensor([41684, 16594, 21227, 22578])]
elem_shape: torch.Size([4]) elem: tensor([ 3645, 46545, 23431, 92630])
batch_len: 2 tensor([ 1778, 14200, 60825, 10431])batch:
batch_len: 2 batch: [tensor([ 3645, 46545, 23431, 92630]), tensor([21302, 48795, 47488, 34331])]
[tensor([ 1778, 14200, 60825, 10431]), tensor([82015, 85877, 42269, 53227])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
[tensor([[2020, 5, 16, 9, 20, 33],
[2020, 8, 12, 12, 42, 53],
[2020, 8, 25, 23, 54, 8],
[2020, 12, 9, 8, 38, 4]]), tensor([[2021, 1, 2, 1, 55, 10],
[2021, 1, 4, 12, 57, 38],
[2021, 1, 10, 8, 48, 46],
[2021, 1, 25, 8, 36, 14]])]
elem_shape: torch.Size([3]) elem: tensor([75506, 5080, 325, 80292])
elem_shape: batch_len: 2 torch.Size([3])batch: elem: tensor([1, 1, 1])/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)

batch_len: 2 batch: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([75506, 5080, 325, 80292]), tensor([ 8350, 15011, 15842, 12424])]
[tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([62601, 73106, 10840, 84653])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: [tensor([62601, 73106, 10840, 84653]), tensor([48662, 22713, 69375, 59])]/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage()._new_shared(numel)

tensor([[2021, 7, 30, 13, 3, 55],
[2021, 8, 2, 7, 9, 44],
[2021, 10, 7, 3, 53, 40],
[2021, 12, 18, 7, 53, 28]])
batch_len: 2 batch: tensor([[2020, 10, 15, 10, 0, 6],
[2020, 11, 4, 20, 23, 30],
[2020, 11, 9, 20, 0, 49],
[2020, 11, 25, 7, 34, 25]])
batch_len: 2elem_shape: batch: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
2 elem_shape: batch: torch.Size([4]) elem: tensor([36270, 52981, 82544, 71318])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
[tensor([36270, 52981, 82544, 71318]), tensor([58128, 32419, 83234, 4544])]/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)

elem_shape: torch.Size([4, 6]) elem: [tensor([[2020, 10, 15, 10, 0, 6],
[2020, 11, 4, 20, 23, 30],
[2020, 11, 9, 20, 0, 49],
[2020, 11, 25, 7, 34, 25]]), tensor([[2022, 1, 10, 13, 26, 1],
[2022, 1, 16, 11, 39, 38],
[2022, 1, 16, 12, 32, 54],
[2022, 1, 16, 12, 35, 23]])]
[tensor([[2021, 7, 30, 13, 3, 55],
[2021, 8, 2, 7, 9, 44],
[2021, 10, 7, 3, 53, 40],
[2021, 12, 18, 7, 53, 28]]), tensor([[2022, 3, 7, 17, 25, 50],
[2022, 3, 18, 17, 32, 55],
[2022, 3, 24, 12, 58, 29],
[2022, 4, 17, 13, 59, 9]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: tensor([[2019, 10, 10, 11, 54, 1],
[2019, 10, 28, 9, 41, 15],
[2019, 11, 1, 12, 55, 38],
[2019, 11, 2, 7, 20, 17]])2
batch: batch_len: 2 elem_shape: batch: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([11709, 58322, 49677, 18878])
batch_len: 2 batch: [tensor([[2019, 10, 10, 11, 54, 1],
[2019, 10, 28, 9, 41, 15],
[2019, 11, 1, 12, 55, 38],
[2019, 11, 2, 7, 20, 17]]), tensor([[2021, 10, 21, 10, 57, 1],
[2021, 10, 28, 4, 41, 13],
[2021, 10, 31, 14, 23, 40],
[2021, 11, 3, 14, 10, 39]])]
tensor([[2022, 1, 9, 4, 24, 57],
[2022, 1, 11, 15, 1, 33],
[2022, 1, 13, 17, 10, 21],
[2022, 2, 24, 16, 4, 35]])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
[tensor([11709, 58322, 49677, 18878]), tensor([72797, 27520, 313, 9565])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
[tensor([[2022, 1, 9, 4, 24, 57],
[2022, 1, 11, 15, 1, 33],
[2022, 1, 13, 17, 10, 21],
[2022, 2, 24, 16, 4, 35]]), tensor([[2021, 9, 11, 8, 53, 52],
[2021, 10, 1, 7, 4, 45],
[2021, 11, 1, 6, 57, 6],
[2021, 11, 2, 12, 50, 55]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
tensor([23121, 94384, 64729, 63419])
batch_len: 2 batch: [tensor([23121, 94384, 64729, 63419]), tensor([68192, 22562, 91147, 53581])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: tensor([[2017, 6, 15, 15, 24, 56],
[2018, 9, 2, 12, 31, 42],
[2018, 12, 5, 13, 24, 18],
[2018, 12, 30, 11, 41, 17]])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: [tensor([[2017, 6, 15, 15, 24, 56],
[2018, 9, 2, 12, 31, 42],
[2018, 12, 5, 13, 24, 18],
[2018, 12, 30, 11, 41, 17]]), tensor([[2021, 10, 16, 21, 55, 55],
[2021, 10, 17, 11, 25, 49],
[2021, 10, 22, 12, 27, 43],
[2021, 10, 24, 10, 50, 51]])]
tensor([[2021, 8, 13, 1, 4, 58],
[2021, 8, 29, 5, 0, 33],
[2021, 8, 29, 8, 9, 38],
[2021, 9, 15, 4, 46, 37]])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
[tensor([[2021, 8, 13, 1, 4, 58],
[2021, 8, 29, 5, 0, 33],
[2021, 8, 29, 8, 9, 38],
[2021, 9, 15, 4, 46, 37]]), tensor([[2021, 5, 8, 11, 38, 37],
[2021, 6, 6, 12, 34, 21],
[2021, 9, 4, 12, 5, 10],
[2021, 9, 16, 15, 41, 53]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([38585, 10559, 37311, 1179])
batch_len: 2 batch: [tensor([38585, 10559, 37311, 1179]), tensor([23749, 14811, 19801, 8735])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([67086, 15614, 41876, 65866])
batch_len: 2 batch: [tensor([67086, 15614, 41876, 65866]), tensor([78779, 84830, 13453, 29267])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([[2018, 11, 25, 8, 42, 38],
[2019, 8, 20, 8, 15, 18],
[2019, 9, 23, 9, 21, 23],
[2019, 10, 15, 4, 48, 33]])
batch_len: 2 batch: [tensor([[2018, 11, 25, 8, 42, 38],
[2019, 8, 20, 8, 15, 18],
[2019, 9, 23, 9, 21, 23],
[2019, 10, 15, 4, 48, 33]]), tensor([[2021, 12, 18, 7, 8, 29],
[2022, 1, 23, 2, 26, 47],
[2022, 2, 15, 8, 20, 36],
[2022, 2, 28, 7, 13, 14]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: elem_shape: torch.Size([4]) torch.Size([4])elem: elem: tensor([ 6662, 4673, 11970, 3575])
tensor([38929, 55609, 36272, 39413])batch_len:
2batch_len: batch: 2 batch: [tensor([38929, 55609, 36272, 39413]), tensor([10936, 63753, 14848, 14360])]
[tensor([ 6662, 4673, 11970, 3575]), tensor([44828, 36185, 87193, 63846])]
elem_shape: elem_shape: torch.Size([4]) torch.Size([4])elem: elem: tensor([32135, 93929, 34764, 69886])
batch_len: 2tensor([45581, 24052, 52267, 23602])
batch: batch_len: 2 batch: [tensor([32135, 93929, 34764, 69886]), tensor([54510, 83382, 65314, 46423])]
[tensor([45581, 24052, 52267, 23602]), tensor([37862, 78032, 23472, 36266])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6])elem_shape: elem: torch.Size([4]) elem: tensor([25029, 50735, 72231, 82790])
tensor([[2020, 6, 22, 4, 8, 58],
[2020, 6, 24, 10, 35, 59],
[2021, 4, 25, 13, 5, 40],
[2021, 6, 30, 23, 12, 22]])batch_len:
2batch_len: elem_shape: batch: 2 batch: torch.Size([4]) elem: tensor([[2020, 4, 5, 0, 51, 10],
[2020, 4, 25, 5, 5, 49],
[2020, 4, 30, 14, 29, 13],
[2020, 5, 4, 13, 33, 40]])
tensor([ 4706, 45075, 33534, 35891])batch_len:
2batch_len: [tensor([25029, 50735, 72231, 82790]), tensor([36757, 13822, 39746, 12628])]batch: 2
batch: elem_shape: torch.Size([4]) elem: [tensor([ 4706, 45075, 33534, 35891]), tensor([14289, 39310, 54955, 22678])]
tensor([ 8623, 49238, 74673, 30410])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: elem_shape: tensor([85332, 83015, 13114, 16238])[tensor([ 8623, 49238, 74673, 30410]), tensor([ 5683, 50161, 52544, 12805])]

batch_len: 2 batch: [tensor([[2020, 6, 22, 4, 8, 58],
[2020, 6, 24, 10, 35, 59],
[2021, 4, 25, 13, 5, 40],
[2021, 6, 30, 23, 12, 22]]), tensor([[2020, 12, 13, 14, 34, 9],
[2020, 12, 15, 10, 28, 9],
[2020, 12, 16, 2, 21, 40],
[2020, 12, 16, 11, 30, 10]])]torch.Size([4])
elem: elem_shape: [tensor([[2020, 4, 5, 0, 51, 10],
[2020, 4, 25, 5, 5, 49],
[2020, 4, 30, 14, 29, 13],
[2020, 5, 4, 13, 33, 40]]), tensor([[2021, 3, 8, 14, 38, 4],
[2021, 4, 3, 13, 49, 1],
[2021, 4, 21, 14, 32, 33],
[2021, 5, 2, 11, 25, 25]])]torch.Size([3])
elem: [tensor([85332, 83015, 13114, 16238]), tensor([94158, 39764, 91330, 69704])]
tensor([ 6869, 8025, 14091, 70717])tensor([1, 1, 1])

batch_len: batch_len: 2 2batch: batch: elem_shape: torch.Size([3]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: [tensor([ 6869, 8025, 14091, 70717]), tensor([15109, 31352, 9331, 28637])]
torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4]) elem: elem_shape: torch.Size([4, 6]) elem: tensor([41009, 62101, 75279, 8355])
batch_len: tensor([[2020, 9, 20, 16, 18, 6],
[2020, 12, 11, 11, 11, 37],
[2021, 6, 13, 14, 34, 1],
[2021, 7, 13, 10, 17, 45]])
batch_len: 2 2batch: batch: [tensor([41009, 62101, 75279, 8355]), tensor([10073, 35383, 38809, 62264])]tensor([[2021, 6, 11, 9, 18, 6],
[2021, 6, 29, 7, 45, 39],
[2021, 7, 1, 5, 32, 32],
[2021, 7, 8, 0, 9, 53]])

batch_len: 2 batch: elem_shape: torch.Size([3]) elem: [tensor([[2020, 9, 20, 16, 18, 6],
[2020, 12, 11, 11, 11, 37],
[2021, 6, 13, 14, 34, 1],
[2021, 7, 13, 10, 17, 45]]), tensor([[2020, 4, 25, 6, 12, 24],
[2020, 5, 10, 15, 33, 41],
[2020, 8, 2, 17, 29, 52],
[2021, 1, 30, 13, 57, 19]])]
tensor([1, 1, 1])
batch_len: 2 batch: [tensor([[2021, 6, 11, 9, 18, 6],
[2021, 6, 29, 7, 45, 39],
[2021, 7, 1, 5, 32, 32],
[2021, 7, 8, 0, 9, 53]]), tensor([[2020, 4, 5, 8, 51, 56],
[2020, 4, 9, 3, 39, 43],
[2020, 4, 13, 7, 54, 40],
[2020, 4, 15, 8, 2, 15]])]
elem_shape: torch.Size([4]) elem: elem_shape: [tensor([1, 1, 1]), tensor([1, 1, 1])]
torch.Size([4]) elem: elem_shape: torch.Size([4, 6]) elem: tensor([29506, 3031, 2450, 16728])
batch_len: 2tensor([27363, 17546, 49356, 12892])
batch: batch_len: 2 batch: [tensor([27363, 17546, 49356, 12892]), tensor([58303, 86109, 48931, 5863])]
[tensor([29506, 3031, 2450, 16728]), tensor([ 4410, 19712, 7912, 2837])]
tensor([[2020, 3, 22, 6, 8, 58],
[2020, 3, 23, 15, 57, 24],
[2020, 4, 1, 17, 19, 52],
[2020, 4, 23, 8, 40, 44]])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: elem_shape: torch.Size([4]) elem: tensor([41751, 69507, 17961, 37770])
batch_len: 2 batch: tensor([40460, 86051, 1461, 24607])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([41751, 69507, 17961, 37770]), tensor([61301, 94601, 69568, 52381])]
[tensor([40460, 86051, 1461, 24607]), tensor([69028, 81627, 80154, 46609])]
tensor([10081, 38232, 20732, 2753])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: elem_shape: torch.Size([3]) elem: [tensor([10081, 38232, 20732, 2753]), tensor([69776, 64923, 40659, 53832])]tensor([1, 1, 1])

batch_len: [tensor([[2020, 3, 22, 6, 8, 58],
[2020, 3, 23, 15, 57, 24],
[2020, 4, 1, 17, 19, 52],
[2020, 4, 23, 8, 40, 44]]), tensor([[2021, 12, 10, 14, 1, 52],
[2022, 1, 21, 2, 55, 46],
[2022, 1, 30, 9, 57, 52],
[2022, 4, 8, 5, 5, 32]])]2
batch: tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
tensor([80933, 54961, 71171, 20766])
batch_len: 2 batch: elem_shape: torch.Size([4, 6])[tensor([1, 1, 1]), tensor([1, 1, 1])] elem:
elem_shape: torch.Size([4, 6]) elem: [tensor([80933, 54961, 71171, 20766]), tensor([27471, 12016, 60008, 31466])]
tensor([[2021, 4, 16, 18, 52, 24],
[2021, 4, 24, 4, 1, 15],
[2021, 4, 28, 12, 34, 40],
[2021, 5, 27, 14, 53, 32]])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: tensor([[2022, 1, 19, 8, 21, 4],
[2022, 1, 25, 13, 21, 57],
[2022, 1, 25, 13, 26, 8],
[2022, 1, 25, 14, 37, 19]])
batch_len: 2 batch: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6]) elem: [tensor([[2021, 4, 16, 18, 52, 24],
[2021, 4, 24, 4, 1, 15],
[2021, 4, 28, 12, 34, 40],
[2021, 5, 27, 14, 53, 32]]), tensor([[2022, 2, 11, 10, 37, 8],
[2022, 2, 14, 9, 58, 15],
[2022, 2, 15, 10, 35, 35],
[2022, 2, 25, 11, 35, 51]])]
[tensor([[2022, 1, 19, 8, 21, 4],
[2022, 1, 25, 13, 21, 57],
[2022, 1, 25, 13, 26, 8],
[2022, 1, 25, 14, 37, 19]]), tensor([[2021, 8, 8, 4, 48, 52],
[2021, 8, 9, 11, 59, 3],
[2021, 8, 11, 2, 36, 48],
[2021, 8, 15, 9, 18, 21]])]
tensor([[2019, 12, 21, 0, 20, 2],
[2019, 12, 24, 4, 56, 0],
[2020, 6, 3, 14, 42, 35],
[2020, 6, 7, 12, 34, 55]])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([[2019, 12, 21, 0, 20, 2],
[2019, 12, 24, 4, 56, 0],
[2020, 6, 3, 14, 42, 35],
[2020, 6, 7, 12, 34, 55]]), tensor([[2021, 8, 5, 16, 34, 44],
[2021, 8, 10, 10, 42, 13],
[2021, 8, 15, 5, 23, 11],
[2021, 8, 25, 5, 38, 34]])]
tensor([25996, 21510, 15073, 22007])
batch_len: 2 batch: [tensor([25996, 21510, 15073, 22007]), tensor([20878, 33688, 13673, 23473])]
elem_shape: torch.Size([4]) elem: tensor([54102, 80143, 42441, 14827])
batch_len: 2 batch: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:43: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = elem.storage().new_shared(numel)
[tensor([54102, 80143, 42441, 14827]), tensor([64859, 75260, 51319, 27606])]
elem_shape: torch.Size([4]) elem: /data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [8], which does not match the required output shape [2, 4]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize
(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([89791, 90644, 92446, 54050])
batch_len: 2 batch: tensor([65655, 37454, 50466, 48762])
batch_len: 2 batch: [tensor([89791, 90644, 92446, 54050]), tensor([56544, 83106, 41060, 1607])]
[tensor([65655, 37454, 50466, 48762]), tensor([60052, 200, 61482, 89754])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([3]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
tensor([1, 1, 1])
batch_len: 2 batch: elem_shape: torch.Size([4, 6]) elem: [tensor([1, 1, 1]), tensor([1, 1, 1])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [6], which does not match the required output shape [2, 3]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4, 6]) elem: tensor([[2021, 6, 4, 10, 56, 45],
[2021, 6, 9, 4, 12, 42],
[2021, 6, 13, 8, 18, 31],
[2021, 7, 10, 15, 46, 59]])
batch_len: 2 batch: tensor([[2021, 5, 26, 8, 51, 21],
[2021, 6, 2, 5, 45, 4],
[2021, 6, 2, 10, 47, 17],
[2021, 6, 14, 23, 51, 24]])
batch_len: 2 batch: [tensor([[2021, 6, 4, 10, 56, 45],
[2021, 6, 9, 4, 12, 42],
[2021, 6, 13, 8, 18, 31],
[2021, 7, 10, 15, 46, 59]]), tensor([[2018, 11, 16, 15, 22, 34],
[2019, 4, 6, 18, 8, 30],
[2019, 4, 19, 9, 34, 45],
[2019, 4, 30, 13, 9, 4]])]
[tensor([[2021, 5, 26, 8, 51, 21],
[2021, 6, 2, 5, 45, 4],
[2021, 6, 2, 10, 47, 17],
[2021, 6, 14, 23, 51, 24]]), tensor([[2020, 8, 29, 12, 27, 53],
[2020, 9, 4, 3, 29, 46],
[2020, 9, 16, 12, 10, 11],
[2020, 9, 16, 12, 15, 40]])]
/data/home/xconnorwang/HLLM/code/REC/data/dataset/collate_fn.py:45: UserWarning: An output with one or more elements was resized since it had shape [48], which does not match the required output shape [2, 4, 6]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:26.)
return torch.stack(batch, 0, out=out)
elem_shape: torch.Size([4]) elem: tensor([54514, 27056, 33162, 20521])
batch_len: 2 batch: elem_shape: torch.Size([4]) elem: [tensor([54514, 27056, 33162, 20521]), tensor([70133, 32499, 25117, 42046])]
elem_shape: torch.Size([4]) tensor([49495, 10004, 10945, 30683])elem:
batch_len: 2 batch: tensor([95307, 57247, 49990, 17365])
batch_len: 2 batch: [tensor([95307, 57247, 49990, 17365]), tensor([75868, 79781, 39034, 12100])]
[tensor([49495, 10004, 10945, 30683]), tensor([ 2930, 15228, 46649, 36206])]
elem_shape: torch.Size([3]) elem: elem_shape: torch.Size([4]) elem: tensor([1, 1, 1])
batch_len: 2 batch: tensor([75048, 81460, 72950, 55131])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
[tensor([75048, 81460, 72950, 55131]), tensor([76877, 71803, 95754, 64344])]
elem_shape: torch.Size([4, 6]) elem: elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
tensor([[2020, 9, 25, 5, 37, 22],
[2020, 12, 16, 14, 21, 19],
[2021, 4, 7, 11, 43, 34],
[2021, 4, 12, 10, 31, 36]])
batch_len: 2 batch: elem_shape: torch.Size([4, 6]) elem: tensor([[2021, 2, 26, 14, 46, 20],
[2021, 4, 5, 12, 24, 38],
[2021, 4, 5, 13, 3, 52],
[2021, 5, 15, 17, 1, 36]])
batch_len: 2 batch: [tensor([[2020, 9, 25, 5, 37, 22],
[2020, 12, 16, 14, 21, 19],
[2021, 4, 7, 11, 43, 34],
[2021, 4, 12, 10, 31, 36]]), tensor([[2019, 2, 10, 7, 8, 58],
[2019, 2, 10, 7, 26, 39],
[2019, 4, 3, 10, 11, 41],
[2019, 4, 11, 9, 40, 53]])]
[tensor([[2021, 2, 26, 14, 46, 20],
[2021, 4, 5, 12, 24, 38],
[2021, 4, 5, 13, 3, 52],
[2021, 5, 15, 17, 1, 36]]), tensor([[2021, 8, 30, 10, 47, 40],
[2021, 9, 28, 12, 1, 47],
[2021, 9, 29, 16, 14, 55],
[2021, 10, 1, 7, 18, 49]])]
elem_shape: torch.Size([4]) elem: tensor([45360, 536, 48045, 26275])
batch_len: 2 batch: [tensor([45360, 536, 48045, 26275]), tensor([20411, 11698, 21909, 15291])]
elem_shape: torch.Size([4]) elem: tensor([79237, 53705, 87849, 26910])
batch_len: 2 batch: [tensor([79237, 53705, 87849, 26910]), tensor([13761, 19517, 61800, 59291])]
elem_shape: torch.Size([3]) elem: tensor([1, 1, 1])
batch_len: 2 batch: [tensor([1, 1, 1]), tensor([1, 1, 1])]
elem_shape: torch.Size([4, 6]) elem: tensor([[2022, 2, 11, 1, 20, 13],
[2022, 2, 11, 1, 23, 9],
[2022, 2, 11, 2, 3, 3],
[2022, 2, 11, 6, 13, 31]])
batch_len: 2 batch: [tensor([[2022, 2, 11, 1, 20, 13],
[2022, 2, 11, 1, 23, 9],
[2022, 2, 11, 2, 3, 3],
[2022, 2, 11, 6, 13, 31]]), tensor([[2020, 7, 18, 15, 24, 36],
[2020, 8, 2, 16, 48, 4],
[2020, 10, 10, 17, 13, 9],
[2020, 11, 24, 6, 57, 41]])]
Traceback (most recent call last):
File "/data/home/xconnorwang/HLLM/code/run.py", line 139, in
run_loop(local_rank=local_rank, config_file=config_file, extra_args=extra_args)
File "/data/home/xconnorwang/HLLM/code/run.py", line 110, in run_loop
best_valid_score, best_valid_result = trainer.fit(
File "/data/home/xconnorwang/HLLM/code/REC/trainer/trainer.py", line 342, in fit
train_loss = self._train_epoch(train_data, epoch_idx, show_progress=show_progress)
File "/data/home/xconnorwang/HLLM/code/REC/trainer/trainer.py", line 198, in _train_epoch
self.lite.backward(losses)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/fabric.py", line 446, in backward
self._strategy.backward(tensor, module, *args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/strategies/strategy.py", line 188, in backward
self.precision.backward(tensor, module, *args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/lightning/fabric/plugins/precision/deepspeed.py", line 91, in backward
model.backward(tensor, *args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1976, in backward
self.optimizer.backward(loss, retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/zero/stage3.py", line 2213, in backward
self.loss_scaler.backward(loss.float(), retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 63, in backward
scaled_loss.backward(retain_graph=retain_graph)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: The size of tensor a (0) must match the size of tensor b (1536) at non-singleton dimension 1

Train [ 0/ 5]: 0%| | 0/408043 [00:05<?, ?it/s]
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1073205) of binary: /usr/local/python3/bin/python3.9
Traceback (most recent call last):
File "/data/home/xconnorwang/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/home/xconnorwang/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

run.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2024-10-31_18:30:56
host : VM-143-114-tencentos
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 1073205)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

  • echo 'start_time: 2024-10-31T18:29:49'
    start_time: 2024-10-31T18:29:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants