diff --git "a/llm/finetune/albert/Albert\347\232\20420newspaper\345\276\256\350\260\203.md" "b/llm/finetune/albert/Albert\347\232\20420newspaper\345\276\256\350\260\203.md"
new file mode 100644
index 000000000..d55be8fb6
--- /dev/null
+++ "b/llm/finetune/albert/Albert\347\232\20420newspaper\345\276\256\350\260\203.md"
@@ -0,0 +1,112 @@
+# Albert的20Newspaper微调
+
+## 硬件
+
+资源规格：NPU: 1*Ascend-D910B(显存: 64GB), CPU: 24, 内存: 192GB
+
+智算中心：武汉智算中心
+
+镜像：mindspore_2_5_py311_cann8
+
+torch训练硬件资源规格：Nvidia 3090
+
+## 模型与数据集
+
+模型："albert/albert-base-v1"
+
+数据集："SetFit/20_newsgroups"
+
+## 训练与评估损失
+
+由于训练的损失过长，只取最后十五个loss展示
+
+### mindspore+mindNLP
+
+| Epoch | Loss   | Eval Loss |
+| ----- | ------ | --------- |
+| 2.9   | 1.5166 |           |
+| 2.91  | 1.3991 |           |
+| 2.92  | 1.4307 |           |
+| 2.93  | 1.3694 |           |
+| 2.93  | 1.3242 |           |
+| 2.94  | 1.4505 |           |
+| 2.95  | 1.4278 |           |
+| 2.95  | 1.3563 |           |
+| 2.96  | 1.4091 |           |
+| 2.97  | 1.5412 |           |
+| 2.98  | 1.2831 |           |
+| 2.98  | 1.4771 |           |
+| 2.99  | 1.3773 |           |
+| 3.0   | 1.2446 |           |
+| 3.0   |        | 1.5597    |
+
+### Pytorch+transformers
+
+| Epoch | Loss   | Eval Loss |
+| ----- | ------ | --------- |
+| 2.26  | 1.1111 |           |
+| 2.32  | 1.1717 |           |
+| 2.37  | 1.1374 |           |
+| 2.43  | 1.1496 |           |
+| 2.49  | 1.1221 |           |
+| 2.54  | 1.0484 |           |
+| 2.6   | 1.1230 |           |
+| 2.66  | 1.0793 |           |
+| 2.71  | 1.1685 |           |
+| 2.77  | 1.0825 |           |
+| 2.82  | 1.1835 |           |
+| 2.88  | 1.0519 |           |
+| 2.94  | 1.0824 |           |
+| 2.99  | 1.1310 |           |
+| 3.0   |        | 1.2418    |
+
+## 对话分类测试
+
+问题来自评估数据集，正确标签如表格
+
+* 问题输入：
+
+  | 序号 | text                                                         | text的正确标签        |
+  | ---- | ------------------------------------------------------------ | --------------------- |
+  | 1    | I am a little confused on all of the models of the 88-89 bonnevilles.I have heard of the LE SE LSE SSE SSEI. Could someone tell me thedifferences are far as features or performance. I am also curious toknow what the book value is for prefereably the 89 model. And how muchless than book value can you usually get them for. In other words howmuch are they in demand this time of year. I have heard that the mid-springearly summer is the best time to buy. | rec.autos             |
+  | 2    | I\'m not familiar at all with the format of these X-Face:thingies, butafter seeing them in some folks\' headers, I\'ve *got* to *see* them (andmaybe make one of my own)!I\'ve got dpg-viewon my Linux box (which displays uncompressed X-Faces)and I\'ve managed to compile [un]compface too... but now that I\'m *looking*for them, I can\'t seem to find any X-Face:\'s in anyones news headers!  :-(Could you, would you, please send me your X-Face:headerI know* I\'ll probably get a little swamped, but I can handle it.\t...I hope. | comp.windows.x        |
+  | 3    | In a word, yes.                                              | alt.atheism           |
+  | 4    | They were attacking the Iraqis to drive them out of Kuwait,a country whose citizens have close blood and business tiesto Saudi citizens.  And me thinks if the US had not helped outthe Iraqis would have swallowed Saudi Arabia, too (or at least the eastern oilfields).  And no Muslim country was doingmuch of anything to help liberate Kuwait and protect SaudiArabia; indeed, in some masses of citizens were demonstratingin favor of that butcher Saddam (who killed lotsa Muslims),just because he was killing, raping, and looting relativelyrich Muslims and also thumbing his nose at the West.So how would have *you* defended Saudi Arabia and rolledback the Iraqi invasion, were you in charge of Saudi Arabia???I think that it is a very good idea to not have governments have anofficial religion (de facto or de jure), because with human naturelike it is, the ambitious and not the pious will always be theones who rise to power.  There are just too many people in thisworld (or any country) for the citizens to really know if a leader is really devout or if he is just a slick operator.You make it sound like these guys are angels, Ilyess.  (In yourclarinet posting you edited out some stuff; was it the following???)Friday's New York Times reported that this group definitely ismore conservative than even Sheikh Baz and his followers (whothink that the House of Saud does not rule the country conservativelyenough).  The NYT reported that, besides complaining that thegovernment was not conservative enough, they have:\t- asserted that the (approx. 500,000) Shiites in the Kingdom\t  are apostates, a charge that under Saudi (and Islamic) law\t  brings the death penalty.  \t  Diplomatic guy (Sheikh bin Jibrin), isn't he Ilyess?\t- called for severe punishment of the 40 or so women who\t  drove in public a while back to protest the ban on\t  women driving.  The guy from the group who said this,\t  Abdelhamoud al-Toweijri, said that these women should\t  be fired from their jobs, jailed, and branded as\t  prostitutes.\t  Is this what you want to see happen, Ilyess?  I've\t  heard many Muslims say that the ban on women driving\t  has no basis in the Qur'an, the ahadith, etc.\t  Yet these folks not only like the ban, they want\t  these women falsely called prostitutes?  \t  If I were you, I'd choose my heroes wisely,\t  Ilyess, not just reflexively rally behind\t  anyone who hates anyone you hate.\t- say that women should not be allowed to work.\t- say that TV and radio are too immoral in the Kingdom.Now, the House of Saud is neither my least nor my most favorite governmenton earth; I think they restrict religious and political reedom a lot, amongother things.  I just think that the most likely replacementsfor them are going to be a lot worse for the citizens of the country.But I think the House of Saud is feeling the heat lately.  In thelast six months or so I've read there have been stepped up harassingby the muttawain (religious police---*not* government) of Western womennot fully veiled (something stupid for women to do, IMO, because itsends the wrong signals about your morality).  And I've read thatthey've cracked down on the few, home-based expartiate religiousgatherings, and even posted rewards in (government-owned) newspapersoffering money for anyone who turns in a group of expartiates whodare worship in their homes or any other secret place. So thegovernment has grown even more intolerant to try to take some ofthe wind out of the sails of the more-conservative opposition.As unislamic as some of these things are, they're just a smalltaste of what would happen if these guys overthrow the House ofSaud, like they're trying to in the long run.Is this really what you (and Rached and others in the generalwest-is-evil-zionists-rule-hate-west-or-you-are-a-puppet crowd)want, Ilyess? | talk.politics.mideast |
+
+* mindnlp未微调前的回答：
+
+  | 序号 | 预测结果    | 是否正确  |
+  | ---- | ----------- | --------- |
+  | 1    | alt.atheism | Incorrect |
+  | 2    | alt.atheism | Incorrect |
+  | 3    | alt.atheism | Correct   |
+  | 4    | alt.atheism | Incorrect |
+
+  
+
+* mindnlp微调后的回答：
+
+  | 序号 | 预测结果              | 是否正确  |
+  | ---- | --------------------- | --------- |
+  | 1    | misc.forsale          | Incorrect |
+  | 2    | comp.windows.x        | Correct   |
+  | 3    | talk.politics.misc    | Incorrect |
+  | 4    | talk.politics.mideast | Correct   |
+
+* torch微调前的回答：
+  
+  | 序号 | 预测结果  | 是否正确  |
+  | ---- | --------- | --------- |
+  | 1    | sci.space | Incorrect |
+  | 2    | sci.space | Incorrect |
+  | 3    | sci.space | Incorrect |
+  | 4    | sci.space | Incorrect |
+  
+* torch微调后的回答：
+
+  | 序号 | 预测结果              | 是否正确  |
+  | ---- | --------------------- | --------- |
+  | 1    | rec.autos             | Correct   |
+  | 2    | comp.windows.x        | Correct   |
+  | 3    | talk.religion.misc    | Incorrect |
+  | 4    | talk.politics.mideast | Correct   |
\ No newline at end of file
diff --git a/llm/finetune/albert/mindNLPAlbert.py b/llm/finetune/albert/mindNLPAlbert.py
new file mode 100644
index 000000000..423e1d7ee
--- /dev/null
+++ b/llm/finetune/albert/mindNLPAlbert.py
@@ -0,0 +1,160 @@
+import os
+import mindspore
+from mindnlp.transformers import AutoTokenizer,AlbertTokenizer, AlbertForSequenceClassification
+from mindnlp.engine import Trainer, TrainingArguments
+from datasets import load_dataset, load_from_disk
+import os
+
+mindspore.set_context(device_target='Ascend', device_id=0, pynative_synchronize=True)
+# 加载预训练模型和分词器
+os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
+model_name = "albert/albert-base-v1"
+tokenizer = AlbertTokenizer.from_pretrained(model_name)
+model = AlbertForSequenceClassification.from_pretrained(model_name,num_labels=20)
+labels = [
+    "alt.atheism",
+    "comp.graphics",
+    "comp.os.ms-windows.misc",
+    "comp.sys.ibm.pc.hardware",
+    "comp.sys.mac.hardware",
+    "comp.windows.x",
+    "misc.forsale",
+    "rec.autos",
+    "rec.motorcycles",
+    "rec.sport.baseball",
+    "rec.sport.hockey",
+    "sci.crypt",
+    "sci.electronics",
+    "sci.med",
+    "sci.space",
+    "soc.religion.christian",
+    "talk.politics.guns",
+    "talk.politics.mideast",
+    "talk.politics.misc",
+    "talk.religion.misc"
+]
+# 定义推理函数
+def predict(text,tokenizer,model, true_label=None):
+    # 对输入文本进行编码
+    inputs = tokenizer(text, return_tensors="ms", padding=True, truncation=True, max_length=512)
+    # 模型推理
+    outputs = model(**inputs)
+    logits = outputs.logits
+
+    # 获取预测结果
+    predicted_class_id = mindspore.mint.argmax(logits, dim=-1).item()
+    predicted_label = labels[predicted_class_id]
+
+    # 判断预测是否正确
+    is_correct = "Correct" if true_label is not None and predicted_label == true_label else "Incorrect"
+    return predicted_label, is_correct
+# 测试样例（包含真实标签）
+test_data = [
+    {"text": "I am a little confused on all of the models of the 88-89 bonnevilles.I have heard of the LE SE LSE SSE SSEI. Could someone tell me thedifferences are far as features or performance. I am also curious toknow what the book value is for prefereably the 89 model. And how muchless than book value can you usually get them for. In other words howmuch are they in demand this time of year. I have heard that the mid-springearly summer is the best time to buy."
+        , "true_label": "rec.autos"},
+    {"text": "I\'m not familiar at all with the format of these X-Face:thingies, butafter seeing them in some folks\' headers, I\'ve *got* to *see* them (andmaybe make one of my own)!I\'ve got dpg-viewon my Linux box (which displays uncompressed X-Faces)and I\'ve managed to compile [un]compface too... but now that I\'m *looking*for them, I can\'t seem to find any X-Face:\'s in anyones news headers!  :-(Could you, would you, please send me your X-Face:headerI know* I\'ll probably get a little swamped, but I can handle it.\t...I hope."
+        , "true_label": "comp.windows.x"},
+    {"text": "In a word, yes."
+        , "true_label": "alt.atheism"},
+    {"text": "They were attacking the Iraqis to drive them out of Kuwait,a country whose citizens have close blood and business tiesto Saudi citizens.  And me thinks if the US had not helped outthe Iraqis would have swallowed Saudi Arabia, too (or at least the eastern oilfields).  And no Muslim country was doingmuch of anything to help liberate Kuwait and protect SaudiArabia; indeed, in some masses of citizens were demonstratingin favor of that butcher Saddam (who killed lotsa Muslims),just because he was killing, raping, and looting relativelyrich Muslims and also thumbing his nose at the West.So how would have *you* defended Saudi Arabia and rolledback the Iraqi invasion, were you in charge of Saudi Arabia???I think that it is a very good idea to not have governments have anofficial religion (de facto or de jure), because with human naturelike it is, the ambitious and not the pious will always be theones who rise to power.  There are just too many people in thisworld (or any country) for the citizens to really know if a leader is really devout or if he is just a slick operator.You make it sound like these guys are angels, Ilyess.  (In yourclarinet posting you edited out some stuff; was it the following???)Friday's New York Times reported that this group definitely ismore conservative than even Sheikh Baz and his followers (whothink that the House of Saud does not rule the country conservativelyenough).  The NYT reported that, besides complaining that thegovernment was not conservative enough, they have:\t- asserted that the (approx. 500,000) Shiites in the Kingdom\t  are apostates, a charge that under Saudi (and Islamic) law\t  brings the death penalty.  \t  Diplomatic guy (Sheikh bin Jibrin), isn't he Ilyess?\t- called for severe punishment of the 40 or so women who\t  drove in public a while back to protest the ban on\t  women driving.  The guy from the group who said this,\t  Abdelhamoud al-Toweijri, said that these women should\t  be fired from their jobs, jailed, and branded as\t  prostitutes.\t  Is this what you want to see happen, Ilyess?  I've\t  heard many Muslims say that the ban on women driving\t  has no basis in the Qur'an, the ahadith, etc.\t  Yet these folks not only like the ban, they want\t  these women falsely called prostitutes?  \t  If I were you, I'd choose my heroes wisely,\t  Ilyess, not just reflexively rally behind\t  anyone who hates anyone you hate.\t- say that women should not be allowed to work.\t- say that TV and radio are too immoral in the Kingdom.Now, the House of Saud is neither my least nor my most favorite governmenton earth; I think they restrict religious and political reedom a lot, amongother things.  I just think that the most likely replacementsfor them are going to be a lot worse for the citizens of the country.But I think the House of Saud is feeling the heat lately.  In thelast six months or so I've read there have been stepped up harassingby the muttawain (religious police---*not* government) of Western womennot fully veiled (something stupid for women to do, IMO, because itsends the wrong signals about your morality).  And I've read thatthey've cracked down on the few, home-based expartiate religiousgatherings, and even posted rewards in (government-owned) newspapersoffering money for anyone who turns in a group of expartiates whodare worship in their homes or any other secret place. So thegovernment has grown even more intolerant to try to take some ofthe wind out of the sails of the more-conservative opposition.As unislamic as some of these things are, they're just a smalltaste of what would happen if these guys overthrow the House ofSaud, like they're trying to in the long run.Is this really what you (and Rached and others in the generalwest-is-evil-zionists-rule-hate-west-or-you-are-a-puppet crowd)want, Ilyess?"
+        , "true_label": "talk.politics.mideast"}
+]
+# 对测试文本进行预测
+for data in test_data:
+    text = data["text"]
+    true_label = data["true_label"]
+    predicted_label, is_correct = predict(text, tokenizer,model,true_label)
+    # print(f"Text: {text}")
+    print(f"True Label: {true_label}")
+    print(f"Predicted Label: {predicted_label}")
+    print(f"Prediction: {is_correct}\n")
+# 加载数据集
+dataset = load_dataset("SetFit/20_newsgroups",trust_remote_code=True)
+print("dataset:",dataset)
+# 定义数据集保存路径
+# 数据预处理函数
+def preprocess_function(examples):
+    return tokenizer(examples['text'], padding="max_length", truncation=True, max_length=512)
+
+# 对数据集进行预处理
+encoded_dataset = dataset.map(preprocess_function, batched=True)
+# 分割训练集和验证集
+train_dataset = encoded_dataset['train']
+eval_dataset = encoded_dataset['test']
+print("encoded_dataset:",encoded_dataset)
+# print("train_dataset:",train_dataset)
+# print("eval_dataset:",eval_dataset)
+# print("eval_dataset[0]:",eval_dataset[0])
+import numpy as np
+def data_generator(dataset):
+    for item in dataset:
+        yield (
+            np.array(item["input_ids"], dtype=np.int32),  # input_ids
+            np.array(item["attention_mask"], dtype=np.int32),  # attention_mask
+            np.array(item["label"], dtype=np.int32)  # label
+        )
+import mindspore.dataset as ds
+# 将训练集和验证集转换为 MindSpore 数据集，注意forward函数中label要改成labels
+def create_mindspore_dataset(dataset, shuffle=True):
+    return ds.GeneratorDataset(
+        source=lambda: data_generator(dataset),  # 使用 lambda 包装生成器
+        column_names=["input_ids", "attention_mask", "labels"],
+        shuffle=shuffle
+    )
+train_dataset = create_mindspore_dataset(train_dataset, shuffle=True)
+eval_dataset = create_mindspore_dataset(eval_dataset, shuffle=False)
+print(train_dataset.create_dict_iterator())
+
+# 定义训练参数
+training_args = TrainingArguments(
+    output_dir='./results',          # 输出目录
+    evaluation_strategy="epoch",     # 每个epoch结束后进行评估
+    learning_rate=2e-5,              # 学习率
+    per_device_train_batch_size=8,   # 每个设备的训练批次大小
+    per_device_eval_batch_size=8,    # 每个设备的评估批次大小
+    num_train_epochs=3,              # 训练epoch数
+    weight_decay=0.01,               # 权重衰减
+    logging_dir='./logs',            # 日志目录
+    logging_steps=10,                # 每10步记录一次日志
+    save_strategy="epoch",           # 每个epoch结束后保存模型
+    save_total_limit=2,              # 最多保存2个模型
+    load_best_model_at_end=True,     # 训练结束后加载最佳模型
+)
+# 初始化Trainer
+trainer = Trainer(
+    model=model,                         # 模型
+    args=training_args,                  # 训练参数
+    train_dataset=train_dataset,         # 训练集
+    eval_dataset=eval_dataset,           # 验证集
+    tokenizer=tokenizer
+)
+# 开始训练
+trainer.train()
+eval_results = trainer.evaluate()
+print(f"Evaluation results: {eval_results}")
+# 保存模型
+model.save_pretrained("./fine-tuned-albert-20newsgroups")
+tokenizer.save_pretrained("./fine-tuned-albert-20newsgroups")
+fine_tuned_model = AlbertForSequenceClassification.from_pretrained("./fine-tuned-albert-20newsgroups")
+fine_tuned_tokenizer = AlbertTokenizer.from_pretrained("./fine-tuned-albert-20newsgroups")
+# 测试样例
+test_texts = [
+    {"text": "I am a little confused on all of the models of the 88-89 bonnevilles.I have heard of the LE SE LSE SSE SSEI. Could someone tell me thedifferences are far as features or performance. I am also curious toknow what the book value is for prefereably the 89 model. And how muchless than book value can you usually get them for. In other words howmuch are they in demand this time of year. I have heard that the mid-springearly summer is the best time to buy."
+        , "true_label": "rec.autos"},
+    {"text": "I\'m not familiar at all with the format of these X-Face:thingies, butafter seeing them in some folks\' headers, I\'ve *got* to *see* them (andmaybe make one of my own)!I\'ve got dpg-viewon my Linux box (which displays uncompressed X-Faces)and I\'ve managed to compile [un]compface too... but now that I\'m *looking*for them, I can\'t seem to find any X-Face:\'s in anyones news headers!  :-(Could you, would you, please send me your X-Face:headerI know* I\'ll probably get a little swamped, but I can handle it.\t...I hope."
+        , "true_label": "comp.windows.x"},
+    {"text": "In a word, yes."
+        , "true_label": "alt.atheism"},
+    {"text": "They were attacking the Iraqis to drive them out of Kuwait,a country whose citizens have close blood and business tiesto Saudi citizens.  And me thinks if the US had not helped outthe Iraqis would have swallowed Saudi Arabia, too (or at least the eastern oilfields).  And no Muslim country was doingmuch of anything to help liberate Kuwait and protect SaudiArabia; indeed, in some masses of citizens were demonstratingin favor of that butcher Saddam (who killed lotsa Muslims),just because he was killing, raping, and looting relativelyrich Muslims and also thumbing his nose at the West.So how would have *you* defended Saudi Arabia and rolledback the Iraqi invasion, were you in charge of Saudi Arabia???I think that it is a very good idea to not have governments have anofficial religion (de facto or de jure), because with human naturelike it is, the ambitious and not the pious will always be theones who rise to power.  There are just too many people in thisworld (or any country) for the citizens to really know if a leader is really devout or if he is just a slick operator.You make it sound like these guys are angels, Ilyess.  (In yourclarinet posting you edited out some stuff; was it the following???)Friday's New York Times reported that this group definitely ismore conservative than even Sheikh Baz and his followers (whothink that the House of Saud does not rule the country conservativelyenough).  The NYT reported that, besides complaining that thegovernment was not conservative enough, they have:\t- asserted that the (approx. 500,000) Shiites in the Kingdom\t  are apostates, a charge that under Saudi (and Islamic) law\t  brings the death penalty.  \t  Diplomatic guy (Sheikh bin Jibrin), isn't he Ilyess?\t- called for severe punishment of the 40 or so women who\t  drove in public a while back to protest the ban on\t  women driving.  The guy from the group who said this,\t  Abdelhamoud al-Toweijri, said that these women should\t  be fired from their jobs, jailed, and branded as\t  prostitutes.\t  Is this what you want to see happen, Ilyess?  I've\t  heard many Muslims say that the ban on women driving\t  has no basis in the Qur'an, the ahadith, etc.\t  Yet these folks not only like the ban, they want\t  these women falsely called prostitutes?  \t  If I were you, I'd choose my heroes wisely,\t  Ilyess, not just reflexively rally behind\t  anyone who hates anyone you hate.\t- say that women should not be allowed to work.\t- say that TV and radio are too immoral in the Kingdom.Now, the House of Saud is neither my least nor my most favorite governmenton earth; I think they restrict religious and political reedom a lot, amongother things.  I just think that the most likely replacementsfor them are going to be a lot worse for the citizens of the country.But I think the House of Saud is feeling the heat lately.  In thelast six months or so I've read there have been stepped up harassingby the muttawain (religious police---*not* government) of Western womennot fully veiled (something stupid for women to do, IMO, because itsends the wrong signals about your morality).  And I've read thatthey've cracked down on the few, home-based expartiate religiousgatherings, and even posted rewards in (government-owned) newspapersoffering money for anyone who turns in a group of expartiates whodare worship in their homes or any other secret place. So thegovernment has grown even more intolerant to try to take some ofthe wind out of the sails of the more-conservative opposition.As unislamic as some of these things are, they're just a smalltaste of what would happen if these guys overthrow the House ofSaud, like they're trying to in the long run.Is this really what you (and Rached and others in the generalwest-is-evil-zionists-rule-hate-west-or-you-are-a-puppet crowd)want, Ilyess?"
+        , "true_label": "talk.politics.mideast"}
+]
+
+# 对测试文本进行预测
+for data in test_texts:
+    text = data["text"]
+    true_label = data["true_label"]
+    predicted_label, is_correct = predict(text, fine_tuned_tokenizer,fine_tuned_model,true_label)
+    print(f"Text: {text}")
+    print(f"True Label: {true_label}")
+    print(f"Predicted Label: {predicted_label}")
+    print(f"Prediction: {is_correct}")
diff --git a/llm/finetune/albert/mindnlplog.txt b/llm/finetune/albert/mindnlplog.txt
new file mode 100644
index 000000000..b91f443a9
--- /dev/null
+++ b/llm/finetune/albert/mindnlplog.txt
@@ -0,0 +1,994 @@
+(MindSpore) [ma-user work]$pip install https://ms-release.obs.cn-north-4.myhuaweicloud.com/2.4.10/MindSpore/unified/aarch64/mindspore-2.4.10-cp39-cp39-linux_aarch64.whl --trusted-host ms-release.obs.cn-north-4.myhuaweicloud.com -i https://pypi.tuna.tsinghua.edu.cn/simple
+Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
+Collecting mindspore==2.4.10
+  Downloading https://ms-release.obs.cn-north-4.myhuaweicloud.com/2.4.10/MindSpore/unified/aarch64/mindspore-2.4.10-cp39-cp39-linux_aarch64.whl (336.3 MB)
+     ━━━━━━━━━━━━━━━━━━━━━━━━ 336.3/336.3 MB 8.3 MB/s eta 0:00:00
+Requirement already satisfied: numpy<2.0.0,>=1.20.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (1.26.1)
+Requirement already satisfied: protobuf>=3.13.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (3.20.3)
+Requirement already satisfied: asttokens>=2.0.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (2.4.1)
+Requirement already satisfied: pillow>=6.2.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (11.1.0)
+Requirement already satisfied: scipy>=1.5.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (1.11.3)
+Requirement already satisfied: packaging>=20.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (23.2)
+Requirement already satisfied: psutil>=5.6.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (5.9.5)
+Requirement already satisfied: astunparse>=1.6.3 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (1.6.3)
+Requirement already satisfied: safetensors>=0.4.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4.10) (0.5.3)
+Requirement already satisfied: six>=1.12.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from asttokens>=2.0.4->mindspore==2.4.10) (1.16.0)
+Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from astunparse>=1.6.3->mindspore==2.4.10) (0.41.2)
+DEPRECATION: moxing-framework 2.1.16.2ae09d45 has a non-standard version number. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of moxing-framework or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
+Installing collected packages: mindspore
+  Attempting uninstall: mindspore
+    Found existing installation: mindspore 2.3.0
+    Uninstalling mindspore-2.3.0:
+      Successfully uninstalled mindspore-2.3.0
+Successfully installed mindspore-2.4.10
+(MindSpore) [ma-user work]$python mindNLPAlbert.py Traceback (most recent call last):  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/utils/mindformers.py", line 13, in <module>
+    from mindformers.experimental.model import LlamaForCausalLM  # pylint: disable=import-error
+ModuleNotFoundError: No module named 'mindformers.experimental'
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "/home/ma-user/work/mindNLPAlbert.py", line 3, in <module>
+    from mindnlp.transformers import AutoTokenizer,AlbertTokenizer, AlbertForSequenceClassification
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/__init__.py", line 47, in <module>
+    from mindnlp import transformers
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/__init__.py", line 16, in <module>
+    from . import models, pipelines
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/__init__.py", line 19, in <module>
+    from . import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/albert/__init__.py", line 16, in <module>
+    from . import tokenization_albert, tokenization_albert_fast, configuration_albert, modeling_albert
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/albert/modeling_albert.py", line 46, in <module>
+    from ...modeling_utils import PreTrainedModel
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/modeling_utils.py", line 74, in <module>
+    from ..accelerate import infer_auto_device_map
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/__init__.py", line 2, in <module>
+    from .utils import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/utils/__init__.py", line 43, in <module>
+    from .mindformers import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/utils/mindformers.py", line 19, in <module>
+    raise ValueError('cannot found `mindformers.experimental`, please install dev version by\n'
+ValueError: cannot found `mindformers.experimental`, please install dev version by
+`pip install git+https://gitee.com/mindspore/mindformers` 
+or remove mindformers by 
+`pip uninstall mindformers`
+(MindSpore) [ma-user work]$pip install git+https://gitee.com/mindspore/mindformers
+Looking in indexes: http://100.125.0.76:32021/repository/pypi/simple
+Collecting git+https://gitee.com/mindspore/mindformers
+  Cloning https://gitee.com/mindspore/mindformers to /tmp/pip-req-build-banwfptc
+  Running command git clone --filter=blob:none --quiet https://gitee.com/mindspore/mindformers /tmp/pip-req-build-banwfptc
+  Resolved https://gitee.com/mindspore/mindformers to commit e7b83ea0ad6254eb647eb8a1e2182c4540fe3b36
+  Preparing metadata (setup.py) ... done
+Requirement already satisfied: setuptools in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (68.2.2)
+Collecting sentencepiece>=0.2.0 (from mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/a3/69/e96ef68261fa5b82379fdedb325ceaf1d353c6e839ec346d8244e0da5f2f/sentencepiece-0.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB)
+     ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 62.2 MB/s eta 0:00:00
+Requirement already satisfied: ftfy>=6.1.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (6.1.1)
+Requirement already satisfied: regex>=2022.10.31 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (2023.10.3)
+Requirement already satisfied: tqdm>=4.65.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (4.67.1)
+Requirement already satisfied: pyyaml>=6.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (6.0.1)
+Requirement already satisfied: jieba>=0.42.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (0.42.1)
+Requirement already satisfied: rouge_chinese>=1.0.3 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (1.0.3)
+Requirement already satisfied: nltk>=2.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (3.8.1)
+Collecting mindpet==1.0.4 (from mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/05/7c/3266e061b7dd74c17ce7556dde55456cedb9a931959998d2ff30c2bd4e51/mindpet-1.0.4-py3-none-any.whl (83 kB)
+     ━━━━━━━━━━━━━━━━━━━━━━━━━ 83.9/83.9 kB 24.3 MB/s eta 0:00:00
+Requirement already satisfied: opencv-python-headless in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (4.8.1.78)
+Collecting pyarrow==12.0.1 (from mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/8b/14/dbda2f416906090824e5b58134ebef504065798bbcc98c929ce712be80ed/pyarrow-12.0.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (36.4 MB)
+     ━━━━━━━━━━━━━━━━━━━━━━━━━ 36.4/36.4 MB 62.0 MB/s eta 0:00:00
+Collecting tokenizers==0.15.0 (from mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/14/cf/883acc48862589f9d54c239a9108728db5b75cd6c0949b92c72aae8e044c/tokenizers-0.15.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.8 MB)
+     ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.8/3.8 MB 78.5 MB/s eta 0:00:00
+Requirement already satisfied: astunparse>=1.6.3 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (1.6.3)
+Requirement already satisfied: numpy<2.0.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (1.26.1)
+Collecting datasets==2.18.0 (from mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/95/fc/661a7f06e8b7d48fcbd3f55423b7ff1ac3ce59526f146fda87a1e1788ee4/datasets-2.18.0-py3-none-any.whl (510 kB)
+     ━━━━━━━━━━━━━━━━━━━━━━━ 510.5/510.5 kB 64.7 MB/s eta 0:00:00
+Collecting tiktoken (from mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/33/35/2792b7dcb8b150d2767322637513c73a3e80833c19212efea80b31087894/tiktoken-0.9.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB)
+     ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 70.0 MB/s eta 0:00:00
+Requirement already satisfied: jinja2 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (3.1.2)
+Collecting setproctitle (from mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/14/0c/a1e1a0554c1261a754eeadef03149115c10e59c1514e254e8532d5639fd5/setproctitle-1.3.5-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (31 kB)
+Requirement already satisfied: safetensors in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (0.5.3)
+Requirement already satisfied: mindspore~=2.4.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindformers==1.3.2) (2.4.10)
+Requirement already satisfied: filelock in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (3.12.4)
+Collecting pyarrow-hotfix (from datasets==2.18.0->mindformers==1.3.2)
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/e4/f4/9ec2222f5f5f8ea04f66f184caafd991a39c8782e31f5b0266f101cb68ca/pyarrow_hotfix-0.6-py3-none-any.whl (7.9 kB)
+Requirement already satisfied: dill<0.3.9,>=0.3.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (0.3.8)
+Requirement already satisfied: pandas in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (2.1.2)
+Requirement already satisfied: requests>=2.19.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (2.32.3)
+Requirement already satisfied: xxhash in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (3.5.0)
+Requirement already satisfied: multiprocess in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (0.70.16)
+Requirement already satisfied: fsspec<=2024.2.0,>=2023.1.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from fsspec[http]<=2024.2.0,>=2023.1.0->datasets==2.18.0->mindformers==1.3.2) (2023.10.0)
+Requirement already satisfied: aiohttp in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (3.11.13)
+Requirement already satisfied: huggingface-hub>=0.19.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (0.29.2)
+Requirement already satisfied: packaging in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets==2.18.0->mindformers==1.3.2) (23.2)
+Requirement already satisfied: click in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindpet==1.0.4->mindformers==1.3.2) (8.1.7)
+Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from astunparse>=1.6.3->mindformers==1.3.2) (0.41.2)
+Requirement already satisfied: six<2.0,>=1.6.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from astunparse>=1.6.3->mindformers==1.3.2) (1.16.0)
+Requirement already satisfied: wcwidth>=0.2.5 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from ftfy>=6.1.1->mindformers==1.3.2) (0.2.8)
+Requirement already satisfied: protobuf>=3.13.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore~=2.4.1->mindformers==1.3.2) (3.20.3)
+Requirement already satisfied: asttokens>=2.0.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore~=2.4.1->mindformers==1.3.2) (2.4.1)
+Requirement already satisfied: pillow>=6.2.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore~=2.4.1->mindformers==1.3.2) (11.1.0)
+Requirement already satisfied: scipy>=1.5.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore~=2.4.1->mindformers==1.3.2) (1.11.3)
+Requirement already satisfied: psutil>=5.6.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore~=2.4.1->mindformers==1.3.2) (5.9.5)
+Requirement already satisfied: joblib in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from nltk>=2.0->mindformers==1.3.2) (1.3.2)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from jinja2->mindformers==1.3.2) (2.1.3)
+Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (2.5.0)
+Requirement already satisfied: aiosignal>=1.1.2 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (1.3.2)
+Requirement already satisfied: async-timeout<6.0,>=4.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (5.0.1)
+Requirement already satisfied: attrs>=17.3.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (23.1.0)
+Requirement already satisfied: frozenlist>=1.1.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (1.5.0)
+Requirement already satisfied: multidict<7.0,>=4.5 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (6.1.0)
+Requirement already satisfied: propcache>=0.2.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (0.3.0)
+Requirement already satisfied: yarl<2.0,>=1.17.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from aiohttp->datasets==2.18.0->mindformers==1.3.2) (1.18.3)
+Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from huggingface-hub>=0.19.4->datasets==2.18.0->mindformers==1.3.2) (4.8.0)
+Requirement already satisfied: charset-normalizer<4,>=2 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests>=2.19.0->datasets==2.18.0->mindformers==1.3.2) (3.3.1)
+Requirement already satisfied: idna<4,>=2.5 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests>=2.19.0->datasets==2.18.0->mindformers==1.3.2) (3.4)
+Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests>=2.19.0->datasets==2.18.0->mindformers==1.3.2) (2.0.7)
+Requirement already satisfied: certifi>=2017.4.17 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests>=2.19.0->datasets==2.18.0->mindformers==1.3.2) (2023.7.22)
+Requirement already satisfied: python-dateutil>=2.8.2 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pandas->datasets==2.18.0->mindformers==1.3.2) (2.8.2)
+Requirement already satisfied: pytz>=2020.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pandas->datasets==2.18.0->mindformers==1.3.2) (2023.3.post1)
+Requirement already satisfied: tzdata>=2022.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pandas->datasets==2.18.0->mindformers==1.3.2) (2023.3)
+Building wheels for collected packages: mindformers
+  Building wheel for mindformers (setup.py) ... done
+  Created wheel for mindformers: filename=mindformers-1.3.2-py3-none-any.whl size=1823754 sha256=40a152181e7d8abf527f172238844247ba7955a8dee33b1cef2f02bbc995e9a6
+  Stored in directory: /tmp/pip-ephem-wheel-cache-kcb8b1mg/wheels/40/94/52/9835458d6a1da05e7e6184cbfcfc44a841d1408c431ae04f01
+Successfully built mindformers
+DEPRECATION: moxing-framework 2.1.16.2ae09d45 has a non-standard version number. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of moxing-framework or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
+Installing collected packages: sentencepiece, setproctitle, pyarrow-hotfix, pyarrow, mindpet, tiktoken, tokenizers, datasets, mindformers
+  Attempting uninstall: sentencepiece
+    Found existing installation: sentencepiece 0.1.99
+    Uninstalling sentencepiece-0.1.99:
+      Successfully uninstalled sentencepiece-0.1.99
+  Attempting uninstall: pyarrow
+    Found existing installation: pyarrow 19.0.1
+    Uninstalling pyarrow-19.0.1:
+      Successfully uninstalled pyarrow-19.0.1
+  Attempting uninstall: mindpet
+    Found existing installation: mindpet 1.0.2
+    Uninstalling mindpet-1.0.2:
+      Successfully uninstalled mindpet-1.0.2
+  Attempting uninstall: tokenizers
+    Found existing installation: tokenizers 0.19.1
+    Uninstalling tokenizers-0.19.1:
+      Successfully uninstalled tokenizers-0.19.1
+  Attempting uninstall: datasets
+    Found existing installation: datasets 3.3.2
+    Uninstalling datasets-3.3.2:
+      Successfully uninstalled datasets-3.3.2
+  Attempting uninstall: mindformers
+    Found existing installation: mindformers 0.8.0
+    Uninstalling mindformers-0.8.0:
+      Successfully uninstalled mindformers-0.8.0
+ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
+mindnlp 0.4.0 requires tokenizers==0.19.1, but you have tokenizers 0.15.0 which is incompatible.
+Successfully installed datasets-2.18.0 mindformers-1.3.2 mindpet-1.0.4 pyarrow-12.0.1 pyarrow-hotfix-0.6 sentencepiece-0.2.0 setproctitle-1.3.5 tiktoken-0.9.0 tokenizers-0.15.0
+(MindSpore) [ma-user work]$python mindNLPAlbert.py Traceback (most recent call last):
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/utils/mindformers.py", line 17, in <module>
+    from mindformers.experimental.parallel_core.pynative import get_optimizer  # pylint: disable=import-error
+ImportError: cannot import name 'get_optimizer' from 'mindformers.experimental.parallel_core.pynative' (/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindformers/experimental/parallel_core/pynative/__init__.py)
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "/home/ma-user/work/mindNLPAlbert.py", line 3, in <module>
+    from mindnlp.transformers import AutoTokenizer,AlbertTokenizer, AlbertForSequenceClassification
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/__init__.py", line 47, in <module>
+    from mindnlp import transformers
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/__init__.py", line 16, in <module>
+    from . import models, pipelines
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/__init__.py", line 19, in <module>
+    from . import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/albert/__init__.py", line 16, in <module>
+    from . import tokenization_albert, tokenization_albert_fast, configuration_albert, modeling_albert
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/albert/modeling_albert.py", line 46, in <module>
+    from ...modeling_utils import PreTrainedModel
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/modeling_utils.py", line 74, in <module>
+    from ..accelerate import infer_auto_device_map
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/__init__.py", line 2, in <module>
+    from .utils import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/utils/__init__.py", line 43, in <module>
+    from .mindformers import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/accelerate/utils/mindformers.py", line 19, in <module>
+    raise ValueError('cannot found `mindformers.experimental`, please install dev version by\n'
+ValueError: cannot found `mindformers.experimental`, please install dev version by
+`pip install git+https://gitee.com/mindspore/mindformers` 
+or remove mindformers by 
+`pip uninstall mindformers`
+(MindSpore) [ma-user work]$pip uninstall mindformers
+Found existing installation: mindformers 1.3.2
+Uninstalling mindformers-1.3.2:
+  Would remove:
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/README.md
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/codellama/finetune_codellama_34b_16p.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/codellama/finetune_codellama_34b_32p.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/codellama/predict_codellama_34b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/codellama/pretrain_codellama_34b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/cogvlm2/finetune_cogvlm2_video_llama3_chat_13b_lora.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/cogvlm2/predict_cogvlm2_image_llama3_chat_19b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/cogvlm2/predict_cogvlm2_video_llama3_chat_13b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/convert_config/run_convert.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/convert_config/run_reversed_convert.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/general/run_general_task.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/finetune_glm2_6b_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/lora_glm2_6b_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/predict_glm2_6b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_finetune_2k_800T_A2_64G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_finetune_2k_800_32G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_finetune_800T_A2_64G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_finetune_800_32G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_finetune_eval.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_lora_2k_800T_A2_64G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_lora_2k_800_32G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_lora_800T_A2_64G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_lora_800_32G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm2/run_glm2_6b_lora_eval.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm3/finetune_glm3_6b_bf16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm3/predict_glm3_6b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm3/run_glm3_6b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm3/run_glm3_6b_finetune_2k_800T_A2_64G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm3/run_glm3_6b_finetune_800T_A2_64G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm3/run_glm3_6b_multiturn_finetune_800T_A2_64G.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm4/finetune_glm4_9b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/glm4/predict_glm4_9b_chat.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/finetune_gpt2_small_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/finetune_gpt2_small_lora_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/finetune_gpt2_small_txtcls_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/predict_gpt2_small_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/pretrain_gpt2_13b_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/pretrain_gpt2_small_fp16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2_13b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2_13b_910b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2_52b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2_lora.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2_txtcls.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2_xl.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/gpt2/run_gpt2_xl_lora.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_13b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_13b_bf16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_70b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_70b_bf16_32p.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_70b_bf16_64p.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_7b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_7b_bf16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_7b_prefixtuning.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/finetune_llama2_7b_ptuning2.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/lora_llama2_13b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/lora_llama2_7b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_13b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_13b_ptq.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_13b_rtn.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_13b_smooth_quant.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_70b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_70b_rtn.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_70b_smooth_quant.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_7b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_7b_prefixtuning.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_7b_ptuning2.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/predict_llama2_7b_slora.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_13b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_13b_auto_parallel.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_13b_bf16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_70b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_70b_auto_parallel.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_70b_bf16_32p.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_70b_bf16_64p.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_7b.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_7b_auto_parallel.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/llama2/pretrain_llama2_7b_bf16.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/configs/whisper/finetune_whisper_large_v3.yaml
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindformers-1.3.2.dist-info/*
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindformers/*
+Proceed (Y/n)? y
+  Successfully uninstalled mindformers-1.3.2
+(MindSpore) [ma-user work]$python mindNLPAlbert.py 
+Building prefix dict from the default dictionary ...
+Dumping model to file cache /tmp/jieba.cache
+Loading model cost 1.324 seconds.
+Prefix dict has been built successfully.
+Traceback (most recent call last):
+  File "/home/ma-user/work/mindNLPAlbert.py", line 3, in <module>
+    from mindnlp.transformers import AutoTokenizer,AlbertTokenizer, AlbertForSequenceClassification
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/__init__.py", line 47, in <module>
+    from mindnlp import transformers
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/__init__.py", line 16, in <module>
+    from . import models, pipelines
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/__init__.py", line 19, in <module>
+    from . import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/__init__.py", line 15, in <module>
+    from . import configuration_rag, modeling_rag, retrieval_rag, tokenization_rag
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/modeling_rag.py", line 29, in <module>
+    from .retrieval_rag import RagRetriever
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/retrieval_rag.py", line 32, in <module>
+    from datasets import Dataset, load_dataset, load_from_disk
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/__init__.py", line 18, in <module>
+    from .arrow_dataset import Dataset
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 66, in <module>
+    from . import config
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/config.py", line 135, in <module>
+    importlib.import_module("soundfile").__libsndfile_version__
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/importlib/__init__.py", line 127, in import_module
+    return _bootstrap._gcd_import(name[level:], package, level)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg/soundfile.py", line 17, in <module>
+    from _soundfile import ffi as _ffi
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg/_soundfile.py", line 2, in <module>
+    import _cffi_backend
+ModuleNotFoundError: No module named '_cffi_backend'
+(MindSpore) [ma-user work]$ pip install cffi
+Looking in indexes: http://100.125.0.76:32021/repository/pypi/simple
+Requirement already satisfied: cffi in /home/ma-user/modelarts-dev/ma-cli (1.15.0)
+Requirement already satisfied: pycparser in /home/ma-user/modelarts-dev/ma-cli (from cffi) (2.21)
+DEPRECATION: moxing-framework 2.1.16.2ae09d45 has a non-standard version number. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of moxing-framework or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
+(MindSpore) [ma-user work]$python mindNLPAlbert.py 
+Building prefix dict from the default dictionary ...
+Loading model from cache /tmp/jieba.cache
+Loading model cost 1.258 seconds.
+Prefix dict has been built successfully.
+Traceback (most recent call last):
+  File "/home/ma-user/work/mindNLPAlbert.py", line 3, in <module>
+    from mindnlp.transformers import AutoTokenizer,AlbertTokenizer, AlbertForSequenceClassification
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/__init__.py", line 47, in <module>
+    from mindnlp import transformers
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/__init__.py", line 16, in <module>
+    from . import models, pipelines
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/__init__.py", line 19, in <module>
+    from . import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/__init__.py", line 15, in <module>
+    from . import configuration_rag, modeling_rag, retrieval_rag, tokenization_rag
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/modeling_rag.py", line 29, in <module>
+    from .retrieval_rag import RagRetriever
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/retrieval_rag.py", line 32, in <module>
+    from datasets import Dataset, load_dataset, load_from_disk
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/__init__.py", line 18, in <module>
+    from .arrow_dataset import Dataset
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 66, in <module>
+    from . import config
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/config.py", line 135, in <module>
+    importlib.import_module("soundfile").__libsndfile_version__
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/importlib/__init__.py", line 127, in import_module
+    return _bootstrap._gcd_import(name[level:], package, level)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg/soundfile.py", line 17, in <module>
+    from _soundfile import ffi as _ffi
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg/_soundfile.py", line 2, in <module>
+    import _cffi_backend
+ModuleNotFoundError: No module named '_cffi_backend'
+(MindSpore) [ma-user work]$ pip uninstall cffi
+Found existing installation: cffi 1.15.0
+Uninstalling cffi-1.15.0:
+  Would remove:
+    /home/ma-user/modelarts-dev/ma-cli/_cffi_backend.cpython-37m-aarch64-linux-gnu.so
+    /home/ma-user/modelarts-dev/ma-cli/cffi-1.15.0.dist-info/*
+    /home/ma-user/modelarts-dev/ma-cli/cffi.libs/libffi-2a6f5b63.so.8.1.0
+    /home/ma-user/modelarts-dev/ma-cli/cffi/*
+Proceed (Y/n)? y
+  Successfully uninstalled cffi-1.15.0
+(MindSpore) [ma-user work]$ pip install cffi
+Looking in indexes: http://100.125.0.76:32021/repository/pypi/simple
+Collecting cffi
+  Downloading http://100.125.0.76:32021/repository/pypi/packages/42/7a/9d086fab7c66bd7c4d0f27c57a1b6b068ced810afc498cc8c49e0088661c/cffi-1.17.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (447 kB)
+     ━━━━━━━━━━━━━━━━━━━━━━━ 447.2/447.2 kB 58.3 MB/s eta 0:00:00
+Requirement already satisfied: pycparser in /home/ma-user/modelarts-dev/ma-cli (from cffi) (2.21)
+DEPRECATION: moxing-framework 2.1.16.2ae09d45 has a non-standard version number. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of moxing-framework or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063
+Installing collected packages: cffi
+Successfully installed cffi-1.17.1
+(MindSpore) [ma-user work]$python mindNLPAlbert.py 
+Building prefix dict from the default dictionary ...
+Loading model from cache /tmp/jieba.cache
+Loading model cost 1.269 seconds.
+Prefix dict has been built successfully.
+Traceback (most recent call last):
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg/soundfile.py", line 161, in <module>
+    import _soundfile_data  # ImportError if this doesn't exist
+ModuleNotFoundError: No module named '_soundfile_data'
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg/soundfile.py", line 170, in <module>
+    raise OSError('sndfile library not found using ctypes.util.find_library')
+OSError: sndfile library not found using ctypes.util.find_library
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "/home/ma-user/work/mindNLPAlbert.py", line 3, in <module>
+    from mindnlp.transformers import AutoTokenizer,AlbertTokenizer, AlbertForSequenceClassification
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/__init__.py", line 47, in <module>
+    from mindnlp import transformers
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/__init__.py", line 16, in <module>
+    from . import models, pipelines
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/__init__.py", line 19, in <module>
+    from . import (
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/__init__.py", line 15, in <module>
+    from . import configuration_rag, modeling_rag, retrieval_rag, tokenization_rag
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/modeling_rag.py", line 29, in <module>
+    from .retrieval_rag import RagRetriever
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/rag/retrieval_rag.py", line 32, in <module>
+    from datasets import Dataset, load_dataset, load_from_disk
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/__init__.py", line 18, in <module>
+    from .arrow_dataset import Dataset
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 66, in <module>
+    from . import config
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/datasets/config.py", line 135, in <module>
+    importlib.import_module("soundfile").__libsndfile_version__
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/importlib/__init__.py", line 127, in import_module
+    return _bootstrap._gcd_import(name[level:], package, level)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg/soundfile.py", line 192, in <module>
+    _snd = _ffi.dlopen(_explicit_libname)
+OSError: cannot load library 'libsndfile.so': libsndfile.so: cannot open shared object file: No such file or directory
+(MindSpore) [ma-user work]$yum install libsndfile1
+Error: This command has to be run under the root user.
+(MindSpore) [ma-user work]$sudo yum install libsndfile1
+Last metadata expiration check: 498 days, 11:12:19 ago on Fri Oct 27 11:23:05 2023.
+No match for argument: libsndfile1
+Error: Unable to find a match
+(MindSpore) [ma-user work]$pip uninstall soundfile
+Found existing installation: soundfile 0.12.1
+Uninstalling soundfile-0.12.1:
+  Would remove:
+    /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/soundfile-0.12.1-py3.9.egg
+Proceed (Y/n)? y
+  Successfully uninstalled soundfile-0.12.1
+(MindSpore) [ma-user work]$python mindNLPAlbert.py 
+Building prefix dict from the default dictionary ...
+Loading model from cache /tmp/jieba.cache
+Loading model cost 1.265 seconds.
+Prefix dict has been built successfully.
+100%|██████████████████████████| 25.0/25.0 [00:00<00:00, 111kB/s]
+100%|█████████████████████████| 742k/742k [00:00<00:00, 1.20MB/s]
+1.25MB [00:00, 2.56MB/s]
+684B [00:00, 2.16MB/s]                                           
+/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py:1526: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted, and will be then set to `False` by default. 
+  warnings.warn(
+100%|████████████████████████| 45.2M/45.2M [00:50<00:00, 942kB/s]
+Some weights of AlbertForSequenceClassification were not initialized from the model checkpoint at albert/albert-base-v1 and are newly initialized: ['classifier.bias', 'classifier.weight']
+You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
+Text: I am a little confused on all of the models of the 88-89 bonnevilles.I have heard of the LE SE LSE SSE SSEI. Could someone tell me thedifferences are far as features or performance. I am also curious toknow what the book value is for prefereably the 89 model. And how muchless than book value can you usually get them for. In other words howmuch are they in demand this time of year. I have heard that the mid-springearly summer is the best time to buy.
+True Label: rec.autos
+Predicted Label: alt.atheism
+Prediction: Incorrect
+
+Text: I'm not familiar at all with the format of these X-Face:thingies, butafter seeing them in some folks' headers, I've *got* to *see* them (andmaybe make one of my own)!I've got dpg-viewon my Linux box (which displays uncompressed X-Faces)and I've managed to compile [un]compface too... but now that I'm *looking*for them, I can't seem to find any X-Face:'s in anyones news headers!  :-(Could you, would you, please send me your X-Face:headerI know* I'll probably get a little swamped, but I can handle it.  ...I hope.
+True Label: comp.windows.x
+Predicted Label: alt.atheism
+Prediction: Incorrect
+
+Text: In a word, yes.
+True Label: alt.atheism
+Predicted Label: alt.atheism
+Prediction: Correct
+
+Text: They were attacking the Iraqis to drive them out of Kuwait,a country whose citizens have close blood and business tiesto Saudi citizens.  And me thinks if the US had not helped outthe Iraqis would have swallowed Saudi Arabia, too (or at least the eastern oilfields).  And no Muslim country was doingmuch of anything to help liberate Kuwait and protect SaudiArabia; indeed, in some masses of citizens were demonstratingin favor of that butcher Saddam (who killed lotsa Muslims),just because he was killing, raping, and looting relativelyrich Muslims and also thumbing his nose at the West.So how would have *you* defended Saudi Arabia and rolledback the Iraqi invasion, were you in charge of Saudi Arabia???I think that it is a very good idea to not have governments have anofficial religion (de facto or de jure), because with human naturelike it is, the ambitious and not the pious will always be theones who rise to power.  There are just too many people in thisworld (or any country) for the citizens to really know if a leader is really devout or if he is just a slick operator.You make it sound like these guys are angels, Ilyess.  (In yourclarinet posting you edited out some stuff; was it the following???)Friday's New York Times reported that this group definitely ismore conservative than even Sheikh Baz and his followers (whothink that the House of Saud does not rule the country conservativelyenough).  The NYT reported that, besides complaining that thegovernment was not conservative enough, they have:     - asserted that the (approx. 500,000) Shiites in the Kingdom       are apostates, a charge that under Saudi (and Islamic) law       brings the death penalty.       Diplomatic guy (Sheikh bin Jibrin), isn't he Ilyess?   - called for severe punishment of the 40 or so women who   drove in public a while back to protest the ban on       women driving.  The guy from the group who said this,    Abdelhamoud al-Toweijri, said that these women should    be fired from their jobs, jailed, and branded as         prostitutes.    Is this what you want to see happen, Ilyess?  I've       heard many Muslims say that the ban on women driving     has no basis in the Qur'an, the ahadith, etc.   Yet these folks not only like the ban, they want         these women falsely called prostitutes?          If I were you, I'd choose my heroes wisely,      Ilyess, not just reflexively rally behind        anyone who hates anyone you hate.     - say that women should not be allowed to work.  - say that TV and radio are too immoral in the Kingdom.Now, the House of Saud is neither my least nor my most favorite governmenton earth; I think they restrict religious and political reedom a lot, amongother things.  I just think that the most likely replacementsfor them are going to be a lot worse for the citizens of the country.But I think the House of Saud is feeling the heat lately.  In thelast six months or so I've read there have been stepped up harassingby the muttawain (religious police---*not* government) of Western womennot fully veiled (something stupid for women to do, IMO, because itsends the wrong signals about your morality).  And I've read thatthey've cracked down on the few, home-based expartiate religiousgatherings, and even posted rewards in (government-owned) newspapersoffering money for anyone who turns in a group of expartiates whodare worship in their homes or any other secret place. So thegovernment has grown even more intolerant to try to take some ofthe wind out of the sails of the more-conservative opposition.As unislamic as some of these things are, they're just a smalltaste of what would happen if these guys overthrow the House ofSaud, like they're trying to in the long run.Is this really what you (and Rached and others in the generalwest-is-evil-zionists-rule-hate-west-or-you-are-a-puppet crowd)want, Ilyess?
+True Label: talk.politics.mideast
+Predicted Label: alt.atheism
+Prediction: Incorrect
+
+Downloading readme: 734B [00:00, 2.15kB/s]                       
+Repo card metadata block was not found. Setting CardData to empty.
+Downloading data: 100%|█████| 14.8M/14.8M [00:08<00:00, 1.83MB/s]
+Downloading data: 100%|█████| 8.91M/8.91M [00:03<00:00, 2.47MB/s]
+Generating train split: 11314 examples [00:00, 127929.31 examples/s]
+Generating test split: 7532 examples [00:00, 200991.85 examples/s]
+dataset: DatasetDict({
+    train: Dataset({
+        features: ['text', 'label', 'label_text'],
+        num_rows: 11314
+    })
+    test: Dataset({
+        features: ['text', 'label', 'label_text'],
+        num_rows: 7532
+    })
+})
+Repo card metadata block was not found. Setting CardData to empty.
+dataset: DatasetDict({
+    train: Dataset({
+        features: ['text', 'label', 'label_text'],
+        num_rows: 11314
+    })
+    test: Dataset({
+        features: ['text', 'label', 'label_text'],
+        num_rows: 7532
+    })
+})
+encoded_dataset: DatasetDict({
+    train: Dataset({
+        features: ['text', 'label', 'label_text', 'input_ids', 'token_type_ids', 'attention_mask'],
+        num_rows: 11314
+    })
+    test: Dataset({
+        features: ['text', 'label', 'label_text', 'input_ids', 'token_type_ids', 'attention_mask'],
+        num_rows: 7532
+    })
+})
+<mindspore.dataset.engine.iterators.DictIterator object at 0xfffd941e42b0>
+  0%|                                   | 0/4245 [00:00<?, ?it/s]{'loss': 2.9946, 'learning_rate': 1.995288574793875e-05, 'epoch': 0.01}
+{'loss': 2.9886, 'learning_rate': 1.9905771495877505e-05, 'epoch': 0.01}
+{'loss': 2.9802, 'learning_rate': 1.9858657243816254e-05, 'epoch': 0.02}
+{'loss': 2.9997, 'learning_rate': 1.9811542991755008e-05, 'epoch': 0.03}
+{'loss': 2.9929, 'learning_rate': 1.976442873969376e-05, 'epoch': 0.04}
+{'loss': 2.9947, 'learning_rate': 1.971731448763251e-05, 'epoch': 0.04}
+{'loss': 2.9801, 'learning_rate': 1.967020023557126e-05, 'epoch': 0.05}
+{'loss': 2.9885, 'learning_rate': 1.9623085983510014e-05, 'epoch': 0.06}
+{'loss': 2.9588, 'learning_rate': 1.9575971731448763e-05, 'epoch': 0.06}
+{'loss': 2.9253, 'learning_rate': 1.9528857479387517e-05, 'epoch': 0.07}
+{'loss': 2.9067, 'learning_rate': 1.948174322732627e-05, 'epoch': 0.08}
+{'loss': 2.8734, 'learning_rate': 1.943462897526502e-05, 'epoch': 0.08}
+{'loss': 2.8918, 'learning_rate': 1.938751472320377e-05, 'epoch': 0.09}
+{'loss': 2.8893, 'learning_rate': 1.9340400471142523e-05, 'epoch': 0.1}
+{'loss': 2.8693, 'learning_rate': 1.9293286219081272e-05, 'epoch': 0.11}
+{'loss': 2.8104, 'learning_rate': 1.9246171967020026e-05, 'epoch': 0.11}
+{'loss': 2.8417, 'learning_rate': 1.919905771495878e-05, 'epoch': 0.12}
+{'loss': 2.8272, 'learning_rate': 1.915194346289753e-05, 'epoch': 0.13}
+{'loss': 2.7627, 'learning_rate': 1.910482921083628e-05, 'epoch': 0.13}
+{'loss': 2.7589, 'learning_rate': 1.905771495877503e-05, 'epoch': 0.14}
+{'loss': 2.7467, 'learning_rate': 1.901060070671378e-05, 'epoch': 0.15}
+{'loss': 2.7249, 'learning_rate': 1.8963486454652535e-05, 'epoch': 0.16}
+{'loss': 2.7779, 'learning_rate': 1.8916372202591284e-05, 'epoch': 0.16}
+{'loss': 2.6691, 'learning_rate': 1.8869257950530038e-05, 'epoch': 0.17}
+{'loss': 2.6295, 'learning_rate': 1.8822143698468788e-05, 'epoch': 0.18}
+{'loss': 2.7362, 'learning_rate': 1.877502944640754e-05, 'epoch': 0.18}
+{'loss': 2.5766, 'learning_rate': 1.872791519434629e-05, 'epoch': 0.19}
+{'loss': 2.696, 'learning_rate': 1.8680800942285044e-05, 'epoch': 0.2}
+{'loss': 2.6637, 'learning_rate': 1.8633686690223794e-05, 'epoch': 0.2}
+{'loss': 2.5966, 'learning_rate': 1.8586572438162547e-05, 'epoch': 0.21}
+{'loss': 2.5944, 'learning_rate': 1.8539458186101297e-05, 'epoch': 0.22}
+{'loss': 2.6356, 'learning_rate': 1.849234393404005e-05, 'epoch': 0.23}
+{'loss': 2.6385, 'learning_rate': 1.84452296819788e-05, 'epoch': 0.23}
+{'loss': 2.6487, 'learning_rate': 1.8398115429917553e-05, 'epoch': 0.24}
+{'loss': 2.561, 'learning_rate': 1.8351001177856303e-05, 'epoch': 0.25}
+{'loss': 2.6091, 'learning_rate': 1.8303886925795052e-05, 'epoch': 0.25}
+{'loss': 2.6735, 'learning_rate': 1.8256772673733806e-05, 'epoch': 0.26}
+{'loss': 2.5227, 'learning_rate': 1.820965842167256e-05, 'epoch': 0.27}
+{'loss': 2.5965, 'learning_rate': 1.816254416961131e-05, 'epoch': 0.28}
+{'loss': 2.486, 'learning_rate': 1.8115429917550062e-05, 'epoch': 0.28}
+{'loss': 2.6101, 'learning_rate': 1.806831566548881e-05, 'epoch': 0.29}
+{'loss': 2.4855, 'learning_rate': 1.802120141342756e-05, 'epoch': 0.3}
+{'loss': 2.6034, 'learning_rate': 1.7974087161366315e-05, 'epoch': 0.3}
+{'loss': 2.4797, 'learning_rate': 1.7926972909305068e-05, 'epoch': 0.31}
+{'loss': 2.4212, 'learning_rate': 1.7879858657243818e-05, 'epoch': 0.32}
+{'loss': 2.4477, 'learning_rate': 1.783274440518257e-05, 'epoch': 0.33}
+{'loss': 2.5574, 'learning_rate': 1.778563015312132e-05, 'epoch': 0.33}
+{'loss': 2.5368, 'learning_rate': 1.773851590106007e-05, 'epoch': 0.34}
+{'loss': 2.5413, 'learning_rate': 1.7691401648998824e-05, 'epoch': 0.35}
+{'loss': 2.4152, 'learning_rate': 1.7644287396937577e-05, 'epoch': 0.35}
+{'loss': 2.4536, 'learning_rate': 1.7597173144876327e-05, 'epoch': 0.36}
+{'loss': 2.4456, 'learning_rate': 1.755005889281508e-05, 'epoch': 0.37}
+{'loss': 2.4348, 'learning_rate': 1.750294464075383e-05, 'epoch': 0.37}
+{'loss': 2.3614, 'learning_rate': 1.745583038869258e-05, 'epoch': 0.38}
+{'loss': 2.3518, 'learning_rate': 1.7408716136631333e-05, 'epoch': 0.39}
+{'loss': 2.413, 'learning_rate': 1.7361601884570082e-05, 'epoch': 0.4}
+{'loss': 2.3661, 'learning_rate': 1.7314487632508836e-05, 'epoch': 0.4}
+{'loss': 2.3632, 'learning_rate': 1.726737338044759e-05, 'epoch': 0.41}
+{'loss': 2.3688, 'learning_rate': 1.722025912838634e-05, 'epoch': 0.42}
+{'loss': 2.3527, 'learning_rate': 1.717314487632509e-05, 'epoch': 0.42}
+{'loss': 2.378, 'learning_rate': 1.712603062426384e-05, 'epoch': 0.43}
+{'loss': 2.3755, 'learning_rate': 1.707891637220259e-05, 'epoch': 0.44}
+{'loss': 2.3734, 'learning_rate': 1.7031802120141345e-05, 'epoch': 0.45}
+{'loss': 2.2904, 'learning_rate': 1.6984687868080098e-05, 'epoch': 0.45}
+{'loss': 2.2639, 'learning_rate': 1.6937573616018848e-05, 'epoch': 0.46}
+{'loss': 2.3959, 'learning_rate': 1.6890459363957597e-05, 'epoch': 0.47}
+{'loss': 2.2694, 'learning_rate': 1.684334511189635e-05, 'epoch': 0.47}
+{'loss': 2.2443, 'learning_rate': 1.67962308598351e-05, 'epoch': 0.48}
+{'loss': 2.2561, 'learning_rate': 1.6749116607773854e-05, 'epoch': 0.49}
+{'loss': 2.3604, 'learning_rate': 1.6702002355712607e-05, 'epoch': 0.49}
+{'loss': 2.2604, 'learning_rate': 1.6654888103651357e-05, 'epoch': 0.5}
+{'loss': 2.2287, 'learning_rate': 1.6607773851590106e-05, 'epoch': 0.51}
+{'loss': 2.2184, 'learning_rate': 1.656065959952886e-05, 'epoch': 0.52}
+{'loss': 2.3084, 'learning_rate': 1.651354534746761e-05, 'epoch': 0.52}
+{'loss': 2.2958, 'learning_rate': 1.6466431095406363e-05, 'epoch': 0.53}
+{'loss': 2.2041, 'learning_rate': 1.6419316843345112e-05, 'epoch': 0.54}
+{'loss': 2.2698, 'learning_rate': 1.6372202591283866e-05, 'epoch': 0.54}
+{'loss': 2.2158, 'learning_rate': 1.6325088339222615e-05, 'epoch': 0.55}
+{'loss': 2.3343, 'learning_rate': 1.627797408716137e-05, 'epoch': 0.56}
+{'loss': 2.2124, 'learning_rate': 1.623085983510012e-05, 'epoch': 0.57}
+{'loss': 2.2794, 'learning_rate': 1.618374558303887e-05, 'epoch': 0.57}
+{'loss': 2.1739, 'learning_rate': 1.613663133097762e-05, 'epoch': 0.58}
+{'loss': 2.1585, 'learning_rate': 1.6089517078916375e-05, 'epoch': 0.59}
+{'loss': 2.2099, 'learning_rate': 1.6042402826855124e-05, 'epoch': 0.59}
+{'loss': 2.2353, 'learning_rate': 1.5995288574793878e-05, 'epoch': 0.6}
+{'loss': 2.2311, 'learning_rate': 1.5948174322732627e-05, 'epoch': 0.61}
+{'loss': 2.2032, 'learning_rate': 1.590106007067138e-05, 'epoch': 0.61}
+{'loss': 2.1507, 'learning_rate': 1.585394581861013e-05, 'epoch': 0.62}
+{'loss': 2.1187, 'learning_rate': 1.580683156654888e-05, 'epoch': 0.63}
+{'loss': 2.2958, 'learning_rate': 1.5759717314487633e-05, 'epoch': 0.64}
+{'loss': 2.2303, 'learning_rate': 1.5712603062426387e-05, 'epoch': 0.64}
+{'loss': 2.1122, 'learning_rate': 1.5665488810365136e-05, 'epoch': 0.65}
+{'loss': 2.1545, 'learning_rate': 1.561837455830389e-05, 'epoch': 0.66}
+{'loss': 2.058, 'learning_rate': 1.557126030624264e-05, 'epoch': 0.66}
+{'loss': 2.1072, 'learning_rate': 1.552414605418139e-05, 'epoch': 0.67}
+{'loss': 2.1222, 'learning_rate': 1.5477031802120142e-05, 'epoch': 0.68}
+{'loss': 2.0514, 'learning_rate': 1.5429917550058896e-05, 'epoch': 0.69}
+{'loss': 2.0618, 'learning_rate': 1.5382803297997645e-05, 'epoch': 0.69}
+{'loss': 2.1909, 'learning_rate': 1.53356890459364e-05, 'epoch': 0.7}
+{'loss': 2.0361, 'learning_rate': 1.528857479387515e-05, 'epoch': 0.71}
+{'loss': 2.0664, 'learning_rate': 1.52414605418139e-05, 'epoch': 0.71}
+{'loss': 2.0312, 'learning_rate': 1.519434628975265e-05, 'epoch': 0.72}
+{'loss': 1.9411, 'learning_rate': 1.5147232037691405e-05, 'epoch': 0.73}
+{'loss': 2.1069, 'learning_rate': 1.5100117785630154e-05, 'epoch': 0.73}
+{'loss': 2.0093, 'learning_rate': 1.5053003533568906e-05, 'epoch': 0.74}
+{'loss': 2.1033, 'learning_rate': 1.5005889281507658e-05, 'epoch': 0.75}
+{'loss': 2.1802, 'learning_rate': 1.4958775029446409e-05, 'epoch': 0.76}
+{'loss': 2.0965, 'learning_rate': 1.4911660777385159e-05, 'epoch': 0.76}
+{'loss': 2.0452, 'learning_rate': 1.486454652532391e-05, 'epoch': 0.77}
+{'loss': 2.0893, 'learning_rate': 1.4817432273262664e-05, 'epoch': 0.78}
+{'loss': 2.1165, 'learning_rate': 1.4770318021201415e-05, 'epoch': 0.78}
+{'loss': 2.1192, 'learning_rate': 1.4723203769140167e-05, 'epoch': 0.79}
+{'loss': 2.0502, 'learning_rate': 1.4676089517078918e-05, 'epoch': 0.8}
+{'loss': 1.9907, 'learning_rate': 1.4628975265017668e-05, 'epoch': 0.81}
+{'loss': 2.018, 'learning_rate': 1.458186101295642e-05, 'epoch': 0.81}
+{'loss': 1.9687, 'learning_rate': 1.453474676089517e-05, 'epoch': 0.82}
+{'loss': 2.1228, 'learning_rate': 1.4487632508833924e-05, 'epoch': 0.83}
+{'loss': 1.8787, 'learning_rate': 1.4440518256772676e-05, 'epoch': 0.83}
+{'loss': 1.9863, 'learning_rate': 1.4393404004711427e-05, 'epoch': 0.84}
+{'loss': 2.1089, 'learning_rate': 1.4346289752650177e-05, 'epoch': 0.85}
+{'loss': 2.043, 'learning_rate': 1.4299175500588928e-05, 'epoch': 0.86}
+{'loss': 1.8254, 'learning_rate': 1.425206124852768e-05, 'epoch': 0.86}
+{'loss': 1.9818, 'learning_rate': 1.4204946996466433e-05, 'epoch': 0.87}
+{'loss': 1.9141, 'learning_rate': 1.4157832744405185e-05, 'epoch': 0.88}
+{'loss': 1.9262, 'learning_rate': 1.4110718492343936e-05, 'epoch': 0.88}
+{'loss': 1.9235, 'learning_rate': 1.4063604240282686e-05, 'epoch': 0.89}
+{'loss': 1.9285, 'learning_rate': 1.4016489988221437e-05, 'epoch': 0.9}
+{'loss': 1.9884, 'learning_rate': 1.3969375736160189e-05, 'epoch': 0.9}
+{'loss': 2.0568, 'learning_rate': 1.392226148409894e-05, 'epoch': 0.91}
+{'loss': 1.9205, 'learning_rate': 1.3875147232037694e-05, 'epoch': 0.92}
+{'loss': 1.8579, 'learning_rate': 1.3828032979976445e-05, 'epoch': 0.93}
+{'loss': 1.948, 'learning_rate': 1.3780918727915195e-05, 'epoch': 0.93}
+{'loss': 2.0039, 'learning_rate': 1.3733804475853946e-05, 'epoch': 0.94}
+{'loss': 1.9272, 'learning_rate': 1.3686690223792698e-05, 'epoch': 0.95}
+{'loss': 1.8781, 'learning_rate': 1.363957597173145e-05, 'epoch': 0.95}
+{'loss': 1.9348, 'learning_rate': 1.3592461719670203e-05, 'epoch': 0.96}
+{'loss': 2.0473, 'learning_rate': 1.3545347467608954e-05, 'epoch': 0.97}
+{'loss': 1.8395, 'learning_rate': 1.3498233215547704e-05, 'epoch': 0.98}
+{'loss': 1.9884, 'learning_rate': 1.3451118963486455e-05, 'epoch': 0.98}
+{'loss': 1.9068, 'learning_rate': 1.3404004711425207e-05, 'epoch': 0.99}
+{'loss': 1.7655, 'learning_rate': 1.3356890459363958e-05, 'epoch': 1.0}
+ 33%|███████▉                | 1414/4245 [11:58<19:28,  2.42it/s]  File "/home/ma-user/work/mindNLPAlbert.py", line 132, in <module>
+    trainer.train()
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/base.py", line 755, in train
+    return inner_training_loop(
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/base.py", line 1107, in _inner_training_loop
+    tr_loss_step, grads = self.training_step(model, inputs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/base.py", line 1382, in training_step
+    loss, grads = self.grad_fn(inputs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py", line 642, in after_grad
+    return grad_(fn_, weights)(*args, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/common/api.py", line 188, in wrapper
+    results = fn(*arg, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py", line 617, in after_grad
+    run_args, res = self._pynative_forward_run(fn, grad_, weights, *args, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/composite/base.py", line 674, in _pynative_forward_run
+    outputs = fn(*args, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/base.py", line 1374, in forward
+    return self.compute_loss(model, inputs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/engine/trainer/base.py", line 1396, in compute_loss
+    outputs = model(**inputs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/core/nn/modules/module.py", line 391, in _wrapped_call_impl
+    return self._call_impl(*args, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/core/nn/modules/module.py", line 402, in _call_impl
+    return forward_call(*args, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/albert/modeling_albert.py", line 1565, in forward
+    outputs = self.albert(
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/core/nn/modules/module.py", line 391, in _wrapped_call_impl
+    return self._call_impl(*args, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/core/nn/modules/module.py", line 402, in _call_impl
+    return forward_call(*args, **kwargs)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/models/albert/modeling_albert.py", line 929, in forward
+    buffered_token_type_ids_expanded = buffered_token_type_ids.broadcast_to((batch_size, seq_length))
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/common/tensor.py", line 1584, in broadcast_to
+    return tensor_operator_registry.get('broadcast_to')(self, shape)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/auto_generate/gen_ops_def.py", line 1081, in broadcast_to
+    return broadcast_to_impl(input, shape)
+  File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindspore/ops/auto_generate/pyboost_inner_prim.py", line 137, in __call__
+    return _convert_stub(super().__call__(input, shape))
+  File "<string>", line 4, in <module>
+  {'eval_loss': 1.934226155281067, 'eval_runtime': 121.8348, 'eval_samples_per_second': 7.732, 'eval_steps_per_second': 0.969, 'epoch': 1.0}
+{'loss': 1.8706, 'learning_rate': 1.330977620730271e-05, 'epoch': 1.0}                                                            
+{'loss': 1.9159, 'learning_rate': 1.3262661955241463e-05, 'epoch': 1.01}
+{'loss': 1.8223, 'learning_rate': 1.3215547703180213e-05, 'epoch': 1.02}
+{'loss': 1.8048, 'learning_rate': 1.3168433451118964e-05, 'epoch': 1.02}
+{'loss': 1.9046, 'learning_rate': 1.3121319199057716e-05, 'epoch': 1.03}
+{'loss': 2.047, 'learning_rate': 1.3074204946996467e-05, 'epoch': 1.04}
+{'loss': 1.794, 'learning_rate': 1.3027090694935219e-05, 'epoch': 1.05}
+{'loss': 1.856, 'learning_rate': 1.2979976442873969e-05, 'epoch': 1.05}
+{'loss': 1.9468, 'learning_rate': 1.2932862190812724e-05, 'epoch': 1.06}
+{'loss': 1.8067, 'learning_rate': 1.2885747938751473e-05, 'epoch': 1.07}
+{'loss': 1.9848, 'learning_rate': 1.2838633686690225e-05, 'epoch': 1.07}
+{'loss': 1.9787, 'learning_rate': 1.2791519434628976e-05, 'epoch': 1.08}
+{'loss': 1.7619, 'learning_rate': 1.2744405182567728e-05, 'epoch': 1.09}
+{'loss': 1.9363, 'learning_rate': 1.2697290930506478e-05, 'epoch': 1.1}
+{'loss': 1.8154, 'learning_rate': 1.2650176678445233e-05, 'epoch': 1.1}
+{'loss': 1.7923, 'learning_rate': 1.2603062426383982e-05, 'epoch': 1.11}
+{'loss': 2.0117, 'learning_rate': 1.2555948174322734e-05, 'epoch': 1.12}
+{'loss': 1.9168, 'learning_rate': 1.2508833922261485e-05, 'epoch': 1.12}
+{'loss': 1.8136, 'learning_rate': 1.2461719670200237e-05, 'epoch': 1.13}
+{'loss': 1.9701, 'learning_rate': 1.2414605418138987e-05, 'epoch': 1.14}
+{'loss': 1.7979, 'learning_rate': 1.2367491166077738e-05, 'epoch': 1.14}
+{'loss': 1.5671, 'learning_rate': 1.2320376914016491e-05, 'epoch': 1.15}
+{'loss': 1.8255, 'learning_rate': 1.2273262661955243e-05, 'epoch': 1.16}
+{'loss': 1.8027, 'learning_rate': 1.2226148409893994e-05, 'epoch': 1.17}
+{'loss': 1.8137, 'learning_rate': 1.2179034157832746e-05, 'epoch': 1.17}
+{'loss': 1.748, 'learning_rate': 1.2131919905771496e-05, 'epoch': 1.18}
+{'loss': 1.7853, 'learning_rate': 1.2084805653710247e-05, 'epoch': 1.19}
+{'loss': 1.8007, 'learning_rate': 1.2037691401648999e-05, 'epoch': 1.19}
+{'loss': 1.8309, 'learning_rate': 1.1990577149587752e-05, 'epoch': 1.2}
+{'loss': 1.9758, 'learning_rate': 1.1943462897526503e-05, 'epoch': 1.21}
+{'loss': 1.9097, 'learning_rate': 1.1896348645465255e-05, 'epoch': 1.22}
+{'loss': 1.7768, 'learning_rate': 1.1849234393404005e-05, 'epoch': 1.22}
+{'loss': 1.8001, 'learning_rate': 1.1802120141342756e-05, 'epoch': 1.23}
+{'loss': 1.7812, 'learning_rate': 1.1755005889281508e-05, 'epoch': 1.24}
+{'loss': 1.7163, 'learning_rate': 1.1707891637220261e-05, 'epoch': 1.24}
+{'loss': 1.81, 'learning_rate': 1.1660777385159012e-05, 'epoch': 1.25}
+{'loss': 1.9075, 'learning_rate': 1.1613663133097764e-05, 'epoch': 1.26}
+{'loss': 1.5918, 'learning_rate': 1.1566548881036514e-05, 'epoch': 1.27}
+{'loss': 1.828, 'learning_rate': 1.1519434628975265e-05, 'epoch': 1.27}
+{'loss': 1.8371, 'learning_rate': 1.1472320376914017e-05, 'epoch': 1.28}
+{'loss': 1.8861, 'learning_rate': 1.1425206124852768e-05, 'epoch': 1.29}
+{'loss': 1.6801, 'learning_rate': 1.1378091872791521e-05, 'epoch': 1.29}
+{'loss': 1.898, 'learning_rate': 1.1330977620730273e-05, 'epoch': 1.3}
+{'loss': 1.6902, 'learning_rate': 1.1283863368669023e-05, 'epoch': 1.31}
+{'loss': 1.731, 'learning_rate': 1.1236749116607774e-05, 'epoch': 1.31}
+{'loss': 1.7859, 'learning_rate': 1.1189634864546526e-05, 'epoch': 1.32}
+{'loss': 1.6359, 'learning_rate': 1.1142520612485277e-05, 'epoch': 1.33}
+{'loss': 1.9486, 'learning_rate': 1.1095406360424029e-05, 'epoch': 1.34}
+{'loss': 1.9364, 'learning_rate': 1.1048292108362782e-05, 'epoch': 1.34}
+{'loss': 1.8847, 'learning_rate': 1.1001177856301532e-05, 'epoch': 1.35}
+{'loss': 1.7944, 'learning_rate': 1.0954063604240283e-05, 'epoch': 1.36}
+{'loss': 1.7532, 'learning_rate': 1.0906949352179035e-05, 'epoch': 1.36}
+{'loss': 1.7355, 'learning_rate': 1.0859835100117786e-05, 'epoch': 1.37}
+{'loss': 1.6636, 'learning_rate': 1.0812720848056538e-05, 'epoch': 1.38}
+{'loss': 1.7203, 'learning_rate': 1.0765606595995291e-05, 'epoch': 1.39}
+{'loss': 1.66, 'learning_rate': 1.071849234393404e-05, 'epoch': 1.39}
+{'loss': 1.6902, 'learning_rate': 1.0671378091872792e-05, 'epoch': 1.4}
+{'loss': 1.5835, 'learning_rate': 1.0624263839811544e-05, 'epoch': 1.41}
+{'loss': 1.6698, 'learning_rate': 1.0577149587750295e-05, 'epoch': 1.41}
+{'loss': 1.8361, 'learning_rate': 1.0530035335689047e-05, 'epoch': 1.42}
+{'loss': 1.7245, 'learning_rate': 1.0482921083627797e-05, 'epoch': 1.43}
+{'loss': 1.821, 'learning_rate': 1.043580683156655e-05, 'epoch': 1.43}
+{'loss': 1.7687, 'learning_rate': 1.0388692579505301e-05, 'epoch': 1.44}
+{'loss': 1.7007, 'learning_rate': 1.0341578327444053e-05, 'epoch': 1.45}
+{'loss': 1.5068, 'learning_rate': 1.0294464075382804e-05, 'epoch': 1.46}
+{'loss': 1.64, 'learning_rate': 1.0247349823321556e-05, 'epoch': 1.46}
+{'loss': 1.916, 'learning_rate': 1.0200235571260306e-05, 'epoch': 1.47}
+{'loss': 1.6733, 'learning_rate': 1.0153121319199059e-05, 'epoch': 1.48}
+{'loss': 1.8821, 'learning_rate': 1.010600706713781e-05, 'epoch': 1.48}
+{'loss': 1.6846, 'learning_rate': 1.0058892815076562e-05, 'epoch': 1.49}
+{'loss': 1.647, 'learning_rate': 1.0011778563015313e-05, 'epoch': 1.5}
+{'loss': 1.7158, 'learning_rate': 9.964664310954065e-06, 'epoch': 1.51}
+{'loss': 1.568, 'learning_rate': 9.917550058892816e-06, 'epoch': 1.51}
+{'loss': 1.7618, 'learning_rate': 9.870435806831568e-06, 'epoch': 1.52}
+{'loss': 1.6675, 'learning_rate': 9.82332155477032e-06, 'epoch': 1.53}
+{'loss': 1.5742, 'learning_rate': 9.77620730270907e-06, 'epoch': 1.53}
+{'loss': 1.7907, 'learning_rate': 9.729093050647822e-06, 'epoch': 1.54}
+{'loss': 1.6222, 'learning_rate': 9.681978798586574e-06, 'epoch': 1.55}
+{'loss': 1.5817, 'learning_rate': 9.634864546525324e-06, 'epoch': 1.55}
+{'loss': 1.6249, 'learning_rate': 9.587750294464077e-06, 'epoch': 1.56}
+{'loss': 1.7486, 'learning_rate': 9.540636042402828e-06, 'epoch': 1.57}
+{'loss': 1.7109, 'learning_rate': 9.493521790341578e-06, 'epoch': 1.58}
+{'loss': 1.6359, 'learning_rate': 9.446407538280331e-06, 'epoch': 1.58}
+{'loss': 1.7983, 'learning_rate': 9.399293286219083e-06, 'epoch': 1.59}
+{'loss': 1.8121, 'learning_rate': 9.352179034157833e-06, 'epoch': 1.6}
+{'loss': 1.5678, 'learning_rate': 9.305064782096584e-06, 'epoch': 1.6}
+{'loss': 1.6605, 'learning_rate': 9.257950530035337e-06, 'epoch': 1.61}
+{'loss': 1.6769, 'learning_rate': 9.210836277974087e-06, 'epoch': 1.62}
+{'loss': 1.4976, 'learning_rate': 9.163722025912839e-06, 'epoch': 1.63}
+{'loss': 1.6349, 'learning_rate': 9.116607773851592e-06, 'epoch': 1.63}
+{'loss': 1.6199, 'learning_rate': 9.069493521790342e-06, 'epoch': 1.64}
+{'loss': 1.6215, 'learning_rate': 9.022379269729093e-06, 'epoch': 1.65}
+{'loss': 1.6123, 'learning_rate': 8.975265017667846e-06, 'epoch': 1.65}
+{'loss': 1.5007, 'learning_rate': 8.928150765606596e-06, 'epoch': 1.66}
+{'loss': 1.4372, 'learning_rate': 8.881036513545348e-06, 'epoch': 1.67}
+{'loss': 1.7341, 'learning_rate': 8.8339222614841e-06, 'epoch': 1.67}
+{'loss': 1.6567, 'learning_rate': 8.78680800942285e-06, 'epoch': 1.68}
+{'loss': 1.5624, 'learning_rate': 8.739693757361602e-06, 'epoch': 1.69}
+{'loss': 1.5854, 'learning_rate': 8.692579505300354e-06, 'epoch': 1.7}
+{'loss': 1.5873, 'learning_rate': 8.645465253239105e-06, 'epoch': 1.7}
+{'loss': 1.5115, 'learning_rate': 8.598351001177857e-06, 'epoch': 1.71}
+{'loss': 1.6174, 'learning_rate': 8.551236749116608e-06, 'epoch': 1.72}
+{'loss': 1.436, 'learning_rate': 8.50412249705536e-06, 'epoch': 1.72}
+{'loss': 1.7209, 'learning_rate': 8.457008244994111e-06, 'epoch': 1.73}
+{'loss': 1.6428, 'learning_rate': 8.409893992932863e-06, 'epoch': 1.74}
+{'loss': 1.4833, 'learning_rate': 8.362779740871614e-06, 'epoch': 1.75}
+{'loss': 1.7402, 'learning_rate': 8.315665488810366e-06, 'epoch': 1.75}
+{'loss': 1.5804, 'learning_rate': 8.268551236749117e-06, 'epoch': 1.76}
+{'loss': 1.6905, 'learning_rate': 8.221436984687869e-06, 'epoch': 1.77}
+{'loss': 1.7076, 'learning_rate': 8.17432273262662e-06, 'epoch': 1.77}
+{'loss': 1.6038, 'learning_rate': 8.127208480565372e-06, 'epoch': 1.78}
+{'loss': 1.5418, 'learning_rate': 8.080094228504123e-06, 'epoch': 1.79}
+{'loss': 1.6592, 'learning_rate': 8.032979976442875e-06, 'epoch': 1.8}
+{'loss': 1.4188, 'learning_rate': 7.985865724381626e-06, 'epoch': 1.8}
+{'loss': 1.5273, 'learning_rate': 7.938751472320378e-06, 'epoch': 1.81}
+{'loss': 1.5125, 'learning_rate': 7.89163722025913e-06, 'epoch': 1.82}
+{'loss': 1.6153, 'learning_rate': 7.84452296819788e-06, 'epoch': 1.82}
+{'loss': 1.5599, 'learning_rate': 7.797408716136632e-06, 'epoch': 1.83}
+{'loss': 1.5748, 'learning_rate': 7.750294464075384e-06, 'epoch': 1.84}
+{'loss': 1.6111, 'learning_rate': 7.703180212014135e-06, 'epoch': 1.84}
+{'loss': 1.5648, 'learning_rate': 7.656065959952887e-06, 'epoch': 1.85}
+{'loss': 1.595, 'learning_rate': 7.6089517078916374e-06, 'epoch': 1.86}
+{'loss': 1.4616, 'learning_rate': 7.56183745583039e-06, 'epoch': 1.87}
+{'loss': 1.3282, 'learning_rate': 7.514723203769141e-06, 'epoch': 1.87}
+{'loss': 1.5174, 'learning_rate': 7.467608951707892e-06, 'epoch': 1.88}
+{'loss': 1.5724, 'learning_rate': 7.420494699646644e-06, 'epoch': 1.89}
+{'loss': 1.5629, 'learning_rate': 7.373380447585396e-06, 'epoch': 1.89}
+{'loss': 1.5239, 'learning_rate': 7.3262661955241465e-06, 'epoch': 1.9}
+{'loss': 1.7579, 'learning_rate': 7.279151943462898e-06, 'epoch': 1.91}
+{'loss': 1.4297, 'learning_rate': 7.23203769140165e-06, 'epoch': 1.92}
+{'loss': 1.6632, 'learning_rate': 7.184923439340401e-06, 'epoch': 1.92}
+{'loss': 1.6051, 'learning_rate': 7.1378091872791525e-06, 'epoch': 1.93}
+{'loss': 1.4817, 'learning_rate': 7.090694935217905e-06, 'epoch': 1.94}
+{'loss': 1.566, 'learning_rate': 7.0435806831566555e-06, 'epoch': 1.94}
+{'loss': 1.4524, 'learning_rate': 6.996466431095407e-06, 'epoch': 1.95}
+{'loss': 1.5737, 'learning_rate': 6.949352179034159e-06, 'epoch': 1.96}
+{'loss': 1.6593, 'learning_rate': 6.90223792697291e-06, 'epoch': 1.96}
+{'loss': 1.4787, 'learning_rate': 6.8551236749116615e-06, 'epoch': 1.97}
+{'loss': 1.5985, 'learning_rate': 6.808009422850412e-06, 'epoch': 1.98}
+{'loss': 1.5656, 'learning_rate': 6.7608951707891645e-06, 'epoch': 1.99}
+{'loss': 1.5111, 'learning_rate': 6.713780918727916e-06, 'epoch': 1.99}
+{'loss': 1.4442, 'learning_rate': 6.666666666666667e-06, 'epoch': 2.0}
+{'eval_loss': 1.64035165309906, 'eval_runtime': 96.028, 'eval_samples_per_second': 9.81, 'eval_steps_per_second': 1.229, 'epoch': 2.0}
+{'loss': 1.4239, 'learning_rate': 6.619552414605419e-06, 'epoch': 2.01}                                                           
+{'loss': 1.5397, 'learning_rate': 6.5724381625441705e-06, 'epoch': 2.01}
+{'loss': 1.5361, 'learning_rate': 6.525323910482921e-06, 'epoch': 2.02}
+{'loss': 1.5718, 'learning_rate': 6.4782096584216735e-06, 'epoch': 2.03}
+{'loss': 1.5439, 'learning_rate': 6.431095406360425e-06, 'epoch': 2.04}
+{'loss': 1.479, 'learning_rate': 6.383981154299176e-06, 'epoch': 2.04}
+{'loss': 1.4308, 'learning_rate': 6.336866902237927e-06, 'epoch': 2.05}
+{'loss': 1.6827, 'learning_rate': 6.2897526501766795e-06, 'epoch': 2.06}
+{'loss': 1.4118, 'learning_rate': 6.24263839811543e-06, 'epoch': 2.06}
+{'loss': 1.485, 'learning_rate': 6.195524146054182e-06, 'epoch': 2.07}
+{'loss': 1.6287, 'learning_rate': 6.148409893992934e-06, 'epoch': 2.08}
+{'loss': 1.4601, 'learning_rate': 6.101295641931685e-06, 'epoch': 2.08}
+{'loss': 1.5838, 'learning_rate': 6.054181389870436e-06, 'epoch': 2.09}
+{'loss': 1.4472, 'learning_rate': 6.0070671378091885e-06, 'epoch': 2.1}
+{'loss': 1.3862, 'learning_rate': 5.959952885747939e-06, 'epoch': 2.11}
+{'loss': 1.6133, 'learning_rate': 5.912838633686691e-06, 'epoch': 2.11}
+{'loss': 1.6568, 'learning_rate': 5.865724381625441e-06, 'epoch': 2.12}
+{'loss': 1.326, 'learning_rate': 5.818610129564194e-06, 'epoch': 2.13}
+{'loss': 1.6088, 'learning_rate': 5.771495877502945e-06, 'epoch': 2.13}
+{'loss': 1.5843, 'learning_rate': 5.724381625441696e-06, 'epoch': 2.14}
+{'loss': 1.462, 'learning_rate': 5.677267373380448e-06, 'epoch': 2.15}
+{'loss': 1.5061, 'learning_rate': 5.6301531213192e-06, 'epoch': 2.16}
+{'loss': 1.3882, 'learning_rate': 5.58303886925795e-06, 'epoch': 2.16}
+{'loss': 1.5251, 'learning_rate': 5.535924617196703e-06, 'epoch': 2.17}
+{'loss': 1.4952, 'learning_rate': 5.488810365135454e-06, 'epoch': 2.18}
+{'loss': 1.4892, 'learning_rate': 5.441696113074205e-06, 'epoch': 2.18}
+{'loss': 1.6042, 'learning_rate': 5.394581861012956e-06, 'epoch': 2.19}
+{'loss': 1.4758, 'learning_rate': 5.347467608951709e-06, 'epoch': 2.2}
+{'loss': 1.5867, 'learning_rate': 5.300353356890459e-06, 'epoch': 2.2}
+{'loss': 1.5673, 'learning_rate': 5.253239104829211e-06, 'epoch': 2.21}
+{'loss': 1.6257, 'learning_rate': 5.206124852767963e-06, 'epoch': 2.22}
+{'loss': 1.4875, 'learning_rate': 5.159010600706714e-06, 'epoch': 2.23}
+{'loss': 1.5995, 'learning_rate': 5.111896348645465e-06, 'epoch': 2.23}
+{'loss': 1.484, 'learning_rate': 5.064782096584218e-06, 'epoch': 2.24}
+{'loss': 1.5324, 'learning_rate': 5.017667844522968e-06, 'epoch': 2.25}
+{'loss': 1.4282, 'learning_rate': 4.97055359246172e-06, 'epoch': 2.25}
+{'loss': 1.51, 'learning_rate': 4.923439340400471e-06, 'epoch': 2.26}
+{'loss': 1.4225, 'learning_rate': 4.876325088339223e-06, 'epoch': 2.27}
+{'loss': 1.5825, 'learning_rate': 4.829210836277974e-06, 'epoch': 2.28}
+{'loss': 1.6643, 'learning_rate': 4.782096584216726e-06, 'epoch': 2.28}
+{'loss': 1.3955, 'learning_rate': 4.734982332155477e-06, 'epoch': 2.29}
+{'loss': 1.4971, 'learning_rate': 4.687868080094229e-06, 'epoch': 2.3}
+{'loss': 1.6081, 'learning_rate': 4.64075382803298e-06, 'epoch': 2.3}
+{'loss': 1.4412, 'learning_rate': 4.593639575971732e-06, 'epoch': 2.31}
+{'loss': 1.4726, 'learning_rate': 4.546525323910483e-06, 'epoch': 2.32}
+{'loss': 1.3144, 'learning_rate': 4.499411071849235e-06, 'epoch': 2.33}
+{'loss': 1.5704, 'learning_rate': 4.452296819787986e-06, 'epoch': 2.33}
+{'loss': 1.6508, 'learning_rate': 4.405182567726738e-06, 'epoch': 2.34}
+{'loss': 1.7181, 'learning_rate': 4.358068315665489e-06, 'epoch': 2.35}
+{'loss': 1.5822, 'learning_rate': 4.310954063604241e-06, 'epoch': 2.35}
+{'loss': 1.5158, 'learning_rate': 4.2638398115429916e-06, 'epoch': 2.36}
+{'loss': 1.5237, 'learning_rate': 4.216725559481744e-06, 'epoch': 2.37}
+{'loss': 1.2985, 'learning_rate': 4.1696113074204954e-06, 'epoch': 2.37}
+{'loss': 1.4391, 'learning_rate': 4.122497055359246e-06, 'epoch': 2.38}
+{'loss': 1.3929, 'learning_rate': 4.0753828032979984e-06, 'epoch': 2.39}
+{'loss': 1.3628, 'learning_rate': 4.028268551236749e-06, 'epoch': 2.4}
+{'loss': 1.4164, 'learning_rate': 3.981154299175501e-06, 'epoch': 2.4}
+{'loss': 1.3494, 'learning_rate': 3.934040047114253e-06, 'epoch': 2.41}
+{'loss': 1.4616, 'learning_rate': 3.886925795053004e-06, 'epoch': 2.42}
+{'loss': 1.397, 'learning_rate': 3.839811542991755e-06, 'epoch': 2.42}
+{'loss': 1.4567, 'learning_rate': 3.7926972909305066e-06, 'epoch': 2.43}
+{'loss': 1.4427, 'learning_rate': 3.745583038869258e-06, 'epoch': 2.44}
+{'loss': 1.4795, 'learning_rate': 3.69846878680801e-06, 'epoch': 2.45}
+{'loss': 1.3985, 'learning_rate': 3.651354534746761e-06, 'epoch': 2.45}
+{'loss': 1.2249, 'learning_rate': 3.6042402826855126e-06, 'epoch': 2.46}
+{'loss': 1.6071, 'learning_rate': 3.5571260306242637e-06, 'epoch': 2.47}
+{'loss': 1.4251, 'learning_rate': 3.5100117785630156e-06, 'epoch': 2.47}
+{'loss': 1.4917, 'learning_rate': 3.462897526501767e-06, 'epoch': 2.48}
+{'loss': 1.2912, 'learning_rate': 3.415783274440518e-06, 'epoch': 2.49}
+{'loss': 1.4848, 'learning_rate': 3.36866902237927e-06, 'epoch': 2.49}
+{'loss': 1.3776, 'learning_rate': 3.321554770318021e-06, 'epoch': 2.5}
+{'loss': 1.3616, 'learning_rate': 3.2744405182567727e-06, 'epoch': 2.51}
+{'loss': 1.4622, 'learning_rate': 3.2273262661955246e-06, 'epoch': 2.52}
+{'loss': 1.3345, 'learning_rate': 3.1802120141342757e-06, 'epoch': 2.52}
+{'loss': 1.5207, 'learning_rate': 3.133097762073027e-06, 'epoch': 2.53}
+{'loss': 1.4474, 'learning_rate': 3.0859835100117787e-06, 'epoch': 2.54}
+{'loss': 1.4421, 'learning_rate': 3.0388692579505302e-06, 'epoch': 2.54}
+{'loss': 1.2, 'learning_rate': 2.9917550058892817e-06, 'epoch': 2.55}
+{'loss': 1.5218, 'learning_rate': 2.9446407538280332e-06, 'epoch': 2.56}
+{'loss': 1.4568, 'learning_rate': 2.8975265017667847e-06, 'epoch': 2.57}
+{'loss': 1.568, 'learning_rate': 2.8504122497055362e-06, 'epoch': 2.57}
+{'loss': 1.4935, 'learning_rate': 2.8032979976442877e-06, 'epoch': 2.58}
+{'loss': 1.371, 'learning_rate': 2.7561837455830392e-06, 'epoch': 2.59}
+{'loss': 1.5419, 'learning_rate': 2.7090694935217903e-06, 'epoch': 2.59}
+{'loss': 1.4674, 'learning_rate': 2.6619552414605422e-06, 'epoch': 2.6}
+{'loss': 1.4457, 'learning_rate': 2.6148409893992937e-06, 'epoch': 2.61}
+{'loss': 1.34, 'learning_rate': 2.567726737338045e-06, 'epoch': 2.61}
+{'loss': 1.3917, 'learning_rate': 2.5206124852767967e-06, 'epoch': 2.62}
+{'loss': 1.3072, 'learning_rate': 2.473498233215548e-06, 'epoch': 2.63}
+{'loss': 1.5005, 'learning_rate': 2.4263839811542993e-06, 'epoch': 2.64}
+{'loss': 1.4246, 'learning_rate': 2.379269729093051e-06, 'epoch': 2.64}
+{'loss': 1.2906, 'learning_rate': 2.3321554770318023e-06, 'epoch': 2.65}
+{'loss': 1.3543, 'learning_rate': 2.285041224970554e-06, 'epoch': 2.66}
+{'loss': 1.2792, 'learning_rate': 2.2379269729093053e-06, 'epoch': 2.66}
+{'loss': 1.3097, 'learning_rate': 2.190812720848057e-06, 'epoch': 2.67}
+{'loss': 1.442, 'learning_rate': 2.1436984687868083e-06, 'epoch': 2.68}
+{'loss': 1.3148, 'learning_rate': 2.0965842167255594e-06, 'epoch': 2.69}
+{'loss': 1.4021, 'learning_rate': 2.0494699646643113e-06, 'epoch': 2.69}
+{'loss': 1.4178, 'learning_rate': 2.002355712603063e-06, 'epoch': 2.7}
+{'loss': 1.2951, 'learning_rate': 1.955241460541814e-06, 'epoch': 2.71}
+{'loss': 1.4155, 'learning_rate': 1.9081272084805654e-06, 'epoch': 2.71}
+{'loss': 1.3039, 'learning_rate': 1.861012956419317e-06, 'epoch': 2.72}
+{'loss': 1.3348, 'learning_rate': 1.8138987043580686e-06, 'epoch': 2.73}
+{'loss': 1.5269, 'learning_rate': 1.76678445229682e-06, 'epoch': 2.73}
+{'loss': 1.456, 'learning_rate': 1.7196702002355714e-06, 'epoch': 2.74}
+{'loss': 1.291, 'learning_rate': 1.6725559481743227e-06, 'epoch': 2.75}
+{'loss': 1.4496, 'learning_rate': 1.6254416961130742e-06, 'epoch': 2.76}
+{'loss': 1.4898, 'learning_rate': 1.578327444051826e-06, 'epoch': 2.76}
+{'loss': 1.4579, 'learning_rate': 1.5312131919905772e-06, 'epoch': 2.77}
+{'loss': 1.5109, 'learning_rate': 1.4840989399293287e-06, 'epoch': 2.78}
+{'loss': 1.393, 'learning_rate': 1.4369846878680802e-06, 'epoch': 2.78}
+{'loss': 1.3775, 'learning_rate': 1.3898704358068315e-06, 'epoch': 2.79}
+{'loss': 1.4783, 'learning_rate': 1.3427561837455832e-06, 'epoch': 2.8}
+{'loss': 1.2595, 'learning_rate': 1.2956419316843347e-06, 'epoch': 2.81}
+{'loss': 1.3582, 'learning_rate': 1.248527679623086e-06, 'epoch': 2.81}
+{'loss': 1.2581, 'learning_rate': 1.2014134275618375e-06, 'epoch': 2.82}
+{'loss': 1.6671, 'learning_rate': 1.154299175500589e-06, 'epoch': 2.83}
+{'loss': 1.2716, 'learning_rate': 1.1071849234393405e-06, 'epoch': 2.83}
+{'loss': 1.2645, 'learning_rate': 1.060070671378092e-06, 'epoch': 2.84}
+{'loss': 1.5729, 'learning_rate': 1.0129564193168433e-06, 'epoch': 2.85}
+{'loss': 1.4087, 'learning_rate': 9.65842167255595e-07, 'epoch': 2.86}
+{'loss': 1.2857, 'learning_rate': 9.187279151943463e-07, 'epoch': 2.86}
+{'loss': 1.282, 'learning_rate': 8.716136631330977e-07, 'epoch': 2.87}
+{'loss': 1.2097, 'learning_rate': 8.244994110718493e-07, 'epoch': 2.88}
+{'loss': 1.3122, 'learning_rate': 7.773851590106007e-07, 'epoch': 2.88}
+{'loss': 1.3597, 'learning_rate': 7.302709069493522e-07, 'epoch': 2.89}
+{'loss': 1.3839, 'learning_rate': 6.831566548881037e-07, 'epoch': 2.9}
+{'loss': 1.5166, 'learning_rate': 6.360424028268551e-07, 'epoch': 2.9}
+{'loss': 1.3991, 'learning_rate': 5.889281507656066e-07, 'epoch': 2.91}
+{'loss': 1.4307, 'learning_rate': 5.418138987043581e-07, 'epoch': 2.92}
+{'loss': 1.3694, 'learning_rate': 4.946996466431095e-07, 'epoch': 2.93}
+{'loss': 1.3242, 'learning_rate': 4.4758539458186104e-07, 'epoch': 2.93}
+{'loss': 1.4505, 'learning_rate': 4.004711425206125e-07, 'epoch': 2.94}
+{'loss': 1.4278, 'learning_rate': 3.53356890459364e-07, 'epoch': 2.95}
+{'loss': 1.3563, 'learning_rate': 3.0624263839811545e-07, 'epoch': 2.95}
+{'loss': 1.4091, 'learning_rate': 2.5912838633686695e-07, 'epoch': 2.96}
+{'loss': 1.5412, 'learning_rate': 2.120141342756184e-07, 'epoch': 2.97}
+{'loss': 1.2831, 'learning_rate': 1.6489988221436985e-07, 'epoch': 2.98}
+{'loss': 1.4771, 'learning_rate': 1.1778563015312134e-07, 'epoch': 2.98}
+{'loss': 1.3773, 'learning_rate': 7.06713780918728e-08, 'epoch': 2.99}
+{'loss': 1.2446, 'learning_rate': 2.3557126030624265e-08, 'epoch': 3.0}
+{'eval_loss': 1.5596660375595093, 'eval_runtime': 98.0142, 'eval_samples_per_second': 9.611, 'eval_steps_per_second': 1.204, 'epoch': 3.0}
+{'train_runtime': 2262.6642, 'train_samples_per_second': 15.009, 'train_steps_per_second': 1.876, 'train_loss': 1.8279571971286732, 'epoch': 3.0}
+100%|████████████████████████| 4245/4245 [37:42<00:00,  1.88it/s]
+100%|██████████████████████████| 942/942 [01:37<00:00,  9.63it/s]
+Evaluation results: {'eval_loss': 1.5596660375595093, 'eval_runtime': 98.2249, 'eval_samples_per_second': 9.59, 'eval_steps_per_second': 1.201, 'epoch': 3.0}
+Text: I am a little confused on all of the models of the 88-89 bonnevilles.I have heard of the LE SE LSE SSE SSEI. Could someone tell me thedifferences are far as features or performance. I am also curious toknow what the book value is for prefereably the 89 model. And how muchless than book value can you usually get them for. In other words howmuch are they in demand this time of year. I have heard that the mid-springearly summer is the best time to buy.
+True Label: rec.autos
+Predicted Label: misc.forsale
+Prediction: Incorrect
+Text: I'm not familiar at all with the format of these X-Face:thingies, butafter seeing them in some folks' headers, I've *got* to *see* them (andmaybe make one of my own)!I've got dpg-viewon my Linux box (which displays uncompressed X-Faces)and I've managed to compile [un]compface too... but now that I'm *looking*for them, I can't seem to find any X-Face:'s in anyones news headers!  :-(Could you, would you, please send me your X-Face:headerI know* I'll probably get a little swamped, but I can handle it.  ...I hope.
+True Label: comp.windows.x
+Predicted Label: comp.windows.x
+Prediction: Correct
+Text: In a word, yes.
+True Label: alt.atheism
+Predicted Label: talk.politics.misc
+Prediction: Incorrect
+Text: They were attacking the Iraqis to drive them out of Kuwait,a country whose citizens have close blood and business tiesto Saudi citizens.  And me thinks if the US had not helped outthe Iraqis would have swallowed Saudi Arabia, too (or at least the eastern oilfields).  And no Muslim country was doingmuch of anything to help liberate Kuwait and protect SaudiArabia; indeed, in some masses of citizens were demonstratingin favor of that butcher Saddam (who killed lotsa Muslims),just because he was killing, raping, and looting relativelyrich Muslims and also thumbing his nose at the West.So how would have *you* defended Saudi Arabia and rolledback the Iraqi invasion, were you in charge of Saudi Arabia???I think that it is a very good idea to not have governments have anofficial religion (de facto or de jure), because with human naturelike it is, the ambitious and not the pious will always be theones who rise to power.  There are just too many people in thisworld (or any country) for the citizens to really know if a leader is really devout or if he is just a slick operator.You make it sound like these guys are angels, Ilyess.  (In yourclarinet posting you edited out some stuff; was it the following???)Friday's New York Times reported that this group definitely ismore conservative than even Sheikh Baz and his followers (whothink that the House of Saud does not rule the country conservativelyenough).  The NYT reported that, besides complaining that thegovernment was not conservative enough, they have:     - asserted that the (approx. 500,000) Shiites in the Kingdom       are apostates, a charge that under Saudi (and Islamic) law       brings the death penalty.       Diplomatic guy (Sheikh bin Jibrin), isn't he Ilyess?   - called for severe punishment of the 40 or so women who   drove in public a while back to protest the ban on       women driving.  The guy from the group who said this,    Abdelhamoud al-Toweijri, said that these women should    be fired from their jobs, jailed, and branded as         prostitutes.    Is this what you want to see happen, Ilyess?  I've       heard many Muslims say that the ban on women driving     has no basis in the Qur'an, the ahadith, etc.   Yet these folks not only like the ban, they want         these women falsely called prostitutes?          If I were you, I'd choose my heroes wisely,      Ilyess, not just reflexively rally behind        anyone who hates anyone you hate.     - say that women should not be allowed to work.  - say that TV and radio are too immoral in the Kingdom.Now, the House of Saud is neither my least nor my most favorite governmenton earth; I think they restrict religious and political reedom a lot, amongother things.  I just think that the most likely replacementsfor them are going to be a lot worse for the citizens of the country.But I think the House of Saud is feeling the heat lately.  In thelast six months or so I've read there have been stepped up harassingby the muttawain (religious police---*not* government) of Western womennot fully veiled (something stupid for women to do, IMO, because itsends the wrong signals about your morality).  And I've read thatthey've cracked down on the few, home-based expartiate religiousgatherings, and even posted rewards in (government-owned) newspapersoffering money for anyone who turns in a group of expartiates whodare worship in their homes or any other secret place. So thegovernment has grown even more intolerant to try to take some ofthe wind out of the sails of the more-conservative opposition.As unislamic as some of these things are, they're just a smalltaste of what would happen if these guys overthrow the House ofSaud, like they're trying to in the long run.Is this really what you (and Rached and others in the generalwest-is-evil-zionists-rule-hate-west-or-you-are-a-puppet crowd)want, Ilyess?
+True Label: talk.politics.mideast
+Predicted Label: talk.politics.mideast
+Prediction: Correct
diff --git a/llm/finetune/bigbird_pagesus/README.md b/llm/finetune/bigbird_pagesus/README.md
new file mode 100644
index 000000000..0a90dacd7
--- /dev/null
+++ b/llm/finetune/bigbird_pagesus/README.md
@@ -0,0 +1,25 @@
+# bigbird_pegasus模型微调对比
+## train loss
+
+对比微调训练的loss变化
+
+| epoch | mindnlp+mindspore | transformer+torch（4060）  |transformer+torch（4060,another time）  | 
+| ----- | ----------------- | ------------------------- |------------------------- |
+| 1     | 2.0958            | 8.7301                    |5.4650                     |
+| 2     | 1.969             | 8.1557                    |4.6890                     |
+| 3     | 1.8755            | 7.7516                    |4.2572                     |
+| 4     | 1.8264            | 7.5017                    |4.0263                     |
+| 5     | 1.7349            | 7.2614                    |3.9444                     |
+| 6     | 1.678             | 7.0559                    |3.8428                     |
+| 7     | 1.6937            | 6.8405                    |3.7187                     |
+| 8     | 1.654             | 6.7297                    |3.7192                     |
+| 9     | 1.6365            | 6.7136                    |3.5434                     |
+| 10    | 1.7003            | 6.6279                    |3.5881                     |
+
+## eval loss                        
+
+对比评估得分
+
+| epoch | mindnlp+mindspore  | transformer+torch（4060） | transformer+torch（4060） |
+| ----- | ------------------ | ------------------------- |------------------------- |
+| 1     | 2.1257965564727783 | 6.3235931396484375        |4.264792442321777         |
\ No newline at end of file
diff --git a/llm/finetune/bigbird_pagesus/mindNLPDatatricksAuto.ipynb b/llm/finetune/bigbird_pagesus/mindNLPDatatricksAuto.ipynb
new file mode 100644
index 000000000..6b0d0cbcb
--- /dev/null
+++ b/llm/finetune/bigbird_pagesus/mindNLPDatatricksAuto.ipynb
@@ -0,0 +1,1095 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "e6aa517b",
+   "metadata": {},
+   "source": [
+    "# MindNLP-bigbird_pegasus模型微调\n",
+    "基础模型：google/bigbird-pegasus-large-arxiv\n",
+    "tokenizer：google/bigbird-pegasus-large-arxiv\n",
+    "微调数据集：databricks/databricks-dolly-15k\n",
+    "硬件：Ascend910B1\n",
+    "环境\n",
+    "| Software    | Version                     |\n",
+    "| ----------- | --------------------------- |\n",
+    "| MindSpore   | MindSpore 2.4.0             |\n",
+    "| MindSpore   | MindSpore 0.4.1             |\n",
+    "| CANN        | 8.0                         |\n",
+    "| Python      | Python 3.9                  |\n",
+    "| OS platform | Ubuntu 5.4.0-42-generic     |\n",
+    "\n",
+    "## instruction\n",
+    "BigBird-Pegasus 是基于 BigBird 和 Pegasus 的混合模型，结合了两者的优势，专为处理长文本序列设计。BigBird 是一种基于 Transformer 的模型，通过稀疏注意力机制处理长序列，降低计算复杂度。Pegasus 是专为文本摘要设计的模型，通过自监督预训练任务（GSG）提升摘要生成能力。BigBird-Pegasus 结合了 BigBird 的长序列处理能力和 Pegasus 的摘要生成能力，适用于长文本摘要任务，如学术论文和长文档摘要。\n",
+    "Databricks Dolly 15k 是由 Databricks 发布的高质量指令微调数据集，包含约 15,000 条人工生成的指令-响应对，用于训练和评估对话模型。是专门为NLP模型微调设计的数据集。\n",
+    "## train loss\n",
+    "\n",
+    "对比微调训练的loss变化\n",
+    "\n",
+    "| epoch | mindnlp+mindspore | transformer+torch（4060） |\n",
+    "| ----- | ----------------- | ------------------------- |\n",
+    "| 1     | 2.0958            | 8.7301                    |\n",
+    "| 2     | 1.969             | 8.1557                    |\n",
+    "| 3     | 1.8755            | 7.7516                    |\n",
+    "| 4     | 1.8264            | 7.5017                    |\n",
+    "| 5     | 1.7349            | 7.2614                    |\n",
+    "| 6     | 1.678             | 7.0559                    |\n",
+    "| 7     | 1.6937            | 6.8405                    |\n",
+    "| 8     | 1.654             | 6.7297                    |\n",
+    "| 9     | 1.6365            | 6.7136                    |\n",
+    "| 10    | 1.7003            | 6.6279                    |\n",
+    "\n",
+    "## eval loss\n",
+    "\n",
+    "对比评估得分\n",
+    "\n",
+    "| epoch | mindnlp+mindspore  | transformer+torch（4060） |\n",
+    "| ----- | ------------------ | ------------------------- |\n",
+    "| 1     | 2.1257965564727783 | 6.3235931396484375        |\n",
+    "\n",
+    "**首先运行以下脚本配置环境**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8361c5cf",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Looking in indexes: http://mirrors.aliyun.com/pypi/simple/\n",
+      "Collecting mindnlp\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/0f/a8/5a072852d28a51417b5e330b32e6ae5f26b491ef01a15ba968e77f785e69/mindnlp-0.4.0-py3-none-any.whl (8.4 MB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m8.4/8.4 MB\u001b[0m \u001b[31m1.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: mindspore>=2.2.14 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindnlp) (2.3.0)\n",
+      "Requirement already satisfied: tqdm in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindnlp) (4.65.0)\n",
+      "Requirement already satisfied: requests in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindnlp) (2.31.0)\n",
+      "Collecting datasets (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/4c/37/22ef7675bef4ffe9577b937ddca2e22791534cbbe11c30714972a91532dc/datasets-3.3.2-py3-none-any.whl (485 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m485.4/485.4 kB\u001b[0m \u001b[31m1.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting evaluate (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/a2/e7/cbca9e2d2590eb9b5aa8f7ebabe1beb1498f9462d2ecede5c9fd9735faaf/evaluate-0.4.3-py3-none-any.whl (84 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m84.0/84.0 kB\u001b[0m \u001b[31m1.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting tokenizers==0.19.1 (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/ba/26/139bd2371228a0e203da7b3e3eddcb02f45b2b7edd91df00e342e4b55e13/tokenizers-0.19.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.6 MB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.6/3.6 MB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
+      "\u001b[?25hCollecting safetensors (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/5d/9a/add3e6fef267658075c5a41573c26d42d80c935cdc992384dfae435feaef/safetensors-0.5.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (459 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m459.5/459.5 kB\u001b[0m \u001b[31m1.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: sentencepiece in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindnlp) (0.1.99)\n",
+      "Requirement already satisfied: regex in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindnlp) (2023.10.3)\n",
+      "Collecting addict (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/6a/00/b08f23b7d7e1e14ce01419a467b583edbb93c6cdb8654e54a9cc579cd61f/addict-2.4.0-py3-none-any.whl (3.8 kB)\n",
+      "Requirement already satisfied: ml-dtypes in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindnlp) (0.2.0)\n",
+      "Collecting pyctcdecode (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/a5/8a/93e2118411ae5e861d4f4ce65578c62e85d0f1d9cb389bd63bd57130604e/pyctcdecode-0.5.0-py2.py3-none-any.whl (39 kB)\n",
+      "Requirement already satisfied: jieba in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindnlp) (0.42.1)\n",
+      "Collecting pytest==7.2.0 (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/67/68/a5eb36c3a8540594b6035e6cdae40c1ef1b6a2bfacbecc3d1a544583c078/pytest-7.2.0-py3-none-any.whl (316 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m316.8/316.8 kB\u001b[0m \u001b[31m866.1 kB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting pillow>=10.0.0 (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/0c/55/f182db572b28bd833b8e806f933f782ceb2df64c40e4d8bd3d4226a46eca/pillow-11.1.0-cp39-cp39-manylinux_2_28_aarch64.whl (4.4 MB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.4/4.4 MB\u001b[0m \u001b[31m1.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: attrs>=19.2.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pytest==7.2.0->mindnlp) (23.1.0)\n",
+      "Collecting iniconfig (from pytest==7.2.0->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/ef/a6/62565a6e1cf69e10f5727360368e451d4b7f58beeac6173dc9db836a5b46/iniconfig-2.0.0-py3-none-any.whl (5.9 kB)\n",
+      "Requirement already satisfied: packaging in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pytest==7.2.0->mindnlp) (23.2)\n",
+      "Collecting pluggy<2.0,>=0.12 (from pytest==7.2.0->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/88/5f/e351af9a41f866ac3f1fac4ca0613908d9a41741cfcf2228f4ad853b697d/pluggy-1.5.0-py3-none-any.whl (20 kB)\n",
+      "Requirement already satisfied: exceptiongroup>=1.0.0rc8 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pytest==7.2.0->mindnlp) (1.1.3)\n",
+      "Collecting tomli>=1.0.0 (from pytest==7.2.0->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/6e/c2/61d3e0f47e2b74ef40a68b9e6ad5984f6241a942f7cd3bbfbdbd03861ea9/tomli-2.2.1-py3-none-any.whl (14 kB)\n",
+      "Requirement already satisfied: huggingface-hub<1.0,>=0.16.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from tokenizers==0.19.1->mindnlp) (0.18.0)\n",
+      "Requirement already satisfied: numpy>=1.17.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore>=2.2.14->mindnlp) (1.23.5)\n",
+      "Requirement already satisfied: protobuf>=3.13.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore>=2.2.14->mindnlp) (3.20.3)\n",
+      "Requirement already satisfied: asttokens>=2.0.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore>=2.2.14->mindnlp) (2.4.1)\n",
+      "Requirement already satisfied: scipy>=1.5.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore>=2.2.14->mindnlp) (1.11.3)\n",
+      "Requirement already satisfied: psutil>=5.6.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore>=2.2.14->mindnlp) (5.9.5)\n",
+      "Requirement already satisfied: astunparse>=1.6.3 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore>=2.2.14->mindnlp) (1.6.3)\n",
+      "Requirement already satisfied: filelock in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets->mindnlp) (3.13.1)\n",
+      "Collecting pyarrow>=15.0.0 (from datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/f2/87/4ef05a088b18082cde4950bdfca752dd31effb3ec201b8026e4816d0f3fa/pyarrow-19.0.1-cp39-cp39-manylinux_2_28_aarch64.whl (40.5 MB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m40.5/40.5 MB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
+      "\u001b[?25hCollecting dill<0.3.9,>=0.3.0 (from datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/c9/7a/cef76fd8438a42f96db64ddaa85280485a9c395e7df3db8158cfec1eee34/dill-0.3.8-py3-none-any.whl (116 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m116.3/116.3 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: pandas in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets->mindnlp) (2.1.2)\n",
+      "Collecting requests (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/f9/9b/335f9764261e915ed497fcdeb11df5dfd6f7bf257d4a6a2a686d80da4d54/requests-2.32.3-py3-none-any.whl (64 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m64.9/64.9 kB\u001b[0m \u001b[31m1.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting tqdm (from mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/d0/30/dc54f88dd4a2b5dc8a0279bdd7270e735851848b762aeb1c1184ed1f6b14/tqdm-4.67.1-py3-none-any.whl (78 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m78.5/78.5 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting xxhash (from datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/b4/92/9ac297e3487818f429bcf369c1c6a097edf5b56ed6fc1feff4c1882e87ef/xxhash-3.5.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (220 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m220.6/220.6 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting multiprocess<0.70.17 (from datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/da/d9/f7f9379981e39b8c2511c9e0326d212accacb82f12fbfdc1aa2ce2a7b2b6/multiprocess-0.70.16-py39-none-any.whl (133 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m133.4/133.4 kB\u001b[0m \u001b[31m1.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: fsspec<=2024.12.0,>=2023.1.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from fsspec[http]<=2024.12.0,>=2023.1.0->datasets->mindnlp) (2023.10.0)\n",
+      "Collecting aiohttp (from datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/1d/d5/ab9ad5242c7920e224cbdc1c9bec62a79f75884049ccb86edb64225e4c0f/aiohttp-3.11.13-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
+      "\u001b[?25hCollecting huggingface-hub<1.0,>=0.16.4 (from tokenizers==0.19.1->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/ae/05/75b90de9093de0aadafc868bb2fa7c57651fd8f45384adf39bd77f63980d/huggingface_hub-0.29.1-py3-none-any.whl (468 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m468.0/468.0 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: pyyaml>=5.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from datasets->mindnlp) (6.0.1)\n",
+      "Requirement already satisfied: charset-normalizer<4,>=2 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests->mindnlp) (3.3.2)\n",
+      "Requirement already satisfied: idna<4,>=2.5 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests->mindnlp) (3.4)\n",
+      "Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests->mindnlp) (2.0.7)\n",
+      "Requirement already satisfied: certifi>=2017.4.17 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from requests->mindnlp) (2023.7.22)\n",
+      "Collecting pygtrie<3.0,>=2.1 (from pyctcdecode->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/ec/cd/bd196b2cf014afb1009de8b0f05ecd54011d881944e62763f3c1b1e8ef37/pygtrie-2.5.0-py3-none-any.whl (25 kB)\n",
+      "Collecting hypothesis<7,>=6.14 (from pyctcdecode->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/3e/15/234573ed76ab2b065c562c72b25ade28ed9d46d0efd347a8599a384521a1/hypothesis-6.127.5-py3-none-any.whl (483 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m483.4/483.4 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: six>=1.12.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from asttokens>=2.0.4->mindspore>=2.2.14->mindnlp) (1.16.0)\n",
+      "Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from astunparse>=1.6.3->mindspore>=2.2.14->mindnlp) (0.41.3)\n",
+      "Collecting aiohappyeyeballs>=2.3.0 (from aiohttp->datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/44/4c/03fb05f56551828ec67ceb3665e5dc51638042d204983a03b0a1541475b6/aiohappyeyeballs-2.4.6-py3-none-any.whl (14 kB)\n",
+      "Collecting aiosignal>=1.1.2 (from aiohttp->datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/ec/6a/bc7e17a3e87a2985d3e8f4da4cd0f481060eb78fb08596c42be62c90a4d9/aiosignal-1.3.2-py2.py3-none-any.whl (7.6 kB)\n",
+      "Collecting async-timeout<6.0,>=4.0 (from aiohttp->datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/fe/ba/e2081de779ca30d473f21f5b30e0e737c438205440784c7dfc81efc2b029/async_timeout-5.0.1-py3-none-any.whl (6.2 kB)\n",
+      "Collecting frozenlist>=1.1.1 (from aiohttp->datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/08/04/e2fddc92135276e07addbc1cf413acffa0c2d848b3e54cacf684e146df49/frozenlist-1.5.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (241 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m241.8/241.8 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting multidict<7.0,>=4.5 (from aiohttp->datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/89/87/d451d45aab9e422cb0fb2f7720c31a4c1d3012c740483c37f642eba568fb/multidict-6.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (126 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m126.2/126.2 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting propcache>=0.2.0 (from aiohttp->datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/e6/65/09b1bacf723721e36a84034ff0a4d64d13c7ddb92cfefe9c0b861886f814/propcache-0.3.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (208 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m208.1/208.1 kB\u001b[0m \u001b[31m1.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hCollecting yarl<2.0,>=1.17.0 (from aiohttp->datasets->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/0f/4f/438c9fd668954779e48f08c0688ee25e0673380a21bb1e8ccc56de5b55d7/yarl-1.18.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (317 kB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m317.3/317.3 kB\u001b[0m \u001b[31m1.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0ma \u001b[36m0:00:01\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: typing-extensions>=3.7.4.3 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from huggingface-hub<1.0,>=0.16.4->tokenizers==0.19.1->mindnlp) (4.8.0)\n",
+      "Collecting sortedcontainers<3.0.0,>=2.1.0 (from hypothesis<7,>=6.14->pyctcdecode->mindnlp)\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/32/46/9cb0e58b2deb7f82b84065f37f3bffeb12413f947f9388e4cac22c4621ce/sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB)\n",
+      "Requirement already satisfied: python-dateutil>=2.8.2 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pandas->datasets->mindnlp) (2.8.2)\n",
+      "Requirement already satisfied: pytz>=2020.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pandas->datasets->mindnlp) (2023.3.post1)\n",
+      "Requirement already satisfied: tzdata>=2022.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from pandas->datasets->mindnlp) (2023.3)\n",
+      "\u001b[33mDEPRECATION: moxing-framework 2.1.16.2ae09d45 has a non-standard version number. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of moxing-framework or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063\u001b[0m\u001b[33m\n",
+      "\u001b[0mInstalling collected packages: sortedcontainers, pygtrie, addict, xxhash, tqdm, tomli, safetensors, requests, pyarrow, propcache, pluggy, pillow, multidict, iniconfig, hypothesis, frozenlist, dill, async-timeout, aiohappyeyeballs, yarl, pytest, pyctcdecode, multiprocess, huggingface-hub, aiosignal, tokenizers, aiohttp, datasets, evaluate, mindnlp\n",
+      "  Attempting uninstall: tqdm\n",
+      "    Found existing installation: tqdm 4.65.0\n",
+      "    Uninstalling tqdm-4.65.0:\n",
+      "      Successfully uninstalled tqdm-4.65.0\n",
+      "  Attempting uninstall: requests\n",
+      "    Found existing installation: requests 2.31.0\n",
+      "    Uninstalling requests-2.31.0:\n",
+      "      Successfully uninstalled requests-2.31.0\n",
+      "  Attempting uninstall: pillow\n",
+      "    Found existing installation: Pillow 9.0.1\n",
+      "    Uninstalling Pillow-9.0.1:\n",
+      "      Successfully uninstalled Pillow-9.0.1\n",
+      "  Attempting uninstall: huggingface-hub\n",
+      "    Found existing installation: huggingface-hub 0.18.0\n",
+      "    Uninstalling huggingface-hub-0.18.0:\n",
+      "      Successfully uninstalled huggingface-hub-0.18.0\n",
+      "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
+      "gradio 3.50.2 requires pillow<11.0,>=8.0, but you have pillow 11.1.0 which is incompatible.\n",
+      "imageio 2.31.6 requires pillow<10.1.0,>=8.3.2, but you have pillow 11.1.0 which is incompatible.\n",
+      "mindtorch 0.3.0 requires tqdm==4.65.0, but you have tqdm 4.67.1 which is incompatible.\u001b[0m\u001b[31m\n",
+      "\u001b[0mSuccessfully installed addict-2.4.0 aiohappyeyeballs-2.4.6 aiohttp-3.11.13 aiosignal-1.3.2 async-timeout-5.0.1 datasets-3.3.2 dill-0.3.8 evaluate-0.4.3 frozenlist-1.5.0 huggingface-hub-0.29.1 hypothesis-6.127.5 iniconfig-2.0.0 mindnlp-0.4.0 multidict-6.1.0 multiprocess-0.70.16 pillow-11.1.0 pluggy-1.5.0 propcache-0.3.0 pyarrow-19.0.1 pyctcdecode-0.5.0 pygtrie-2.5.0 pytest-7.2.0 requests-2.32.3 safetensors-0.5.3 sortedcontainers-2.4.0 tokenizers-0.19.1 tomli-2.2.1 tqdm-4.67.1 xxhash-3.5.0 yarl-1.18.3\n",
+      "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+      "\u001b[0mLooking in indexes: http://mirrors.aliyun.com/pypi/simple/\n",
+      "Collecting mindspore==2.4\n",
+      "  Downloading http://mirrors.aliyun.com/pypi/packages/1b/e4/87dc1ae146f0715fa0ae9c04aab4cb44d07d971cb643c9460d0050d6a031/mindspore-2.4.0-cp39-none-any.whl (333.7 MB)\n",
+      "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m333.7/333.7 MB\u001b[0m \u001b[31m1.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m00:06\u001b[0m\n",
+      "\u001b[?25hRequirement already satisfied: numpy<2.0.0,>=1.20.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (1.23.5)\n",
+      "Requirement already satisfied: protobuf>=3.13.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (3.20.3)\n",
+      "Requirement already satisfied: asttokens>=2.0.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (2.4.1)\n",
+      "Requirement already satisfied: pillow>=6.2.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (11.1.0)\n",
+      "Requirement already satisfied: scipy>=1.5.4 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (1.11.3)\n",
+      "Requirement already satisfied: packaging>=20.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (23.2)\n",
+      "Requirement already satisfied: psutil>=5.6.1 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (5.9.5)\n",
+      "Requirement already satisfied: astunparse>=1.6.3 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (1.6.3)\n",
+      "Requirement already satisfied: safetensors>=0.4.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from mindspore==2.4) (0.5.3)\n",
+      "Requirement already satisfied: six>=1.12.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from asttokens>=2.0.4->mindspore==2.4) (1.16.0)\n",
+      "Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages (from astunparse>=1.6.3->mindspore==2.4) (0.41.3)\n",
+      "\u001b[33mDEPRECATION: moxing-framework 2.1.16.2ae09d45 has a non-standard version number. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of moxing-framework or contact the author to suggest that they release a version with a conforming version number. Discussion can be found at https://github.com/pypa/pip/issues/12063\u001b[0m\u001b[33m\n",
+      "\u001b[0mInstalling collected packages: mindspore\n",
+      "  Attempting uninstall: mindspore\n",
+      "    Found existing installation: mindspore 2.3.0\n",
+      "    Uninstalling mindspore-2.3.0:\n",
+      "      Successfully uninstalled mindspore-2.3.0\n",
+      "Successfully installed mindspore-2.4.0\n",
+      "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
+      "\u001b[0m"
+     ]
+    }
+   ],
+   "source": [
+    "# 在Ascend910B1环境需要额外安装以下\n",
+    "# !pip install mindnlp\n",
+    "# !pip install mindspore==2.4\n",
+    "# !export LD_PRELOAD=$LD_PRELOAD:/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/torch.libs/libgomp-74ff64e9.so.1.0.0\n",
+    "# !yum install libsndfile"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d780a67a",
+   "metadata": {},
+   "source": [
+    "## 导入库\n",
+    "注意这里曾经导入了多个Tokenizer进行过测试。\n",
+    "要设置mindspore工作环境为Ascend。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "d127981e",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "[WARNING] GE_ADPT(37,ffff8709e010,python):2025-03-04-11:16:41.325.592 [mindspore/ccsrc/utils/dlopen_macro.h:163] DlsymAscend] Dynamically load symbol aclmdlBundleGetModelId failed, result = /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so: undefined symbol: aclmdlBundleGetModelId\n",
+      "[WARNING] GE_ADPT(37,ffff8709e010,python):2025-03-04-11:16:41.325.674 [mindspore/ccsrc/utils/dlopen_macro.h:163] DlsymAscend] Dynamically load symbol aclmdlBundleLoadFromMem failed, result = /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so: undefined symbol: aclmdlBundleLoadFromMem\n",
+      "[WARNING] GE_ADPT(37,ffff8709e010,python):2025-03-04-11:16:41.325.715 [mindspore/ccsrc/utils/dlopen_macro.h:163] DlsymAscend] Dynamically load symbol aclmdlBundleUnload failed, result = /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so: undefined symbol: aclmdlBundleUnload\n",
+      "[WARNING] GE_ADPT(37,ffff8709e010,python):2025-03-04-11:16:41.325.909 [mindspore/ccsrc/utils/dlopen_macro.h:163] DlsymAscend] Dynamically load symbol aclrtGetMemUceInfo failed, result = /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so: undefined symbol: aclrtGetMemUceInfo\n",
+      "[WARNING] GE_ADPT(37,ffff8709e010,python):2025-03-04-11:16:41.325.926 [mindspore/ccsrc/utils/dlopen_macro.h:163] DlsymAscend] Dynamically load symbol aclrtDeviceTaskAbort failed, result = /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so: undefined symbol: aclrtDeviceTaskAbort\n",
+      "[WARNING] GE_ADPT(37,ffff8709e010,python):2025-03-04-11:16:41.325.941 [mindspore/ccsrc/utils/dlopen_macro.h:163] DlsymAscend] Dynamically load symbol aclrtMemUceRepair failed, result = /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so: undefined symbol: aclrtMemUceRepair\n",
+      "[WARNING] GE_ADPT(37,ffff8709e010,python):2025-03-04-11:16:41.327.779 [mindspore/ccsrc/utils/dlopen_macro.h:163] DlsymAscend] Dynamically load symbol acltdtCleanChannel failed, result = /usr/local/Ascend/ascend-toolkit/latest/lib64/libacl_tdt_channel.so: undefined symbol: acltdtCleanChannel\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:41.550.830 [mindspore/run_check/_check_version.py:327] MindSpore version 2.4.0 and Ascend AI software package (Ascend Data Center Solution)version 7.2 does not match, the version of software package expect one of ['7.3', '7.5']. Please refer to the match info on: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:41.554.596 [mindspore/run_check/_check_version.py:396] Can not find the tbe operator implementation(need by mindspore-ascend). Please check whether the Environment Variable PYTHONPATH is set. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:44.300.46 [mindspore/run_check/_check_version.py:345] MindSpore version 2.4.0 and \"te\" wheel package version 7.2 does not match. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:44.341.43 [mindspore/run_check/_check_version.py:352] MindSpore version 2.4.0 and \"hccl\" wheel package version 7.2 does not match. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:44.358.17 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 3\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:45.385.67 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 2\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:46.419.87 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 1\n",
+      "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+      "  from .autonotebook import tqdm as notebook_tqdm\n",
+      "Building prefix dict from the default dictionary ...\n",
+      "Dumping model to file cache /tmp/jieba.cache\n",
+      "Loading model cost 1.375 seconds.\n",
+      "Prefix dict has been built successfully.\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:55.820.621 [mindspore/run_check/_check_version.py:327] MindSpore version 2.4.0 and Ascend AI software package (Ascend Data Center Solution)version 7.2 does not match, the version of software package expect one of ['7.3', '7.5']. Please refer to the match info on: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:55.827.619 [mindspore/run_check/_check_version.py:396] Can not find the tbe operator implementation(need by mindspore-ascend). Please check whether the Environment Variable PYTHONPATH is set. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:55.828.586 [mindspore/run_check/_check_version.py:345] MindSpore version 2.4.0 and \"te\" wheel package version 7.2 does not match. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:55.829.144 [mindspore/run_check/_check_version.py:352] MindSpore version 2.4.0 and \"hccl\" wheel package version 7.2 does not match. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:55.829.808 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 3\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:56.831.621 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 2\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:57.834.664 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 1\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:58.839.664 [mindspore/run_check/_check_version.py:327] MindSpore version 2.4.0 and Ascend AI software package (Ascend Data Center Solution)version 7.2 does not match, the version of software package expect one of ['7.3', '7.5']. Please refer to the match info on: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:58.843.964 [mindspore/run_check/_check_version.py:396] Can not find the tbe operator implementation(need by mindspore-ascend). Please check whether the Environment Variable PYTHONPATH is set. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:58.845.048 [mindspore/run_check/_check_version.py:345] MindSpore version 2.4.0 and \"te\" wheel package version 7.2 does not match. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:58.845.711 [mindspore/run_check/_check_version.py:352] MindSpore version 2.4.0 and \"hccl\" wheel package version 7.2 does not match. For details, refer to the installation guidelines: https://www.mindspore.cn/install\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:58.846.365 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 3\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:16:59.848.213 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 2\n",
+      "[WARNING] ME(37:281472947314704,MainProcess):2025-03-04-11:17:00.851.249 [mindspore/run_check/_check_version.py:366] Please pay attention to the above warning, countdown: 1\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "from mindnlp.transformers import (\n",
+    "    BigBirdPegasusForCausalLM, \n",
+    "    PegasusTokenizer,\n",
+    "    AutoTokenizer\n",
+    ")\n",
+    "from datasets import load_dataset, DatasetDict\n",
+    "from mindspore.dataset import GeneratorDataset\n",
+    "from mindnlp.engine import Trainer, TrainingArguments\n",
+    "import mindspore as ms\n",
+    "# 设置运行模式和设备\n",
+    "ms.set_context(mode=ms.PYNATIVE_MODE, device_target=\"Ascend\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dbcec2d3",
+   "metadata": {},
+   "source": [
+    "## 处理数据集\n",
+    "这里为了快速多次微调，数据集经过处理后保存到本地。需要注意的是这里使用BigBirdPegasusForCausalLM，使用的是语言模型，需要将数据集进行处理。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "caec8504",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# 定义数据集保存路径\n",
+    "dataset_path = \"./processed_dataset\"\n",
+    "# 检查是否存在处理好的数据集\n",
+    "if os.path.exists(dataset_path):\n",
+    "    dataset = DatasetDict.load_from_disk(dataset_path)\n",
+    "    train_dataset = dataset[\"train\"]\n",
+    "    eval_dataset = dataset[\"eval\"]\n",
+    "else:\n",
+    "    # 加载和处理数据集\n",
+    "    dataset = load_dataset(\"databricks/databricks-dolly-15k\")\n",
+    "    print(dataset)\n",
+    "\n",
+    "    def format_prompt(sample):\n",
+    "        instruction = f\"### Instruction\\n{sample['instruction']}\"\n",
+    "        context = f\"### Context\\n{sample['context']}\" if len(sample[\"context\"]) > 0 else None\n",
+    "        response = f\"### Answer\\n{sample['response']}\"\n",
+    "        prompt = \"\\n\\n\".join([i for i in [instruction, context, response] if i is not None])\n",
+    "        sample[\"prompt\"] = prompt\n",
+    "        return sample\n",
+    "\n",
+    "    dataset = dataset.map(format_prompt)\n",
+    "    dataset = dataset.remove_columns(['instruction', 'context', 'response', 'category'])\n",
+    "    train_dataset = dataset[\"train\"].select(range(0, 40))\n",
+    "    eval_dataset = dataset[\"train\"].select(range(40, 50))\n",
+    "    # print(train_dataset)\n",
+    "    # print(eval_dataset)\n",
+    "    # print(train_dataset[0])\n",
+    "    # 保存处理好的数据集\n",
+    "    dataset = DatasetDict({\"train\": train_dataset, \"eval\": eval_dataset})\n",
+    "    dataset.save_to_disk(dataset_path)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4e401840",
+   "metadata": {},
+   "source": [
+    "## 加载模型\n",
+    "在mindnlp中没有找到类似BigBirdPegasusTokenizer的类，所以使用AutoTokenizer。查阅mindnlp，发现有个例程还可以使用PegasusTokenizer，都进行了尝试。\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "a267c7fe",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py:1526: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted, and will be then set to `False` by default. \n",
+      "  warnings.warn(\n",
+      "BigBirdPegasusForCausalLM has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`.`PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.\n",
+      "  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).\n",
+      "  - If you are not the owner of the model architecture class, please contact the model code owner to update it.\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[MS_ALLOC_CONF]Runtime config:  enable_vmm:True  vmm_align_size:2MB\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "[WARNING] DEVICE(37,fffd60ebb0e0,python):2025-03-04-11:17:17.714.431 [mindspore/ccsrc/transform/acl_ir/op_api_convert.h:114] GetOpApiFunc] Dlsym aclSetAclOpExecutorRepeatable failed!\n",
+      "[WARNING] KERNEL(37,fffd60ebb0e0,python):2025-03-04-11:17:17.714.567 [mindspore/ccsrc/transform/acl_ir/op_api_cache.h:54] SetExecutorRepeatable] The aclSetAclOpExecutorRepeatable is unavailable, which results in aclnn cache miss.\n",
+      "[WARNING] DEVICE(37,fffd5abce0e0,python):2025-03-04-11:17:17.732.921 [mindspore/ccsrc/transform/acl_ir/op_api_convert.h:114] GetOpApiFunc] Dlsym aclDestroyAclOpExecutor failed!\n"
+     ]
+    }
+   ],
+   "source": [
+    "model_name = \"google/bigbird-pegasus-large-arxiv\"\n",
+    "tokenizer_name = \"google/bigbird-pegasus-large-arxiv\"\n",
+    "tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)\n",
+    "# tokenizer = PegasusTokenizer.from_pretrained(tokenizer_name)\n",
+    "tokenizer.pad_token = tokenizer.eos_token \n",
+    "model = BigBirdPegasusForCausalLM.from_pretrained(model_name)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bbda48b5",
+   "metadata": {},
+   "source": [
+    "## 将数据集预处理为训练格式\n",
+    "这里在mindnlp中没有找到类似transformer中DataCollatorForLanguageModeling的工具，所以需要自己编写padding和truncation。\n",
+    "这里输出了处理过的数据集与torch的进行对比，保证获得的数据集是一样的。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "fe44b259",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "train_dataset: <mindspore.dataset.engine.datasets_user_defined.GeneratorDataset object at 0xffff404b6430>\n",
+      "eval_dataset: <mindspore.dataset.engine.datasets_user_defined.GeneratorDataset object at 0xffff45782430>\n",
+      "{'input_ids': Tensor(shape=[256], dtype=Int64, value= [  110, 63444, 26323,   463,   117,   114,   110, 84040,  5551, 41676,   152,   110, 63444, 30058,   222, 22600,   108,   114,   110, 84040,  5551, 41676,   117,   142, \n",
+      "  8091, 41676,   120,   117,   263,   112, 37525,   523,   108,   120,   117,   108,   112,  1910,   523,   190,   203, 31059,  2274,   143,   544,  1613,   113,   109, \n",
+      " 12091,   250, 10008, 44069,   143, 10209,   116,   158,   113,   523,   138,   129, 53136,   141,   109, 41676,   134,   291, 10269,   107,   182,   117,   114,   711, \n",
+      "   113,   109, 41676,  1001,   131,   116,  4224,   113, 67669,  7775,   122, 30671,   143, 84040,  2928,   250, 10879,   108,   895, 44069,   143,  6388,   158, 11213, \n",
+      "   114,  1934, 28593,   197,  6306, 44069,   143, 11753,   250,   139, 31757,   113,   695,   523,   190,  1613,   141,   114, 41676,  1358,  6381, 15121, 12455,   112, \n",
+      " 10796,   120,   695,   523, 13333,   113,   114,  3173,   113,   291,  1613,   107,   110, 63444, 13641,   202,   110, 84040,  5551, 41676,   117,   142,  8091, 41676, \n",
+      "   120, 37525,   116,   109,   523,   131,   116,   291, 44069,   134,   291, 10269,   107,   434,   695,   523,   117, 66437,   224,   114,   110, 84040,  5551, 41676, \n",
+      "   126,   138,  1910,   190,   109,   291,  1613,   113,   109, 12091,   107,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1, \n",
+      "     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1, \n",
+      "     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1, \n",
+      "     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1]), 'attention_mask': Tensor(shape=[256], dtype=Int64, value= [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \n",
+      " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \n",
+      " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \n",
+      " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \n",
+      " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \n",
+      " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \n",
+      " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \n",
+      " 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \n",
+      " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \n",
+      " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \n",
+      " 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), 'labels': Tensor(shape=[256], dtype=Int64, value= [  110, 63444, 26323,   463,   117,   114,   110, 84040,  5551, 41676,   152,   110, 63444, 30058,   222, 22600,   108,   114,   110, 84040,  5551, 41676,   117,   142, \n",
+      "  8091, 41676,   120,   117,   263,   112, 37525,   523,   108,   120,   117,   108,   112,  1910,   523,   190,   203, 31059,  2274,   143,   544,  1613,   113,   109, \n",
+      " 12091,   250, 10008, 44069,   143, 10209,   116,   158,   113,   523,   138,   129, 53136,   141,   109, 41676,   134,   291, 10269,   107,   182,   117,   114,   711, \n",
+      "   113,   109, 41676,  1001,   131,   116,  4224,   113, 67669,  7775,   122, 30671,   143, 84040,  2928,   250, 10879,   108,   895, 44069,   143,  6388,   158, 11213, \n",
+      "   114,  1934, 28593,   197,  6306, 44069,   143, 11753,   250,   139, 31757,   113,   695,   523,   190,  1613,   141,   114, 41676,  1358,  6381, 15121, 12455,   112, \n",
+      " 10796,   120,   695,   523, 13333,   113,   114,  3173,   113,   291,  1613,   107,   110, 63444, 13641,   202,   110, 84040,  5551, 41676,   117,   142,  8091, 41676, \n",
+      "   120, 37525,   116,   109,   523,   131,   116,   291, 44069,   134,   291, 10269,   107,   434,   695,   523,   117, 66437,   224,   114,   110, 84040,  5551, 41676, \n",
+      "   126,   138,  1910,   190,   109,   291,  1613,   113,   109, 12091,   107,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1, \n",
+      "     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1, \n",
+      "     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1, \n",
+      "     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1,     1])}\n"
+     ]
+    }
+   ],
+   "source": [
+    "class TextDataset:\n",
+    "    def __init__(self, data):\n",
+    "        self.data = data\n",
+    "    # 这里就是个padding和truncation截断的操作\n",
+    "    def __getitem__(self, index):\n",
+    "        index = int(index)\n",
+    "        text = self.data[index][\"prompt\"]\n",
+    "        inputs = tokenizer(text, padding='max_length', max_length=256, truncation=True)\n",
+    "        return (\n",
+    "            inputs[\"input_ids\"], \n",
+    "            inputs[\"attention_mask\"],\n",
+    "            inputs[\"input_ids\"]  # 添加labels\n",
+    "        )\n",
+    "\n",
+    "    def __len__(self):\n",
+    "        return len(self.data)\n",
+    "train_dataset = GeneratorDataset(\n",
+    "    TextDataset(train_dataset),\n",
+    "    column_names=[\"input_ids\", \"attention_mask\", \"labels\"],  # 添加labels\n",
+    "    shuffle=True\n",
+    ")\n",
+    "eval_dataset = GeneratorDataset(\n",
+    "    TextDataset(eval_dataset),\n",
+    "    column_names=[\"input_ids\", \"attention_mask\", \"labels\"],  # 添加labels\n",
+    "    shuffle=False\n",
+    ")\n",
+    "print(\"train_dataset:\", train_dataset)\n",
+    "print(\"eval_dataset:\", eval_dataset)\n",
+    "for data in train_dataset.create_dict_iterator():\n",
+    "    print(data)\n",
+    "    break"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8e3ddebb",
+   "metadata": {},
+   "source": [
+    "## 配置trainer并train\n",
+    "这里参数要与torch的训练参数一致，记录当前训练的loss变换然后对比"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "d3fe864b",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  0%|          | 0/100 [00:00<?, ?it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\\\r"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  1%|          | 1/100 [00:28<47:21, 28.70s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "|\r"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 10%|█         | 10/100 [00:38<01:38,  1.10s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 2.0958, 'learning_rate': 4.5e-05, 'epoch': 1.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      " 67%|██████▋   | 2/3 [00:01<00:00,  1.63it/s]\u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "/\r"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "                                                \n",
+      " 10%|█         | 10/100 [00:43<01:38,  1.10s/it]\n",
+      "100%|██████████| 3/3 [00:04<00:00,  1.63it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.592344045639038, 'eval_runtime': 4.9288, 'eval_samples_per_second': 0.609, 'eval_steps_per_second': 0.203, 'epoch': 1.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 20%|██        | 20/100 [00:50<01:04,  1.24it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.969, 'learning_rate': 4e-05, 'epoch': 2.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 20%|██        | 20/100 [00:50<01:04,  1.24it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 19.53it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.486072063446045, 'eval_runtime': 0.2738, 'eval_samples_per_second': 10.956, 'eval_steps_per_second': 3.652, 'epoch': 2.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 30%|███       | 30/100 [00:57<00:46,  1.50it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.8755, 'learning_rate': 3.5e-05, 'epoch': 3.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 30%|███       | 30/100 [00:57<00:46,  1.50it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 22.78it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.367415189743042, 'eval_runtime': 0.2442, 'eval_samples_per_second': 12.283, 'eval_steps_per_second': 4.094, 'epoch': 3.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 40%|████      | 40/100 [01:04<00:39,  1.54it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.8264, 'learning_rate': 3e-05, 'epoch': 4.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 40%|████      | 40/100 [01:04<00:39,  1.54it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 24.96it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.3535046577453613, 'eval_runtime': 0.241, 'eval_samples_per_second': 12.45, 'eval_steps_per_second': 4.15, 'epoch': 4.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 50%|█████     | 50/100 [01:11<00:34,  1.45it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.7349, 'learning_rate': 2.5e-05, 'epoch': 5.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 50%|█████     | 50/100 [01:11<00:34,  1.45it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 22.24it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.2972629070281982, 'eval_runtime': 0.2457, 'eval_samples_per_second': 12.21, 'eval_steps_per_second': 4.07, 'epoch': 5.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 60%|██████    | 60/100 [01:18<00:24,  1.61it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.678, 'learning_rate': 2e-05, 'epoch': 6.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 60%|██████    | 60/100 [01:18<00:24,  1.61it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 24.91it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.195664882659912, 'eval_runtime': 0.2324, 'eval_samples_per_second': 12.91, 'eval_steps_per_second': 4.303, 'epoch': 6.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 70%|███████   | 70/100 [01:25<00:20,  1.44it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.6937, 'learning_rate': 1.5e-05, 'epoch': 7.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 70%|███████   | 70/100 [01:25<00:20,  1.44it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 21.99it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.1624794006347656, 'eval_runtime': 0.2587, 'eval_samples_per_second': 11.596, 'eval_steps_per_second': 3.865, 'epoch': 7.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 80%|████████  | 80/100 [01:32<00:13,  1.48it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.654, 'learning_rate': 1e-05, 'epoch': 8.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 80%|████████  | 80/100 [01:32<00:13,  1.48it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 23.14it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.159714460372925, 'eval_runtime': 0.2363, 'eval_samples_per_second': 12.696, 'eval_steps_per_second': 4.232, 'epoch': 8.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      " 90%|█████████ | 90/100 [01:39<00:06,  1.51it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.6365, 'learning_rate': 5e-06, 'epoch': 9.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                \n",
+      " 90%|█████████ | 90/100 [01:39<00:06,  1.51it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 22.68it/s]\u001b[A\n",
+      "                                             \u001b[A"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.1347262859344482, 'eval_runtime': 0.2604, 'eval_samples_per_second': 11.523, 'eval_steps_per_second': 3.841, 'epoch': 9.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 100/100 [01:46<00:00,  1.52it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'loss': 1.7003, 'learning_rate': 0.0, 'epoch': 10.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "  0%|          | 0/3 [00:00<?, ?it/s]\u001b[A\n",
+      "                                                 \n",
+      "100%|██████████| 100/100 [01:46<00:00,  1.52it/s]\n",
+      "100%|██████████| 3/3 [00:00<00:00, 21.63it/s]\u001b[A\n",
+      "100%|██████████| 100/100 [01:46<00:00,  1.06s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'eval_loss': 2.1257965564727783, 'eval_runtime': 0.2557, 'eval_samples_per_second': 11.733, 'eval_steps_per_second': 3.911, 'epoch': 10.0}\n",
+      "{'train_runtime': 106.4446, 'train_samples_per_second': 3.758, 'train_steps_per_second': 0.939, 'train_loss': 1.7863994789123536, 'epoch': 10.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "TrainOutput(global_step=100, training_loss=1.7863994789123536, metrics={'train_runtime': 106.4446, 'train_samples_per_second': 3.758, 'train_steps_per_second': 0.939, 'train_loss': 1.7863994789123536, 'epoch': 10.0})"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "EPOCHS = 10\n",
+    "BATCH_SIZE = 4\n",
+    "# 定义训练参数\n",
+    "training_args = TrainingArguments(\n",
+    "    output_dir='./MindsporeBigBirdFinetune',\n",
+    "    overwrite_output_dir=True,\n",
+    "    num_train_epochs=EPOCHS,\n",
+    "    per_device_train_batch_size=BATCH_SIZE,\n",
+    "    per_device_eval_batch_size=BATCH_SIZE,\n",
+    "    \n",
+    "    save_steps=500,                  # Save checkpoint every 500 steps\n",
+    "    save_total_limit=2,              # Keep only the last 2 checkpoints\n",
+    "    logging_dir=\"./logs\",            # Directory for logs\n",
+    "    logging_steps=100,               # Log every 100 steps\n",
+    "    logging_strategy=\"epoch\",\n",
+    "    evaluation_strategy=\"epoch\",\n",
+    "    eval_steps=500,                  # Evaluation frequency\n",
+    "    learning_rate=5e-5,\n",
+    "    weight_decay=0.01,               # Weight decay\n",
+    ")\n",
+    "\n",
+    "# 创建trainer\n",
+    "trainer = Trainer(\n",
+    "    model=model,\n",
+    "    args=training_args,\n",
+    "    train_dataset=train_dataset,\n",
+    "    eval_dataset=eval_dataset,\n",
+    "    tokenizer=tokenizer,\n",
+    "    compute_metrics=None\n",
+    ")\n",
+    "trainer.train()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c8a575ad",
+   "metadata": {},
+   "source": [
+    "## 查看评估结果"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "5c0833ad",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 3/3 [00:00<00:00, 15.78it/s]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Evaluation results: {'eval_loss': 2.1257965564727783, 'eval_runtime': 0.3007, 'eval_samples_per_second': 9.977, 'eval_steps_per_second': 3.326, 'epoch': 10.0}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "eval_results = trainer.evaluate()\n",
+    "print(f\"Evaluation results: {eval_results}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a294ba38",
+   "metadata": {},
+   "source": [
+    "## 保存微调结果"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "0c5b2db5",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.\n",
+      "Non-default generation parameters: {'max_length': 256, 'num_beams': 5, 'length_penalty': 0.8}\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "('./mindNLPTokenizerBigbirdPegasusFinetune/tokenizer_config.json',\n",
+       " './mindNLPTokenizerBigbirdPegasusFinetune/special_tokens_map.json',\n",
+       " './mindNLPTokenizerBigbirdPegasusFinetune/spiece.model',\n",
+       " './mindNLPTokenizerBigbirdPegasusFinetune/added_tokens.json',\n",
+       " './mindNLPTokenizerBigbirdPegasusFinetune/tokenizer.json')"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "model.save_pretrained(\"./mindNLPModelBigbirdPegasusFinetune\")\n",
+    "tokenizer.save_pretrained(\"./mindNLPTokenizerBigbirdPegasusFinetune\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "20472ce6",
+   "metadata": {},
+   "source": [
+    "## 使用微调模型进行测试\n",
+    "虽然loss不断下降并且比torch的更好。但是由于两个都是短暂微调训练，可以看到语言模型实际效果并不好，输出结果不解其意。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "8e3fab68",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "in,, have a but\n"
+     ]
+    }
+   ],
+   "source": [
+    "fine_tuned_model = BigBirdPegasusForCausalLM.from_pretrained(\"./mindNLPModelBigbirdPegasusFinetune\")\n",
+    "fine_tuned_tokenizer = PegasusTokenizer.from_pretrained(\"./mindNLPTokenizerBigbirdPegasusFinetune\")\n",
+    "inputs = \"Hello, my dog is cute\"\n",
+    "input_tokens = fine_tuned_tokenizer(inputs, return_tensors=\"ms\")\n",
+    "outputs = fine_tuned_model(**input_tokens)\n",
+    "logits = outputs.logits\n",
+    "# 使用 argmax 获取预测的 token ID\n",
+    "from mindspore import ops\n",
+    "predicted_token_ids = ops.argmax(logits, dim=-1)  # 在最后一个维度（vocab_size）上取 argmax\n",
+    "# 解码生成的文本\n",
+    "generated_text = fine_tuned_tokenizer.decode(predicted_token_ids[0].asnumpy().tolist(), skip_special_tokens=True)\n",
+    "print(generated_text)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b56a68c9",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9b5ed92b",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.18"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git "a/llm/finetune/blenderbot_small/Blenderbot_Small\347\232\204Synthetic-Persona-Chat\345\276\256\350\260\203.md" "b/llm/finetune/blenderbot_small/Blenderbot_Small\347\232\204Synthetic-Persona-Chat\345\276\256\350\260\203.md"
new file mode 100644
index 000000000..519a8d272
--- /dev/null
+++ "b/llm/finetune/blenderbot_small/Blenderbot_Small\347\232\204Synthetic-Persona-Chat\345\276\256\350\260\203.md"
@@ -0,0 +1,55 @@
+# Blenderbot_Small的Synthetic-Persona-Chat微调
+
+## 硬件
+
+资源规格：NPU: 1*Ascend-D910B(显存: 64GB), CPU: 24, 内存: 192GB
+
+智算中心：武汉智算中心
+
+镜像：mindspore_2_5_py311_cann8
+
+torch训练硬件资源规格：Nvidia 3090
+
+## 模型与数据集
+
+模型："facebook/blenderbot_small-90M"
+
+数据集："google/Synthetic-Persona-Chat"
+
+## 训练损失
+
+| trainloss | mindspore+mindnlp | Pytorch+transformers |
+| --------- | ----------------- | -------------------- |
+| 1         | 0.1737            | 0.2615               |
+| 2         | 0.1336            | 0.1269               |
+| 3         | 0.1099            | 0.0987               |
+
+## 评估损失
+
+| eval loss | mindspore+mindnlp   | Pytorch+transformers |
+| --------- | ------------------- | -------------------- |
+| 1         | 0.16312436759471893 | 0.160710409283638    |
+| 2         | 0.15773458778858185 | 0.15692724287509918  |
+| 3         | 0.15398454666137695 | 0.1593361645936966   |
+| 4         | 0.15398454666137695 | 0.1593361645936966   |
+
+## 对话测试
+
+* 问题输入：
+
+  Nice to meet you too. What are you interested in?
+
+* mindnlp未微调前的回答：
+
+  i ' m not really sure . i ' ve always wanted to go back to school , but i don ' t know what i want to do yet .
+
+* mindnlp微调后的回答：
+
+  user 2: i'm interested in a lot of things, but my main interests are music, art, and music. i also like to play video games, go to the movies, and spend time with my friends and family. my favorite video games are the legend of zelda series, and my favorite game is the witcher 3. name) what breath my his their i they ] include yes when philip boarity
+
+* torch微调前的回答：
+  i ' m not really sure . i ' ve always wanted to go back to school , but i don ' t know what i want to do yet .
+
+* torch微调后的回答：
+
+  user 2: i ' m interested in a lot of things , but my favorite ones are probably history and language . what do you like to do for fun ? hades is one of my favorite characters . hades is also my favorite character . hades namegardenblem pola litz strönape ception ddie ppon plata yder foundry patel fton darted sler bbins vili atsu ović endra scoe barons
\ No newline at end of file
diff --git "a/llm/finetune/blenderbot_small/Blenderbot_Small\347\232\204coqa\345\276\256\350\260\203.md" "b/llm/finetune/blenderbot_small/Blenderbot_Small\347\232\204coqa\345\276\256\350\260\203.md"
new file mode 100644
index 000000000..8170aef53
--- /dev/null
+++ "b/llm/finetune/blenderbot_small/Blenderbot_Small\347\232\204coqa\345\276\256\350\260\203.md"
@@ -0,0 +1,80 @@
+# Blenderbot_Small的coqa微调
+
+## 硬件
+
+资源规格：NPU: 1*Ascend-D910B(显存: 64GB), CPU: 24, 内存: 192GB
+
+智算中心：武汉智算中心
+
+镜像：mindspore_2_5_py311_cann8
+
+torch训练硬件资源规格：Nvidia 3090
+
+## 模型与数据集
+
+模型："facebook/blenderbot_small-90M"
+
+数据集："google/Synthetic-Persona-Chat"
+
+## 训练损失
+
+| trainloss | mindspore+mindnlp | Pytorch+transformers |
+| --------- | ----------------- | -------------------- |
+| 1         | 0.0117            | 0.3391               |
+| 2         | 0.0065            | 0.0069               |
+| 3         | 0.0041            | 0.0035               |
+| 4         | 0.0027            |                      |
+| 5         | 0.0017            |                      |
+| 6         | 0.0012            |                      |
+| 7         | 0.0007            |                      |
+| 8         | 0.0005            |                      |
+| 9         | 0.0003            |                      |
+| 10        | 0.0002            |                      |
+
+## 评估损失
+
+| eval loss | mindspore+mindnlp    | Pytorch+transformers |
+| --------- | -------------------- | -------------------- |
+| 1         | 0.010459424927830696 | 0.010080045089125633 |
+| 2         | 0.010958473198115826 | 0.008667134679853916 |
+| 3         | 0.011061458848416805 | 0.00842051301151514  |
+| 4         | 0.011254088021814823 | 0.00842051301151514  |
+| 5         | 0.011891312897205353 |                      |
+| 6         | 0.012321822345256805 |                      |
+| 7         | 0.012598296627402306 |                      |
+| 8         | 0.01246054656803608  |                      |
+| 9         | 0.0124361552298069   |                      |
+| 10        | 0.01264810748398304  |                      |
+
+## 对话测试
+
+问题来自评估数据集的第一个问题，微调后看起来效果不太好。
+
+* 问题输入：
+
+  The Vatican Apostolic Library, more commonly called the Vatican Library or simply the Vat, is the library of the Holy See, located in Vatican City. Formally established in 1475, although it is much older, it is one of the oldest libraries in the world and contains one of the most significant collections of historical texts. It has 75,000 codices from throughout history, as well as 1.1 million printed books, which include some 8,500 incunabula. 
+
+  The Vatican Library is a research library for history, law, philosophy, science and theology. The Vatican Library is open to anyone who can document their qualifications and research needs. Photocopies for private study of pages from books published between 1801 and 1990 can be requested in person or by mail. 
+
+  In March 2014, the Vatican Library began an initial four-year project of digitising its collection of manuscripts, to be made available online. 
+
+  The Vatican Secret Archives were separated from the library at the beginning of the 17th century; they contain another 150,000 items. 
+
+  Scholars have traditionally divided the history of the library into five periods, Pre-Lateran, Lateran, Avignon, Pre-Vatican and Vatican. 
+
+  The Pre-Lateran period, comprising the initial days of the library, dated from the earliest days of the Church. Only a handful of volumes survive from this period, though some are very significant.When was the Vat formally opened?
+
+* mindnlp未微调前的回答：
+
+  wow , that ' s a lot of information ! i ' ll have to check it out !
+
+* mindnlp微调后的回答：
+
+  it was formally established in 1475 remarked wang commenced baxter vii affiliate xii ) detained amid xvi scarcely spokesman murmured pradesh condemned himweekriedly upheld kilometers ywood longitude reportedly unarmed sworth congressional quarreandrea according monsieur constituent zhang smiled ɪfellows combe mitt
+
+* torch微调前的回答：
+  wow , that ' s a lot of information ! i ' ll have to check it out !
+
+* torch微调后的回答：
+
+  1475 monsieur palermo pradesh ˈprincipality pali turbines constituent gallagher xii ɪxv odi pauline ɒgregory coefficient julien deutsche sbury roberto henrietta əenko militants gmina podium hya taliban hague ːkensington poole inmate livery habsburg longitude reid lieu@@
\ No newline at end of file
diff --git a/llm/finetune/blenderbot_small/mindNLPBlenderbotsmallCopa.py b/llm/finetune/blenderbot_small/mindNLPBlenderbotsmallCopa.py
new file mode 100644
index 000000000..e8ca3501e
--- /dev/null
+++ b/llm/finetune/blenderbot_small/mindNLPBlenderbotsmallCopa.py
@@ -0,0 +1,145 @@
+from mindnlp.transformers import BlenderbotSmallForConditionalGeneration, BlenderbotSmallTokenizer
+from mindnlp.engine import Trainer, TrainingArguments
+from datasets import load_dataset, load_from_disk
+import mindspore as ms
+import os
+
+# 设置运行模式和设备
+ms.set_context(mode=ms.PYNATIVE_MODE, device_target="Ascend")
+
+# 设置 HF_ENDPOINT 环境变量
+os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
+# 加载模型和分词器
+print("加载模型和分词器")
+model_name = "facebook/blenderbot_small-90M"
+tokenizer = BlenderbotSmallTokenizer.from_pretrained(model_name)
+model = BlenderbotSmallForConditionalGeneration.from_pretrained(model_name)
+print("模型和分词器加载完成")
+# 测试原始模型的输出
+input = "The Vatican Apostolic Library, more commonly called the Vatican Library or simply the Vat, is the library of the Holy See, located in Vatican City. Formally established in 1475, although it is much older, it is one of the oldest libraries in the world and contains one of the most significant collections of historical texts. It has 75,000 codices from throughout history, as well as 1.1 million printed books, which include some 8,500 incunabula. \n\nThe Vatican Library is a research library for history, law, philosophy, science and theology. The Vatican Library is open to anyone who can document their qualifications and research needs. Photocopies for private study of pages from books published between 1801 and 1990 can be requested in person or by mail. \n\nIn March 2014, the Vatican Library began an initial four-year project of digitising its collection of manuscripts, to be made available online. \n\nThe Vatican Secret Archives were separated from the library at the beginning of the 17th century; they contain another 150,000 items. \n\nScholars have traditionally divided the history of the library into five periods, Pre-Lateran, Lateran, Avignon, Pre-Vatican and Vatican. \n\nThe Pre-Lateran period, comprising the initial days of the library, dated from the earliest days of the Church. Only a handful of volumes survive from this period, though some are very significant.When was the Vat formally opened?"
+print("input question:", input)
+input_tokens = tokenizer([input], return_tensors="ms")
+output_tokens = model.generate(**input_tokens)
+print("output answer:", tokenizer.batch_decode(output_tokens, skip_special_tokens=True)[0])
+
+# # 设置填充标记（BlenderbotSmall默认无pad_token）
+# # tokenizer.pad_token = tokenizer.eos_token  # 用eos_token作为填充标记
+# # model.config.pad_token_id = tokenizer.eos_token_id
+
+print("加载数据集")
+# 定义数据集保存路径
+dataset_path = "./dataset_valid_preprocessed"
+# 检查是否存在处理好的数据集
+if os.path.exists(dataset_path):
+    # 加载预处理后的数据集
+    dataset_train = load_from_disk("./dataset_train_preprocessed")
+    dataset_valid = load_from_disk("./dataset_valid_preprocessed")
+else:
+    dataset = load_dataset("stanfordnlp/coqa")
+    print("dataset finished\n")
+    print("dataset:", dataset)
+    print("\ndataset[train][0]:", dataset["train"][0])
+    print("\ndataset[validation][0]:", dataset["validation"][0])
+    dataset_train = dataset["train"]
+    dataset_valid = dataset["validation"]
+    # 数据预处理，coqa数据集是一个sotry和多个问题和多个答案的数据集，这里只取出第一个问题和第一个答案，sotry和问题拼接作为模型的输入，第一个答案作为模型的输出
+    def preprocess_function(examples):
+        # 取出第一个问题的文本
+        first_question = examples['questions'][0]
+        # 取出第一个答案的文本
+        first_answer = examples['answers']['input_text'][0]
+        # 将故事和第一个问题拼接成模型的输入格式
+        inputs = examples['story'] + " " + first_question
+        # 删除多余的引号
+        inputs = inputs.replace('"', '')
+        # 将第一个答案作为模型的输出
+        labels = first_answer
+        # 删除多余的引号
+        labels = labels.replace('"', '')
+        return {'input_ids': inputs, 'labels': labels}
+
+    def tokenize_function(examples):
+        # 对输入进行分词
+        model_inputs = tokenizer(examples['input_ids'], max_length=512, truncation=True, padding="max_length")
+        # 对标签进行分词
+        with tokenizer.as_target_tokenizer():
+            labels = tokenizer(examples['labels'], max_length=512, truncation=True, padding="max_length")
+        model_inputs["labels"] = labels["input_ids"]
+        return model_inputs
+    # 应用预处理函数
+    dataset_train = dataset_train.map(preprocess_function, batched=False) 
+    dataset_train = dataset_train.map(tokenize_function, batched=True)
+    dataset_train = dataset_train.remove_columns(["source", "story", "questions", "answers"])
+
+    dataset_valid = dataset_valid.map(preprocess_function, batched=False)
+    dataset_valid = dataset_valid.map(tokenize_function, batched=True)
+    dataset_valid = dataset_valid.remove_columns(["source", "story", "questions", "answers"])
+
+    dataset_train.save_to_disk("./dataset_train_preprocessed")
+    dataset_valid.save_to_disk("./dataset_valid_preprocessed")
+    print("dataset_train_tokenizerd:", dataset_train)
+
+print("转化为mindspore格式数据集")
+import numpy as np
+def data_generator(dataset):
+    for item in dataset:
+        yield (
+            np.array(item["input_ids"], dtype=np.int32),
+            np.array(item["attention_mask"], dtype=np.int32), 
+            np.array(item["labels"], dtype=np.int32)
+        )
+import mindspore.dataset as ds
+def create_mindspore_dataset(dataset, shuffle=True):
+    return ds.GeneratorDataset(
+        source=lambda: data_generator(dataset),  # 使用 lambda 包装生成器
+        column_names=["input_ids", "attention_mask", "labels"],
+        shuffle=shuffle,
+        num_parallel_workers=1
+    )
+dataset_train_tokenized = create_mindspore_dataset(dataset_train, shuffle=True)
+dataset_valid_tokenized = create_mindspore_dataset(dataset_valid, shuffle=False)
+
+TOKENS = 20
+EPOCHS = 10
+BATCH_SIZE = 4
+training_args = TrainingArguments(
+    output_dir='./MindNLPblenderbot_coqa_finetuned',
+    overwrite_output_dir=True,
+    num_train_epochs=EPOCHS,
+    per_device_train_batch_size=BATCH_SIZE,
+    per_device_eval_batch_size=BATCH_SIZE,
+    save_steps=500,                  # Save checkpoint every 500 steps
+    save_total_limit=2,              # Keep only the last 2 checkpoints
+    logging_dir="./mindsporelogs",            # Directory for logs
+    logging_steps=100,               # Log every 100 steps
+    logging_strategy="epoch",
+    evaluation_strategy="epoch",
+    eval_steps=500,                  # Evaluation frequency
+    warmup_steps=100,
+    learning_rate=5e-5,
+    weight_decay=0.01,               # Weight decay
+)
+
+trainer = Trainer(
+    model=model,
+    args=training_args,
+    train_dataset=dataset_train_tokenized,
+    eval_dataset=dataset_valid_tokenized
+)
+# 开始训练
+print("开始训练")
+trainer.train()
+eval_results = trainer.evaluate()
+print(f"Evaluation results: {eval_results}")
+model.save_pretrained("./blenderbot_coqa_finetuned")
+tokenizer.save_pretrained("./blenderbot_coqa_finetuned")
+fine_tuned_model = BlenderbotSmallForConditionalGeneration.from_pretrained("./blenderbot_coqa_finetuned")
+fine_tuned_tokenizer = BlenderbotSmallTokenizer.from_pretrained("./blenderbot_coqa_finetuned")
+
+
+print("再次测试对话")
+input = "The Vatican Apostolic Library, more commonly called the Vatican Library or simply the Vat, is the library of the Holy See, located in Vatican City. Formally established in 1475, although it is much older, it is one of the oldest libraries in the world and contains one of the most significant collections of historical texts. It has 75,000 codices from throughout history, as well as 1.1 million printed books, which include some 8,500 incunabula. \n\nThe Vatican Library is a research library for history, law, philosophy, science and theology. The Vatican Library is open to anyone who can document their qualifications and research needs. Photocopies for private study of pages from books published between 1801 and 1990 can be requested in person or by mail. \n\nIn March 2014, the Vatican Library began an initial four-year project of digitising its collection of manuscripts, to be made available online. \n\nThe Vatican Secret Archives were separated from the library at the beginning of the 17th century; they contain another 150,000 items. \n\nScholars have traditionally divided the history of the library into five periods, Pre-Lateran, Lateran, Avignon, Pre-Vatican and Vatican. \n\nThe Pre-Lateran period, comprising the initial days of the library, dated from the earliest days of the Church. Only a handful of volumes survive from this period, though some are very significant.When was the Vat formally opened?"
+print("input question:", input)
+input_tokens = fine_tuned_tokenizer([input], return_tensors="ms")
+output_tokens = fine_tuned_model.generate(**input_tokens)
+print("output answer:", fine_tuned_tokenizer.batch_decode(output_tokens, skip_special_tokens=True)[0])
\ No newline at end of file
diff --git a/llm/finetune/blenderbot_small/mindNLPBlenderbotsmallPersona.py b/llm/finetune/blenderbot_small/mindNLPBlenderbotsmallPersona.py
new file mode 100644
index 000000000..d4f09ae03
--- /dev/null
+++ b/llm/finetune/blenderbot_small/mindNLPBlenderbotsmallPersona.py
@@ -0,0 +1,165 @@
+# !pip install mindnlp
+# !pip install mindspore==2.4
+# !export LD_PRELOAD=$LD_PRELOAD:/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/torch.libs/libgomp-74ff64e9.so.1.0.0
+# !yum install libsndfile
+from mindnlp.transformers import BlenderbotSmallForConditionalGeneration, BlenderbotSmallTokenizer
+from mindnlp.engine  import Trainer, TrainingArguments
+from datasets import load_dataset, load_from_disk
+import mindspore as ms
+import os
+# 设置运行模式和设备
+ms.set_context(mode=ms.PYNATIVE_MODE, device_target="Ascend")
+# 设置 HF_ENDPOINT 环境变量
+os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
+# 加载模型和分词器
+print("加载模型和分词器")
+model_name = "facebook/blenderbot_small-90M"
+tokenizer = BlenderbotSmallTokenizer.from_pretrained(model_name)
+model = BlenderbotSmallForConditionalGeneration.from_pretrained(model_name)
+print("模型和分词器加载完成")
+# 测试原始模型的输出
+input = "Nice to meet you too. What are you interested in?"
+print("input question:", input)
+input_tokens = tokenizer([input], return_tensors="ms")
+output_tokens = model.generate(**input_tokens)
+print("output answer:", tokenizer.batch_decode(output_tokens, skip_special_tokens=True)[0])
+# 设置填充标记（BlenderbotSmall默认无pad_token）
+# tokenizer.pad_token = tokenizer.eos_token  # 用eos_token作为填充标记
+# model.config.pad_token_id = tokenizer.eos_token_id
+print("加载数据集")
+# 加载 Persona-Chat 数据集
+# 定义数据集保存路径
+dataset_path = "./dataset_valid_preprocessed"
+# 检查是否存在处理好的数据集
+if os.path.exists(dataset_path):
+    # 加载预处理后的数据集
+    dataset_train = load_from_disk("./dataset_train_preprocessed")
+    dataset_valid = load_from_disk("./dataset_valid_preprocessed")
+else:
+    dataset = load_dataset("google/Synthetic-Persona-Chat")
+    print("dataset finished")
+
+    print("dataset:", dataset)
+    print("dataset['train'][0]:", dataset["train"][0])
+    dataset_train = dataset["train"]
+    dataset_valid = dataset["validation"]
+    print("dataset_train:", dataset_train)
+    print("dataset_train['Best Generated Conversation'][0]:\n", 
+          dataset_train["Best Generated Conversation"][0])
+    print("dataset_train['user 1 personas'][0]:", 
+          dataset_train["user 1 personas"][0])
+    print("dataset_train['user 2 personas'][0]:", 
+          dataset_train["user 2 personas"][0])
+    print("dataset_train.column_names:", 
+          dataset_train.column_names)
+    # 数据预处理：将对话格式化为上下文-回复对
+    def format_dialogue(examples):
+        inputs, targets = [], []
+        for conversation in examples["Best Generated Conversation"]:
+            # 将对话按行拆分
+            lines = conversation.split("\n")
+            # 将对话拆分为上下文和回复
+            # print("lines_range:", len(lines) - 1)
+            for i in range(len(lines) - 1):
+                context = "\n".join(lines[:i+1])  # 上下文是当前行及之前的所有行
+                reply = lines[i+1]  # 下一行是回复
+                context = context.replace("User 1: ", "")
+                inputs.append(context.strip())
+                context = context.replace("User 2: ", "")
+                targets.append(reply.strip())
+        # print(f"Best Generated Conversation: {len(examples['Best Generated Conversation'])}")
+        # print(f"user 1 personas: {len(examples['user 1 personas'])}")
+        # print(f"inputs length: {len(inputs)}, targets length: {len(targets)}")
+        return {"input": inputs, "target": targets}
+
+    # 应用预处理函数
+    dataset_train = dataset_train.map(format_dialogue, batched=True
+                                        , remove_columns=["user 1 personas"
+                                                            , "user 2 personas"
+                                                            , "Best Generated Conversation"])
+    dataset_valid = dataset_valid.map(format_dialogue, batched=True
+                                        , remove_columns=["user 1 personas"
+                                                            , "user 2 personas"
+                                                            , "Best Generated Conversation"])
+    # 保存预处理后的数据集
+    dataset_train.save_to_disk("./dataset_train_preprocessed")
+    dataset_valid.save_to_disk("./dataset_valid_preprocessed")
+print("tokenizer数据集")
+# 定义数据集保存路径
+dataset_path = "./datasetTokenized_train_preprocessed"
+# 检查是否存在处理好的数据集
+if os.path.exists(dataset_path):
+    # 加载预处理后的数据集
+    dataset_train_tokenized = load_from_disk("./datasetTokenized_train_preprocessed")
+    dataset_valid_tokenized= load_from_disk("./datasetTokenized_valid_preprocessed")
+else:
+    # 分词处理
+    def tokenize_function(examples):
+        model_inputs = tokenizer(
+            examples["input"],
+            max_length=128,
+            truncation=True,
+            padding="max_length",
+        )
+        with tokenizer.as_target_tokenizer():
+            labels = tokenizer(
+                examples["target"],
+                max_length=128,
+                truncation=True,
+                padding="max_length",
+            )
+        model_inputs["labels"] = labels["input_ids"]#获得"labels" "input_ids" "attention_mask"
+        return model_inputs
+
+    dataset_train_tokenized = dataset_train.map(tokenize_function, batched=True)
+    dataset_valid_tokenized = dataset_valid.map(tokenize_function, batched=True)
+    # 保存预处理后的数据集
+    dataset_train_tokenized.save_to_disk("./datasetTokenized_train_preprocessed")
+    dataset_valid_tokenized.save_to_disk("./datasetTokenized_valid_preprocessed")
+# 训练参数
+TOKENS = 20
+EPOCHS = 10
+BATCH_SIZE = 4
+# 定义训练参数
+training_args = TrainingArguments(
+    output_dir='./Mindsporeblenderbot_persona_finetuned',
+    overwrite_output_dir=True,
+    num_train_epochs=EPOCHS,
+    per_device_train_batch_size=BATCH_SIZE,
+    per_device_eval_batch_size=BATCH_SIZE,
+    
+    save_steps=500,                  # Save checkpoint every 500 steps
+    save_total_limit=2,              # Keep only the last 2 checkpoints
+    logging_dir="./mindsporelogs",            # Directory for logs
+    logging_steps=100,               # Log every 100 steps
+    logging_strategy="epoch",
+    evaluation_strategy="epoch",
+    eval_steps=500,                  # Evaluation frequency
+    warmup_steps=100,
+    learning_rate=5e-5,
+    weight_decay=0.01,               # Weight decay
+)
+
+# 训练器
+trainer = Trainer(
+    model=model,
+    args=training_args,
+    train_dataset=dataset_train_tokenized,
+    eval_dataset=dataset_valid_tokenized
+)
+# 开始训练
+trainer.train()
+eval_results = trainer.evaluate()
+print(f"Evaluation results: {eval_results}")
+# 保存模型
+model.save_pretrained("./blenderbot_dialogue_finetuned")
+tokenizer.save_pretrained("./blenderbot_dialogue_finetuned")
+fine_tuned_model = BlenderbotSmallForConditionalGeneration.from_pretrained("./blenderbot_dialogue_finetuned")
+fine_tuned_tokenizer = BlenderbotSmallTokenizer.from_pretrained("./blenderbot_dialogue_finetuned")
+# 再次测试对话
+print("再次测试对话")
+input = "Nice to meet you too. What are you interested in?"
+print("input question:", input)
+input_tokens = fine_tuned_tokenizer([input], return_tensors="ms")
+output_tokens = fine_tuned_model.generate(**input_tokens)
+print("output answer:", fine_tuned_tokenizer.batch_decode(output_tokens, skip_special_tokens=True)[0])
\ No newline at end of file
diff --git a/llm/finetune/blenderbot_small/mindNLPCopaLog.txt b/llm/finetune/blenderbot_small/mindNLPCopaLog.txt
new file mode 100644
index 000000000..8a79c5ff8
--- /dev/null
+++ b/llm/finetune/blenderbot_small/mindNLPCopaLog.txt
@@ -0,0 +1,138 @@
+(MindSpore) [ma-user work]$python mindNLPBlenderbotsmallCopa.py 
+Building prefix dict from the default dictionary ...
+Dumping model to file cache /tmp/jieba.cache
+Loading model cost 1.349 seconds.
+Prefix dict has been built successfully.
+加载模型和分词器
+/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py:1526: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted, and will be then set to `False` by default. 
+  warnings.warn(
+BlenderbotSmallForConditionalGeneration has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`.`PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
+  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
+  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
+模型和分词器加载完成
+input question: The Vatican Apostolic Library, more commonly called the Vatican Library or simply the Vat, is the library of the Holy See, located in Vatican City. Formally established in 1475, although it is much older, it is one of the oldest libraries in the world and contains one of the most significant collections of historical texts. It has 75,000 codices from throughout history, as well as 1.1 million printed books, which include some 8,500 incunabula. 
+
+The Vatican Library is a research library for history, law, philosophy, science and theology. The Vatican Library is open to anyone who can document their qualifications and research needs. Photocopies for private study of pages from books published between 1801 and 1990 can be requested in person or by mail. 
+
+In March 2014, the Vatican Library began an initial four-year project of digitising its collection of manuscripts, to be made available online. 
+
+The Vatican Secret Archives were separated from the library at the beginning of the 17th century; they contain another 150,000 items. 
+
+Scholars have traditionally divided the history of the library into five periods, Pre-Lateran, Lateran, Avignon, Pre-Vatican and Vatican. 
+
+The Pre-Lateran period, comprising the initial days of the library, dated from the earliest days of the Church. Only a handful of volumes survive from this period, though some are very significant.When was the Vat formally opened?
+..output answer: in 1475. xi xiv xii ) viii corporation mitt pola walking vii evich decided towards includes inhabitants sworth mentioned miss after the holy see between and together reflects including yes s xiii characters ston united sammy incident missing extensive : include
+加载数据集
+转化为mindspore格式数据集
+开始训练
+  0%|                                  | 0/18000 [00:00<?, ?it/s]  0%|                      | 1/18000 [00:49<246:32:42, 49.31s/it]  3%|▌                     | 500/18000 [06:07<3:16:45,  1.48it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+  4%| | 782/18000 [09:32<2:49:                                                                                      6%|▋            | 1000/18000 [11:41<2:53:50,  1.63it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+  8%|██▎                        | 1500/18000 [17:04<2:54:26,  1.58it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0117, 'learning_rate': 4.5251396648044695e-05, 'epoch': 1.0}
+ 10%|██▋                        | 1800/18000 [20:15<2:26:47,  1.84it/s]{'eval_loss': 0.010459424927830696, 'eval_runtime': 19.7896, 'eval_samples_per_second': 6.316, 'eval_steps_per_second': 1.617, 'epoch': 1.0}  
+ 11%|███                        | 2000/18000 [22:36<2:49:39,  1.57it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 14%|███▊                       | 2500/18000 [27:58<2:30:59,  1.71it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 17%|████▌                      | 3000/18000 [33:07<2:25:51,  1.71it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 19%|█████▎                     | 3500/18000 [38:24<2:31:27,  1.60it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0065, 'learning_rate': 4.022346368715084e-05, 'epoch': 2.0} 
+{'eval_loss': 0.010958473198115826, 'eval_runtime': 12.0248, 'eval_samples_per_second': 10.395, 'eval_steps_per_second': 2.661, 'epoch': 2.0} 
+ 22%|██████                     | 4000/18000 [43:54<2:16:41,  1.71it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 25%|██████▊                    | 4500/18000 [48:52<2:00:01,  1.87it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 28%|███████▌                   | 5000/18000 [54:15<2:17:01,  1.58it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0041, 'learning_rate': 3.519553072625699e-05, 'epoch': 3.0} 
+{'eval_loss': 0.011061458848416805, 'eval_runtime': 11.2378, 'eval_samples_per_second': 11.123, 'eval_steps_per_second': 2.848, 'epoch': 3.0} 
+ 31%|████████▎                  | 5500/18000 [59:54<2:13:38,  1.56it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 33%|████████▎                | 6000/18000 [1:05:20<2:10:20,  1.53it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 36%|█████████                | 6500/18000 [1:10:40<2:04:21,  1.54it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 39%|█████████▋               | 7000/18000 [1:16:22<2:02:31,  1.50it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0027, 'learning_rate': 3.0167597765363132e-05, 'epoch': 4.0}
+{'eval_loss': 0.011254088021814823, 'eval_runtime': 11.6698, 'eval_samples_per_second': 10.711, 'eval_steps_per_second': 2.742, 'epoch': 4.0} 
+ 42%|██████████▍              | 7500/18000 [1:21:48<1:38:48,  1.77it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 43%|█████▌       | 7676 43%|▍| 7677/18000 [1:23:                        43%|▍| 7678/18000 [1:23 43%|████████████▍                | 7755/18000 [1:24:36<1:38:12,  1.7 44%|████████████████                     | 7839/18000 [1:25:25<1:38: 44%|██████████             | 7840/18000 [1:25:26<1:40:44,  1.68it/s] 44%|███████████▌              | 8000/18000 [1:26:59<1:31:25,  1.82it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 47%|████████████▎             | 8500/18000 [1:31:59<1:34:17,  1.68it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 50%|█████████████             | 9000/18000 [1:37:11<1:40:44,  1.49it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0017, 'learning_rate': 2.5139664804469275e-05, 'epoch': 5.0} 
+{'eval_loss': 0.011891312897205353, 'eval_runtime': 11.4767, 'eval_samples_per_second': 10.892, 'eval_steps_per_second': 2.788, 'epoch': 5.0}   
+ 53%|█████████████▋            | 9500/18000 [1:42:48<1:28:36,  1.60it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 56%|█████████████▉           | 10000/18000 [1:48:16<1:16:12,  1.75it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 58%|██████████████▌          | 10500/18000 [1:53:26<1:07:43,  1.85it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0012, 'learning_rate': 2.011173184357542e-05, 'epoch': 6.0}  
+{'eval_loss': 0.012321822345256805, 'eval_runtime': 9.8571, 'eval_samples_per_second': 12.681, 'eval_steps_per_second': 3.246, 'epoch': 6.0}    
+ 61%|███████████████▎         | 11000/18000 [1:58:42<1:06:35,  1.75it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 64%|█████████████████▎         | 11500/18000 [2:03:32<54:35,  1.98it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 67%|██████████████████         | 12000/18000 [2:08:33<59:03,  1.69it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 69%|█████████████████▎       | 12500/18000 [2:13:57<1:00:09,  1.52it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0007, 'learning_rate': 1.5083798882681566e-05, 'epoch': 7.0} 
+{'eval_loss': 0.012598296627402306, 'eval_runtime': 11.407, 'eval_samples_per_second': 10.958, 'eval_steps_per_second': 2.805, 'epoch': 7.0}    
+ 72%|███████████████████▌       | 13000/18000 [2:19:22<53:15,  1.56it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 75%|████████████████████▎      | 13500/18000 [2:24:39<51:34,  1.45it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 78%|█████████████████████      | 14000/18000 [2:29:56<40:34,  1.64it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0005, 'learning_rate': 1.005586592178771e-05, 'epoch': 8.0}  
+{'eval_loss': 0.01246054656803608, 'eval_runtime': 11.8003, 'eval_samples_per_second': 10.593, 'eval_steps_per_second': 2.712, 'epoch': 8.0}    
+ 81%|█████████████████████▊     | 14500/18000 [2:35:27<35:26,  1.65it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 83%|██████████████████████▌    | 15000/18000 [2:40:46<29:44,  1.68it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 86%|███████████████████████▎   | 15500/18000 [2:45:55<25:25,  1.64it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 89%|████████████████████████   | 16000/18000 [2:51:06<19:32,  1.71it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0003, 'learning_rate': 5.027932960893855e-06, 'epoch': 9.0}  
+{'eval_loss': 0.0124361552298069, 'eval_runtime': 9.2372, 'eval_samples_per_second': 13.532, 'eval_steps_per_second': 3.464, 'epoch': 9.0}      
+ 92%|████████████████████████▊  | 16500/18000 [2:56:42<15:12,  1.64it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 94%|█████████████████████████▌ | 17000/18000 [3:02:13<10:15,  1.62it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 97%|██████████████████████████▎| 17500/18000 [3:07:30<05:00,  1.66it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+100%|███████████████████████████| 18000/18000 [3:12:50<00:00,  1.67it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.0002, 'learning_rate': 0.0, 'epoch': 10.0}                   
+{'eval_loss': 0.01264810748398304, 'eval_runtime': 11.1519, 'eval_samples_per_second': 11.209, 'eval_steps_per_second': 2.869, 'epoch': 10.0}   
+{'train_runtime': 11591.6409, 'train_samples_per_second': 6.211, 'train_steps_per_second': 1.553, 'train_loss': 0.002952050690849622, 'epoch': 10.0}
+100%|███████████████████████████| 18000/18000 [3:13:11<00:00,  1.55it/s]
+100%|█████████████████████████████████| 125/125 [00:09<00:00, 12.99it/s]
+Evaluation results: {'eval_loss': 0.01264810748398304, 'eval_runtime': 9.9023, 'eval_samples_per_second': 12.623, 'eval_steps_per_second': 3.232, 'epoch': 10.0}
+Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+再次测试对话
+input question: The Vatican Apostolic Library, more commonly called the Vatican Library or simply the Vat, is the library of the Holy See, located in Vatican City. Formally established in 1475, although it is much older, it is one of the oldest libraries in the world and contains one of the most significant collections of historical texts. It has 75,000 codices from throughout history, as well as 1.1 million printed books, which include some 8,500 incunabula. 
+
+The Vatican Library is a research library for history, law, philosophy, science and theology. The Vatican Library is open to anyone who can document their qualifications and research needs. Photocopies for private study of pages from books published between 1801 and 1990 can be requested in person or by mail. 
+
+In March 2014, the Vatican Library began an initial four-year project of digitising its collection of manuscripts, to be made available online. 
+
+The Vatican Secret Archives were separated from the library at the beginning of the 17th century; they contain another 150,000 items. 
+
+Scholars have traditionally divided the history of the library into five periods, Pre-Lateran, Lateran, Avignon, Pre-Vatican and Vatican. 
+
+The Pre-Lateran period, comprising the initial days of the library, dated from the earliest days of the Church. Only a handful of volumes survive from this period, though some are very significant.When was the Vat formally opened?
+.output answer: it was formally established in 1475 remarked wang commenced baxter vii affiliate xii ) detained amid xvi scarcely spokesman murmured pradesh condemned himweekriedly upheld kilometers ywood longitude reportedly unarmed sworth congressional quarreandrea according monsieur constituent zhang smiled ɪfellows combe mitt
diff --git a/llm/finetune/blenderbot_small/mindNLPPersonaTenLog.txt b/llm/finetune/blenderbot_small/mindNLPPersonaTenLog.txt
new file mode 100644
index 000000000..882ccd4ac
--- /dev/null
+++ b/llm/finetune/blenderbot_small/mindNLPPersonaTenLog.txt
@@ -0,0 +1,74 @@
+(MindSpore) [ma-user work]$python OldmindNLPBlenderbotsmall.py
+Building prefix dict from the default dictionary ...
+Loading model from cache /tmp/jieba.cache
+Loading model cost 1.252 seconds.
+Prefix dict has been built successfully.
+加载模型和分词器
+/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py:1526: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted, and will be then set to `False` by default. 
+  warnings.warn(
+BlenderbotSmallForConditionalGeneration has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`.`PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
+  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
+  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
+模型和分词器加载完成
+input question: Nice to meet you too. What are you interested in?
+output answer: i ' m not really sure . i ' ve always wanted to go back to school , but i don ' t know what i want to do yet .
+加载数据集
+tokenizer数据集
+dataset_train_tokenized: Dataset({
+    features: ['input', 'target', 'input_ids', 'attention_mask', 'labels'],
+    num_rows: 23589
+})
+dataset_valid_tokenized: Dataset({
+    features: ['input', 'target', 'input_ids', 'attention_mask', 'labels'],
+    num_rows: 2687
+})
+开始训练
+0%|                                        | 0/8847 [00:00<?, ?it/s]  6%|█▌                          | 500/8847 [04:00<1:05:14,  2.13it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 11%|███                        | 1000/8847 [08:29<1:05:43,  1.99it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 17%|████▉                        | 1500/8847 [12:52<57:16,  2.14it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 23%|██████▌                      | 2000/8847 [17:22<56:59,  2.00it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 28%|████████▏                    | 2500/8847 [21:44<53:00,  2.00it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.1737, 'learning_rate': 3.371441637132731e-05, 'epoch': 1.0}
+ 33%|█████████▋                   | 2949/8847 [25:40<49:59,  1.97it/s]{'eval_loss': 0.16312436759471893, 'eval_runtime': 25.1526, 'eval_samples_per_second': 13.358, 'eval_steps_per_second': 1.67, 'epoch': 1.0} 
+ 34%|█████████▊                   | 3000/8847 [26:30<48:38,  2.00it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 40%|███████████▍                 | 3500/8847 [30:47<43:32,  2.05it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 45%|█████████████                | 4000/8847 [35:04<39:51,  2.03it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 51%|██████████████▊              | 4500/8847 [39:22<42:30,  1.70it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 57%|████████████████▍            | 5000/8847 [43:37<31:08,  2.06it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 62%|██████████████████           | 5500/8847 [47:54<27:22,  2.04it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.1336, 'learning_rate': 1.6857208185663656e-05, 'epoch': 2.0}
+{'eval_loss': 0.15773458778858185, 'eval_runtime': 22.8097, 'eval_samples_per_second': 14.731, 'eval_steps_per_second': 1.841, 'epoch': 2.0}
+  68%|███████████████████▋         | 6000/8847 [52:33<23:08,  2.05it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 73%|█████████████████████▎       | 6500/8847 [56:50<19:13,  2.03it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 79%|█████████████████████▎     | 7000/8847 [1:01:21<19:25,  1.58it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 85%|███████████████████████████████████████████████████████████████████████████████████████▎               | 7500/8847 [1:07:00<15:30,  1.45it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 90%|█████████████████████████████████████████████████████████████████████████████████████████████▏         | 8000/8847 [1:12:37<08:42,  1.62it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+ 96%|██████████████████████████████████████████████████████████████████████████████████████████████████▉    | 8500/8847 [1:18:12<03:45,  1.54it/s]Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+{'loss': 0.1099, 'learning_rate': 0.0, 'epoch': 3.0}                                                                                              
+{'eval_loss': 0.15398454666137695, 'eval_runtime': 32.5334, 'eval_samples_per_second': 10.328, 'eval_steps_per_second': 1.291, 'epoch': 3.0}      
+{'train_runtime': 4966.7027, 'train_samples_per_second': 14.25, 'train_steps_per_second': 1.781, 'train_loss': 0.1390861510368098, 'epoch': 3.0}  
+100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 8847/8847 [1:22:46<00:00,  1.78it/s]
+100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 336/336 [00:29<00:00, 11.44it/s]
+Evaluation results: {'eval_loss': 0.15398454666137695, 'eval_runtime': 29.6095, 'eval_samples_per_second': 11.348, 'eval_steps_per_second': 1.418, 'epoch': 3.0}
+Some non-default generation parameters are set in the model config. These should go into a GenerationConfig file instead.
+Non-default generation parameters: {'max_length': 128, 'min_length': 20, 'num_beams': 10, 'length_penalty': 0.65, 'no_repeat_ngram_size': 3, 'forced_eos_token_id': 2}
+再次测试对话
+input question: Nice to meet you too. What are you interested in?
+output answer: user 2: i'm interested in a lot of things, but my main interests are music, art, and music. i also like to play video games, go to the movies, and spend time with my friends and family. my favorite video games are the legend of zelda series, and my favorite game is the witcher 3. name) what breath my his their i they ] include yes when philip boarity