Skip to content

#fix qwen2 abnormal loss caused by SoftmaxCrossEntropyWithLogits on 910A/B #2034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: 0.4
Choose a base branch
from

Conversation

xuhangscut
Copy link
Contributor

@xuhangscut xuhangscut commented May 6, 2025

Using SoftmaxCrossEntropyWithLogits function trainning loss chart as follows:
image
It should use CrossEntropyLoss function like this trend in pytorch:
image

It can be seen that the decrease in loss has become less significant, and the ability learned through testing has decreased.
This pr will maintain the code for SoftmaxCrossEntropyWithLogits function using on orangepi
2cf9584436899827ccee669bfa628bb

experiment environment:
python 3.9.10
mindspore 2.5.0
Ascend 910A/B

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant