Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add __float2half_rn for cuda compute capabilities less than 53 #4489

Merged
merged 2 commits into from
Dec 10, 2019

Conversation

reminisce
Copy link
Contributor

@reminisce reminisce commented Dec 10, 2019

Description

When float32 consts are converted to float16, __float2half_rn, which is officially defined in cuda_fp16.h and included for cuda compute capabilities >= 53, is invoked. In this PR, the same function is added in cuda code gen for cuda compute capabilities < 53.

Tested on sm_37 and sm_70.

Thank @yzhliu for pointers.

@yzhliu yzhliu merged commit e47bc1d into apache:master Dec 10, 2019
@yzhliu
Copy link
Member

yzhliu commented Dec 10, 2019

Thanks @reminisce

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Dec 13, 2019
zxy844288792 pushed a commit to neo-ai/tvm that referenced this pull request Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants