Add __float2half_rn for cuda compute capabilities less than 53 #4489

reminisce · 2019-12-10T06:12:22Z

Description

When float32 consts are converted to float16, __float2half_rn, which is officially defined in cuda_fp16.h and included for cuda compute capabilities >= 53, is invoked. In this PR, the same function is added in cuda code gen for cuda compute capabilities < 53.

Tested on sm_37 and sm_70.

Thank @yzhliu for pointers.

yzhliu · 2019-12-10T22:06:01Z

Thanks @reminisce

…e#4489) * Fix * clean up

reminisce added 2 commits December 9, 2019 21:57

Fix

baacd41

clean up

f52ef87

yzhliu approved these changes Dec 10, 2019

View reviewed changes

yzhliu merged commit e47bc1d into apache:master Dec 10, 2019

yzhliu added the status: accepted label Dec 10, 2019

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Dec 13, 2019

Add __float2half_rn for cuda compute capabilities less than 53 (apach…

b6174e8

…e#4489) * Fix * clean up

zxy844288792 pushed a commit to neo-ai/tvm that referenced this pull request Dec 13, 2019

Add __float2half_rn for cuda compute capabilities less than 53 (apach…

5440cc8

…e#4489) * Fix * clean up

zhiics mentioned this pull request Sep 15, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add __float2half_rn for cuda compute capabilities less than 53 #4489

Add __float2half_rn for cuda compute capabilities less than 53 #4489

reminisce commented Dec 10, 2019 •

edited

Loading

yzhliu commented Dec 10, 2019

Add __float2half_rn for cuda compute capabilities less than 53 #4489

Add __float2half_rn for cuda compute capabilities less than 53 #4489

Conversation

reminisce commented Dec 10, 2019 • edited Loading

Description

yzhliu commented Dec 10, 2019

reminisce commented Dec 10, 2019 •

edited

Loading