For the following cases the results are different for float32 and float64 on gpu.
>>> dpt.sin(dpt.asarray(-0.,dtype='f4',device='gpu'))
usm_ndarray(0., dtype=float32)
>>> dpt.sin(dpt.asarray(-0.,dtype='f8',device='gpu'))
usm_ndarray(-0.)
>>> dpt.expm1(dpt.asarray(-0.,dtype='f4',device='gpu'))
usm_ndarray(-0., dtype=float32)
>>> dpt.expm1(dpt.asarray(-0.,dtype='f8',device='gpu'))
usm_ndarray(0.)