You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is support planned for NEFF and the respective runtime? Currently writing a Triton DLR backend, and having the option for a unified backend entrypoint to the neuron runtime if INF1 instances are specified would be very nice.
I know Neuron uses a TVM frontend, so I understand it is possibly best to just make a choice -- either use the raw TVM runtime exposed by DLR or compile your model via Neo, targeted at INF1 using Neuron. However, Neuron's usage of a TVM frontend is somewhat a blackbox, and doesn't allow directly passing TVM .so, etc. directly to neuron-cc. This limits use cases, such as classical ML models compiled via HummingbirdML to TVM.
The text was updated successfully, but these errors were encountered:
Is support planned for NEFF and the respective runtime? Currently writing a Triton DLR backend, and having the option for a unified backend entrypoint to the neuron runtime if INF1 instances are specified would be very nice.
I know Neuron uses a TVM frontend, so I understand it is possibly best to just make a choice -- either use the raw TVM runtime exposed by DLR or compile your model via Neo, targeted at INF1 using Neuron. However, Neuron's usage of a TVM frontend is somewhat a blackbox, and doesn't allow directly passing TVM .so, etc. directly to
neuron-cc
. This limits use cases, such as classical ML models compiled via HummingbirdML to TVM.The text was updated successfully, but these errors were encountered: