- There are no optimizations other than memory access coalescing. I a cuda beginner and willing to receive optimization suggestions : )
- Install
pip install -e .
Make sure the version of nvcc in PATH is compatible with your current PyTorch version (it seems minor version difference is OK).
-
Run
- Run test on MNIST:
python cheby_test.py
- Run benchmark (code from KAN-benchmarking):
python benchmark.py --method all --reps 100 --just-cuda