Skip to content

autograph v0.1.1

Compare
Choose a tag to compare
@charles-r-earp charles-r-earp released this 12 Dec 03:24
· 328 commits to main since this release

Profiling

Currently requires nightly and feature "profile". Set the AUTOGRAPH_PROFILE environmental variable to 1 or True to produce a table of statistics for compute passes that are executed.

AUTOGRAPH_PROFILE=1 cargo +nightly run --feature profile

Rust GEMM

Improved performance on Neural Network MNIST example (Lenet5) by 5x.

  • Implemented in Rust for u32, i32, f32
    • bf16 not yet implemented
  • Unrolled loops with crunchy
  • Work per thread (1x1, 2x2, 4x4) micro tiles
  • SplitK variant (256) for small m or n and large k
    • Atomically accumulates with multiple work groups

Tensor

  • Added Tensor::ones method.

Neural Networks

  • Allowed SGD learning_rate = 1.0
  • MeanPool
  • Fixed correctness issues
    • Cross Entropy Loss
    • Sum
    • Test accuracy improved to ~99% on Neural Network MNIST example (Lenet5)

Examples

  • Added shuffling of training batches

Benchmark

Added Neural Network Benchmark to compare performance with other libraries. Training is now ~2.7x slower than tch (NVIDIA GeForce GTX 1060 with Max-Q Design) with similar test accuracy.

+-----------+------------+---------------+-----------------------+----------------------------------+
| Library   | Best Epoch | Best Accuracy | Time To Best Accuracy | Mean Epoch Time to Best Accuracy |
+===========+============+===============+=======================+==================================+
| autograph | 69         | 99.04%        | 127.38s               | 1.85s                            |
+-----------+------------+---------------+-----------------------+----------------------------------+
| tch       | 32         | 99.12%        | 22.03s                | 688.31ms                         |
+-----------+------------+---------------+-----------------------+----------------------------------+