Skip to content

Latest commit

 

History

History
201 lines (168 loc) · 16.8 KB

README.md

File metadata and controls

201 lines (168 loc) · 16.8 KB

Awesome State-Space Resources for ML

Contributions are welcome! Please read the contribution guidelines before contributing.

Table of Contents

Tutorials

Blogposts

  1. S4 Series
  2. The Annotated S4
  3. The Annotated S4D
  4. The Annotated Mamba [code]
  5. Mamba: The Easy Way
  6. Mamba: The Hard Way
  7. A Visual Guide to Mamba and State Space Models
  8. State Space Models: A Modern Approach
  9. Mamba No. 5 (A Little Bit Of...)
  10. Mamba: SSM, Theory, and Implementation in Keras and TensorFlow

Videos

  1. Efficiently Modeling Long Sequences with Structured State Spaces
  2. Do we need Attention? A Mamba Primer
  3. Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math
  4. MAMBA from Scratch
  5. Yannic Kilcher's Video

Surveys (Structured State Space Models)

  1. Modeling Sequences with Structured State Spaces
  2. State Space Model for New-Generation Network Alternative to Transformers
  3. Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
  4. A Survey on Visual Mamba

Books (Classical State Space Models)

  1. Linear State-Space Control Systems
  2. Principles of System Identification Theory and Practice

Foundation

  1. Mamba: Linear-Time Sequence Modeling with Selective State Spaces [code]
  2. Structured state-space models are deep Wiener models
  3. State-space Models with Layer-wise Nonlinearity are Universal Approximators with Exponential Decaying Memory
  4. Repeat After Me: Transformers are Better than State Space Models at Copying
  5. Theoretical Foundations of Deep Selective State-Space Models
  6. The Hidden Attention of Mamba Models
  7. The Expressive Capacity of State Space Models: A Formal Language Perspective
  8. Simplifying and Understanding State Space Models with Diagonal Linear RNNs

Architecture

  1. Jamba: A Hybrid Transformer-Mamba Language Model
  2. Jamba-1.5: Hybrid Transformer-Mamba Models at Scale
  3. Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models [code]
  4. S5: Simplified State Space Layers for Sequence Modeling (ICLR 2023) [code]
  5. Long range language modeling via gated state spaces (ICLR 2023)
  6. Pretraining Without Attention [code]
  7. MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts [code]
  8. LOCOST: State-Space Models for Long Document Abstractive Summarization [code]
  9. BlackMamba: Mixture of Experts for State-Space Models [code]
  10. DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models [code]
  11. ZigMa: Zigzag Mamba Diffusion Model (ECCV 2024) [code] [website]
  12. Block-State Transformers
  13. Efficient Long Sequence Modeling via State Space Augmented Transformer
  14. S7: Selective and Simplified State Space Layers for Sequence Modeling

Language

  1. Hungry Hungry Hippos: Towards Language Modeling with State Space Models (ICLR 2023) [code]
  2. Long range language modeling via gated state spaces (ICLR 2023) [code]
  3. Mamba: Linear-Time Sequence Modeling with Selective State Spaces [code]
  4. MambaByte: Token-free Selective State Space Model [code]
  5. Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
  6. Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models [code]

Audio

  1. It's Raw! Audio Generation with State-Space Models (ICML 2022) [code]
  2. Augmenting conformers with structured state space models for online speech recognition
  3. Diagonal State Space Augmented Transformers for Speech Recognition
  4. Structured State Space Decoder for Speech Recognition and Synthesis
  5. Spiking Structured State Space Model for Monaural Speech Enhancement
  6. A Neural State-Space Model Approach to Efficient Speech Separation
  7. Multi-Head State Space Model for Speech Recognition
  8. Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation [code]
  9. SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model [code]
  10. Audio Mamba: Bidirectional State Space Model for Audio Representation Learning [code]
  11. Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis [code]

Vision

  1. S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces (NeurIPS 2022)
  2. Long movie clip classification with state-space video models (ECCV 2022) [code]
  3. Efficient Movie Scene Detection using State-Space Transformers (CVPR 2023)
  4. Selective Structured State-Spaces for Long-Form Video Understanding (CVPR 2023)
  5. 2-D SSM: A General Spatial Layer for Visual Transformers [code]
  6. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model [code]
  7. VMamba: Visual State Space Model [code]
  8. U-shaped Vision Mamba for Single Image Dehazing [code]
  9. Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning [code]
  10. Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation [code]
  11. LocalMamba: Visual State Space Model with Windowed Selective Scan [code]
  12. Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM [code]
  13. A Survey on Visual Mamba
  14. SUM: Saliency Unification through Mamba for Visual Attention Modeling [code]
  15. [CVPR'24 Spotlight] State Space Models for Event Cameras [code]

Time-Series

  1. Deep State Space Models for Time Series Forecasting (NeurIPS 2018)
  2. FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting (NeurIPS 2022)
  3. Effectively modeling time series with simple discrete state spaces (ICLR 2023)
  4. Deep Latent State Space Models for Time-Series Generation (ICML 2023)
  5. Generative AI for End-to-End Limit Order Book Modelling (ICAIF 2023)
  6. On the Performance of Legendre State-Space Models in Short-Term Time Series Forecasting (CCECE 2023)
  7. Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
  8. Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models

Medical

  1. Structured State Space Models for Multiple Instance Learning in Digital Pathology
  2. Modeling Multivariate Biosignals with Graph Neural Networks and Structured State Space
  3. Diffusion-based conditional ECG generation with structured state space models
  4. Improving the Diagnosis of Psychiatric Disorders with Self-Supervised Graph State Space Models
  5. fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models
  6. Vivim: a Video Vision Mamba for Medical Video Object Segmentation [code]
  7. MambaMorph: a Mamba-based Backbone with Contrastive Feature Learning for Deformable MR-CT Registration [code]
  8. SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation [code]
  9. U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation [code]
  10. nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model
  11. VM-UNet: Vision Mamba UNet for Medical Image Segmentation
  12. MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation
  13. ViM-UNet: Vision Mamba for Biomedical Segmentation (MIDL 2024)
  14. I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling [code]
  15. BioMamba: A Pre-trained Biomedical Language Representation Model Leveraging Mamba

Tabular

  1. MambaTab: A Simple Yet Effective Approach for Handling Tabular Data

Reinforcement Learning

  1. Decision S4: Efficient Sequence-Based RL via State Spaces Layers (ICLR 2023)
  2. Structured State Space Models for In-Context Reinforcement Learning (NeurIPS 2023)
  3. Mastering Memory Tasks with World Models (ICLR 2024 oral)

SSM Parameterization and Initialization

  1. Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers (NeurIPS 2021)
  2. Efficiently Modeling Long Sequences with Structured State Spaces (ICLR 2022)
  3. On the Parameterization and Initialization of Diagonal State Space Models (NeurIPS 2022)
  4. Diagonal State Spaces are as Effective as Structured State Spaces (NeurIPS 2022) [code]
  5. How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections (ICLR 2023)
  6. Robustifying State-space Models for Long Sequences via Approximate Diagonalization
  7. StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
  8. Spectral State Space Models
  9. From Generalization Analysis to Optimization Designs for State Space Models (ICML 2024)

Miscellaneous

  1. Variational learning for switching state-space models (Neural Computation 2000)
  2. Liquid structural state-space models (ICLR 2023)
  3. Resurrecting Recurrent Neural Networks for Long Sequences (ICML 2023)
  4. Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets (ICLR 2023)
  5. Never Train from Scratch: Fair Comparison Of Long- Sequence Models Requires Data-Driven Pirors
  6. Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks (NeurIPS 2019)

Contributions

🎉 Thank you for considering contributing to our Awesome State Space Models for Machine Learning repository! 🚀

Contribute in 3 Steps:

  1. Fork the Repo: Fork this repo to your GitHub account.

  2. Edit Content: Contribute by adding new resources or improving existing content in the README.md file.

  3. Create a Pull Request: Open a pull request (PR) from your branch to the main repository.

Guidelines

  • Follow the existing structure and formatting.
  • Ensure added resources are relevant to State Space Models in Machine Learning.
  • Verify that links work correctly.

Reporting Issues

If you encounter issues or have suggestions, open an issue on the GitHub repository.

Your contributions make this repository awesome! Thank you! 🙌

License

This project is licensed under the MIT License.