Skip to content

FLock-io/awesome-decentralized-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

✨ SoK: Decentralized AI (DeAI)

The centralization of Artificial Intelligence (AI) poses significant challenges, including single points of failure, inherent biases, data privacy concerns, and scalability issues. These problems are especially prevalent in closed-source large language models (LLMs), where user data is collected and used without transparency. To mitigate these issues, blockchain-based decentralized AI (DeAI) has emerged as a promising solution. DeAI combines the strengths of both blockchain and AI technologies to enhance the transparency, security, decentralization, and trustworthiness of AI systems. However, a comprehensive understanding of state-of-the-art DeAI development, particularly for active industry solutions, is still lacking.

In this work, we present a Systematization of Knowledge (SoK) for blockchain-based DeAI solutions. We propose a taxonomy to classify existing DeAI protocols based on the model lifecycle. Based on this taxonomy, we provide a structured way to clarify the landscape of DeAI protocols and identify their similarities and differences. We analyze the functionalities of blockchain in DeAI, investigating how blockchain features contribute to enhancing the security, transparency, and trustworthiness of AI processes, while also ensuring fair incentives for AI data and model contributors. In addition, we identify key insights and research gaps in developing DeAI protocols, highlighting several critical avenues for future research.

This repo contains the list of papers and protocols investigated in our SoK.

Figure1 Figure 1: Comparison of different machine learning paradigms: (A) Standalone Learning, (B) Centralized Learning, (C) Distributed Learning (Data Parallelism), (D) Centralized Federated Learning, (E) Decentralized Federated Learning (Ring All-reduce), and (F) Decentralized Learning.

Figure2 Figure2: A DeAI model lifecycle consists of four phases: 1.trask proposing, 2.pre-training, 3.on-training, and 4.post-training.

📚 Table of Content (ToC)

0. Overview of DeAI Projects

Project Task Creation Data Preparation Compute Training Model Inference Model Marketplace Agents Incentive Mechanism Enhanced Security Permission Control Data Storage Public Reference Auditability AI Assets Tokenization Decentralization[^2] Staking Security Guarantee
Vana ZKP
Fraction AI Reputation
Ocean On-chain Consensus
Numbers Proof of Stake
The Graph On-chain Consensus
Synternet Proof of Delivery/Consumption
OriginTrail Proof of Knowledge
ZeroGravity Proof of Random Access
Grass ZKP + Reputation
OORT Storage Proof of Honesty
KIP On-chain Consensus
Filecoin Proof-of-Replication/Spacetime
IO.NET Reward + Slash
NetMind Proof of Authority
Render Network Reputation + Proof of Render
Akash Tendermint Consensus
Nosana On-chain Consensus
Inferix Proof of Rendering
OctaSpace On-chain Consensus
DeepBrain Chain Delegated Proof of Stake
OpSec Delegated Proof of Stake
Gensyn Proof of Learning
Lilypad Mediators + On-chain consensus
Bittensor Yuma Consensus
FLock.io FLock Consensus
Numerai On-chain Consensus
Commune AI Yuma Consensus
Modulus zkML
Hyperspace Fraud Proof
Sertn ZKP + FHE[^3] + MPC
ORA opML
Ritual On-chain Consensus
Allora CometBFT
Fetch.AI Proof of Stake
Arbius Proof of Useful Work
Theoriq Proof of Contribution/Collaboration
Delysium On-chain Consensus
OpenServ On-chain Consensus
Autonolas Tendermint Consensus
ELNA On-chain Consensus
OpenAgents On-chain Consensus
SingularityNET Multi-Party Escrow
SaharaAI Proof-of-Stake
Shinkai ZKP + MPC
Balance DAO Proof-of-Stake
Immutable Labs Green Proof of Work
Prime Intellect Centralized Server

Decentralization: We mark most existing DeAI solutions as 'partially' decentralized as they have centralized or off-chain components.

FHE: Fully Homomorphic Encryption.

Prime Intellect: We also present the project which aims to build DeAI without leveraging blockchain.

1. Task Proposing

1.1 Academic Work

  • Trusted ai in multiagent systems: An overview of privacy and security for distributed learning. C. Ma et al. Proceedings of the IEEE, vol. 111, no. 9, pp. 1097–1132, 2023. [paper]

  • Applications of distributed machine learning for the internet-of-things: A comprehensive survey. M. Le et al. IEEE Communications Surveys & Tutorials, 2024. [paper]

  • Reinforcement learning: A survey. L. P. Kaelbling et al. Journal of Artificial Intelligence Research, vol. 4, pp. 237–285, 1996. [paper]

  • Markov games as a framework for multi-agent reinforcement learning. M. L. Littman. Machine Learning Proceedings 1994, Elsevier, 1994, pp. 157–163. [paper]

  • UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers. S. Hu et al. International Conference on Learning Representations, 2021. [paper]

2. Pre-Training

2.1. Data Preparation

2.1.1. Industry
2.1.2. Academic Work
  • Data preprocessing in data mining. S. García, J. Luengo, and F. Herrera. 2016. [paper]

  • An introduction to variable and feature selection. I. Guyon and A. Elisseeff. Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003. [paper]

  • On the opportunities and risks of foundation models. R. Bommasani et al. arXiv preprint arXiv:2108.07258, 2021. [paper]

  • Accelerators and specialized hardware for deep learning. T. Ben-Nun and T. Hoefler. Communications of the ACM, vol. 62, no. 12, pp. 34–44, 2019. [paper]

  • Language models are unsupervised multitask learners. A. Radford et al. OpenAI Blog, vol. 1, p. 9, 2019. [paper] [code]

  • Language models are few-shot learners. T. B. Brown. arXiv preprint arXiv:2005.14165, 2020. [paper]

  • Exploring the limits of transfer learning with a unified text-to-text transformer. C. Raffel et al. Journal of Machine Learning Research, vol. 21, pp. 1–67, 2020. [paper]

  • PaLM: Scaling language modeling with pathways. A. Chowdhery et al., 2022. [paper]

  • LLama: Open and Efficient Foundation Language Models. H. Touvron et al., 2023. [paper]

  • Will we run out of data? Limits of LLM scaling based on human-generated data. L. Villalobos et al., 2022. [paper]

  • Blockchain versus federated learning: A comparative analysis for privacy-preserving applications. M. J. M. Chowdhury et al. 2022. [paper]

2.2. Compute

2.2.1. Industry
2.2.2. Academic Work
  • The backpropagation algorithm. R. Rojas. Neural Networks: A Systematic Introduction, pp. 149–182, 1996. [chapter]

  • In-datacenter performance analysis of a tensor processing unit. N. P. Jouppi et al. Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017. [paper]

  • AI and compute. D. Amodei and D. Hernandez. 2018. Available at: [OpenAI blog]

  • AI is outpacing Moore’s law. Z. Science. 2019. [article]

  • Denoising diffusion probabilistic models. J. Ho et al. Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851, 2020. [paper]

  • The EU General Data Protection Regulation (GDPR). P. Voigt and A. Von dem Bussche. A Practical Guide, 1st Ed., Cham: Springer International Publishing, vol. 10, no. 3152676, pp. 10–5555, 2017. [book]

  • Truebit: A scalable verification solution for blockchains. J. Teutsch and C. Reitwießner. 2018. [paper]

  • Language models are few-shot learners. T. B. Brown et al., 2020. [paper]

3. On-training

3.1. Industry

3.2. Academic Work

  • The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. M. Brundage et al. arXiv preprint arXiv:1802.07228, 2018. [paper]

  • Large scale distributed deep networks. J. Dean et al. In Advances in Neural Information Processing Systems, vol. 25, 2012, pp. 1223–1231. [paper]

  • Imagenet: A large-scale hierarchical image database. J. Deng et al. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009, pp. 248–255. [paper]

  • Health insurance portability and accountability act of 1996. U.S. Congress. Public Law 104–191, 1996. [article]

  • Language models are few-shot learners. T. B. Brown et al. In Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 1877–1901. [paper]

4. Post-training

4.1. Model Inference
4.1.1. Industry
4.1.2. Academic Work
  • Machine Learning. T. M. Mitchell. McGraw Hill, 1997. [book]

  • Efficient processing of deep neural networks: A tutorial and survey. V. Sze et al. Proceedings of the IEEE, vol. 105, no. 12, pp. 2295–2329, 2017. [paper]

  • Blockchain for IoT security and privacy: The case study of a smart home. A. Dorri et al. In 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). IEEE, 2017, pp. 618–623. [paper]

4.2. Application
4.2.1. Industry
4.2.2. Academic Work
  • Artificial Intelligence: A Modern Approach. S. Russell and P. Norvig. 3rd ed. Upper Saddle River, NJ: Prentice Hall, 2010. [book]

  • Multiagent systems: A survey from a machine learning perspective. P. Stone and M. Veloso. Autonomous Robots, vol. 8, no. 3, pp. 345–383, 2000. [paper]

  • Grand challenges in AI: Representation learning, reasoning, and common sense. O. Vinyals et al. DeepMind Research, 2019. [blog]

4.3. Marketplace
4.3.1. Industry

About

Existing works list of DeAI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •