Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of Falcon models (7b, 40b, 180b) to DeepSpeed-FastGen #4790

Merged
merged 6 commits into from
Dec 12, 2023

Conversation

arashb
Copy link
Contributor

@arashb arashb commented Dec 8, 2023

This PR adds support for Falcon models 7b, 40b and 180b. Note that 40b and 180b models have a different model architecture than the 7b variant.

Falcon-7B:

Huggingface output with prompt "DeepSpeed is":

a new AI-powered video compression technology that can compress video files by up to 40 times without losing any quality.
DeepSpeed is a new AI-powered video compression technology that can compress video files by up to 40 times without losing any quality.
The new technology is the result of a collaboration between Google and the University of Washington.
The researchers behind the technology have been working on it for the past three years.
They have been able to compress videos by up to 40 times without losing any quality.
The technology is based on a new algorithm that uses deep learning to compress videos.
The algorithm is able

DeepSpeed-FastGen output with prompt "DeepSpeed is" with 2-way sharding:

a new AI-powered video compression technology that can compress video files by up to 40 times without losing any quality.
DeepSpeed is a new AI-powered video compression technology that can compress video files by up to 40 times without losing any quality.
The new technology is the result of a collaboration between Google and the University of Washington.
The researchers behind the technology have been working on it for the past three years.
They have been able to compress videos by up to 40 times without losing any quality.
The technology is based on a new algorithm that uses deep learning to compress videos.
The algorithm is able

Falcon-40B:

Huggingface output with prompt "DeepSpeed is":

an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source

DeepSpeed-FastGen output with prompt "DeepSpeed is" with 2-way sharding:

an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source deep learning library for PyTorch that enables state-of-the-art large-scale deep learning training.
DeepSpeed is an open-source

Falcon-180B:

Huggingface output with prompt "DeepSpeed is":

a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy to use, efficient, and scalable.
DeepSpeed is a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy to use, efficient, and scalable.
DeepSpeed is a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy to use, efficient, and scalable.
DeepSpeed is a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy

DeepSpeed-FastGen output with prompt "DeepSpeed is" with 8-way sharding:

a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy to use, efficient, and scalable.
DeepSpeed is a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy to use, efficient, and scalable.
DeepSpeed is a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy to use, efficient, and scalable.
DeepSpeed is a deep learning optimization library that helps developers and researchers train models with billions of parameters. It is designed to be easy

@mrwyattii mrwyattii merged commit a7900bc into master Dec 12, 2023
9 checks passed
@mrwyattii mrwyattii deleted the arashb/falcon branch December 12, 2023 21:31
@RezaYazdaniAminabadi
Copy link
Contributor

Nice job, @arashb :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants