Skip to content

Commit

Permalink
Merge pull request #6 from vllm-project/cleanup-sparseml
Browse files Browse the repository at this point in the history
SparseML Cleanup
  • Loading branch information
markurtz authored Jun 25, 2024
2 parents db3ea08 + 401ef68 commit 92cd10c
Show file tree
Hide file tree
Showing 21 changed files with 145 additions and 223 deletions.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ A clear and concise description of what you expected to happen.
Include all relevant environment information:
1. OS [e.g. Ubuntu 18.04]:
2. Python version [e.g. 3.7]:
3. SparseML version or commit hash [e.g. 0.1.0, `f7245c8`]:
3. LLM Compressor version or commit hash [e.g. 0.1.0, `f7245c8`]:
4. ML framework version(s) [e.g. torch 1.7.1]:
5. Other Python package versions [e.g. SparseZoo, DeepSparse, numpy, ONNX]:
6. Other relevant environment information [e.g. hardware, CUDA version]:
Expand Down
103 changes: 35 additions & 68 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,86 +1,53 @@
<!--
Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
# Contributing to LLM Compressor

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
Thank you for your interest in contributing to LLM Compressor!
Our community is open to everyone and welcomes all kinds of contributions, no matter how small or large.
There are several ways you can contribute to the project:

http://www.apache.org/licenses/LICENSE-2.0
- Identify and report any issues or bugs.
- Request or add new compression methods or research.
- Suggest or implement new features.

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
However, remember that contributions aren't just about code.
We believe in the power of community support; thus, answering queries, assisting others, and enhancing the documentation are highly regarded and beneficial contributions.

TODO: update for upstream push
Finally, one of the most impactful ways to support us is by raising awareness about LLM Compressor and the vLLM community.
Talk about it in your blog posts, highlighting how it's driving your incredible projects.
Express your support on Twitter if vLLM aids you, or simply offer your appreciation by starring our repository.

# Contributing to SparseML
## Setup for development

If you’re reading this, hopefully we have piqued your interest to take the next step. Join us and help make SparseML even better! As a contributor, here are some community guidelines we would like you to follow:
### Install from source

- [Code of Conduct](#code-of-conduct)
- [Ways to Contribute](#ways-to-contribute)
- [Bugs and Feature Requests](#bugs-and-feature-requests)
- [Question or Problem](#question-or-problem)
- [Developing SparseML](DEVELOPING.md)
```bash
pip install -e ./[dev]
```

## Code of Conduct
### Code Styling and Formatting checks

Help us keep the software inclusive. Please read and follow our [Code of Conduct](https://github.com/neuralmagic/sparseml/blob/main/CODE_OF_CONDUCT.md) in order to promote an environment that is friendly, fair, respectful, and safe. We want to inspire collaboration, innovation, and fun!
```bash
make style
make quality
```

## Ways to Contribute
### Testing

Whether you’re a newbie, dabbler, or expert, we appreciate you jumping in.
```bash
make test
```

### Contributing Code
## Contributing Guidelines

- Make pull requests for addressing bugs, open issues, and documentation
- Neural Magic as the maintainer will do reviews and final merge
### Issue Reporting

### Reporting In
If you encounter a bug or have a feature request, please check our issues page first to see if someone else has already reported it.
If not, please file a new issue, providing as much relevant information as possible.

- See something, say something: bugs, documentation
- Propose new feature requests to Neural Magic
### Pull Requests & Code Reviews

### Helping Others
Please check the PR checklist in the [PR template](.github/PULL_REQUEST_TEMPLATE.md) for detailed guide for contribution.

- Answer open discussion topics
- Spread the word about SparseML
- Teach and empower others. This is the way!
### Thank You

## Bugs and Feature Requests

Please search through existing issues and requests first to avoid duplicates. Neural Magic will work with you further to take next steps.

- Go to: [GitHub Issues](https://github.com/vllm-project/llm-compressor/issues)

For bugs, include:

- brief summary
- OS/Environment details
- steps to reproduce (s.t.r.)
- code snippets, screenshots/casts, log content, sample models
- add the GitHub label "bug" to your post

For feature requests, include:

- problem you’re trying to solve
- community benefits
- other relevant details to support your proposal
- add the GitHub label "enhancement" to your post

For documentation edits, include:

- current state, proposed state
- if applicable, screenshots/casts
- add the GitHub label "documentation" to your post

## Question or Problem

Sign up or log in to our [**Neural Magic Community Slack**](https://neuralmagic.com/community/). We are growing the community member by member and happy to see you there. Don’t forget to search through existing discussions to avoid duplication! Thanks!

## Developing SparseML

Made it this far? Review [Developing SparseML](DEVELOPING.md) to get started.
Finally, thank you for taking the time to read these guidelines and for your interest in contributing to LLM Compressor.
Your contributions make LLM Compressor a great tool for everyone!
50 changes: 8 additions & 42 deletions DEVELOPING.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,7 @@
<!--
Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
# Developing LLM Compressor

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

TODO: update for upstream push

# Developing SparseML

SparseML is developed and tested using Python 3.8-3.11.
To develop SparseML, you will also need the development dependencies and to follow the styling guidelines.
LLM Compressor is developed and tested using Python 3.8-3.11.
To develop LLM Compressor, you will also need the development dependencies and to follow the styling guidelines.

Here are some details to get started.

Expand All @@ -33,17 +15,7 @@ cd llm-compressor
python3 -m pip install -e "./[dev]"
```

This will clone the SparseML repo, install it, and install the development dependencies.

To develop framework specific features, you will also need the relevant framework packages.
Those can be installed by adding the framework name to the install extras. Frameworks include
`torch`, `keras`, and `tensorflow_v1`. For example:
```bash
python3 -m pip install -e "./[dev,torch]"
```

Note: Running all pytorch tests using `make test TARGETS=torch`, also requires `torchvision`
and `onnxruntime` install all these dependencies using `python3 -m pip install -e "./[dev, torch, torchvision, onnxruntime]"`
This will clone the LLM Compressor repo, install it, and install the development dependencies.

**Code Styling and Formatting checks**

Expand All @@ -52,22 +24,16 @@ make style
make quality
```

This will run automatic code styling using `black` and `isort` and test that the
This will run automatic code styling using `ruff`, `flake8`, `black`, and `isort` to test that the
repository's code matches its standards.

**EXAMPLE: test changes locally**

```bash
make test TARGETS=<CSV of frameworks to run>
make test
```

This will run the targeted SparseML unit tests for the frameworks specified.
The targets should be specified, because not all framework dependencies can be installed to run all tests.

To run just PyTorch tests, run
```bash
make test TARGETS=pytorch
```
This will run the targeted LLM Compressor unit tests for the frameworks specified.

File any error found before changes as an Issue and fix any errors found after making changes before submitting a Pull Request.

Expand All @@ -92,7 +58,7 @@ File any error found before changes as an Issue and fix any errors found after m
3. Add a remote to keep up with upstream changes.

```bash
git remote add upstream https://github.com/neuralmagic/sparseml.git
git remote add upstream https://github.com/vllm-project/llm-compressor.git
```

If you already have a copy, fetch upstream changes.
Expand Down
22 changes: 3 additions & 19 deletions examples/finetuning/configure_fsdp.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,15 @@
<!--
Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Configuring FSDP for Sparse Finetuning

An example FSDP configuration file, `example_fsdp_config.yaml`, is provided in this
folder. It can be used out of the box by editting the `num_processes` parameter to
folder. It can be used out of the box by editing the `num_processes` parameter to
fit the number of GPUs on your machine.

You can also customize your own config file by running the following prompt
```
accelerate config
```

An FSDP config file can be passed to the SparseML finetuning script like this:
An FSDP config file can be passed to the LLM Compressor finetuning script like this:
```
accelerate launch --config_file example_fsdp_config.yaml --no_python sparseml.transformers.text_generation.finetune
accelerate launch --config_file example_fsdp_config.yaml --no_python llmcompressor.transformers.text_generation.finetune
```
2 changes: 1 addition & 1 deletion examples/quantization/llama7b_one_shot_quantization.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Creating a Quantized Llama Model in One Shot

Quantizing a model to a lower precision can save on both memory and speed at inference time.
This example demonstrates how to use the SparseML API to quantize a Llama model from 16 bits
This example demonstrates how to use the LLM Compressor API to quantize a Llama model from 16 bits
to 4 bits and save it to a compressed-tensors format for inference with vLLM.

## Step 1: Select a model and dataset
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
model_stub, torch_dtype=torch.bfloat16, device_map="auto"
)

# uses SparseML's built-in preprocessing for ultra chat
# uses LLM Compressor's built-in preprocessing for ultra chat
dataset = "ultrachat-200k"

# save location of quantized model
Expand Down
Loading

0 comments on commit 92cd10c

Please sign in to comment.