feat(converter): fix llama.cpp build and conversion paths #39

leonvanbokhorst · 2024-11-06T06:42:01Z

Update binary name from 'quantize' to 'llama-quantize'
Add proper path handling for convert_hf_to_gguf.py
Fix missing convert_script_path attribute
Improve error handling and debug messages
Add Metal support for Apple Silicon

This commit ensures proper building and execution of the model conversion
pipeline on Apple Silicon (M-series) machines, with correct path resolution
for all required binaries and scripts.

Summary by Sourcery

Fix the build and conversion paths for llama.cpp, including updating the binary name to 'llama-quantize', adding Metal support for Apple Silicon, and improving error handling and path resolution for the conversion script.

New Features:

Add Metal support for Apple Silicon in the llama.cpp build process.

Bug Fixes:

Fix path handling for convert_hf_to_gguf.py script to ensure correct execution.

Enhancements:

Update binary name from 'quantize' to 'llama-quantize' for clarity and consistency.
Improve error handling and debug messages in the model conversion pipeline.

- Update binary name from 'quantize' to 'llama-quantize' - Add proper path handling for convert_hf_to_gguf.py - Fix missing convert_script_path attribute - Improve error handling and debug messages - Add Metal support for Apple Silicon This commit ensures proper building and execution of the model conversion pipeline on Apple Silicon (M-series) machines, with correct path resolution for all required binaries and scripts.

This reverts commit 64634d8.

- Update binary name from 'quantize' to 'llama-quantize' - Add proper path handling for convert_hf_to_gguf.py - Fix missing convert_script_path attribute - Improve error handling and debug messages - Add Metal support for Apple Silicon This commit ensures proper building and execution of the model conversion pipeline on Apple Silicon (M-series) machines, with correct path resolution for all required binaries and scripts.

sourcery-ai · 2024-11-06T06:42:10Z

Reviewer's Guide by Sourcery

This PR implements a model conversion pipeline for LLaMA models, specifically targeting Apple Silicon compatibility. The implementation includes a new LLMConverter class that handles the end-to-end process of downloading, converting, quantizing, and uploading LLaMA models to the Hugging Face Hub, with proper Metal support for Apple Silicon machines.

Class diagram for the new LLMConverter class

classDiagram
    class LLMConverter {
        - String model_id
        - String model_name
        - List~String~ quantization_methods
        - String hf_token
        - String username
        - Path base_path
        - Path convert_script_path
        + __init__(model_id: String, quantization_methods: List~String~, hf_token: String, username: String)
        + setup_llamacpp() void
        + download_model() void
        + convert_to_fp16() Path
        + quantize_model(fp16_path: Path) void
        + upload_to_hub() void
        + run() void
    }
    note for LLMConverter "Handles downloading, converting, quantizing, and uploading LLaMA models"

File-Level Changes

Change	Details	Files
Implementation of the LLaMA model conversion pipeline	Created LLMConverter class to manage the conversion workflow Added Metal support compilation flags for Apple Silicon Implemented model download from Hugging Face Added FP16 conversion functionality Implemented model quantization with multiple methods Added Hugging Face Hub upload capability	`src/12_llm_gguf_conversion.py`
Enhanced build and execution environment setup	Updated quantize binary name to 'llama-quantize' Added proper path resolution for convert_hf_to_gguf.py Implemented comprehensive error handling and debug messages Added requirements installation from llama.cpp	`src/12_llm_gguf_conversion.py`
Version control configuration update	Modified gitignore settings	`.gitignore`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:

Overall Comments:

Consider moving the hardcoded configuration values (MODEL_ID, QUANTIZATION_METHODS, USERNAME) from main() to a configuration file or environment variables for better reusability and maintainability.

Here's what I looked at during the review

🟡 General issues: 5 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟡 Complexity: 1 issue found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2024-11-06T06:43:12Z

src/12_llm_gguf_conversion.py

+            print("Conversion completed successfully!")
+
+        except subprocess.CalledProcessError as e:
+            print(f"Error during execution: {e}", file=sys.stderr)


suggestion (bug_risk): Consider adding cleanup of temporary files in error paths

When errors occur, any partially downloaded models or incomplete conversions should be cleaned up to avoid leaving invalid artifacts and wasting disk space.

Suggested change

print(f"Error during execution: {e}", file=sys.stderr)

if os.path.exists(output_path):

os.remove(output_path)

print(f"Error during execution: {e}", file=sys.stderr)

sourcery-ai · 2024-11-06T06:43:12Z

src/12_llm_gguf_conversion.py

+    # Configuration
+    MODEL_ID = "leonvanbokhorst/Llama-3.2-1B-Instruct-Complaint"
+    QUANTIZATION_METHODS = ["q4_k_m", "q5_k_m"]
+    HF_TOKEN = os.getenv("HF_TOKEN")


issue: Add validation for HF_TOKEN environment variable

The HF_TOKEN should be checked for None and raise a clear error if not set, rather than failing later with a potentially confusing error message.

sourcery-ai · 2024-11-06T06:43:12Z

src/12_llm_gguf_conversion.py

+    ):
+        self.model_id = model_id
+        self.model_name = model_id.split("/")[-1]
+        self.quantization_methods = [m.strip() for m in quantization_methods]


suggestion (performance): Validate quantization methods before starting the conversion process

Consider maintaining a list of valid quantization methods and validating against it early to fail fast before downloading large models.

Suggested change

self.quantization_methods = [m.strip() for m in quantization_methods]

VALID_QUANTIZATION_METHODS = {"q2_k", "q3_k_l", "q3_k_m", "q3_k_s", "q4_0", "q4_1", "q4_k_m", "q4_k_s", "q5_0", "q5_1", "q5_k_m", "q5_k_s", "q6_k", "q8_0"}

invalid_methods = set(m.strip() for m in quantization_methods) - VALID_QUANTIZATION_METHODS

if invalid_methods:

raise ValueError(f"Invalid quantization methods: {invalid_methods}")

self.quantization_methods = [m.strip() for m in quantization_methods]

sourcery-ai · 2024-11-06T06:43:13Z

src/12_llm_gguf_conversion.py

+
+    def download_model(self) -> None:
+        """Download model from Hugging Face."""
+        subprocess.run(["git", "lfs", "install"], check=True)


suggestion: Add check for git-lfs availability

Check if git-lfs is installed and provide a clear error message if it's not available.

Suggested change

subprocess.run(["git", "lfs", "install"], check=True)

try:

subprocess.run(["git", "lfs", "install"], check=True)

except FileNotFoundError:

raise RuntimeError("git-lfs is not installed. Please install git-lfs before proceeding.")

sourcery-ai · 2024-11-06T06:43:13Z

src/12_llm_gguf_conversion.py

+        print(f"Installing requirements from: {requirements_file}")
+        subprocess.run(["pip", "install", "-r", str(requirements_file)], check=True)
+
+    def download_model(self) -> None:


suggestion: Add disk space verification before starting downloads

Check available disk space against expected model size requirements to avoid failed downloads or conversions due to insufficient space.

def download_model(self) -> None: """Download model from Hugging Face.""" free_space = shutil.disk_usage("/").free required_space = 15 * 1024 * 1024 * 1024 # 15GB minimum if free_space < required_space: raise RuntimeError(f"Insufficient disk space. Need {required_space/1024**3:.1f}GB, have {free_space/1024**3:.1f}GB free")

sourcery-ai · 2024-11-06T06:43:13Z

src/12_llm_gguf_conversion.py

+from huggingface_hub import create_repo, HfApi
+
+
+class LLMConverter:


issue (complexity): Consider restructuring the monolithic LLMConverter class into focused utility functions.

The code would be simpler and more maintainable if restructured into focused utility functions rather than a monolithic class. This allows independent testing and reuse of components. Here's an example:

def setup_llamacpp(base_path: Path) -> tuple[Path, Path]: """Setup and build llama.cpp, returning paths to key executables.""" llamacpp_path = base_path / "llama.cpp" # ... build steps ... return llamacpp_path / "llama-quantize", llamacpp_path / "convert_hf_to_gguf.py" def download_model(model_id: str) -> Path: """Download model and return its path.""" model_name = model_id.split("/")[-1] if not Path(model_name).exists(): subprocess.run(["git", "clone", f"https://huggingface.co/{model_id}"], check=True) return Path(model_name) def convert_and_quantize( model_path: Path, convert_script: Path, quantize_binary: Path, methods: List[str] ) -> List[Path]: """Convert to fp16 and quantize, returning paths to generated files.""" # Conversion and quantization logic return quantized_paths def upload_files( repo_name: str, username: str, folder_path: Path, token: str ) -> None: """Upload generated files to HF Hub.""" api = HfApi() # ... upload logic ... def convert_model( model_id: str, quantization_methods: List[str], username: str, hf_token: str ) -> None: """Main orchestration function.""" base_path = Path.cwd() quantize_binary, convert_script = setup_llamacpp(base_path) model_path = download_model(model_id) quantized_files = convert_and_quantize( model_path, convert_script, quantize_binary, quantization_methods ) upload_files(f"{model_path.name}-GGUF", username, model_path, hf_token)

This approach:

Makes each component independently testable

Removes shared state and reduces coupling

Allows reuse of individual components

Makes the flow of data explicit through function parameters

Maintains the same functionality but with clearer boundaries

sourcery-ai · 2024-11-06T06:43:13Z

src/12_llm_gguf_conversion.py

+    def run(self):
+        """Execute the full conversion pipeline."""
+        try:
+            print("Setting up llama.cpp...")


issue (code-quality): Extract code out into method (extract-method)

leonvanbokhorst added 3 commits November 6, 2024 07:36

Revert "feat(converter): fix llama.cpp build and conversion paths"

db4bc10

This reverts commit 64634d8.

leonvanbokhorst merged commit d5edec6 into main Nov 6, 2024
1 check passed

leonvanbokhorst deleted the gguf-converter branch November 6, 2024 06:42

sourcery-ai bot approved these changes Nov 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(converter): fix llama.cpp build and conversion paths #39

feat(converter): fix llama.cpp build and conversion paths #39

leonvanbokhorst commented Nov 6, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Nov 6, 2024 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

sourcery-ai bot Nov 6, 2024

sourcery-ai bot Nov 6, 2024

sourcery-ai bot Nov 6, 2024

sourcery-ai bot Nov 6, 2024

sourcery-ai bot Nov 6, 2024

sourcery-ai bot Nov 6, 2024

sourcery-ai bot Nov 6, 2024

-        self.quantization_methods = [m.strip() for m in quantization_methods]
+        VALID_QUANTIZATION_METHODS = {"q2_k", "q3_k_l", "q3_k_m", "q3_k_s", "q4_0", "q4_1", "q4_k_m", "q4_k_s", "q5_0", "q5_1", "q5_k_m", "q5_k_s", "q6_k", "q8_0"}
+        invalid_methods = set(m.strip() for m in quantization_methods) - VALID_QUANTIZATION_METHODS
+        if invalid_methods:
+            raise ValueError(f"Invalid quantization methods: {invalid_methods}")
+        self.quantization_methods = [m.strip() for m in quantization_methods]

		from huggingface_hub import create_repo, HfApi


		class LLMConverter:

feat(converter): fix llama.cpp build and conversion paths #39

feat(converter): fix llama.cpp build and conversion paths #39

Conversation

leonvanbokhorst commented Nov 6, 2024 • edited by sourcery-ai bot Loading

Summary by Sourcery

sourcery-ai bot commented Nov 6, 2024 • edited Loading

Reviewer's Guide by Sourcery

Class diagram for the new LLMConverter class

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Nov 6, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 6, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 6, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 6, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 6, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 6, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 6, 2024

Choose a reason for hiding this comment

leonvanbokhorst commented Nov 6, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Nov 6, 2024 •

edited

Loading