Add YAML configuration support to llama-cli with --config flag #2

devin-ai-integration · 2025-09-12T12:21:30Z

Make sure to read the contributing guidelines before submitting a PR

Summary

This PR adds comprehensive YAML configuration file support to llama-cli with a new --config flag. Users can now specify model parameters, sampling settings, and other options via YAML files instead of lengthy command lines, while maintaining full backward compatibility.

Link to Devin run: https://app.devin.ai/sessions/05f5de10f98f4d50a16fb685814bec88
Requested by: Jaime Mizrachi (@jaime-leo)

Key Features

--config <path> flag: Load parameters from YAML configuration files
Proper precedence: Command-line flags > YAML config > defaults
Strict validation: Unknown YAML keys are rejected with helpful error messages
Smart path resolution: Relative paths in YAML resolve relative to the config file directory
Comprehensive coverage: Supports model, sampling, speculative decoding, and other parameter groups

Example Usage

# Use YAML config only
llama-cli --config configs/minimal.yaml

# Override YAML values with flags  
llama-cli --config configs/minimal.yaml -n 32 --temp 0.8

Example minimal.yaml:

model:
  path: models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
n_ctx: 256
sampling:
  seed: 42
  temp: 0.0
prompt: "Hello from YAML"
n_predict: 16
simple_io: true

Implementation Details

Architecture

common/config.cpp: YAML parser with validation and path resolution
Pre-scan approach: --config is processed before normal argument parsing to ensure proper precedence
yaml-cpp dependency: Added via FetchContent with system package fallback

Testing

Unit tests: Config parsing, error handling, path resolution (test-config-yaml.cpp)
Integration tests: YAML-only, YAML+overrides, parity verification against equivalent flags
CI workflow: Dedicated lightweight workflow with tiny model for testing

Files Changed

common/config.{h,cpp} - YAML configuration loader (new)
common/CMakeLists.txt - yaml-cpp dependency
common/arg.cpp - --config flag and pre-scan logic
tests/test-config-yaml.cpp - Unit tests (new)
tests/CMakeLists.txt - Integration test registration
tests/test-yaml-parity.sh - Parity verification script (new)
configs/{minimal,override}.yaml - Example configurations (new)
.github/workflows/config.yml - CI workflow for config tests (new)
README.md - Usage documentation and examples

Review Checklist

High Priority:

Verify pre-scan logic in arg.cpp doesn't interfere with existing argument parsing edge cases
Check path resolution logic works correctly across platforms (Windows/Unix path separators)
Confirm YAML key validation list in config.cpp covers all relevant common_params fields
Test error handling for malformed YAML files and edge cases

Medium Priority:

Verify CMake FetchContent integration works reliably across build environments
Ensure integration tests are deterministic and don't depend on external model availability
Check that precedence logic handles complex flag combinations correctly
Review performance impact of YAML parsing on startup time

Low Priority:

Validate example configs match documented usage patterns
Confirm CI workflow is appropriately scoped and efficient

- Add common/config.h and common/config.cpp for YAML configuration loading - Implement common_load_yaml_config() with validation and path resolution - Add yaml-cpp dependency via FetchContent in common/CMakeLists.txt - Support nested config structure (model, sampling, speculative, vocoder) - Reject unknown keys with descriptive error messages - Resolve relative paths relative to YAML file directory Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

- Add --config flag to argument parser with pre-scan before flag parsing - Implement YAML loading before common_params_parse_ex to ensure proper precedence - Flags override YAML values, YAML overrides defaults - Add --config option to usage help and argument definitions Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

- Add test-config-yaml.cpp with unit tests for config parsing and error cases - Add three CTest integration tests: yaml-only, yaml-plus-overrides, parity - Add test-yaml-parity.sh script for comparing YAML vs flags output - Gate integration tests on model file existence to avoid CI failures - Use absolute paths in parity test to handle CTest working directory Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

- Add configs/minimal.yaml with basic model and sampling configuration - Add configs/override.yaml with different settings for override testing - Use relative paths that resolve correctly from configs/ directory - Include simple_io and conversation mode settings for deterministic testing Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

- Add YAML config section with usage examples and precedence rules - Document --config flag and example configurations - Show minimal.yaml and override.yaml usage patterns - Explain flags > yaml > defaults precedence - Document relative path resolution behavior Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

- Add config.yml workflow to test YAML configuration functionality - Download tiny model artifact for integration testing - Build with tests enabled and run YAML-specific test suite - Isolated workflow to avoid impacting main CI performance Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

devin-ai-integration · 2025-09-12T12:21:33Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

- Add -DLLAMA_CURL=OFF to test-yaml-config workflow - Remove trailing whitespace from workflow, config.cpp, arg.cpp - Add yaml-cpp build flags to disable tests/tools/contrib - Fix spacing consistency in config.cpp Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

- Update CI workflow to use existing tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf model - Update all config files and tests to reference the existing model - Scope yaml-cpp dependency to LLAMA_BUILD_TOOLS only with compile guards - Suppress all warnings for yaml-cpp to avoid -Werror failures - This resolves the 404 model download and cross-platform build issues Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

…; run unit+integration tests - Use git-lfs to download stories15M-q4_0.gguf from ggml-org/models - Download model before CMake configure so if(EXISTS ...) condition works - Update all config files and tests to use consistent model path - Run comprehensive YAML config test suite in CI Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

- Replace git-lfs clone of entire ggml-org/models repo (10.91 GiB) - Use direct wget download of stories15M-q4_0.gguf (19MB only) - Prevents 'No space left on device' errors in GitHub Actions Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

…ONFIG_YAML guards - Wrap --config option definition with LLAMA_ENABLE_CONFIG_YAML in arg.cpp - Guard all YAML-dependent code sections in config.cpp and config.h - Ensures yaml-cpp is only compiled when LLAMA_BUILD_TOOLS=ON - Prevents platform builds that don't need tools from pulling yaml-cpp dependencies Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

…gml-org#16038) Initalizing RESERVED_NAME in is_reserved_name() is not thread safe and leads to corrupted memory when used from multiple threads as can be seen in the asan trace below. This fixes the initialization to make it thread-safe. #0 0x000100abd018 in std::__1::pair<std::__1::__hash_iterator<std::__1::__hash_node<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, void*>*>, bool> std::__1::__hash_table<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__emplace_unique_key_args<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) __hash_table:1565 #1 0x000100ab0320 in SchemaConverter::visit(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) json-schema-to-grammar.cpp:802 #2 0x000100aafc48 in std::__1::__function::__func<build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&)::$_2, std::__1::allocator<build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&)::$_2>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>::operator()(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&) function.h:319 #3 0x000100a2c938 in std::__1::__function::__func<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0::operator()(common_grammar_builder const&) const::'lambda'(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&), std::__1::allocator<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0::operator()(common_grammar_builder const&) const::'lambda'(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>, void (nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)>::operator()(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&) function.h:319 #4 0x000100a139f8 in foreach_function(nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&, std::__1::function<void (nlohmann::json_abi_v3_12_0::basic_json<nlohmann::json_abi_v3_12_0::ordered_map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool, long long, unsigned long long, double, std::__1::allocator, nlohmann::json_abi_v3_12_0::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char>>, void> const&)> const&) chat.cpp:762 #5 0x000100a2a7f4 in std::__1::__function::__func<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0, std::__1::allocator<common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool)::$_0>, void (common_grammar_builder const&)>::operator()(common_grammar_builder const&) function.h:319 #6 0x000100aa98f4 in build_grammar(std::__1::function<void (common_grammar_builder const&)> const&, common_grammar_options const&) json-schema-to-grammar.cpp:982 #7 0x0001009c9314 in common_chat_params_init_llama_3_x(minja::chat_template const&, templates_params const&, bool) chat.cpp:1110 #8 0x0001009b8afc in common_chat_templates_apply_jinja(common_chat_templates const*, common_chat_templates_inputs const&) chat.cpp:1992 #9 0x0001009b533c in common_chat_templates_apply(common_chat_templates const*, common_chat_templates_inputs const&) chat.cpp:2074 #10 0x000100810120 in llamacpp_apply_chat_template+0x724 (predict_oai-98384e17fb94e863:arm64+0x100090120) ... ==45482==Register values: x[0] = 0x00006020004147f8 x[1] = 0x00006080000013c8 x[2] = 0x0000000000000000 x[3] = 0x0000604006289738 x[4] = 0x0000000000000002 x[5] = 0x0000000000000001 x[6] = 0x04034000004b4000 x[7] = 0x0000000000000001 x[8] = 0xbebebebebebebebe x[9] = 0x17d7d7d7d7d7d7d7 x[10] = 0x00000c04000828ff x[11] = 0x0000000000000001 x[12] = 0x000000002018d383 x[13] = 0x0000000000000000 x[14] = 0xfa0000000000fafa x[15] = 0x000010700001ffff x[16] = 0x000000019dc012c0 x[17] = 0x00000001021284f8 x[18] = 0x0000000000000000 x[19] = 0x00000001700acdc0 x[20] = 0x0000000000000002 x[21] = 0x000000002018d384 x[22] = 0x16dd16fd2e731151 x[23] = 0x0000007000020000 x[24] = 0x0000000100c69c08 x[25] = 0x0000000100c69c20 x[26] = 0x00006080000013c7 x[27] = 0x0000000100c69c00 x[28] = 0x00000001700acd60 fp = 0x00000001700aceb0 lr = 0x0000000100abce30 sp = 0x00000001700acd60 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV __hash_table:1565 in std::__1::pair<std::__1::__hash_iterator<std::__1::__hash_node<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, void*>*>, bool> std::__1::__hash_table<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>::__emplace_unique_key_args<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) Thread T5 created by T0 here: #0 0x0001020b99d4 in pthread_create+0x5c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x359d4) #1 0x000100873910 in std::sys::pal::unix::thread::Thread::new::h77254fdd87a28e05+0x118 (predict_oai-98384e17fb94e863:arm64+0x1000f3910) #2 0x0001007c7a1c in test::run_test::haeb3c2bcd5ed6cf6+0x76c (predict_oai-98384e17fb94e863:arm64+0x100047a1c) #3 0x0001007aedb0 in test::console::run_tests_console::he9d142d704f3a986+0x149c (predict_oai-98384e17fb94e863:arm64+0x10002edb0) #4 0x0001007c5758 in test::test_main::hf86a5e20735245b9+0x118 (predict_oai-98384e17fb94e863:arm64+0x100045758) #5 0x0001007c5da0 in test::test_main_static::h61ee9c8fd30abca0+0x54 (predict_oai-98384e17fb94e863:arm64+0x100045da0) ... ==45482==ABORTING

devin-ai-integration bot and others added 6 commits September 12, 2025 12:19

github-actions bot added testing devops labels Sep 12, 2025

devin-ai-integration bot and others added 6 commits September 12, 2025 13:31

fix(editorconfig): remove trailing whitespace in CMakeLists.txt

3b6d738

Co-Authored-By: Jaime Mizrachi <jaime@cognition.ai>

This was referenced Oct 22, 2025

Compile bug: Linker crash with tagged pointer truncation on Android 15 (Termux) #49

Open

Eval bug: HIP gfx908 (MI100) cublass error when prompt is too long. #154

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add YAML configuration support to llama-cli with --config flag #2

Add YAML configuration support to llama-cli with --config flag #2

Uh oh!

devin-ai-integration bot commented Sep 12, 2025

Uh oh!

devin-ai-integration bot commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Add YAML configuration support to llama-cli with --config flag #2

Are you sure you want to change the base?

Add YAML configuration support to llama-cli with --config flag #2

Uh oh!

Conversation

devin-ai-integration bot commented Sep 12, 2025

Summary

Key Features

Example Usage

Implementation Details

Architecture

Testing

Files Changed

Review Checklist

Uh oh!

devin-ai-integration bot commented Sep 12, 2025

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants