Skip to content

torchtune training fails to validate dataset #1849

@booxter

Description

@booxter

System Info

.

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

When I try to use torchtune for post-training, it no longer works and fails with:

AttributeError: 'DatasetWithACL' object has no attribute 'dataset_schema'

This happens during dataset validation.

Error executing endpoint route=\'/v1/post-training/supervised-fine-tune\' method=\'post\'\nTraceback (most recent call last):\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/distribution/server/server.py", line 201, in endpoint\n    return await maybe_await(value)\n           ^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/distribution/server/server.py", line 161, in maybe_await\n    return await value\n           ^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/torchtune/post_training.py", line 89, in supervised_fine_tune\n    await recipe.setup()\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py", line 198, in setup\n    self._training_sampler, self._training_dataloader = await self._setup_data(\n                                                        ^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py", line 342, in _setup_data\n    await validate_input_dataset_schema(\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/common/validator.py", line 50, in validate_input_dataset_schema\n    if not dataset_def.dataset_schema or len(dataset_def.dataset_schema) == 0:\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/venv/lib64/python3.11/site-packages/pydantic/main.py", line 984, in __getattr__\n    raise AttributeError(f\'{type(self).__name__!r} object has no attribute {item!r}\')\nAttributeError: \'DatasetWithACL\' object has no attribute \'dataset_schema\'' severity=<LogSeverity.ERROR: 'error'>

This happens since dataset API was changed and the dataset_schema field removed: #1573

Error logs

.

Expected behavior

.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions