Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
add validation script
update
change token count function
reorganize cells
Add unit tests
Add a printout for CPT
update question
Add questions
Fix lints
update format
update
nb source
add validation script
update
change token count function
reorganize cells
Add unit tests
Add a printout for CPT
update question
Add questions
Fix lints
update format
update
nb source
Remove license insert for validation notebook
Add validation utils
Minor cleanups (Minor cleanups mosaicml/llm-foundry#858)
nits
logger
add log
lint
update utils/init.py to include extra validation functions
update notebook
update
update
Read UC delta table (Read UC delta table mosaicml/llm-foundry#773)
initial commit
use databricks-sql to read delta table and convert to json
update
update
update
add mocked unittest
Fix lints
update
update
restructure code
Add timer for optimizing
Add db-connect
add wrapper
update
add install dbconnect
update
update
patch dbconnect to allow multiple return formats
update
add arrow
use compression
clean up
Add cluster rt check
Fix lints
remove patch.py for CI
update
update
updat
update
fix tests
fix lint
update
update
Add more tests
update
update
update
change to download_json
update
fix lints
Add decompressed option for arrow
format json to jsonl
Add comments
Make cf_collect_type global option
fix comments
fix lints
fix comments
Fix lints
change to use workspaceclient
Add CPT support
Rewire method assignment logic
Fix bug in stripping https
Add tests for rewired method assignment logic
Fix lints
Fix lints
Removed logger set_level
Remove pyspark. It conflicts with databricks-connect
Update the comment
skip cluster version check when cluster_id is serverless
Add use_serverless flag
update tests with use_serverless flag
Fix lints
Add download remote function to util
update
remove fused layernorm (Remove fused layernorm (deprecated in composer) mosaicml/llm-foundry#859)
update
update
update
update
update
update
update
update
update
Remove hardcoded combined.jsonl with a flag (Remove hardcoded combined.jsonl with a flag mosaicml/llm-foundry#861)
Remove hardcoded combined.jsonl with a flag
update
change output_json_path output_json_folder
bump (Bump to turbo v8 mosaicml/llm-foundry#828)
Add dask and dataframe_to_mds
update
update
update
update
Add notebook
update
update
remove script and tests, keep notebook
update
update
update
update
Always initialize dist (Always initialize dist mosaicml/llm-foundry#864)
fix dev
lint
remove gpu
updated notebook
remove scripts keep notebook
update notebook. rephrase.
update
Add response tokens
update
update
Disable MDSWrite, return token counts
Change plot settings
update notebook
update
update notebook
update
update notebook
update pip install link
Change done file location
Create the dest folder
update notebook
update