Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the problem about mosaicml-streaming #106

Closed
sysusicily opened this issue May 11, 2023 · 1 comment · Fixed by #110
Closed

the problem about mosaicml-streaming #106

sysusicily opened this issue May 11, 2023 · 1 comment · Fixed by #110

Comments

@sysusicily
Copy link

when I follow pip install -e ".[gpu]",I find the error about mosaicml-streaming
#-------------------------------------------------------------------------------------------
root@7730f5bd29fa:/home/mosaicml/llm-foundry# pip list|grep stream
mosaicml-streaming 0.2.1
root@7730f5bd29fa:/home/mosaicml/llm-foundry#
root@7730f5bd29fa:/home/mosaicml/llm-foundry# python scripts/data_prep/convert_dataset_hf.py --dataset c4 --data_subset en --out_root ./my-copy-c4 --splits train_small val_small --concat_tokens 2048 --tokenizer EleutherAI/gpt-neox-20b --eos_text '<|endoftext|>'
Traceback (most recent call last):
File "/home/mosaicml/llm-foundry/llmfoundry/init.py", line 8, in
from llmfoundry.data import (ConcatTokensDataset,
File "/home/mosaicml/llm-foundry/llmfoundry/data/init.py", line 5, in
from llmfoundry.data.denoising import (MixtureOfDenoisersCollator,
File "/home/mosaicml/llm-foundry/llmfoundry/data/denoising.py", line 19, in
from llmfoundry.data.text_data import StreamingTextDataset
File "/home/mosaicml/llm-foundry/llmfoundry/data/text_data.py", line 15, in
from streaming import Stream, StreamingDataset
ImportError: cannot import name 'Stream' from 'streaming' (/usr/lib/python3/dist-packages/streaming/init.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/mosaicml/llm-foundry/scripts/data_prep/convert_dataset_hf.py", line 19, in
from llmfoundry.data.datasets import ConcatTokensDataset, NoConcatDataset
File "/home/mosaicml/llm-foundry/llmfoundry/init.py", line 32, in
raise ImportError(
ImportError: Please make sure to pip install . to get the requirements for the LLM example.
root@7730f5bd29fa:/home/mosaicml/llm-foundry#

@hanlint
Copy link
Collaborator

hanlint commented May 11, 2023

Hello @sysusicily , the Stream functionality was introduced in v0.4.0 (Release Notes), but you are on v0.2.1. Can you try:

pip install mosaicml-streaming==0.4.1

We will update our install requirements to have this minimum dependency, thank you!

bmosaicml pushed a commit that referenced this issue Jun 6, 2023
* mmllllm

* factor out fsdp to common, support enc-dec

Moves FSDP HF utils to common so they can be used in UL2, and adds support for encoder-decoder models including t5, mt5, t0pp, bart, pegasus, marian, prophetnet

* delete m2l4m yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants