Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q and A not working for Youtube #1643

Closed
anushaharish538 opened this issue May 23, 2024 · 7 comments
Closed

Q and A not working for Youtube #1643

anushaharish538 opened this issue May 23, 2024 · 7 comments
Labels

Comments

@anushaharish538
Copy link

Hi,

when I add YouTube video url and click on ingest it says docs:1 chunks:3, but when I ask any question about the video its connecting to internet and giving answer. It's working fine for pdf.

Regards,
Anusha

@pseudotensor
Copy link
Collaborator

Review the document viewer and what is in the database. Nominally for youtube video, will get both 10 frames + audio transcription.

What does the console say when you upload a video? It should show lots of details if you ran in --verbose mode.

@pseudotensor pseudotensor added the type/question Question label May 23, 2024
@anushaharish538
Copy link
Author

anushaharish538 commented May 23, 2024

@pseudotensor Thank you for the reply. I'm using CPU. will the YouTube chat works with CPU?

I'm getting the following error.
Traceback (most recent call last):
File "/home/anushaharish538/as/h2ogpt/src/gpt_langchain.py", line 4034, in file_to_doc
docs1c = model_loaders['asr'].load(from_youtube=True)
File "/home/anushaharish538/as/h2ogpt/src/audio_langchain.py", line 387, in load
self.load_model()
File "/home/anushaharish538/as/h2ogpt/src/audio_langchain.py", line 368, in load_model
self.model = OpenAIWhisperParserLocal(device=self.device,
File "/home/anushaharish538/as/h2ogpt/src/audio_langchain.py", line 183, in init
self.pipe = pipeline(
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/pipelines/init.py", line 906, in pipeline
framework, model = infer_framework_load_model(
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/pipelines/base.py", line 283, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3709, in from_pretrained
check_tied_parameters_on_same_device(tied_params, device_map)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 598, in check_tied_parameters_on_same_device
tie_param_devices[param] = _get_param_device(param, device_map)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 580, in _get_param_device
return _get_param_device(parent_param, device_map)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 580, in _get_param_device
return _get_param_device(parent_param, device_map)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 580, in _get_param_device
return _get_param_device(parent_param, device_map)
[Previous line repeated 1 more time]
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 575, in _get_param_device
return device_map[param]
TypeError: 'set' object is not subscriptable
ASR: 'set' object is not subscriptable: None
Failed to ingest https://www.youtube.com/watch?v=U1JMy1LTSu8 due to Traceback (most recent call last):
File "/home/anushaharish538/as/h2ogpt/src/gpt_langchain.py", line 4853, in path_to_doc1
res = file_to_doc(file,
File "/home/anushaharish538/as/h2ogpt/src/gpt_langchain.py", line 4081, in file_to_doc
raise ValueError("%s had no valid text and no meta data was parsed: %s" % (file, str(e)))
ValueError: https://www.youtube.com/watch?v=U1JMy1LTSu8 had no valid text and no meta data was parsed: 'set' object is not subscriptable

@pseudotensor
Copy link
Collaborator

Try setting --asr_gpu=False

@anushaharish538
Copy link
Author

Thank You for the help. I tried but still same issue.
I'm using open ai
OPENAI_API_KEY=Key python generate.py --base_model=gpt-4o --prompt_type=mixtral --inference_server=openai_chat --score_model=None --append_sources_to_answer=True --langchain_mode=UserData --asr_gpu=False

@anushaharish538
Copy link
Author

@pseudotensor I think its issue with database . can you please confirm. I'm getting python src/make_db.pyExceptions: 0/0 []Traceback (most recent call last): File "/home/anushaharish538/as/h2ogpt/src/make_db.py", line 403, in
H2O_Fire(make_db_main)
File "/home/anushaharish538/as/h2ogpt/src/utils.py", line 73, in H2O_Fire
fire.Fire(component=component, command=args)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/anushaharish538/as/h2ogpt/src/make_db.py", line 389, in make_db_main
assert len(sources) > 0 or not fail_if_no_sources, "No sources found"
AssertionError: No sources found

It's issue with only YouTube . Pdfs working good

@anushaharish538
Copy link
Author

@pseudotensor can you please tell me , will the YouTube chat works with CPU?

Thank you.

@pseudotensor
Copy link
Collaborator

asr_gpu -> False already if no GPUs, so it's not that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants