Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Youtube ingestion doesn't work #1648

Closed
anushaharish538 opened this issue May 28, 2024 · 3 comments
Closed

Youtube ingestion doesn't work #1648

anushaharish538 opened this issue May 28, 2024 · 3 comments

Comments

@anushaharish538
Copy link

anushaharish538 commented May 28, 2024

Hi,
Youtube ingestion doesn't work .Its converting to audio but 3).
[download] Destination: /tmp/gradio/6ce6b3f2-2e2a-4a1f-81e8-a58785d61295/What Is LangChain? - LangChain + ChatGPT Overview.m4a
[download] 100% of 5.92MiB in 00:00:00 at 39.03MiB/s
[FixupM4a] Correcting container of "/tmp/gradio/6ce6b3f2-2e2a-4a1f-81e8-a58785d61295/What Is LangChain? - LangChain + ChatGPT Overview.m4a"
[ExtractAudio] Not converting audio /tmp/gradio/6ce6b3f2-2e2a-4a1f-81e8-a58785d61295/What Is LangChain? - LangChain + ChatGPT Overview.m4a; file is already in target format m4a
Transcribing part /tmp/gradio/6ce6b3f2-2e2a-4a1f-81e8-a58785d61295/What Is LangChain? - LangChain + ChatGPT Overview.m4a!
Due to a bug fix in huggingface/transformers#28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass language='en'.

Taking long run and killing automatically after sometime.

@pseudotensor I think its issue with database . can you please confirm. I'm getting python src/make_db.pyExceptions: 0/0 []Traceback (most recent call last): File "/home/anushaharish538/as/h2ogpt/src/make_db.py", line 403, in
H2O_Fire(make_db_main)
File "/home/anushaharish538/as/h2ogpt/src/utils.py", line 73, in H2O_Fire
fire.Fire(component=component, command=args)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/anushaharish538/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/anushaharish538/as/h2ogpt/src/make_db.py", line 389, in make_db_main
assert len(sources) > 0 or not fail_if_no_sources, "No sources found"
AssertionError: No sources found

It's issue with only YouTube . Pdfs working good

@pseudotensor
Copy link
Collaborator

pseudotensor commented May 28, 2024

I fixed CPU mode: #1643

However, yes, CPU ASR is very slow. I tried same video as you used in CPU vs. GPU, and CPU ASR goes for a while on my i9 8 core system, maybe 2 minutes.

So I don't think ASR is good on CPU.

However, after that, things actually hang.

Logs are like this for me using CPU:

WARNING! Model override. Using model:  openai/whisper-medium
Using the following model:  openai/whisper-medium
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
No optimum, not using BetterTransformer: Transformers now supports natively BetterTransformer optimizations (torch.nn.functional.scaled_dot_product_attention) for the model type whisper. As such, there is no need to use `model.to_bettertransformers()` or `BetterTransformer.transform(model)` from the Optimum library. Please upgrade to transformers>=4.36 and torch>=2.1.1 to use it. Details: https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-and-memory-efficient-attention-through-pytorchs-scaleddotproductattention.
[youtube] Extracting URL: https://www.youtube.com/watch?v=_v_fgW2SkkQ
[youtube] _v_fgW2SkkQ: Downloading webpage
[youtube] _v_fgW2SkkQ: Downloading ios player API JSON
[youtube] _v_fgW2SkkQ: Downloading m3u8 information
[info] _v_fgW2SkkQ: Downloading 1 format(s): 140
[download] Destination: /tmp/gradio/78817303-11f7-43fc-b98e-89f67d47cff5/What Is LangChain? - LangChain + ChatGPT Overview.m4a
[download] 100% of    5.92MiB in 00:00:00 at 32.78MiB/s  
[FixupM4a] Correcting container of "/tmp/gradio/78817303-11f7-43fc-b98e-89f67d47cff5/What Is LangChain? - LangChain + ChatGPT Overview.m4a"
[ExtractAudio] Not converting audio /tmp/gradio/78817303-11f7-43fc-b98e-89f67d47cff5/What Is LangChain? - LangChain + ChatGPT Overview.m4a; file is already in target format m4a
Transcribing part /tmp/gradio/78817303-11f7-43fc-b98e-89f67d47cff5/What Is LangChain? - LangChain + ChatGPT Overview.m4a!
Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`.

INFO:eta.core.utils: 100% |████████████| 1/1 [2.8s elapsed, 0s remaining, 0.4 videos/s] 
 100% |████████████| 1/1 [2.8s elapsed, 0s remaining, 0.4 videos/s] 
INFO:eta.core.utils: 100% |███████████| 1/1 [29.2ms elapsed, 0s remaining, 34.5 samples/s] 
 100% |███████████| 1/1 [29.2ms elapsed, 0s remaining, 34.5 samples/s] 
INFO:fiftyone.core.metadata:Computing metadata...
Computing metadata...
 100% |███████████| 1/1 [51.5ms elapsed, 0s remaining, 19.4 samples/s] 
 100% |███████| 384/384 [5.3ms elapsed, 0s remaining, 72.3K samples/s]       
INFO:eta.core.utils: 100% |███████████| 1/1 [51.5ms elapsed, 0s remaining, 19.4 samples/s] 
INFO:eta.core.utils: 100% |███████| 384/384 [5.3ms elapsed, 0s remaining, 72.3K samples/s]       
 100% |███████| 384/384 [5.7ms elapsed, 0s remaining, 67.2K samples/s]       
Setting 384 frame filepaths on the input collection that exist on disk but are not recorded on the dataset
INFO:eta.core.utils: 100% |███████| 384/384 [5.7ms elapsed, 0s remaining, 67.2K samples/s]       
INFO:fiftyone.core.video:Setting 384 frame filepaths on the input collection that exist on disk but are not recorded on the dataset
Sampling video frames...
   0% ||----------| 0/1 [2.0ms elapsed, ? remaining, ? samples/s] INFO:fiftyone.core.video:Sampling video frames...
 100% |███████████| 1/1 [4.5s elapsed, 0s remaining, 0.2 samples/s] 
INFO:eta.core.utils: 100% |███████████| 1/1 [4.5s elapsed, 0s remaining, 0.2 samples/s] 
Computing embeddings...
INFO:fiftyone.brain.internal.core.utils:Computing embeddings...
 100% |███████| 384/384 [1.4m elapsed, 0s remaining, 5.0 samples/s]      
INFO:eta.core.utils: 100% |███████| 384/384 [1.4m elapsed, 0s remaining, 5.0 samples/s]      
INFO:fiftyone.brain.similarity:Computing unique samples...
INFO:fiftyone.brain.internal.core.sklearn:Generating index for 384 embeddings...
INFO:fiftyone.brain.internal.core.sklearn:Index complete
INFO:fiftyone.brain.similarity:threshold: 1.000000, kept: 3, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.500000, kept: 6, target: 10
Computing unique samples...
Generating index for 384 embeddings...
Index complete
threshold: 1.000000, kept: 3, target: 10
threshold: 0.500000, kept: 6, target: 10
threshold: 0.250000, kept: 12, target: 10
threshold: 0.375000, kept: 7, target: 10
threshold: 0.312500, kept: 9, target: 10
threshold: 0.281250, kept: 11, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.250000, kept: 12, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.375000, kept: 7, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.312500, kept: 9, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.281250, kept: 11, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.296875, kept: 9, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.289062, kept: 9, target: 10
INFO:fiftyone.brain.similarity:threshold: 0.285156, kept: 10, target: 10
INFO:fiftyone.brain.similarity:Uniqueness computation complete
WARNING:fiftyone.core.collections:Directory '/tmp/gradio/extraction_61cafcb3-c183-4557-b943-f27e15cf5982' already exists; export will be merged with existing files
threshold: 0.296875, kept: 9, target: 10
threshold: 0.289062, kept: 9, target: 10
threshold: 0.285156, kept: 10, target: 10
Uniqueness computation complete
Directory '/tmp/gradio/extraction_61cafcb3-c183-4557-b943-f27e15cf5982' already exists; export will be merged with existing files
 100% |█████████| 10/10 [9.5ms elapsed, 0s remaining, 1.1K samples/s] 
INFO:eta.core.utils: 100% |█████████| 10/10 [9.5ms elapsed, 0s remaining, 1.1K samples/s] 
0it [00:00, ?it/s]No acceptable contours found.
No acceptable contours found.
No acceptable contours found.
No acceptable contours found.
No acceptable contours found.
No acceptable contours found.
No acceptable contours found.
No acceptable contours found.
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and may be removed in the future. Please use keyword parameter(s) instead.
  warnings.warn(
/home/jon/miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)

@pseudotensor
Copy link
Collaborator

Seems to be stuck in DocTR for me.

I should disable DocTR if have CPU mode.

@pseudotensor
Copy link
Collaborator

Uploading image.png…

should work now, not hang at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants