Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading issue #9

Open
chaoqunwangcs opened this issue Mar 25, 2024 · 11 comments
Open

Downloading issue #9

chaoqunwangcs opened this issue Mar 25, 2024 · 11 comments

Comments

@chaoqunwangcs
Copy link

Thanks for the great job!
I would like to download the YouTube data with the given script, but the download command "youtube-dl -f {args} {url}" is not working, and the error infos is "ERROR: Unable to extract uploader id". Can you provide another downloading script? or are there any questions? A solution is using the 'yt-dlp' package, but I wonder about the downloading args, such as the resolution and ext, which is about the image quality and dataset size.

@YTEP-ZHI
Copy link
Collaborator

Hi @chaoqunwangcs, thanks for your feedback. This issue arises when YouTubers delete or make their videos private, thus these video links will be invalid. One solution is to skip these missing videos and we will update the code soon.

@chaoqunwangcs
Copy link
Author

Thanks for your reply. But the issue is not because the YouTubers delete or make their videos private, as the 'yt-dlp' package could successfully download the video. To align with your datasets downloaded by the 'youtube-dl' package, I'd like to ask for the details such as the resolutions and ext, which is about the image quality and dataset scale.

@GihhArwtw
Copy link
Contributor

GihhArwtw commented Mar 26, 2024

Hi @chaoqunwangcs could you please give more information about the video that leads to ERROR: Unable to extract uploader id? (video_id or url) that'll help us handle the issue faster.

as for resolution and ext, most videos are downloaded at 1080p (720p for those that couldn't find a version at 1080p). As for ext, all videos are either in mp4 format or webm format. You can refer to https://github.com/OpenDriveLab/DriveAGI/blob/main/opendv/configs/download.json#L4 for some details.

I'll try to make a download script for yt-dlp asap.

@chaoqunwangcs
Copy link
Author

Thanks for your reply. For the failure case, you can just run the command 'youtube-dl https://www.youtube.com/watch?v=--I-TdCe2_g'(just the first video) with the latest 'youtube-dl' package(install with "pip install youtube-dl"). Besides, many videos have higher resolution such as 2K(3848*2160), do you ever statistic the dataset scale with the highest resolution?

@GihhArwtw
Copy link
Contributor

  1. it seems that the command works fine on our server. Maybe the issue has something to do with the network condition?
    Since yt-dlp package will work find on your end, I think I can update another download script using yt-dlp.

  2. though many videos support 2K or 4K resolution, we still download their 1080p versions since the processed data will take up a lot of disk space. But of course you can download videos at higher resolution if you need.

@GihhArwtw
Copy link
Contributor

hi @chaoqunwangcs. I just update the download script.

To download videos using yt-dlp, you just need to change the method configure in configs/download.json to yt-dlp. Please let us know if there are some further problems.

@makolon
Copy link

makolon commented Apr 9, 2024

Hi @GihhArwtw,

I encountered an issue while trying to download videos using yt-dlp. After installing yt-dlp via the pip command and configuring it as the download method, I received the following error and warning:

$ python scripts/youtube_download.py >> download_output.txt
  0%|                                                                       | 0/2139 [00:00<?, ?it/s]
WARNING: [youtube] Skipping player responses from android clients (got player responses for video "aQvGIIdgFDM" instead of "9fZl32pIdCM")
ERROR: [youtube] 9fZl32pIdCM: Video unavailable. This video is no longer available because the YouTube account associated with this video has been terminated.

~~~

"""
Traceback (most recent call last):
  File "/root/DriveAGI/opendv/scripts/youtube_download.py", line 36, in single_download
    raise Exception("ERROR: Video unavailable or network error.")
Exception: ERROR: Video unavailable or network error.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/root/DriveAGI/opendv/scripts/youtube_download.py", line 39, in single_download
    with open(CONFIGS.exception_file, "a") as f:
AttributeError: 'EasyDict' object has no attribute 'exception_file'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/DriveAGI/opendv/scripts/youtube_download.py", line 102, in <module>
    multiple_download(video_list, configs)
  File "/root/DriveAGI/opendv/scripts/youtube_download.py", line 56, in multiple_download
    for _ in tqdm(p.imap(single_download, video_list), total=video_count):
  File "/usr/local/lib/python3.10/dist-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value
AttributeError: 'EasyDict' object has no attribute 'exception_file'

Could you please help me resolve this issue? Any assistance would be greatly appreciated.

Thank you!

@makolon
Copy link

makolon commented Apr 9, 2024

Thanks for your reply!!!
I initially thought that the process was stuck at the WARNING, but it actually continued!
However, an error about being unable to rename is output. Is it safe to ignore this error?

ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/USA_Thrill/B1rC5Ni8Dgk.webm.part' -> 'OpenDV-YouTube/videos/USA_Thrill/B1rC5Ni8Dgk.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/J_Utah/Arz8k37-9F4.webm.part' -> 'OpenDV-YouTube/videos/J_Utah/Arz8k37-9F4.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/Relaxing_Walks/BWAbBu7uNdA.webm.part' -> 'OpenDV-YouTube/videos/Relaxing_Walks/BWAbBu7uNdA.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/Zhejiang_Street_Scenes/A8DZaOwQQ8U.webm.part' -> 'OpenDV-YouTube/videos/Zhejiang_Street_Scenes/A8DZaOwQQ8U.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/Relaxing_Walks/A_cPQJt-id4.webm.part' -> 'OpenDV-YouTube/videos/Relaxing_Walks/A_cPQJt-id4.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/The_Driving_Channel/AoXyxEi09CI.webm.part' -> 'OpenDV-YouTube/videos/The_Driving_Channel/AoXyxEi09CI.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/Relaxing_Scenes_-_Driving/CBtT-zVekxg.webm.part' -> 'OpenDV-YouTube/videos/Relaxing_Scenes_-_Driving/CBtT-zVekxg.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/Relaxing_Walks/BDXHtLqFUf4.webm.part' -> 'OpenDV-YouTube/videos/Relaxing_Walks/BDXHtLqFUf4.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/J_Utah/A0F1ZKqmavc.webm.part' -> 'OpenDV-YouTube/videos/J_Utah/A0F1ZKqmavc.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/Driving_in_China/BN2hmwUJH9o.webm.part' -> 'OpenDV-YouTube/videos/Driving_in_China/BN2hmwUJH9o.webm'
ERROR: Unable to rename file: [Errno 2] No such file or directory: 'OpenDV-YouTube/videos/Wheels_Around_The_World/BHm7wQ8Y_cU.webm.part' -> 'OpenDV-YouTube/videos/Wheels_Around_The_World/BHm7wQ8Y_cU.webm'

@GihhArwtw
Copy link
Contributor

GihhArwtw commented Apr 9, 2024

Hi @makolon
i'll look into it. Maybe you could try running python with sudo and see if the errors still exist. I'm not sure whether the problem has something to do with the directory permission on your end or not.

also, i'll fix the EasyDict bug today. i don't how i missed it before, probably since the testing download process can continue with the error message 😂

@GihhArwtw
Copy link
Contributor

hi @makolon i just fix the EasyDict bug.
still, i could not reproduce the error you reported in the latter comment. But i think it is not safe to ignore it. Both youtube-dl and yt-dlp need to rename *.part (the temporary file) to *.<EXT> when they finished downloading.

@makolon
Copy link

makolon commented Apr 9, 2024

hi @GihhArwtw!
Thank you! After pulling the revised code and running it again, it seems like the download was successful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants