Skip to content

milahu/opensubtitles-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

118a7d3 · Nov 30, 2024
Nov 19, 2023
Nov 8, 2024
Jul 30, 2024
Jun 9, 2024
Dec 29, 2023
Aug 3, 2024
Nov 24, 2024
Jul 16, 2024
Apr 30, 2023
Nov 19, 2023
Aug 3, 2024
Jul 16, 2024
Apr 22, 2023
Apr 22, 2023
Apr 22, 2023
Apr 22, 2023
Apr 22, 2023
Dec 22, 2023
Dec 22, 2023
Apr 28, 2023
Apr 28, 2023
Apr 28, 2023
Nov 20, 2023
Jul 30, 2024
Dec 17, 2023
Aug 21, 2024
Mar 9, 2024
Apr 22, 2023
Oct 23, 2023
Dec 26, 2023
Dec 26, 2023
Apr 22, 2023
Nov 28, 2024
Apr 22, 2023
Apr 28, 2023
Apr 29, 2023
Apr 29, 2023
Nov 19, 2023
Apr 22, 2023
Apr 28, 2023
Nov 19, 2024
Nov 19, 2023
Nov 19, 2023
May 5, 2023
Nov 19, 2023
Apr 22, 2023
Apr 22, 2023
Apr 28, 2023
Nov 19, 2024
Nov 21, 2023
Oct 23, 2023
May 8, 2023
Apr 22, 2023
Apr 22, 2023
Mar 10, 2024
Dec 17, 2023
Jul 16, 2024
Apr 5, 2024
Jul 27, 2024
Nov 24, 2024
Aug 25, 2024
Nov 22, 2024
Apr 22, 2023
Apr 22, 2023
Apr 22, 2023
Dec 29, 2023
Apr 22, 2023
May 5, 2023
Nov 19, 2023
Nov 30, 2024
Aug 25, 2024
Jun 9, 2024
Dec 20, 2023
Nov 24, 2024
Apr 22, 2023
Apr 22, 2023
May 5, 2023

Repository files navigation

opensubtitles-scraper

scrape subtitles from opensubtitles.org

result

torrent RSS feed: opensubtitles.org.dump.torrent.rss

unreleased subs are stored in github.com/milahu/opensubtitles-scraper-new-subs

usage

run get-subs.py to get subtitles for a movie:

~/src/opensubtitles-scraper/get-subs.py Scary.Movie.2000.mp4

video_path Scary.Movie.2000.mp4
video_filename Scary.Movie.2000.mp4
video_parsed MatchesDict([('title', 'Scary Movie'), ('year', 2000), ('container', 'mp4'), ('mimetype', 'video/mp4'), ('type', 'movie')])
output 'Scary.Movie.2000.en.00018286.sub' from 'Scary_eng.txt' (us-ascii)
output 'Scary.Movie.2000.en.00018615.sub' from 'Scary Movie.txt' (us-ascii)
output 'Scary.Movie.2000.en.00106539.sub' from 'Scary Movie - ENG.txt' (us-ascii)
output 'Scary.Movie.2000.en.00117707.sub' from 'scream_english.sub' (iso-8859-1)
output 'Scary.Movie.2000.en.00203573.sub' from 'Scary Movie - ENG.txt' (us-ascii)
output 'Scary.Movie.2000.en.00204203.sub' from 'Scary Movie_engl.sub' (iso-8859-1)
output 'Scary.Movie.2000.en.03112243.srt' from 'Scary Movie 1 (2000).en.bug-fixed.srt' (Windows-1252)
output 'Scary.Movie.2000.en.03142326.srt' from 'kns-sm.srt' (Windows-1252)
output 'Scary.Movie.2000.en.03279944.srt' from 'Scary Movie 1 iNT DvD RiP- WaCkOs.srt' (Windows-1252)
output 'Scary.Movie.2000.en.03318665.srt' from 'rvlt-scarymovie.srt' (us-ascii)
output 'Scary.Movie.2000.en.03552139.srt' from 'Scary Movie.srt' (us-ascii)
output 'Scary.Movie.2000.en.03686957.sub' from 'Scary.Movie.(2000).DVDRIP.Divx.DOMiNION.sub' (iso-8859-1)
output 'Scary.Movie.2000.en.04867080.srt' from 'Scary.Movie.2000.BrRip.720p.x264.YIFY-eng.srt' (Windows-1252)
output 'Scary.Movie.2000.en.05115082.srt' from 'Scary Movie 1.[2000].UNRATED.DVDRIP.XVID.[Eng]-DUQA®.srt' (Windows-1252)
...

subtitles server

to run your own subtitles server, see docs/lighttpd.conf to expose get-subs.py as a CGI script on an HTTP server

based on

offline version of opensubtitles

useful for subtitle-fetchers like

scraping

opensubtitles.org is protected by cloudflare, so im using a scraping proxy (zenrows.com). with max_concurrency = 10 in fetch-subs.py, one request takes about 0.2 seconds.

videos: