Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for SubRip #232

Merged
merged 10 commits into from
Jan 17, 2022
Merged

Add support for SubRip #232

merged 10 commits into from
Jan 17, 2022

Conversation

joksas
Copy link
Contributor

@joksas joksas commented Jan 6, 2022

The syntax is described here.

MIME types text/x-subrip and text/x-srt were suggested here, while the type application/srt is motivated by this.

@coveralls
Copy link

coveralls commented Jan 6, 2022

Coverage Status

Coverage remained the same at 96.407% when pulling 1de8108 on joksas:srt into db2464c on gabriel-vasile:master.

@joksas
Copy link
Contributor Author

joksas commented Jan 6, 2022

Huh, that's strange -- it works on Linux, but not on Windows. I'll take a look at that.

In Windows, a newline is represented as `\r\n`.
@joksas
Copy link
Contributor Author

joksas commented Jan 6, 2022

Alright, it works now.

Copy link
Owner

@gabriel-vasile gabriel-vasile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
I added a benchmark for .srt detection with regex vs. lines parsing.
On my machine parsing lines is almost 10 times faster and I think this is enough to justify not using regex for this.

cd internal/magic
go test -bench=.

goos: linux
goarch: amd64
pkg: github.com/gabriel-vasile/mimetype/internal/magic
cpu: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
BenchmarkRegexSrt-8   	  160696	      7496 ns/op
BenchmarkParseSrt-8   	 1370484	       824.4 ns/op
PASS
ok  	github.com/gabriel-vasile/mimetype/internal/magic	4.368s

I didn't test it on windows, but it should work too. I'm not sure there are no other bugs. Tell me if you see anything wrong with it.

@joksas
Copy link
Contributor Author

joksas commented Jan 11, 2022

Hi, thank you a lot for this!

Yes, I agree that it makes sense to use parsing. I only made the checks on timestamps a bit more restrictive.

Here are the benchmarks before I made the changes (i.e. your version):

cpu: AMD Ryzen 5 2600 Six-Core Processor
BenchmarkRegexSrt-12           46358         25832 ns/op
BenchmarkParseSrt-12          397768          2912 ns/op

and after my changes:

cpu: AMD Ryzen 5 2600 Six-Core Processor
BenchmarkRegexSrt-12           45734         25546 ns/op
BenchmarkParseSrt-12          394863          3043 ns/op

So the increase in time is only marginal (if statistically significant at all).

Anyway, sorry for merging too early, I just didn't expect that the function would work with Go 1.17.6 (my local version) and fail with Go 1.16.12 (GitHub Actions version)... I did confirm that this is the case by installing the latter version locally. I'll try to find out what's happening.

@joksas
Copy link
Contributor Author

joksas commented Jan 11, 2022

Alright, it seems that comma as a decimal separator for fractional seconds was only introduced in this commit. I'll think about the best way around this.

@joksas
Copy link
Contributor Author

joksas commented Jan 12, 2022

It now works with Go versions less than 1.17 as well.

As a side note, would it make sense to standardise the package's minimum Go version defined in go.mod and the version used by GitHub actions? The former is 1.12 and the latter is 1.16. I don't think it matters that much what specific version is used, only that it is consistent throughout the project. Tests passing in 1.16 does not necessarily imply that they will pass in 1.12.

Copy link
Owner

@gabriel-vasile gabriel-vasile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
Code looks good, nice catch finding the comma issue.
I left one last suggestion.

internal/magic/text.go Show resolved Hide resolved
Co-authored-by: Gabriel Vasile <gabriel.vasile@email.com>
@joksas
Copy link
Contributor Author

joksas commented Jan 16, 2022

The suggestion makes sense, I've committed it -- thank you.

Copy link
Owner

@gabriel-vasile gabriel-vasile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last last problem. I thought about it, forgot to write it down and only noticed it now, at the last review.

application/srt is not a valid MIME type because it's in the standards tree but it is not registered at IANA.
buzzsprout.com cannot just decide to use application/srt by themselves. They need to follow the registration process with IANA or use one of the vendor tree or the unregistered tree.

The other MIME types (text/x-subrip and text/x-srt) are ok. They are experimental (because of the x- part) and don't need registration with IANA to be used, but application/srt shouldn't be here.

@joksas
Copy link
Contributor Author

joksas commented Jan 17, 2022

Got it. I've removed application/srt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants