-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#4306 - Extract tags from SubRip subtitles, add support for alignment #4582
Conversation
…gnment tags based on SSA v4+
@@ -154,6 +155,27 @@ public void testDecodeNoEndTimecodes() throws IOException { | |||
.isEqualTo("Or to the end of the media."); | |||
} | |||
|
|||
@Test | |||
public void testDecodeCueWithTag() throws IOException{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't appear to be asserting anything other than that the tags were removed (or not) from the text. Shouldn't the test also be asserting that they were applied successfully (e.g. that the subtitles in the cues have the right alignment parameters)
assertThat(subtitle.getCues(subtitle.getEventTime(4)).get(0).text.toString()) | ||
.isEqualTo("This is the third subtitle."); | ||
|
||
// Based on the SSA v4+ specs the curly bracket must be followed by a backslash, so this is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link to this doc (just in this thread, not in the source). The doc I found has a bunch of examples where this is always true, but it's also always true that there's no whitespace inside the brackets either. Hence it seems strange to forbid { \\an2}
but not {\ an2}
. Unless you have counter examples, I think we should just require no whitespace in the bracketed region. Which I think you can do by using [^\s]
instead of .
in the Pattern
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the docs that can be downloaded from here:
moodub.free.fr/video/ass-specs.doc
And it says: "All override codes are always preceded by a backslash "
So that's why I didn't allow a space after the bracket. Regarding spaces after the backslash I just followed how the VLC handles these cases.
We sure can require no whitespace in the bracketed region, is up to you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess we can allow space, if there's an existing example of doing this (as there seems to be)!
Spanned text = Html.fromHtml(textBuilder.toString()); | ||
cues.add(new Cue(text)); | ||
// Extract tags | ||
SubtitleTagResult tagResult = extractTags(textBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given you wouldn't expect a tag to be separated over multiple lines, it seems cleaner to do this in the while loop above. You could replace the last line in the while loop with something like:
textBuilder.append(processLine(currentLine, tags));
Where tags
is an Arraylist
to which any new tags are added, and processLine
is responsible for trimming the line and extracting tags. Note passing the list in as an argument also avoids having to define SubtitleTagResult
.
You'd then make the tags
list before the start of the while loop to pass in (you could even re-use the same single instance if you want, as is done for the StringBuilder
, as long as you remember to clear it.
Cue cue = null; | ||
|
||
// Check if tags are present | ||
if (tagResult.tags.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - You can probably remove this. The loop will be a no-op if tags has length zero.
// Check if tags are present | ||
if (tagResult.tags.length > 0) { | ||
|
||
boolean alignTagFound = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you just break;
from the loop when you find the first align tag, and avoid the need for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I could, but I intended to wrote like this to be generic and to be able to extend the loop with other tag's "apply" logic. Note that this for loop isn't only for align tags (for now it is), instead goes over all the tags and applies them, I suppose it will have support for more tags in the future and it that case breaking after the first align tag would be incorrect behavior.
Taking these into consideration, if you still would like to just "break", then I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our preference (for small blocks of code; this doesn't really apply to larger scale design choices for obvious reasons) is to write it in the simplest way for current usage, and change it as and when the usage changes. So I think using break is still preferable for now.
if (alignTagFound) continue; | ||
alignTagFound = true; | ||
|
||
AlignmentResult alignmentResult = getAlignmentValues(tag); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change getAlignmentValues(tag)
to be buildCue(Spanned text, String alignmentTag)
then you can get rid of AlignmentResult
.
@ojw28 waiting for your opinion on the |
Replied to the |
@szaboa - I think there's still an open comment on |
Yes, there is that and the break thing. I am on vacation right now, I will resolve those on the next Monday (first of October) when I am back, sorry for the delays! |
@ojw28 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks. I put a few final comments inline.
break; | ||
} | ||
|
||
return new Cue(text, Layout.Alignment.ALIGN_NORMAL, line, Cue.LINE_TYPE_FRACTION, lineAnchor, position, positionAnchor, Cue.DIMEN_UNSET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use null for text alignment? Cue
implies that's what you should use unless you know better.
line = DEFAULT_END_FRACTION; | ||
position = DEFAULT_START_FRACTION; | ||
positionAnchor = Cue.ANCHOR_TYPE_START; | ||
lineAnchor = Cue.ANCHOR_TYPE_END; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to order the assignments as line
,lineAnchor
,position
,positionAnchor
so the line and position ones are grouped. Ignorable, but it might be even clearer if split into two switch blocks with three cases each, like:
float position;
@Cue.AnchorType int positionAnchor;
switch (alignmentTag) {
case ALIGN_BOTTOM_LEFT:
case ALIGN_MID_LEFT:
case ALIGN_TOP_LEFT:
position = DEFAULT_START_FRACTION;
positionAnchor = Cue.ANCHOR_TYPE_START;
case ALIGN_BOTTOM_MID:
case ALIGN_MID_MID:
case ALIGN_TOP_MID:
...
case ALIGN_BOTTOM_RIGHT:
case ALIGN_MID_RIGHT:
case ALIGN_TOP_RIGHT:
...
}
... (ditto for line and lineAnchor) ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, much nicer!
*/ | ||
private Cue buildCue(Spanned text, String alignmentTag) { | ||
// Default values used for positioning the subtitle in case of align tags | ||
float line = DEFAULT_END_FRACTION, position = DEFAULT_MID_FRACTION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to assign initial values to any of these (it's better not to, since that'll make the compiler ensure they're initialized along all code paths below). Add default:
to the last case in the switch statement if needed to make the compiler happy.
…signments, setting Cue's textAlignment to def value - null
Pushed the changes according to the comments. |
Thanks! We made some further improvements during internal review, which were merged in 7849a5e. Let us know if we accidentally broken anything :). |
It works fine and looks much better now, actually I am feeling that I did poor job after seeing your changes :) Anyway, based on this I will try to pick up a better coding style in my next pull request. |
Not at all; this feature wouldn't have happened without you ;). Thanks for the contribution! |
#4306 - Extract tags from SubRip subtitles, add support for alignment
Extracted valid tags from srt subtitles (so those won't be rendered anymore) and added support for different alignments tags (based on SSA v4+ specification) such as {\an2} for example.