Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correction ms_to_frames/frames_to_ms/ms_to_str #57

Open
moi15moi opened this issue Nov 6, 2022 · 21 comments
Open

Correction ms_to_frames/frames_to_ms/ms_to_str #57

moi15moi opened this issue Nov 6, 2022 · 21 comments
Assignees

Comments

@moi15moi
Copy link
Contributor

moi15moi commented Nov 6, 2022

Currently ms_to_frames and frames_to_ms does not works correctly. They can be imprecise.

I did a pullrequest on the PyonFx repos and I think it could be a good idea to do something similar with pysubs2: CoffeeStraw/PyonFX#46

I recommand you to look at these two file of my PR. These are the only one that matters for pysubs2.

  • pyonfx/convert.py
  • pyonfx/timestamps.py

In brief, here is all the method I propose to change:

@tkarabela
Copy link
Owner

Hi @moi15moi, thanks for reaching out!

I see that Aegisub does proper rounding for the millisecond to centisecond conversion: https://github.com/Aegisub/Aegisub/blob/6f546951b4f004da16ce19ba638bf3eedefb9f31/libaegisub/include/libaegisub/ass/time.h#L32

While pysubs2 currently always rounds down: https://github.com/tkarabela/pysubs2/blob/master/pysubs2/substation.py#L147

I agree that we should round like Aegisub does when writing SSA/ASS timestapms, which is responsibility of SubstationFormat.ms_to_timestamp().

As for the general routines in time.py, I'm not sure they have anything to do with it? The library generally works with milliseconds, and SubStation is the only format that needs centiseconds, IIRC.

@moi15moi
Copy link
Contributor Author

moi15moi commented Nov 6, 2022

As for the general routines in time.py, I'm not sure they have anything to do with it? The library generally works with milliseconds, and SubStation is the only format that needs centiseconds, IIRC.

If you are talking about ms_to_frames/frames_to_ms, here is an example.

import bisect
from enum import Enum
from fractions import Fraction
import sys
from typing import Tuple, Union, List


def pysubs2_frames_to_ms(frames: int, fps: float) -> int:
    return int(round(frames * (1000 / fps)))


def pysubs2_ms_to_frames(ms: int, fps: float) -> int:
    return int(round((ms / 1000) * fps))


def create_timestamps_from_fps(
    fps: Union[int, float, Fraction], n_frames: int
) -> List[int]:

    timestamps = [round(frame * 1000 / fps) for frame in range(n_frames)]
    return timestamps


class TimeType(Enum):
    EXACT = "EXACT"
    START = "START"
    END = "END"


def ms_to_frames(
    timestamps: List[int], ms: int, time_type: TimeType = TimeType.EXACT
) -> int:

    if not isinstance(time_type, TimeType):
        raise TypeError(f'Unknown time_type "{time_type}" provided.')

    if time_type == TimeType.START:
        return ms_to_frames(timestamps, ms - 1) + 1
    elif time_type == TimeType.END:
        return ms_to_frames(timestamps, ms - 1)

    denominator, numerator, last = get_den_num_last(timestamps)
    if ms < 0:
        return int(int(ms * numerator / denominator - 999) / 1000)
    elif ms > timestamps[-1]:
        return int(
            int((ms * numerator - last + denominator - 1) / denominator) / 1000
        ) + (len(timestamps) - 1)

    return bisect.bisect_right(timestamps, ms) - 1


@staticmethod
def frames_to_ms(
    timestamps: List[int], frame: int, time_type: TimeType = TimeType.EXACT
) -> int:

    if not isinstance(time_type, TimeType):
        raise TypeError(f'Unknown time_type "{time_type}" provided.')

    if time_type == TimeType.START:
        prev_frame = frames_to_ms(timestamps, frame - 1)
        curr_frame = frames_to_ms(timestamps, frame)
        return prev_frame + int((curr_frame - prev_frame + 1) / 2)
    elif time_type == TimeType.END:
        curr_frame = frames_to_ms(timestamps, frame)
        next_frame = frames_to_ms(timestamps, frame + 1)
        return curr_frame + int((next_frame - curr_frame + 1) / 2)

    denominator, numerator, last = get_den_num_last(timestamps)
    if frame < 0:
        return int(frame * denominator * 1000 / numerator)
    elif frame > (len(timestamps) - 1):
        frames_past_end = frame - len(timestamps) + 1
        return int(
            (frames_past_end * 1000 * denominator + last + int(numerator / 2))
            / numerator
        )

    return timestamps[frame]


def get_den_num_last(timestamps: List[int]) -> Tuple[int, int, int]:

    denominator = 1000000000
    numerator = int((len(timestamps) - 1) * denominator * 1000 / timestamps[-1])
    last = (len(timestamps) - 1) * denominator * 1000

    return denominator, numerator, last


def main():
    fps = Fraction(24000, 1001)
    timestamps = create_timestamps_from_fps(fps, 1000)

    print(f"TimeType.START: {frames_to_ms(timestamps, 500, TimeType.START)}")
    print(f"TimeType.EXACT: {frames_to_ms(timestamps, 500, TimeType.EXACT)}")
    print(f"TimeType.END: {frames_to_ms(timestamps, 500, TimeType.END)}")
    
    print(f"pysubs2_frames_to_ms: {pysubs2_frames_to_ms(500, fps)}")


if __name__ == "__main__":
    sys.exit(main())

It print

TimeType.START: 20833
TimeType.EXACT: 20854
TimeType.END: 20875
pysubs2_frames_to_ms: 20854

Like you can see, pysubs2_frames_to_ms return the same value as TimeType.EXACT.

But, there is 3 type of timing. Start, Exact, End.

The exact is the timestamps of the frame.
The start is the timestamps of the needed frame and the previous frame divided by two.
The end is the timestamps of the needed frame and the next frame divided by two.

Example:
You want to have a subtitle that only appear at the frame 500.

Like any line, you need to set an start time and a end time.

If you set the start_time and a end_time with the TimeType.EXACT, the subs won't appear since it would have a duration of 0ms.
This is why you need to use the TimeType.START and the TimeType.END.

Edit:
I just did some test for srt file with mpv.
Srt does not work at all like .ass subs.
This means that ms_to_frames/frames_to_ms between ass format and srt format are totally different.

Ass

  • Start: Exact ts of previous frame ceil to have centiseconds (warning: if the previous frame, exact time is 0:00:21.980, we need to ceil it to 0:00:21.99)
  • End: Exact ts of next frame floor to have centiseconds

Srt

  • Start = Exact ts of current frame
  • End = Exact ts of next frame

From what I understand, here is how it works with srt

def ms_to_frames_for_srt(
    timestamps: List[int], ms: int, time_type: TimeType = TimeType.EXACT
) -> int:

    if not isinstance(time_type, TimeType):
        raise TypeError(f'Unknown time_type "{time_type}" provided.')

    if ms < 0:
        return ValueError("You cannot specify an time under 0")

    if time_type in (TimeType.START, TimeType.EXACT):
        return bisect.bisect_right(timestamps, ms) - 1
    elif time_type == TimeType.END:
        return bisect.bisect_right(timestamps, ms - 1) - 1


def frames_to_ms_for_srt(
    timestamps: List[int], frame: int, time_type: TimeType = TimeType.EXACT
) -> int:

    if not isinstance(time_type, TimeType):
        raise TypeError(f'Unknown time_type "{time_type}" provided.')

    if frame < 0:
        return ValueError("You cannot specify a frame under 0")
    elif frame > (len(timestamps) - 1):
        return ValueError("You cannot specify an image above what the video contains")


    if time_type == TimeType.START:
        return timestamps[frame]
    elif time_type == TimeType.END:
        return timestamps[frame + 1]

    return timestamps[frame]

@tkarabela
Copy link
Owner

I think I understand better what you mean now!

To be honest, I've only ever used the library for batch shifting, format conversion, copying styles around, etc. - never for frame-perfect typesetting or karaoke where exact interpretation of timestamps would be relevant. On the other hand, this seems to be the main focus of the https://github.com/CoffeeStraw/PyonFX library you mentioned.

In my experience, subtitle formats (except perhaps OpenVTT and similar "official" stuff) are very poorly defined, so it ends up being dependent on the exact player/rendering library used for playback. You noted there is difference between ASS and SRT in mpv.

While I can see the use for a thing like ms_to_frames_for_srt(..., time_type=TimeType.START), I'm not really sure where it fits in the current API, and I'm certainly not the person to write documentation and tutorial on how to use it... So I'm not sure where to go with this :)

@moi15moi
Copy link
Contributor Author

So I'm not sure where to go with this :)
When we use pysubs2 to convert any format to an another format, in my opinion, it should be frame perfect.

Yes, there are a lot of subtitle format where the spec aren't well defined, but if pysubs2 respect what mpv does, i think it is better than nothing.

tkarabela added a commit that referenced this issue Nov 27, 2022
@moi15moi
Copy link
Contributor Author

moi15moi commented Dec 6, 2022

@tkarabela
I thought about it and found a way to achieve accurate time conversion of subtitles regardless of format.
I think it would even be possible to implement it in pysubs2.

Here is what I have done: https://gist.github.com/moi15moi/9b68589e6df0a76efbd01693c6c6f2ff

In my experience, subtitle formats (except perhaps OpenVTT and similar "official" stuff) are very poorly defined, so it ends up being dependent on the exact player/rendering library used for playback. You noted there is difference between ASS and SRT in mpv.

There was something I had misunderstood. In reality, any format use the same logic.

@moi15moi
Copy link
Contributor Author

@tkarabela What do you think about my previous message?

@tkarabela
Copy link
Owner

Hi @moi15moi , thank you for the code :) I'm somewhat busy at the moment, I'll probably get to it over the weekend.

@tkarabela
Copy link
Owner

@moi15moi I've looked at the code and it looks good to me (thanks for including the docstrings and tests!). As the next step I suggest that I create a new branch in this repo, put your code in there and integrate it with the rest of the library, and then we can make a review and decide if it's ready for release or more work is needed.

If all goes well I think it could be released before end of the year :)

@moi15moi
Copy link
Contributor Author

moi15moi commented Jan 4, 2023

I don't really know where to add it properly.
I thought about time.py, but since the new class I proposed contain frames_to_ms and ms_to_frames method, we need to change all the code where they are used.

I think think they are only used here:

Problem/Question

  1. In the make_time method, I need the subtitle format and the TimeType to call frames_to_ms.
    If I add those parameter, it break a lot of function.
    See this commit to see why it break: moi15moi@01e202e

  2. Should we replace the parameter fps by an Timestamps?

@tkarabela
Copy link
Owner

I was thinking about something like this:

import pysubs2
import os.path


def snap_to_frames(subs: pysubs2.SSAFile, format_: str, ts: pysubs2.Timestamps):
    for e in subs:
        old_start_ms = e.start
        old_end_ms = e.end
        start_frames = ts.ms_to_frames(old_start_ms, format_, pysubs2.TimeType.START)  # not sure if we want START or END in these
        end_frames = ts.ms_to_frames(old_end_ms, format_, pysubs2.TimeType.END)
        new_start_ms = ts.frames_to_ms(start_frames, pysubs2.TimeType.START)
        new_end_ms = ts.frames_to_ms(end_frames, pysubs2.TimeType.END)
        e.start = new_start_ms
        e.end = new_end_ms


# this code would be incorporated into pysubs2.SSAFile.save()
def new_save(subs: pysubs2.SSAFile, path: str, ts: Optional[pysubs2.Timestapms]):
    subs = subs.copy()

    # when saving, we already know the target format
    ext = os.path.splitext(path)[1].lower()
    format_ = pysubs2.formats.get_format_identifier(ext)

    # new part: snap subtitles to frames according to timestamps
    if ts:
        snap_to_frames(subs, format_, ts)
        
    # save subtitles as usual
    subs.save(path)


ts = pysubs2.Timestamps.from_video_file("video.mkv")
subs = pysubs2.load("subs.srt")

new_save(subs, "subs.ass", timestamps=ts)

This would make sure that timestamps are properly "snapped" to timestamps in the output file, while requiring little modification to other code. Would that be useful?

I see that going further and allowing something like:

ts = pysubs2.Timestamps.from_video_file("video.mkv")
subs = pysubs2.load("subs.srt")
subs.shift(frames=5, timestamps=ts)  # <---

...is problematic if the frames/ms conversion needs to know output format, since we don't know that in pysubs2. The SSAFile class is based on the assumption that the subtitles are something like .ass with millisecond timestamps. While what your Timestamp class would ideally require is a subtitle class that uses frames instead of milliseconds, and is fixed to a given output format.

In the command line interface (python -m pysubs2 ...) we can do better since that requires specifying output format.

On a more general note, what are the use-cases for what Timestamps class provides?

  • karaoke / effects where one wants to work with frame precision?
  • converting eg. SRT -> ASS for VFR video?
  • time shifting subtitles for VFR video?

I'm thinking about how to best support it, given that the library wasn't designed for it: for the karaoke/fx use-case which is going to involve a lot of custom scripting anyway, I think even providing the Timestamps class as-is would be useful. For more casual use like converting subtitle formats, it could be put in the CLI and perhaps as a special method like snap_to_frames() above, with mention in the tutorial on how to use it.

P.S. Sorry for the late reply, I didn't get around to do it during the holidays.

@moi15moi
Copy link
Contributor Author

This would make sure that timestamps are properly "snapped" to timestamps in the output file, while requiring little modification to other code. Would that be useful?

Not really, the timing does not need to be "snapped" to the frame when saving it, so I don't see the utility to snap_to_frames.

While what your Timestamp class would ideally require is a subtitle class that uses frames instead of milliseconds, and is fixed to a given output format.

The timestamps class has no use if the subtitle timing are already in frame!

On a more general note, what are the use-cases for what Timestamps class provides?

  • karaoke / effects where one wants to work with frame precision?

Exactly, but it isn't necessarily only usefull for karaoke/effect.
With keyframe and this new "module", we can verify if our timing isn't too bad by checking that our timing does not overlap the keyframe.
We may need to provide a method to get the keyframe, but this is an another subject.

  • converting eg. SRT -> ASS for VFR video?

This method allow to have a perfect timing when converting an .srt to an .ass file.
Currently, there may be a 1 frame difference between the original file (srt) and the converted one (.ass) on any kind of video (CFR and VFR).
But, when converting an .ass to an .srt, there isn't some kind of "error".
It can only happen when converting an subtitle format that has an higher timing precision (ex: milliseconds) to an subtitle format that has an lower timing precision (ex: centiseconds)

  • time shifting subtitles for VFR video?

Not necessarily a VFR video. It is also usefull with CFR video since the current algorithm does not always work (if you need example, ask it and I will provide you some).

There may be some other use case, but these are clearly the most important one.

@tkarabela
Copy link
Owner

Thanks, I see. If you have any examples, that would be helpful :) Feel free to link them here or send me an e-mail to tkarabela at seznam dot cz.

The conversion from SRT -> ASS should be improved as of version 1.6.0 of the library after this commit: 2725c0a

@moi15moi
Copy link
Contributor Author

moi15moi commented Jan 14, 2023

The conversion from SRT -> ASS should be improved as of version 1.6.0 of the library after this commit: 2725c0a

This isn't really a magic method that make the conversion working.

Before explaining why it does not, here is an reminder of possible value an time can have for an particular frame:

  • Start time: ]Previous image, Current image]
  • End time: ]Current image, Next image]

Let's suppose we have those timestamps (these are the timestamps for an 24000/1001 fps video with 6 images).

Image 0: 0 ms
Image 1: 42 ms
Image 2: 83 ms
Image 3: 125 ms
Image 4: 167 ms
Image 5: 209 ms

Possible start time for the image 3: ]83, 125] = [84, 125]

  • If we have 84 ms and we floor it, it give 80 ms which is the image 2, so we can't use it.
  • If we have 84 ms and we round it, it give 80 ms which is the image 2, so we can't use it.
  • If we have 84 ms and we ceil it, it give 90 ms we can use it because it is in the range we found.
    But, if we have 125 ms and we ceil it, it give 130 ms which is the image 4, so we can't use it.

So, this is a proof that we can't only use a rounding method to convert srt to ass. This is why we need to have the timestamps information to know how to properly convert the milliseconds into centiseconds while conversing the exact frame timing.

Why is the current shifting method with frame does not work well?

I think the best is to give you an example: Example bad shift frame.zip
The input file is the srt file and the output is the ass file.
Like you can see, the subtitle start at the frame 2.
If I shift it by one frame, it appear on the frame 4 (but it shoud have been the frame 3), so there is a problem

Even if you floor or ceil the result, there will always be edge case that will cause problem.

Example why floor does not work.
Start time is 84 ms which correspond to frame 3.
The method shift return me 126 ms, but if I floor it, it give me 120 ms which is also frame 3.

Example why rounding does not work.
Start time is 83 ms which correspond to frame 2.
The method shift return me 125 ms, but if I round it, it give me 130 ms which is also frame 4.

Example why ceil does not work.
Start time is 83 ms which correspond to frame 2.
The method shift return me 125 ms, but if I ceil it, it give me 130 ms which is also frame 4.

So, how to make the shifting works?
With what I proposed, you need to call ms_to_frame, then shifting the frame and finally call the method frame_to_ms.

Why frames_to_ms need to know the output format?

I just realised we don't need it since I take an average of the 2 extreme value (for start time average of Previous image and Current image, for end time average of Current image and Next image) and you round the result when saving the .ass file.
This make the rounding working with any kind of subtitle format and it preserve the frame accuracy.
I know this may seem to contradict what I said earlier when I said So, this is a proof that we can't only use a rounding method to convert srt to ass, but since in this paticular case we know we will always have the "middle" frame value, rounding is a solution that always works

So, I removed the parameter "format" in the method frames_to_ms in my last commit.
In brief, this mean rounding need to be implemented into all format that cannot represent milliseconds.

Remaining problem

I would need to have the TimeType in the method make_time
https://github.com/moi15moi/pysubs2/blob/f9ea3c5863e795d37ce6e93f9fc2db9e8dab94b8/pysubs2/time.py#L44

@tkarabela
Copy link
Owner

Remaining problem

I would need to have the TimeType in the method make_time https://github.com/moi15moi/pysubs2/blob/f9ea3c5863e795d37ce6e93f9fc2db9e8dab94b8/pysubs2/time.py#L44

I agree, we need an additional parameter with default value TimeType.START (so that output of make_time(frame=1234, fps=23.976) stays the same as in the current version).

Let's suppose we have those timestamps (these are the timestamps for an 24000/1001 fps video with 6 images).

Image 0: 0 ms
Image 1: 42 ms
Image 2: 83 ms
Image 3: 125 ms
Image 4: 167 ms
Image 5: 209 ms

Possible start time for the image 3: ]83, 125] = [84, 125]

* If we have 84 ms and we **floor** it, it give 80 ms which is the image 2, so we can't use it.

* If we have 84 ms and we **round** it, it give 80 ms which is the image 2, so we can't use it.

* If we have 84 ms and we **ceil** it, it give 90 ms we can use it because it is in the range we found.
  But, if we have 125 ms and we ceil it, it give 130 ms which is the image 4, so we can't use it.

So, this is a proof that we can't only use a rounding method to convert srt to ass. This is why we need to have the timestamps information to know how to properly convert the milliseconds into centiseconds while conversing the exact frame timing.

I think I finally get it ^^; I've made a test .mp4 file with frame information both in the video and in SRT subtitles, which match perfectly. Just converting this SRT file to ASS using the current version of the library messes up the timing and many subtitles are 1 frame off, as you explain.

This undesirable behaviour should be fixed by the snap_to_frames() function I suggested above.

The test data is here: pysubs2_timestamps_test.zip

So, how to make the shifting works?
With what I proposed, you need to call ms_to_frame, then shifting the frame and finally call the method frame_to_ms.

Let's have a look what may be happening:

  1. CFR video, subs.shift(frames=...)
  • you suggest: call ms_to_frame, add frames, frame_to_ms
  • I suggest: just add duration in milliseconds, as we do now (to shift by 1234 frames, do e.start += make_time(frame=1234, fps=23.976), etc.) and handle the frame-aware rounding when saving, using snap_to_frames()
  1. CFR video, subs.shift(s=...)
  • I think this makes sense as-is, ie. just add whatever the user specifies to the stored milliseconds
  • in reality, one can only achieve shift in discrete steps (+1 frame, etc.), so that shift by ms=100 is really a shift by ms=83 or ms=125 (to use your example)
  • I would like to handle this when saving and not sooner, if possible
  1. VFR video, subs.shift(frames=...)
  • IMHO this doesn't make sense and ideally should raise an exception
  1. VFR video, subs.shift(s=...)
  • I think this makes sense as-is, ie. just add whatever the user specifies to the stored milliseconds

As I understand it, doing the ms_to_frame -> frame_to_ms round-trip (which is what my snap_to_frames does) only needs to happen when we want to know the timestamps in reduced precision, ie. when saving the file to something like ASS.

I just realised we don't need it since I take an average of the 2 extreme value (for start time average of Previous image and Current image, for end time average of Current image and Next image) and you round the result when saving the .ass file.

Yes, this is a good trick. One consequence is that this kind of "rounding" will in practice subtly shift most timestamps (eg. a subtitle that used to start at 0:00:00.00 will now be something like 0:00:00.02), which may be somewhat confusing. The previous format-aware method would effectively just round the last digit up or down for timestamps where it is relevant.

@moi15moi
Copy link
Contributor Author

The test data is here: pysubs2_timestamps_test.zip

The file test_video.mp4 is a bit strange since it doesn't round the timestamps.
Ex: for the image 1, it should be 42 ms, not 41 ms.
You can try it by making the srt file start at 42 ms and you will see that it is still displayed on the image 1.

But, if it helped you understand the current issues, that's great.

  1. CFR video, subs.shift(frames=...)
  • I suggest: just add duration in milliseconds, as we do now (to shift by 1234 frames, do e.start += make_time(frame=1234, fps=23.976), etc.) and handle the frame-aware rounding when saving, using snap_to_frames()

This will cause the same problem has the current shifting since the e.start/end can be anything.
Even if e.start/end is snapped to the frame, it does not always works.

Example where the fps = 93, the end_time correspond to the frame 37 and we want to shift the end_time by 1 frame.

from pysubs2 import Timestamps, TimeType, make_time

def round_timing(ass_ms):
    """
    From https://github.com/Aegisub/Aegisub/blob/6f546951b4f004da16ce19ba638bf3eedefb9f31/libaegisub/include/libaegisub/ass/time.h#L32
    """
    return (ass_ms + 5) - (ass_ms + 5) % 10

fps = 93
timestamps = Timestamps.from_fps(fps)

end_frame = 37
# end_ms is snapped to the frame
end_ms = timestamps.frames_to_ms(end_frame, TimeType.END)

# I do like you said
end_ms += make_time(frames=1, fps=fps)

# Since .ass does not handle milliseconds, I "round" it
end_ms = round_timing(end_ms)
print(f"Expected result: {end_frame + 1}")
print(f"Result: {timestamps.ms_to_frames(end_ms, TimeType.END)}")

Output (Like you can see, it doesn't work)

Expected result: 38
Result: 39

In my opinion, this is why the method ms_to_frame -> frame_to_ms -> ms_to_frame is better.

2. CFR video, subs.shift(s=...)

  • I think this makes sense as-is, ie. just add whatever the user specifies to the stored milliseconds

I am 100% agree. We should not "invoke" anything relied with Timestamps.

3. VFR video, subs.shift(frames=...)

  • IMHO this doesn't make sense and ideally should raise an exception

I don't see why it should raise en exception.
Sometimes, I shift subtitles that are timed with a VFR video in frame and I am clearly not the only one.

4. VFR video, subs.shift(s=...)

  • I think this makes sense as-is, ie. just add whatever the user specifies to the stored milliseconds

I am 100% agree. We should not "invoke" anything relied with Timestamps.

As I understand it, doing the ms_to_frame -> frame_to_ms round-trip (which is what my snap_to_frames does) only needs to happen when we want to know the timestamps in reduced precision, ie. when saving the file to something like ASS.

If you don't consider the VFR video and a possibility an user shifted the subtitle by few ms, then I think you are right, but I am not 100$ sure.

@tkarabela
Copy link
Owner

This will cause the same problem has the current shifting since the e.start/end can be anything. Even if e.start/end is snapped to the frame, it does not always works.

Example where the fps = 93, the end_time correspond to the frame 37 and we want to shift the end_time by 1 frame.

from pysubs2 import Timestamps, TimeType, make_time

def round_timing(ass_ms):
    """
    From https://github.com/Aegisub/Aegisub/blob/6f546951b4f004da16ce19ba638bf3eedefb9f31/libaegisub/include/libaegisub/ass/time.h#L32
    """
    return (ass_ms + 5) - (ass_ms + 5) % 10

Sorry, by "snapping to frame" I didn't mean the simple rounding as you do in round_timing (that's already implemented, and I see it why it doesn't really help), I meant my earlier:

def snap_to_frames(subs: pysubs2.SSAFile, format_: str, ts: pysubs2.Timestamps):
    for e in subs:
        old_start_ms = e.start
        old_end_ms = e.end
        start_frames = ts.ms_to_frames(old_start_ms, format_, pysubs2.TimeType.START)  # not sure if we want START or END in these
        end_frames = ts.ms_to_frames(old_end_ms, format_, pysubs2.TimeType.END)
        new_start_ms = ts.frames_to_ms(start_frames, pysubs2.TimeType.START)
        new_end_ms = ts.frames_to_ms(end_frames, pysubs2.TimeType.END)
        e.start = new_start_ms
        e.end = new_end_ms

Curiously, in your example:

print(f"Expected result: {end_frame + 1}")
print(f"Result: {timestamps.ms_to_frames(end_ms, TimeType.END)}")

Expected result: 38
Result: 39

I actually do get both results as 38, this is with moi15moi@aab9981 and this change:

# TODO Correct this line
return timestamps.frames_to_ms(frames, TimeType.START)
  1. VFR video, subs.shift(frames=...)
  • IMHO this doesn't make sense and ideally should raise an exception

I don't see why it should raise en exception. Sometimes, I shift subtitles that are timed with a VFR video in frame and I am clearly not the only one.

I meant that it's a tricky operation: let's say your video has 30fps and 60fps segments, then by shifting subtitles by 30 frames, you shift by 1 second in the 30fps parts, but only by 1/2 second in the 60fps parts. subs.shift() affects all subtitles in the file. If the subtitles you're shifting are all in a particular fps region, then I can see it being useful, since that's basically the CFR case.

As I understand it, doing the ms_to_frame -> frame_to_ms round-trip (which is what my snap_to_frames does) only needs to happen when we want to know the timestamps in reduced precision, ie. when saving the file to something like ASS.

If you don't consider the VFR video and a possibility an user shifted the subtitle by few ms, then I think you are right, but I am not 100$ sure.

Yeah, I'm also not sure, that's mainly why I actually generated a video with matching subtitles, to demonstrate the issues and see if the code changes resolve them.

I did another one for your 93fps example (this time I rounded the ms, as you suggest) and... it just doesn't seem to match, at least in ffmpeg and VLC. The SRT and ASS subtitles are both 2-3 frames ahead of the video, even though the SRT timestamps should be perfect (rounding the ms or not doesn't fix this). pysubs2_timestamps_test2.zip

@moi15moi
Copy link
Contributor Author

moi15moi commented Jan 15, 2023

Sorry, by "snapping to frame" I didn't mean the simple rounding as you do in round_timing (that's already implemented, and I see it why it doesn't really help)

I understood that. I don't know what make you think I misunderstood it.
Just to be sure we understand each other, my previous example is snapped to the frame since I got the result via frames_to_ms and I "round" it because .ass does support milliseconds.

I actually do get both results as 38, this is with moi15moi@aab9981 and this change:

This is simply a coincidence. It make no sense to do that.
Try with the fps 25 and the end_frame 1608.
You will get those result.

Expected result: 1609
Result: 1608

I did another one for your 93fps example (this time I rounded the ms, as you suggest) and... it just doesn't seem to match, at least in ffmpeg and VLC. The SRT and ASS subtitles are both 2-3 frames ahead of the video, even though the SRT timestamps should be perfect (rounding the ms or not doesn't fix this). pysubs2_timestamps_test2.zip

The video is broken. There are a lot of frame that are duplicated.
Try to export it in mkv. From what I see, there is no problem with mkv. I don't know why, but ffmpeg seems to have an issue ^^'
Try this "corrected" version pysubs2_timestamps_test.zip

The frame 0 has 4 frame which make no sense:
image
image
image
image

@tkarabela
Copy link
Owner

Ah, you're right, ffmpeg messes up the .mp4 output somehow. With .mkv it's good.

I've tried with the first 23.976 video example:

import subprocess
import pysubs2

subs = pysubs2.load("test_video.srt")
ts = pysubs2.timestamps.Timestamps.from_video_file("test_video.mp4")

def snap_to_frames(subs: pysubs2.SSAFile, ts: pysubs2.Timestamps):
    for e in subs:
        old_start_ms = e.start
        old_end_ms = e.end
        start_frames = ts.ms_to_frames(old_start_ms, pysubs2.TimeType.START)  # not sure if we want START or END in these
        end_frames = ts.ms_to_frames(old_end_ms, pysubs2.TimeType.END)
        new_start_ms = ts.frames_to_ms(start_frames, pysubs2.TimeType.START)
        new_end_ms = ts.frames_to_ms(end_frames, pysubs2.TimeType.END)
        e.start = new_start_ms
        e.end = new_end_ms

subs.shift(frames=99, fps=23.976)
snap_to_frames(subs, ts)

subs.save("test_video-shift.ass")
subprocess.check_call(["ffmpeg", "-y", "-i", "test_video.mp4", "-vf", "subtitles=test_video-shift.ass", "test_video_with_ass-shift.mp4"])

With no shift, snap_to_frames() takes care of the difference between SRT and ASS, both show correctly on playback. When applying shift, it also seems to work well, but you suggest that's because the input subtitles are already snapped to frames, while this doesn't have to be the case necessarily.

Can you implement what you think is needed inside SSAFile.shift() in your branch?

@moi15moi
Copy link
Contributor Author

I've tried with the first 23.976 video example:

See the message where I presented the example where the fps = 93, the end_time correspond to the frame 37 and we want to shift the end_time by 1 frame.

I "proved" by showing an counterexample that even if you snap the subs to the frame, it does not always works.
If you need any more detail, ask it. I know it is an complicated subject

Can you implement what you think is needed inside SSAFile.shift() in your branch?

Sure, see my last commit

@moi15moi
Copy link
Contributor Author

Is there any problem?

@tkarabela
Copy link
Owner

Hi @moi15moi , I'm ill at the moment, I will continue when I'm able to. Your code looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants