-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reolink Bad SPS #102
Comments
the camera is a Reolink Video Doorbell WiFi |
This camera sends an "out-of-band" sequence parameter set (in the DESCRIBE response's SDP): Then the camera sends another SPS "in-band" (as part of the RTP stream): --- out-of-band sps
+++ in-band sps without last byte
@@ -1,38 +1,57 @@
SeqParameterSet {
profile_idc: ProfileIdc(
100,
),
constraint_flags: ConstraintFlags {
flag0: false,
flag1: false,
flag2: false,
flag3: false,
flag4: false,
flag5: false,
reserved_zero_two_bits: 0,
},
level_idc: 51,
seq_parameter_set_id: SeqParamSetId(
0,
),
chroma_info: ChromaInfo {
chroma_format: YUV420,
separate_colour_plane_flag: false,
bit_depth_luma_minus8: 0,
bit_depth_chroma_minus8: 0,
qpprime_y_zero_transform_bypass_flag: false,
scaling_matrix: SeqScalingMatrix,
},
log2_max_frame_num_minus4: 9,
pic_order_cnt: TypeZero {
log2_max_pic_order_cnt_lsb_minus4: 9,
},
max_num_ref_frames: 1,
gaps_in_frame_num_value_allowed_flag: true,
pic_width_in_mbs_minus1: 39,
pic_height_in_map_units_minus1: 29,
frame_mbs_flags: Frames,
direct_8x8_inference_flag: true,
frame_cropping: None,
- vui_parameters: None,
+ vui_parameters: Some(
+ VuiParameters {
+ aspect_ratio_info: None,
+ overscan_appropriate: Unspecified,
+ video_signal_type: None,
+ chroma_loc_info: None,
+ timing_info: Some(
+ TimingInfo {
+ num_units_in_tick: 1270,
+ time_scale: 25400,
+ fixed_frame_rate_flag: false,
+ },
+ ),
+ nal_hrd_parameters: None,
+ vcl_hrd_parameters: None,
+ low_delay_hrd_flag: None,
+ pic_struct_present_flag: false,
+ bitstream_restrictions: None,
+ },
+ ),
} so essentially the new SPS additionally declares a fixed frame rate of 10 fps (a tick every 1,270 / 25,400 seconds, 2 ticks per frame because H.264 supports interlaced video). That sounds perfectly plausible and, except for the extra byte, valid. Next question: which is wrong: Next question: what should we do about it? I'm unsure:
Thoughts? I'm also wondering what other clients do. Did VLC log any errors/warnings when talking to this camera? ffmpeg uses some "interesting" macros...but I think is actually checking that the rbsp trailing bits are as expected, just as #define fixed(width, name, value) do { \
av_unused uint32_t fixed_value = value; \
xu(width, name, fixed_value, value, value, 0, ); \
} while (0)
// ...
#define xu(width, name, var, range_min, range_max, subs, ...) do { \
uint32_t value; \
CHECK(ff_cbs_read_unsigned(ctx, rw, width, #name, \
SUBSCRIPTS(subs, __VA_ARGS__), \
&value, range_min, range_max)); \
var = value; \
} while (0)
// ...
static int FUNC(rbsp_trailing_bits)(CodedBitstreamContext *ctx, RWContext *rw)
{
int err;
fixed(1, rbsp_stop_one_bit, 1);
while (byte_alignment(rw) != 0)
fixed(1, rbsp_alignment_zero_bit, 0);
return 0;
}
// ...
static int FUNC(sps)(CodedBitstreamContext *ctx, RWContext *rw,
H264RawSPS *current)
{
int err, i;
// ...
flag(vui_parameters_present_flag);
if (current->vui_parameters_present_flag)
CHECK(FUNC(vui_parameters)(ctx, rw, ¤t->vui, current));
else
CHECK(FUNC(vui_parameters_default)(ctx, rw, ¤t->vui, current));
CHECK(FUNC(rbsp_trailing_bits)(ctx, rw));
return 0;
} |
Another point of comparison: deepch/vdk just stops when it has all the data it knows to look for, here. This program... package main
import (
"encoding/hex"
"fmt"
"github.com/deepch/vdk/codec/h264parser"
"log"
)
func main() {
raw, err := hex.DecodeString("67640033ac1514a0a03da1000004f6000063380404")
if err != nil {
log.Fatalf("hex string wouldn't parse: %v", err)
}
sps, err := h264parser.ParseSPS(raw)
if err != nil {
log.Fatalf("sps wouldn't parse: %v", err)
}
fmt.Printf("sps: %+v\n", sps)
} prints: |
Is there some reference H264 implementation we could try it with? Why just
This may not be the only camera with this issue though, and it's harder to find replacements for door bell cameras compared to normal ones.
We could add a check for the in-band SPSs to see if the extra byte has length 1 and equals
If it randomly receives an invalid SPS several hours after the stream started I'd prefer if it threw an error. |
bluenviron ignores extra bytes. https://github.com/bluenviron/mediacommon/blob/main/pkg/codecs/h264/sps.go#L702 package main
import (
"encoding/hex"
"fmt"
"log"
"github.com/bluenviron/mediacommon/pkg/codecs/h264"
)
func main() {
raw, err := hex.DecodeString("67640033ac1514a0a03da1000004f6000063380404")
if err != nil {
log.Fatalf("hex string wouldn't parse: %v", err)
}
var sps h264.SPS
err = sps.Unmarshal(raw)
if err != nil {
log.Fatalf("sps wouldn't parse: %v", err)
}
fmt.Printf("sps: %+v\n", sps)
fmt.Printf("vui: %+v\n", *sps.VUI)
} prints
|
Turns out yes. https://www.itu.int/rec/T-REC-H.264.2-201602-I/en has a zip file with a bunch of stuff in it. Some of it seems to be specifically tailored for stuff like Scalable Video Codec. The basic thing seems to be the
It will crash trying to interpret video frames that aren't in the file, but it seems to parse the bogus SPS first. It doesn't care about the trailing RBSP bits being correct; it just stops reading a given NAL when it finds what it wants. I guess that's three implementations so far that just stop reading. Should we do the same? 🤷
🤷 That level of reverse engineering is more than I can do in the time I have.
It certainly would be possible to define a policy where it only allows the invalid SPS say before the first frame of video. |
You're right that it may mask other bugs, so it might be better to add a special exception for just this camera. @dholroyd what do you think? |
I can see that there is an example of other syntax in the spec where it's expected to have additional data between
But I can't see this being needed in a normal So I agree that the example you describe seems wrong. How would it be if This design still falls short on the diagnostic front because the data structures can't represent having parsed 'half' of a Anyway, in this case, the error would contain the complete |
How would this work if the error is before some mandatory element? I'm leaning toward a more modest thing of just adding a Retina flag to ignore the trailing data part only. I want Retina to work with a variety of cameras, but as a general practice I prefer to understand each distinct issue to the extent I can, rather than doing something broader that might paper over some unrelated bug. |
I got another report in which the camera sent the SPS
Wouldn't work; in this case it's never received a valid SPS.
In this case it's two bytes [edit: or three, extra 0x00 bytes after the last bit set are unnecessary but explicitly allowed] or three and they don't match the previous two, so apparently not. |
Any updates? |
Another report for the same error with a reolink cx410
|
I could make a knob for this, but it seems like tolerating this is common among RTSP clients, and just doing it by default saves some plumbing to the codec within Retina and (for Moonfire) creating config for it. Fixes #102
This should be fixed on Retina's |
cx410 and the doorbell camera confirmed working! |
reolink_working_vlc_pcap.zip
The text was updated successfully, but these errors were encountered: