Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aomenc tune=vmaf result in SIGFPE, Arithmetic exception #630

Closed
eclipseo opened this issue Jul 17, 2020 · 14 comments
Closed

aomenc tune=vmaf result in SIGFPE, Arithmetic exception #630

eclipseo opened this issue Jul 17, 2020 · 14 comments

Comments

@eclipseo
Copy link

I wanted to try tune=vmaf on some picture but the process segfault. The backtrace seems to involve libvmaf.

What steps will reproduce the problem?

  1. /usr/bin/aomenc --tile-columns=4 --cpu-used=2 --full-still-picture-hdr --passes=2 --lossless=1 --tune=vmaf_with_preprocessing --vmaf-model-path=/usr/share/model/vmaf_v0.6.1.pkl -o AOM-VMAF_out/subset1/89_-Vézelay_Basilique_2/89-Vézelay_Basilique_2-lossless.webm /tmp/308324889-_Vézelay_Basilique_2.png.y4m

What is the expected output?
A proper webm output, vmaf tuned

What do you see instead?
"aomenc" received signal SIGFPE, Arithmetic exception.

What version / commit were you testing with? (git describe can produce this
info if building from source). On what operating system?
Tested both aom 2.0.0 and the GIT tip.
libvmaf is 1.5.2
GCC 10.0.1
Fedora 32

Please provide any additional information below.

Backtrace from GDB:

Thread 2 "aomenc" received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7ffff4308640 (LWP 3101722)]
0x00007ffff74fe1e3 in adm_csf_den_scale_s (src=src@entry=0x7ffff4307ab0, orig_h=orig_h@entry=806, scale=scale@entry=0, w=w@entry=630, h=h@entry=403, 
    src_stride=src_stride@entry=2528, border_factor=border_factor@entry=0.10000000000000001) at ../src/feature/adm_tools.c:337
337                             accum_inner_d += val;
(gdb) bt
#0  0x00007ffff74fe1e3 in adm_csf_den_scale_s (src=src@entry=0x7ffff4307ab0, orig_h=orig_h@entry=806, scale=scale@entry=0, w=w@entry=630, 
    h=h@entry=403, src_stride=src_stride@entry=2528, border_factor=border_factor@entry=0.10000000000000001) at ../src/feature/adm_tools.c:337
#1  0x00007ffff74fd2d5 in compute_adm (ref=ref@entry=0x5555636889a0, dis=dis@entry=0x555565982f60, w=630, w@entry=1260, h=403, h@entry=806, 
    ref_stride=ref_stride@entry=5056, dis_stride=dis_stride@entry=5056, score=0x7ffff4307cc8, score_num=0x7ffff4307cd8, score_den=0x7ffff4307ce0, 
    scores=0x7ffff4307d90, border_factor=border_factor@entry=0.10000000000000001, adm_enhn_gain_limit=adm_enhn_gain_limit@entry=100)
    at ../src/feature/adm.c:183
#2  0x00007ffff743d330 in combo_threadfunc (vmaf_thread_data=<optimized out>) at ../src/combo.c:321
#3  0x00007ffff7a6a53a in start_thread () from /lib64/libpthread.so.0
#4  0x00007ffff7645283 in clone () from /lib64/libc.so.6
@li-zhi
Copy link
Collaborator

li-zhi commented Jul 17, 2020 via email

@eclipseo
Copy link
Author

I tried with release 1.5.2:

What version / commit were you testing with? (git describe can produce this
info if building from source). On what operating system?
Tested both aom 2.0.0 and the GIT tip.
libvmaf is 1.5.2
GCC 10.0.1
Fedora 32

@eclipseo
Copy link
Author

(gdb) info locals
abs_csf_o_val_h = 9.37698559e-08
abs_csf_o_val_v = 0
abs_csf_o_val_d = 6.69693577e-16
src_h = <optimized out>
src_v = <optimized out>
src_d = <optimized out>
src_px_stride = 632
factor1 = <optimized out>
factor2 = <optimized out>
rfactor = {0.0173815377, 0.0173815377, 0.005890687}
accum_h = 384.66391
accum_v = 1189.8446
accum_d = 2.49340606
accum_inner_h = 0.65392971
accum_inner_v = 3.14131665
accum_inner_d = 0.00294110784
den_scale_h = <optimized out>
den_scale_v = <optimized out>
den_scale_d = <optimized out>
val = 0
left = 62
top = 39
right = 568
bottom = 364
i = 284
j = 303

@li-zhi
Copy link
Collaborator

li-zhi commented Jul 17, 2020

I see. The problem already existed with 1.5.2.

Could you try it on 1.5.1 instead?

@eclipseo
Copy link
Author

I see. The problem already existed with 1.5.2.

Could you try it on 1.5.1 instead?

I tried with 1.5.1 which is provided by Fedora and I tried with 1.5.2 which I compiled from source. In both case the error is the same.

@li-zhi
Copy link
Collaborator

li-zhi commented Jul 17, 2020

In this case, we'll have to start by reproducing the issue. I know libaom tune=vmaf keeps intermediate output. Would we be able to reproduce the error message from the intermediate video frames?

@eclipseo
Copy link
Author

In this case, we'll have to start by reproducing the issue. I know libaom tune=vmaf keeps intermediate output. Would we be able to reproduce the error message from the intermediate video frames?

In this case, I'm working on a single y4m video frame. My intent was to test avif.

@eclipseo
Copy link
Author

I've tried with this video: https://media.xiph.org/video/derf/y4m/vidyo4_720p_60fps.y4m
Same error, the first pass works but it segfaults with the first frame of the second pass.

@li-zhi
Copy link
Collaborator

li-zhi commented Jul 17, 2020

Where you able to grab the intermediate video frames generated in the second pass?

To be sure, you are encoding a single frame?

@eclipseo
Copy link
Author

Where you able to grab the intermediate video frames generated in the second pass?

Not sure how I should do that? I don't have any output from aomenc, just a segfault.

To be sure, you are encoding a single frame?

That's what I was trying to do, but as mentioned above I have the same issue with a video file.

@li-zhi
Copy link
Collaborator

li-zhi commented Jul 17, 2020

Have you had any successful runs using tune=vmaf on other clips? Wondering if this is a content-dependent problem or system configuration problem.

@eclipseo
Copy link
Author

eclipseo commented Jul 17, 2020

Have you had any successful runs using tune=vmaf on other clips? Wondering if this is a content-dependent problem or system configuration problem.

No success, I'll be trying to recompile the entire chain in a chroot, from aom to vmaf and ffmpeg. It's probably a config issue since I seem to be the only one affected?

No luck with the re-build, still fails.

@eclipseo
Copy link
Author

I have tried everything I can think of and I still have this issue:

  • installed on a pritine systemm
  • rebuilt with clang to see if the error was due to the compiler
  • compiled git tip of both aom and vmaf
  • tried other models
  • tried with no option other than tune=vmaf
  • recompiled with libvmaf as a static library (it is build shared by default on my system)

@li-zhi
Copy link
Collaborator

li-zhi commented Jul 21, 2020 via email

@li-zhi li-zhi closed this as completed Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants