Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This is fun and i like it! #49

Open
clort81 opened this issue Aug 27, 2021 · 46 comments
Open

This is fun and i like it! #49

clort81 opened this issue Aug 27, 2021 · 46 comments

Comments

@clort81
Copy link

clort81 commented Aug 27, 2021

What's the fastest sixel viewer?

@hackerb9
Copy link
Owner

Good question! I'd love to have an answer to that on the lsix page.

I mostly use XTerm, the second slowest sixel interpreter I've ever seen, which is good enough for lsix. I can't recommend XTerm for general image viewing or animations -- although that's what I often use it for. mlterm and foot seem much faster.

For the Fastest Sixel Shooter in the West, I've seen some impressive sixel demos from @dankamongmen, so you may want to check out notcurses.

@j4james: any thoughts on a good sixel benchmark? It would be nice to have a good one for whatever test suite comes out of the vt340test.

I'm not sure if this will be helpful, but I wrote a gif viewer that measured frames per second by sending an inquiry to the terminal and presuming the image had finished rendering once it got a response.

@dankamongmen
Copy link

While I appreciate the kind words from @hackerb9 , Notcurses's sixel implementation as of 2.3.x leaves some room for improvement. I'd say the current state-of-the-art is libsixel, which is (on many images) faster and more accurate than the homegrown Notcurses solution (with that said, Notcurses can handle some images that cause libsixel to fail). I don't expect this to remain the case for very long, but it's true at the moment. With regards to the Kitty graphics protocol, Notcurses is probably the most advanced.

I have some timing data here for ncplayer vs timg:
dankamongmen/notcurses#1857

Note that there are some design tradeoffs that can affect raw benchmarks. WIth Kitty, you can either sideload a file or transmit its contents directly; the former is obviously faster, but only works on the local machine. You can use or not use compression; it slows down your image encoding, but can speed up use over a network (though probably not if you already use compression with OpenSSH). With Sixel, you can share palettes among multiple images, which cuts overhead but is obviously only a win when there are multiple images available. There's then the issue of startup time -- a program with high startup time can see it amortized over multiple loads, but will look bad if invoked once per image. Finally, how much parallelism is used? High parallelism might look great when run by itself, but not look so good when the machine is otherwise loaded. Etc, etc, etc.

With smaller images, it ought be entirely possible to drive hundreds of frames per second.

@dankamongmen
Copy link

it is interesting, for instance, to run ncls on a directory and also run lsix.

@dankamongmen
Copy link

oh if you were asking about fastest display as opposed to encoding, Sixel is fastest on WezTerm and foot in my experience. Kitty is fast on both Kitty and WezTerm, the only two implementations of which i am aware.

@dankamongmen
Copy link

@j4james: any thoughts on a good sixel benchmark? It would be nice to have a good one for whatever test suite comes out of the vt340test.

in addition to time performance, it would be awesome to compare quantized image with original. i've got a bug on this here: dankamongmen/notcurses#1724

@j4james
Copy link

j4james commented Aug 28, 2021

I haven't really looked, but the only sixel benchmark I've come across was https://github.com/jerch/sixel-bench, which just measures the throughput of a video sequence that's been pre-encoded as sixel.

It's not really a good way to compare implementations though, because it doesn't account for the rendering frame rate. For example, if your renderer is the bottleneck, you can improve your throughput dramatically just by dropping 90% of the frames, but that's not what most people would consider an improvement.

That said, I do think it could still be useful for a terminal developer evaluating improvements in their own implementation, as long as they understand what they're actually measuring.

@dnkl
Copy link

dnkl commented Aug 28, 2021

Note: To avoid false high throughput artefacts by aggressive prebuffering, the script waits for a cursor position report sent from the terminal after the Sixel data.

@j4james nice touch!

@hackerb9
Copy link
Owner

So, I just tried my ancient sixvid gif viewer on the sixel capable terminals that are packaged with the current Debian GNU/Linux (11, bullseye) and got some results that were not what I expected.

Terminal Frames per Second
foot 1.6.4 169 FPS
mlterm 3.9.0 240 FPS
XTerm(366) 223 FPS

@hackerb9
Copy link
Owner

Here's my sixvid gif viewer if you'd like to see relative speeds of sixel implementations on your own computers: https://github.com/hackerb9/sixvid

Usage: sixvid nyantocat.gif

(Hit the b key to toggle benchmarking mode).

@dankamongmen
Copy link

So, I just tried my ancient sixvid gif viewer on the sixel capable terminals that are packaged with the current Debian GNU/Linux (11, bullseye) and got some results that were not what I expected.

Terminal Frames per Second
foot 1.6.4 169 FPS
mlterm 3.9.0 240 FPS
XTerm(366) 223 FPS

interesting. here's ncplayer -bpixel -d0 ../data/notcursesIII.mkv -t0 -q:

term version pgeom cgeom times
mlterm 3.9.0 880x1406 80x61 1m5.142s 1m.3.731s 1m3.631s
xterm 368 880x1403 88x74 55.655s 55.433s 55.841s
alacritty 0.13.1 880x1400 88x70 55.716s 55.902s 55.709s

notcurses 2.3.17. were you running foot in a standalone Wayland, or xwayland? i'm not sure the latter would be a particularly fair comparison.

@dankamongmen
Copy link

if we look beyond sixel:

term version pgeom cgeom times
kitty 0.23.1 880x1440 88x70 33.179s 33.477s 33.385s
kitty 0.19.3 880x1440 88x70 54.722s 54.473s 54.867s

@dankamongmen
Copy link

i put these stats up at https://nick-black.com/dankwiki/index.php?title=Notcurses#Pixel_blitters. @kovidgoyal, i know you like this kind of thing =]

@kovidgoyal
Copy link

Yeah thanks, and am happy to see no surprises there, and this is without using side band transmission even, I think?

@j4james
Copy link

j4james commented Aug 29, 2021

I couldn't get sixvid to work - the screen just went black. I don't know whether it's dependent on something I don't have installed. I didn't spend much time trying to figure it out, but will give it another go tomorrow. In the meantime, though, I have run sixel-bench on some of the terminals in my collection.

As I mentioned above, some will just drop frames (sometimes all frames), so when that's obviously the case I've just discounted them from the running (that includes Rxvt, St, and Yaft). I've also not included any Windows terminals, because I wanted to limit the competition to the same VM.

All of the ones below at least gave the appearance of rendering all the frames, so I think there's a reasonable chance they're competing fairly, but I wouldn't read too much into these results. Ordered from fastest to slowest:

Rank Terminal Time
1 VTE 4s
2 MLTerm 10s
3 Alacritty 21s
4 XTerm 23s
5 Contour 27s
6 WezTerm 159s
7 DomTerm 525s

I don't understand why WezTerm was so slow given the earlier praise it got from @dankamongmen - I did download the latest nightly to see if that made any difference, but no such luck. I don't know whether perhaps my use of a VM might effect some terminals more than others.

At any rate, VTE looks to me to be the winner by a long way. It's possible it gets that speed from not rendering all the frames, but it wasn't obviously doing anything like that. It just looked fast and smooth.

@hackerb9
Copy link
Owner

were you running foot in a standalone Wayland, or xwayland? i'm not sure the latter would be a particularly fair comparison.

A good question and one I am gratified to be able to answer promptly: I do not know. It runs at the same speed when I unset the DISPLAY, if that means anything.

@hackerb9
Copy link
Owner

I couldn't get sixvid to work - the screen just went black.

Huh. I wonder if ImageMagick is calling ffmpeg even for GIF decoding.

@dankamongmen
Copy link

j4james: my praise for WezTerm is for its kitty implementation. I recall it being unremarkable with sixel, though not as poor as your results would suggest. if you built from source, did you use "cargo build --release"? unoptimized rust is very very slow.

that VTE number shocks me. I didn't even think VTE implemented sixel?

@dankamongmen
Copy link

were you running foot in a standalone Wayland, or xwayland? i'm not sure the latter would be a particularly fair comparison.

A good question and one I am gratified to be able to answer promptly: I do not know. It runs at the same speed when I unset the DISPLAY, if that means anything.

so foot, AFAIK, is a wayland-only terminal. the others are mostly X, though a few can do both. I'm not sure how you would run foot in Xorg without running Xwayland, interesting.

@dankamongmen
Copy link

Yeah thanks, and am happy to see no surprises there, and this is without using side band transmission even, I think?

correct, Notcurses eschews zee sideband

@dankamongmen
Copy link

with that said, my sixel vs kitty numbers oughtn't be taken out of the Notcurses context. my sixel quantization algorithm, as I've mentioned, is not where I'd like it to be. with that said, the kitty 0.19.3 numbers are very compare to the best sixel numbers, but the kitty 0.23.1 runs crush them. this is due to both improvements in kitty's internals, improvements in the kitty protocol, and Notcurses taking advantage of those improvements.

with that said, I'm a big believer in the superiority of the kitty protocol in just about every sense, even (perhaps counter-intuitively) implementation complexity (no need to deal with heights that aren't multiples of 6, no need to avoid the bottom row, no need to quantize).

@j4james
Copy link

j4james commented Aug 29, 2021

if you built from source, did you use "cargo build --release"?

Nope, I just downloaded the nightly deb package from github and installed that. Many of the others were built from source, though, so that's definitely worth bearing in mind. A different compiler could easily make a difference to the performance. As I said, don't read too much into my results.

I didn't even think VTE implemented sixel?

Like XTerm, you've got to build it yourself with the appropriate option enabled. I don't think you'll find it in a released package anywhere.

no need to avoid the bottom row

Technically you shouldn't need to avoid the bottom row with sixel either - that's just a bug in most implementations.

@hackerb9
Copy link
Owner

@j4james I've put in some sanity checks so hopefully you'll get an error message now instead of a blank screen.

@hackerb9
Copy link
Owner

Technically you shouldn't need to avoid the bottom row with sixel either - that's just a bug in most implementations.

@j4james just schooled me on that one this week. 😳 I had been complaining that my VT340 doesn't go to the next line at the end of sixels. Turns out that's a feature.

@dankamongmen
Copy link

I didn't even think VTE implemented sixel?
Like XTerm, you've got to build it yourself with the appropriate option enabled. I don't think you'll find it in a released package anywhere.

huh. that number seems very suspect, as i've found the total time to generally be dominated by transfer, not by actual rendering. i'm likewise surprised that mlterm could possibly score so high, as it's consistently been the slowest in my testing. very strange.

@dankamongmen
Copy link

@j4james just schooled me on that one this week. I had been complaining that my VT340 doesn't go to the next line at the end of sixels. Turns out that's a feature.

oh absolutely, i'd love for that to be the case. i have a good hundred lines of code devoted to dealing with this annoyance. kitty meanwhile has c=1 which means DON'T SCROLL DAMNIT and is a fine thing.

@dankamongmen
Copy link

oh absolutely, i'd love for that to be the case. i have a good hundred lines of code devoted to dealing with this annoyance. kitty meanwhile has c=1 which means DON'T SCROLL DAMNIT and is a fine thing.

meanwhile the linux framebuffer console keeps drawing kinda-but-not-completely distinct from text, so i have do all my image scrolling manually there =]

@hackerb9
Copy link
Owner

@dankamongmen wrote:

I've found the total time to generally be dominated by transfer, not by actual rendering.

Interesting. I added a --shm option to sixvid to create the temporary sixel files to /dev/shm/$USER (if the machine has that filesystem mounted). Unsurprisingly, it doesn't give much of a speed boost since the sixel files are cached in memory after the first play through.

@kovidgoyal
Copy link

Yeah thanks, and am happy to see no surprises there, and this is without using side band transmission even, I think?

correct, Notcurses eschews zee sideband

Since, IIRC, times are dominated by encoding/transmission/decoding using the sideband should yield substantial improvements. Most well designed terminal emulators should not have rendering as a bottleneck.

@hackerb9
Copy link
Owner

@dankamongmen writes:

I'm a big believer in the superiority of the kitty protocol in just about every sense, even (perhaps counter-intuitively) implementation complexity (no need to deal with heights that aren't multiples of 6, no need to avoid the bottom row, no need to quantize).

You'll find no argument. In fact, I think they may not even be comparable. It's not just that sixel is a protocol from 30 years ago. The kitty graphics protocol became a completely different type of thing once it added the notion of tagged images. Sixel splats bitmaps on the screen and forgets about them. A terminal that supports the kitty protocol must treat images as first class citizens, on the same level as text. That's a huge paradigm shift.

While I have some questions — how do sixel and kitty graphics interact? why can't text erase graphics? how do graphics interact with VT220 scrolling windows? do the images disappear as one would expect when switching text pages? does switching back making them reappear? why does loop=2 mean loop once, and loop=1 mean ∞? — I think kitty is the most probable future. @kovidgoyal has done a yeoman's job with the kitty graphics protocol and I look forward to it becoming standardized.

In the meantime, though, I'm having fun with sixels. ImageMagick understands them so I can do quick tests from the command line as I manipulate images or write tiny shell scripts like lsix. My preferred terminal, XTerm, has support builtin. I can browse the web using sixels in w3m. Even my humble DEC VT340 can display sixels, albeit at 9600 baud. Definitely not the future, but I like it.

@kovidgoyal
Copy link

kovidgoyal commented Aug 29, 2021 via email

@hackerb9
Copy link
Owner

hackerb9 commented Aug 29, 2021

So, I just tried my ancient sixvid gif viewer on the sixel capable terminals that are packaged with the current Debian GNU/Linux (11, bullseye) and got some results that were not what I expected.
Terminal Frames per Second
foot 1.6.4 169 FPS
mlterm 3.9.0 240 FPS
XTerm(366) 223 FPS

This time I tried my sixvid script on a different video source (live action instead of an animated GIF) and the speeds reversed, with foot being the fastest.

@hackerb9
Copy link
Owner

hackerb9 commented Aug 29, 2021

If you mean the alternate screen, yes the main and alternate screen maintain their own independent image lists.

Similar, but not quite. I meant Page Memory, which is how sixel does double-buffering. You can choose which page to write on and which page to display and they don't have to be the same.

@dnkl
Copy link

dnkl commented Aug 29, 2021

There's been some performance improvements to the sixel decoder in foot since 1.6.4. Still, sixvid with nyancat is much slower than I'd expect. This is something I'm going to want to look into.

(all benchmarks run on lousy laptop)

sixel-bench
foot 3.28s (36.13 MB/s)
mlterm 4.0s (29.62 MB/s)
xterm 61.14s (1.94 MB/s)
sixvid (nyancat)
foot 61FPS
XTerm 17FPS
MLTerm 118FPS

@hackerb9
Copy link
Owner

@dnkl wrote:

sixvid (nyancat)
foot 61FPS
XTerm 17FPS
MLTerm 118FPS

Uh. What happened to your XTerm? 17 frames per second on nyantocat? I just pulled out an old 32-bit Pentium M laptop (circa 2005) to test and I get better FPS than that. (Not by much, mind you. But, still...) Surely your laptop isn't more than fifteen years old, right?

@dnkl
Copy link

dnkl commented Aug 29, 2021

Uh. What happened to your XTerm?

Good question... turned out to be this:

xterm*maxGraphicSize: 10000x10000

With the default maxGraphicSize I get 110 FPS.

@dnkl
Copy link

dnkl commented Aug 29, 2021

And with 1920x1080 (to match my monitor), I get just above 100 FPS.

@dnkl
Copy link

dnkl commented Aug 29, 2021

I took a quick look at the nyancat issue, but not really sure what's going on; none of the processes are using nowhere near 100% CPU.

My best guess atm is that sixvid is stalling on a full PTY pipe, since foot does not consume any PTY data while rendering. However, I would expect much higher CPU usage from foot if that was the case, as well as long rendering times. But most frames are rendered in less than 1ms.

I'll have to sit down one day and dig deeper into this. @hackerb9 thanks for a, what it looks like, very interesting benchmark :D

@kovidgoyal
Copy link

kovidgoyal commented Aug 29, 2021 via email

@j4james
Copy link

j4james commented Aug 29, 2021

@j4james I've put in some sanity checks so hopefully you'll get an error message now instead of a blank screen.

Thanks - that helped a lot. It turns out I just didn't have ffmpeg installed on my test VM.

I'm not going to list specific numbers this time, because there was quite a lot of fluctuation in the frame rates, but the results were in the same ballpark as the sixel-bench test. VTE is still at the top, and WezTerm and DomTerm are still clearly at the bottom. The midfield were fairly close, but if I had to order them, I'd consider Alacritty and Contour the leaders of that group now, with Xterm and MLTerm bringing up the rear (which seems more in line with the results @dankamongmen was seeing).

That said, something that came up in the sixvid test that wasn't apparent with sixel-bench, is that Alacritty eats through memory like there's no tomorrow. After running sixvid for a minute or so it had chewed up several gigs of memory and died.

Sixvid also enabled me to get what I think was a fairer test of Rxvt, St, and Yaft, since they were now actually displaying all the frames (or at least appeared to be). The first two didn't do very well though - around the same speed as WezTerm - and Rxvt had the same memory-eating issue as Alacritty. However, Yaft's performance was fantastic - it seems about 50% faster than VTE even. It is a framebuffer terminal, so maybe that's a factor, and it's possible it's just not showing all the frames, but from a user point of view it looked great.

@hackerb9
Copy link
Owner

hackerb9 commented Aug 31, 2021

there was quite a lot of fluctuation in the frame rates

Yeah, I saw that from some of the terminal emulators. I had expected that to only happen from videos with differing levels of compressibility, but not nyantocat. Anyhow, I've added a final FPS reading when you quit the program which will give you an overall frames per second, starting from the point where the decoding and sixelizing finished.

That said, something that came up in the sixvid test that wasn't apparent with sixel-bench, is that Alacritty eats through memory like there's no tomorrow. After running sixvid for a minute or so it had chewed up several gigs of memory and died.

Yipes!

Yaft's performance was fantastic - it seems about 50% faster than VTE even. It is a framebuffer terminal, so maybe that's a factor, and it's possible it's just not showing all the frames, but from a user point of view it looked great.

Wow. I had benchmarked yaft as plenty fast, but middling on my machine. Are you using any sort of special framebuffer in the VM that might affect the performance measurement? Yaft lacks a scroll back buffer, right?

@j4james
Copy link

j4james commented Aug 31, 2021

Anyhow, I've added a final FPS reading when you quit the program which will give you an overall frames per second, starting from the point where the decoding and sixelizing finished.

That's brilliant. Thanks.

Are you using any sort of special framebuffer in the VM that might affect the performance measurement?

Not that I'm aware of. But maybe it's just that the GUI-based terminals are at a disadvantage running in the VM because they're not getting the video acceleration they would usually get. The results for WezTerm in particular seem hard to believe.

Also I know I can't run foot at all, because it depends on Wayland, which doesn't work in my VM. And perhaps if I did have Wayland that would give a performance boost to some terminals that are otherwise limited by X11.

Yaft lacks a scroll back buffer, right?

Yeah I think so. There's no concept of a scrollbar, and mouse wheel scrolling doesn't seem to do anything.

@hackerb9
Copy link
Owner

hackerb9 commented Sep 5, 2021

Anyhow, I've added a final FPS reading when you quit the program which will give you an overall frames per second, starting from the point where the decoding and sixelizing finished.

That's brilliant. Thanks.

I've fixed the FPS calculation to take much less time to converge. Previously, you had to wait quite a while before it gave the correct answer.

Also I know I can't run foot at all, because it depends on Wayland, which doesn't work in my VM. And perhaps if I did have Wayland that would give a performance boost to some terminals that are otherwise limited by X11.

What is your VM set up? Perhaps I or someone else can replicate your results.

@j4james
Copy link

j4james commented Sep 5, 2021

What is your VM set up? Perhaps I or someone else can replicate your results.

The VM is Hyper-V, running on Windows 10.0.18363.1500. It's got 2 virtual processor, and it's allocated 3GB of memory (but with the dynamic memory option enabled, so I think it can use to more than that). The guest OS is Ubuntu 20.04, but it wasn't installed as that initially (it's been upgraded a couple of time). I probably should try doing a clean install from the VM "quick create" setup, but I'm not convinced that'll make any difference.

@hackerb9
Copy link
Owner

hackerb9 commented Sep 5, 2021

The VM is Hyper-V, running on Windows 10.0.18363.1500.

I can't replicate Hyper-V as I don't have Microsoft Windows. Do you have an old junker computer or laptop you could install Ubuntu onto? I think that'd make a bigger difference than trying to reinstall within Hyper-V.

Alternately, you could use a Live USB stick to try Ubuntu on the bare hardware without affecting your Microsoft Hyper-V installation.

@j4james
Copy link

j4james commented Sep 5, 2021

Do you have an old junker computer or laptop you could install Ubuntu onto?
Alternately, you could use a Live USB stick to try Ubuntu on the bare hardware without affecting your Microsoft Hyper-V installation.

Neither of those options are feasible for me, and I'm not that enthusiastic about performance testing anyway. I was just curious to get a general idea of the speed of other terminals to compare with my own crappy implementation. Personally I care more about the correctness of the Sixel emulation, and if I'm close to the middle of the range in performance, I'd consider that a win.

@hackerb9
Copy link
Owner

hackerb9 commented Sep 5, 2021

I'm not that enthusiastic about performance testing anyway.

More than fair enough. I look forward to working with you more on correctness of sixel implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants