Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kaleido hangs on repeated write_image calls. #42

Closed
36000 opened this issue Sep 18, 2020 · 5 comments · Fixed by #43
Closed

Kaleido hangs on repeated write_image calls. #42

36000 opened this issue Sep 18, 2020 · 5 comments · Fixed by #43
Labels
bug something broken

Comments

@36000
Copy link

36000 commented Sep 18, 2020

I am trying to use kaleido to make a gif, by writing out a series of png's then concatenating them together. The first few calls to write_image succeed, but at some point it just hangs. I have pickled my plotly figure and attach it to this post, as well as code to reproduce the phenomenon. I am using kaleido==0.0.3.post1 but I have tried it on many other pip install-able versions (0.0.1, 0.0.2, 0.0.3) to get the same effect. BTW, I really like this software! It makes it so much easier to write out plotly figures.

import pickle
import tempfile
from time import time
import numpy as np

with open('example_fig.obj', 'rb') as ff:
    figure = pickle.load(ff)

tdir = tempfile.gettempdir()
n_frames=60
zoom=2.5
z_offset=0.5

for i in range(n_frames):
    start = time()
    theta = (i * 6.28) / n_frames
    camera = dict(
        eye=dict(x=np.cos(theta) * zoom,
                 y=np.sin(theta) * zoom, z=z_offset)
    )
    figure.update_layout(scene_camera=camera)
    figure.write_image(tdir + f"/tgif{i}.png")
    print(time()-start)

example_fig.zip
This is the output before it stalls:

10.502809762954712
10.133813619613647
10.165985584259033
10.156563758850098
10.303324222564697
10.326725006103516
10.667879581451416
10.48402190208435
10.299865007400513
10.463427543640137
10.515920162200928
10.529618263244629
10.70783519744873
10.670512676239014
10.657939195632935
10.771514415740967
10.917421340942383
10.875455617904663
11.05155086517334
11.001903772354126
11.043968915939331
11.229110717773438
11.24083161354065
11.347798109054565
11.507253408432007
11.527276039123535
11.55422854423523
13.812325954437256
13.134565353393555
26.34061861038208
30.875459909439087
@jonmmease
Copy link
Collaborator

Thanks for the report and example @36000, I was able to reproduce the issue. Here is my output (after stopping the kernel after hanging for a minute or two):

12.307554006576538
10.964718103408813
10.78917670249939
10.571112632751465
10.875031471252441
10.775719165802002
10.85896372795105
11.100398302078247
11.008890390396118
11.293341398239136
11.152736186981201
11.235199689865112
11.43349838256836
11.315977096557617
11.50989055633545
11.462449789047241
11.620534181594849
11.627352714538574
11.686328887939453
11.837352275848389
12.046703577041626
12.051520586013794
12.280725002288818
12.206238508224487
12.196724653244019
12.240059614181519
12.327786684036255
12.407614469528198
12.416523456573486
17.26522159576416
22.575580835342407
22.482662439346313
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-13-12fa48e63b5c> in <module>
     11     )
     12     figure.update_layout(scene_camera=camera)
---> 13     figure.write_image(tdir + f"/tgif{i}.png")
     14     print(time()-start)

~/anaconda3/envs/kaleido_install/lib/python3.7/site-packages/plotly/basedatatypes.py in write_image(self, *args, **kwargs)
   3249         import plotly.io as pio
   3250 
-> 3251         return pio.write_image(self, *args, **kwargs)
   3252 
   3253     # Static helpers

~/anaconda3/envs/kaleido_install/lib/python3.7/site-packages/plotly/io/_kaleido.py in write_image(fig, file, format, scale, width, height, validate, engine)
    249         height=height,
    250         validate=validate,
--> 251         engine=engine,
    252     )
    253 

~/anaconda3/envs/kaleido_install/lib/python3.7/site-packages/plotly/io/_kaleido.py in to_image(fig, format, width, height, scale, validate, engine)
    129     fig_dict = validate_coerce_fig_to_dict(fig, validate)
    130     img_bytes = scope.transform(
--> 131         fig_dict, format=format, width=width, height=height, scale=scale
    132     )
    133 

~/anaconda3/envs/kaleido_install/lib/python3.7/site-packages/kaleido/scopes/plotly.py in transform(self, figure, format, width, height, scale)
    101         # response dict, including error codes.
    102         response = self._perform_transform(
--> 103             figure, format=format, width=width, height=height, scale=scale
    104         )
    105 

~/anaconda3/envs/kaleido_install/lib/python3.7/site-packages/kaleido/scopes/base.py in _perform_transform(self, data, **kwargs)
    216             self._proc.stdin.write("\n".encode('utf-8'))
    217             self._proc.stdin.flush()
--> 218             response = self._proc.stdout.readline()
    219 
    220         response_string = response.decode('utf-8')

KeyboardInterrupt: 

There was also some internally buffered error log

import plotly.io as pio
scope = pio.kaleido.scope
print(scope._std_error.getvalue().decode())
<--- Last few GCs --->

[17066:0x359b00000000]   429018 ms: Scavenge 3589.3 (3616.1) -> 3589.4 (3616.3) MB, 5.9 / 0.0 ms  (average mu = 0.143, current mu = 0.000) allocation failure 
[17066:0x359b00000000]   429024 ms: Scavenge 3590.5 (3616.9) -> 3590.5 (3616.9) MB, 4.3 / 0.0 ms  (average mu = 0.143, current mu = 0.000) allocation failure 
[17066:0x359b00000000]   429029 ms: Scavenge 3590.5 (3616.9) -> 3590.4 (3617.7) MB, 5.2 / 0.0 ms  (average mu = 0.143, current mu = 0.000) allocation failure 


<--- JS stacktrace --->

[0919/053546.664013:FATAL:memory.cc(38)] Out of memory. size=0
#0 0x55ebe72ef8a9 base::debug::CollectStackTrace()
#1 0x55ebe725e203 base::debug::StackTrace::StackTrace()
#2 0x55ebe726e925 logging::LogMessage::~LogMessage()
#3 0x55ebe7288ab9 base::internal::OnNoMemoryInternal()
#4 0x55ebe6eb95d0 (anonymous namespace)::OnNoMemory()
#5 0x55ebe6eb9359 blink::BlinkGCOutOfMemory()
#6 0x55ebe63e4b00 v8::Utils::ReportOOMFailure()
#7 0x55ebe63e4ac5 v8::internal::V8::FatalProcessOutOfMemory()
#8 0x55ebe654ab85 v8::internal::Heap::FatalProcessOutOfMemory()
#9 0x55ebe654c1dc v8::internal::Heap::RecomputeLimits()
#10 0x55ebe6548867 v8::internal::Heap::PerformGarbageCollection()
#11 0x55ebe6546128 v8::internal::Heap::CollectGarbage()
#12 0x55ebe6546641 v8::internal::Heap::CollectAllAvailableGarbage()
#13 0x55ebe655189e v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath()
#14 0x55ebe6522f60 v8::internal::FactoryBase<>::NewFixedDoubleArray()
#15 0x55ebe6688146 v8::internal::(anonymous namespace)::ElementsAccessorBase<>::GrowCapacity()
#16 0x55ebe67fa80f v8::internal::Runtime_GrowArrayElements()
#17 0x55ebe6d20638 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit
Task trace:
#0 0x55ebe78e8e6d IPC::(anonymous namespace)::ChannelAssociatedGroupController::Accept()
IPC message handler context: 0xC5F65154

Received signal 6
#0 0x55ebe72ef8a9 base::debug::CollectStackTrace()
#1 0x55ebe725e203 base::debug::StackTrace::StackTrace()
#2 0x55ebe72ef445 base::debug::(anonymous namespace)::StackDumpSignalHandler()
#3 0x7ff56341f540 (/usr/lib/x86_64-linux-gnu/libpthread-2.30.so+0x1553f)
#4 0x7ff56270d3eb gsignal
#5 0x7ff5626ec899 abort
#6 0x55ebe72ee3a5 base::debug::BreakDebugger()
#7 0x55ebe726edc2 logging::LogMessage::~LogMessage()
#8 0x55ebe7288ab9 base::internal::OnNoMemoryInternal()
#9 0x55ebe6eb95d0 (anonymous namespace)::OnNoMemory()
#10 0x55ebe6eb9359 blink::BlinkGCOutOfMemory()
#11 0x55ebe63e4b00 v8::Utils::ReportOOMFailure()
#12 0x55ebe63e4ac5 v8::internal::V8::FatalProcessOutOfMemory()
#13 0x55ebe654ab85 v8::internal::Heap::FatalProcessOutOfMemory()
#14 0x55ebe654c1dc v8::internal::Heap::RecomputeLimits()
#15 0x55ebe6548867 v8::internal::Heap::PerformGarbageCollection()
#16 0x55ebe6546128 v8::internal::Heap::CollectGarbage()
#17 0x55ebe6546641 v8::internal::Heap::CollectAllAvailableGarbage()
#18 0x55ebe655189e v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath()
#19 0x55ebe6522f60 v8::internal::FactoryBase<>::NewFixedDoubleArray()
#20 0x55ebe6688146 v8::internal::(anonymous namespace)::ElementsAccessorBase<>::GrowCapacity()
#21 0x55ebe67fa80f v8::internal::Runtime_GrowArrayElements()
#22 0x55ebe6d20638 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit
  r8: 0000000000000000  r9: 00007fffa1a3fe10 r10: 0000000000000008 r11: 0000000000000246
 r12: 00007fffa1a410d8 r13: 00007fffa1a400b0 r14: 00007fffa1a410e0 r15: aaaaaaaaaaaaaaaa
  di: 0000000000000002  si: 00007fffa1a3fe10  bp: 00007fffa1a40060  bx: 00007ff5622bb100
  dx: 0000000000000000  ax: 0000000000000000  cx: 00007ff56270d3eb  sp: 00007fffa1a3fe10
  ip: 00007ff56270d3eb efl: 0000000000000246 cgf: 002b000000000033 erf: 0000000000000000
 trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]
Calling _exit(1). Core file will not be generated.

@jonmmease
Copy link
Collaborator

So it looks like we have a object/memory leak in the JavaScript portion of kaleido that shows up here due to the large figure size. This will probably take some time to track down, so here's a workaround you can use in the meantime

import pickle
import tempfile
from time import time
import numpy as np
import os
import plotly.io as pio
scope = pio.kaleido.scope

with open('example_fig.obj', 'rb') as ff:
    figure = pickle.load(ff)

# tdir = tempfile.gettempdir()
tdir = './out4'
os.makedirs(tdir)


n_frames=60
zoom=2.5
z_offset=0.5

for i in range(n_frames):
    start = time()
    theta = (i * 6.28) / n_frames
    camera = dict(
        eye=dict(x=np.cos(theta) * zoom,
                 y=np.sin(theta) * zoom, z=z_offset)
    )
    figure.update_layout(scene_camera=camera)
    figure.write_image(tdir + f"/tgif{i}.png")
    
    # Shutdown kaleido subprocess to free memory, it will
    # be started again on next image export request
    scope._shutdown_kaleido()

    print(time()-start)

The scope._shutdown_kaleido() call will shutdown the kaleido subprocess so that it will be started fresh on each export. This will add a couple of seconds to each call to write_image, but there won't be any memory build up.

Hope this holds you over until we can track this down and get a fix out. Thanks again for the report.

@jonmmease jonmmease added the bug something broken label Sep 19, 2020
@36000
Copy link
Author

36000 commented Sep 23, 2020

Do you have any tips / tricks on how to speed up this gif creation process? I don't know the details, but it seems like each time I call write_image, under the hood, I am opening up the same figure using chromium. Is there an easy way to open the figure once, then rotate and take screenshots within the browser?

@jonmmease
Copy link
Collaborator

I don't know of anything different to do right now.

It would take some benchmarking to hone in on what would actually is taking so long right now. I expect that a lot of the time is being taken up with JSON serialization (Python to C++, then C++ to JavaScript) given the size of the figures.

Do these figures use large numpy arrays? We've talked about optimizing this serialization path in the past, but haven't gotten it off the ground yet.

All that said, I could see a place for adding support to kaleido for batching image export requests. One option would be to support inputting figures that define frames (https://plotly.com/python/animations/) and exporting these to a gif. For your case of rotating a camera, I think the frames could each be pretty small, basically just defining the camera orientation.

@36000
Copy link
Author

36000 commented Sep 23, 2020

My inputs to the scatter3d traces are long numpy arrays, I think those are what make the figures so big. I like that last idea of adding support for inputting figures that define frames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants