Skip to content
This repository was archived by the owner on Dec 19, 2022. It is now read-only.
This repository was archived by the owner on Dec 19, 2022. It is now read-only.

Cocalc project restarts. How to investigate? #21

Open
@Debilski

Description

@Debilski

We are experimenting with Cocalc (a slightly slimmed image with fewer kernels and with increased memory defaults) for remote teaching/pair programming. (It works pretty well!) I am currently noticing three different types of crashes and would like to get a hint as to how to find out why the crash occurred/how I can fix it/see the logs.

  1. Python kernel crashes. Seems to occur when I allocate too much memory in a numpy array for example. The relevant cell gets a red tag with the kernel killed message. All understandable, I can live with that. (Although I wouldn’t mind seeing this somewhere in some project admin/server admin logs.)

  2. Project Pod sometimes gets killed. All I see is a Killed event in kubectl get events. Doesn’t happen super often, so it is not too bad, but I’d still like to get an idea why.

  3. Project restarts without notice. Sometimes this happens every 10 minutes while people are working on a project, so it doesn’t seem to be some idle timeout. (I figured it’s not the worst thing that can happen for teaching, as it clears all hidden variables and gives the student a clean state. ;) ) This is the nastiest problem as the reason is very unclear to me and I wouldn’t know where to look (and which limit to increase).

Any hints?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions