Skip to content

Space leak caused by stan plugin #4515

@TeofilC

Description

@TeofilC

Your environment

Which OS do you use? NixOS Linux x86_64

Which version of GHC do you use and how did you install it? 9.8.2 from haskell.nix

How is your project built (alternative: link to the project)? cabal-install

Which LSP client (editor/plugin) do you use? emacs+lsp-mode

Which version of HLS do you use and how did you install it?

Have you configured HLS in any way (especially: a hie.yaml file)? Yeah just:

cradle:
  cabal:

Steps to reproduce

Run HLS with -hi profiling with the stan plugin enabled. You will notice that that the heap profile shows THUNKs from stan, and that lots of other closures follow the shape of this thunk, suggesting a space leak.

If you run without stan, this goes away.

Debug information

Image

This is the biggest closure size, note that it mirrors the stan closure:

Image

Activity

added
type: bugSomething isn't right: doesn't work as intended, documentation is missing/outdated, etc..
on Mar 11, 2025
fendor

fendor commented on Mar 11, 2025

@fendor
Collaborator

@0rphee ^ Maybe interesting to you

fendor

fendor commented on Mar 11, 2025

@fendor
Collaborator

Thanks for the bug report!

Is this a memory leak in stan or hls-stan-plugin? That's in stan, right?

TeofilC

TeofilC commented on Mar 11, 2025

@TeofilC
Author

Is this a memory leak in stan or hls-stan-plugin? That's in stan, right?

Yeah it's likely in stan, or at least the thunks in question seem to be coming from there. I guess it's less of a problem when run as a batch program.

tomjaguarpaw

tomjaguarpaw commented on Mar 14, 2025

@tomjaguarpaw
Member

How big is this space leak? I see "85536.82". Is that an amount of memory? What are the units? It can't be MB, could it be kB? If so that's worth fixing. If it's bytes then it doesn't seem worth fixing! EDIT: Oh, it can't be bytes because it's fractional :)

@Bodigrim spotted some potential memory use inefficiencies in stan: kowainik/stan#585. @TeofilC would you be willing to try his patch to see if it eliminates the space leak. If so I'm happy to merge his patch.

However, I don't see that the stan code obviously uses too much memory. There is an unsafeInterleaveIO which is supposed to avoid getSubDirsRecursive materializing the whole directory tree: https://github.com/kowainik/stan/blob/dc0a3a550dee53d51a018dda730d3d3631024dd7/src/Stan/Cabal.hs#L108-L117

If that works (and it may well not) I can only imagine large memory performance arising if there is a directory directly containing many files, a directory directly containing many directories, or a path many directories deep. "Many" here would have to be tens or hundreds of thousands I think, so see a noticeable space leak.

(But all this is just from eyebaling, not running the code, so please run it!)

Thanks for spotting and reporting this @TeofilC. Thanks to @Bodigrim for digging in.

TeofilC

TeofilC commented on Mar 14, 2025

@TeofilC
Author

How big is this space leak? I see "85536.82". Is that an amount of memory? What are the units? It can't be MB, could it be kB? If so that's worth fixing. If it's bytes then it doesn't seem worth fixing! EDIT: Oh, it can't be bytes because it's fractional :)

It is "Integrated Size (MiB s)", which is a bit of a confusing unit! Btw these were generated by eventlog2html

Wow that was quick. Thanks for taking a look @Bodigrim!

Unfortunately I'm still seeing this thunk even with that change (though I think the amount has been reduced somewhat):

Image
Note here that at its peak there are 7 million of these Maybe FilePath objects on the heap (I get this information if I hover on the graph on the left, but that's hard to capture with a screenshot).

Whereas:

❯ find -type f | wc -l
60764

So there is a factor of about 100 more of these than files in my tree.

I'm a bit suspicious that somehow this stan code is being called for each module?
Though from glancing at it, it feels like something that you only need to call once a session or maybe per package?

That's just some speculation though. I don't have a good understanding of how any of this works.

Note as well that while the Maybe Filepaths take up about 5MB, which isn't a lot. They have the same shape as the most common closure which takes up 1.18GB of 3.6GB of live bytes.

tomjaguarpaw

tomjaguarpaw commented on Apr 5, 2025

@tomjaguarpaw
Member

Unfortunately I'm still seeing this thunk even with that change (though I think the amount has been reduced somewhat):

Could you clarify "somewhat"? In your latest message I see "291.91" but the corresponding number in your first screenshot does not appear. Can we put some numbers on how much this reduces the space leak?

(My guess is that the code to find the cabal files is not actually the source of the space leak, since it uses unsafeInterleaveIO achieve streaming behaviour.)

tomjaguarpaw

tomjaguarpaw commented on Apr 5, 2025

@tomjaguarpaw
Member

If you provide instructions regarding how to reproduce what you're seeing, including the memory profile, then I'll have a closer look.

tomjaguarpaw

tomjaguarpaw commented on Apr 5, 2025

@tomjaguarpaw
Member

Some investigation:

Now I need some domain specific help: what is rules? How often is it run? It will traverse your entire current working directory tree each time it is called, which seems bad if it is called many times.

EDIT: Since this is a Shake thing I guess the correct way of saying it is "how often does the body of the rule fire" rather than how often is it called. Anyway, I hope the point is clear.


Possible resolution: obtain the path of the .cabal file some other way and set stanArgsCabalFilePath accordingly. That will cause findCabalFiles to not be run. Does some HLS expert know if that is possible?

TeofilC

TeofilC commented on Apr 5, 2025

@TeofilC
Author

Thanks for taking a look @tomjaguarpaw. I wasn't super scientific when gathering this data. I just opened up multiple large libraries from the work codebase and then refreshed some modules a few times. Because of this, I didn't gather exact numbers because there was a certain level of variance. I can try to compare again on Monday if that's helpful.

tomjaguarpaw

tomjaguarpaw commented on Apr 5, 2025

@tomjaguarpaw
Member

Thanks. This seems like a bad bug in Stan and I'm keen to fix it but since I'm not really an HLS user I will be able to help if and only if I can get a reliable repro case.

14 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    component: hls-stan-plugintype: bugSomething isn't right: doesn't work as intended, documentation is missing/outdated, etc..

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @TeofilC@tomjaguarpaw@Bodigrim@fendor@0rphee

      Issue actions

        Space leak caused by stan plugin · Issue #4515 · haskell/haskell-language-server