Skip to content

Caching mechanism for MLGraph #807

@anssiko

Description

@anssiko

[ Spun off from issue #780 ]

@reillyeon mentioned to me about introducing a caching mechanism for MLGraph (e.g. save it for later use and avoid repeated graph compilation). Said mechanism might help here.

I'd like to discuss this proposal from @reillyeon a bit more. I believe the group's current working assumption is that graph compilation could take a long time and caching would improve the user experience on subsequent visits to the same site (cross-origin is a harder, separate problem). Depending on the size of the model, underlying implementation and other factors, this could be a significant performance and UX improvement.

@reillyeon have you thought about this more since you came up with the idea? Known implementation blockers?

@bbernhar what could we learn from the WebGPU compilation caches for shaders, pipelines? I see some toggles in Dawn code to control caching and you've done work in this space e.g. in https://issues.chromium.org/issues/41479574 suggesting you might have insights to share.

Also paging @huningxin @fdwr and @RafaelCintron for thoughts. Interested in all insights in the spirit of brainstorming.

A few additional questions:

Do we foresee the caching of compiled graphs to be purely an implementation detail?

Privacy impact? We already discuss caching-related timing attack vectors in privacy considerations and reference the WebGPU compilation cache considerations. Depending on which way we go, might want to revise these considerations.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions