Skip to content

fix: High memory usage and potential race conditions when reading large JSON files #5331

@KJ7LNW

Description

@KJ7LNW

We have identified significant memory usage issues with our current approach to reading JSON files in the codebase. Currently, many places use fs.readFile followed by JSON.parse, which:

  1. Buffers the entire file content into memory as a string
  2. Then creates a second copy in memory when parsing to a JavaScript object
  3. Can race with concurrent writes to the same file
  4. Causes unnecessary garbage collection pressure due to doubled memory footprint
  5. Lacks proper error handling in many places

Examples of large files affected

17M May  3 20:43 tasks/55f5ddc5-d6e5-4aa0-ad28-56d0e010ddf1/ui_messages.json
17M May  3 20:25 tasks/65324090-c9d4-4509-83c5-1932df106926/ui_messages.json
17M May  3 20:31 tasks/96276917-39ec-435f-bd8d-dedaba14d1d6/ui_messages.json
17M May  3 20:23 tasks/a98cfe47-6a3e-4922-aa14-3d241f52b674/ui_messages.json
17M May  3 20:41 tasks/99993cb3-6bdd-4d51-bd03-adeb233395d6/ui_messages.json
17M May  3 18:11 tasks/37653e9e-8481-4873-b053-4ab2a7907dec/ui_messages.json
17M May  3 20:28 tasks/74b42012-267d-4c4e-b948-2bd0e9207977/ui_messages.json
17M May  3 16:21 tasks/14b4426e-edec-4174-a3a3-c5acaba408bf/ui_messages.json
17M May  3 17:21 tasks/bb655ebc-0665-40ec-839f-fab9de6f3206/ui_messages.json
20M Jun 26 13:07 tasks/162dffeb-515f-4b90-b9b5-c6092834efa6/ui_messages.json

For a 20MB file, the current approach requires at least 40MB of memory (20MB for the string buffer and 20MB for the parsed object), which is inefficient and may lead to out-of-memory issues when dealing with multiple large files simultaneously.

Proposed solution

We should implement a safeReadJson utility function that:

  1. Streams the file content using a library like stream-json to parse JSON without buffering the entire file
  2. Provides proper locking to prevent race conditions with concurrent writes
  3. Has consistent error handling
  4. Supports extracting specific paths from JSON without loading the entire object

This would be similar to our existing safeWriteJson utility but optimized for reading operations.

Once implemented, we should replace all instances of fs.readFile + JSON.parse with safeReadJson throughout the codebase to reduce memory usage and prevent potential race conditions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue - Needs ScopingValid, but needs effort estimate or design input before work can start.bugSomething isn't working

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions