Skip to content

Idea for irept memory usage reduction. #7960

Open
@thomasspriggs

Description

@thomasspriggs

Each reference counted tree_nodet of irept essentially contains -

  • Reference count
  • An id / data field
  • A key value look-up (named_sub).
  • A linear collection (sub)
  • Cached Hash

I put some statistics gathering code into the destructor of the tree_node class and the few examples I looked at seemed to have a reasonable proportion of leaf nodes where the named_sub and the sub fields are empty. Despite these collections being empty we still allocate the complete book keeping structure for them, which uses 48 bytes of memory, where we only have 4 actual bytes of the id value which needs to be stored. The hash look-up for a leaf nodes doesn't benefit from caching as the hash is the id in this case.

On x64 the pointer to tree_nodet 8 bytes in size, with the data structure being 8 byte aligned. The alignment means the least significant bit of these pointers in guaranteed to be 0. Therefore we could re-purpose this bit and the memory of the pointer for the storage of leaf nodes. So for the leaf nodes instead of using the memory which would hold the pointer for a pointer we can set the least significant bit to 1 and write the value of the id to the 4 highest bytes. Using the 4 highest bytes keeps the memory of the id aligned as these are 4 byte aligned and allows forming pointers/references to it without performing bit shifts. I think this can be achieved relatively transparently to code using irept based on allocating the full data structure only if constructing irept with non-empty named_sub / sub or on requesting writable references to these fields.

I think the above outlined scheme dual purposing the pointer memory for either a pointer or an id value would save the allocation and de-allocation of the entire 48 byte tree_node data structure for nearly all leaf nodes in memory. I have not yet had the time to implement and debug a portable version of this idea. So I don't yet know what the performance characteristics of this would look like. However I wanted to shared the details of this idea for future reference.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions