Repo Map Accuracy

The way the repository map works is that:

- Treesitter extracts a lists of definitions and reference identifiers per file
- These identifiers are chunked into larger maps of identifier -> definer files and identifier -> reference files
- A graph is computed from these lists (which introduced duplication as seen [here](https://github.com/Aider-AI/aider/issues/2341)
- The weights for these multipliers have fairly naive static values (number of references, number of definers)

The gist is that turning the determination of the multipliers to equations should allow for a more dynamic proxy for "relevancy" in the repo map construction, which should give the LLM at least marginally better context to operate off of. My hypothesis is that logarithmic equations based on the number of overall definers (to detect rote/boiler plate code) and on the number of references (in terms of unique files, and absolute references) will serve as a better checker and balancer for how relevant/central identifiers and files are inside of codebases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repo Map Accuracy #45

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Repo Map Accuracy #45

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions