forked from Aider-AI/aider
-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Labels
Description
The way the repository map works is that:
- Treesitter extracts a lists of definitions and reference identifiers per file
- These identifiers are chunked into larger maps of identifier -> definer files and identifier -> reference files
- A graph is computed from these lists (which introduced duplication as seen here
- The weights for these multipliers have fairly naive static values (number of references, number of definers)
The gist is that turning the determination of the multipliers to equations should allow for a more dynamic proxy for "relevancy" in the repo map construction, which should give the LLM at least marginally better context to operate off of. My hypothesis is that logarithmic equations based on the number of overall definers (to detect rote/boiler plate code) and on the number of references (in terms of unique files, and absolute references) will serve as a better checker and balancer for how relevant/central identifiers and files are inside of codebases
Reactions are currently unavailable