Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for project instruction file #46

Merged
merged 1 commit into from
Sep 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 56 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,8 +137,10 @@ To enhance AI comprehension, the output file begins with an AI-oriented explanat
#### Plain Text Format (default)

```text
This file is a merged representation of the entire codebase, combining all repository files into a single document.

================================================================
REPOPACK OUTPUT FILE
File Summary
================================================================
(Metadata and usage AI instructions)

Expand Down Expand Up @@ -169,6 +171,11 @@ File: src/utils.js
// File contents here

(...remaining files)

================================================================
Instruction
================================================================
(Custom instructions from `output.instructionFilePath`)
```

#### XML Format
Expand All @@ -181,9 +188,11 @@ repopack --style xml
The XML format structures the content in a hierarchical manner:

```xml
<summary>
This file is a merged representation of the entire codebase, combining all repository files into a single document.

<file_summary>
(Metadata and usage AI instructions)
</summary>
</file_summary>

<repository_structure>
src/
Expand All @@ -201,6 +210,10 @@ src/

(...remaining files)
</repository_files>

<instruction>
(Custom instructions from `output.instructionFilePath`)
</instruction>
```

For those interested in the potential of XML tags in AI contexts:
Expand Down Expand Up @@ -279,6 +292,7 @@ Here's an explanation of the configuration options:
|`output.filePath`| The name of the output file | `"repopack-output.txt"` |
|`output.style`| The style of the output (`plain`, `xml`) |`"plain"`|
|`output.headerText`| Custom text to include in the file header |`null`|
|`output.instructionFilePath`| Path to a file containing detailed custom instructions |`null`|
|`output.removeComments`| Whether to remove comments from supported file types | `false` |
|`output.removeEmptyLines`| Whether to remove empty lines from the output | `false` |
|`output.showLineNumbers`| Whether to add line numbers to each line in the output |`false`|
Expand Down Expand Up @@ -349,6 +363,45 @@ This approach allows for flexible file exclusion configuration based on your pro

Note: Binary files are not included in the packed output by default, but their paths are listed in the "Repository Structure" section of the output file. This provides a complete overview of the repository structure while keeping the packed file efficient and text-based.

### Custom Instruction

The `output.instructionFilePath` option allows you to specify a separate file containing detailed instructions or context about your project. This allows AI systems to understand the specific context and requirements of your project, potentially leading to more relevant and tailored analysis or suggestions.

Here's an example of how you might use this feature:

1. Create a file named `repopack-instruction.md` in your project root:

```markdown
# Coding Guidelines
- Follow the Airbnb JavaScript Style Guide
- Suggest splitting files into smaller, focused units when appropriate
- Add comments for non-obvious logic. Keep all text in English
- All new features should have corresponding unit tests

# Generate Comprehensive Output
- Include all content without abbreviation, unless specified otherwise
- Optimize for handling large codebases while maintaining output quality
```

2. In your `repopack.config.json`, add the `instructionFilePath` option:

```json5
{
"output": {
"instructionFilePath": "repopack-instruction.md",
// other options...
}
}
```

When Repopack generates the output, it will include the contents of `repopack-instruction.md` in a dedicated section.

Note: The instruction content is appended at the end of the output file. This placement can be particularly effective for AI systems. For those interested in understanding why this might be beneficial, Anthropic provides some insights in their documentation:
https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips

> Put longform data at the top: Place your long documents and inputs (~20K+ tokens) near the top of your prompt, above your query, instructions, and examples. This can significantly improve Claude's performance across all models.
> Queries at the end can improve response quality by up to 30% in tests, especially with complex, multi-document inputs.

### Comment Removal

When `output.removeComments` is set to `true`, Repopack will attempt to remove comments from supported file types. This feature can help reduce the size of the output file and focus on the essential code content.
Expand Down
65 changes: 65 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
"cli-spinners": "^2.9.2",
"commander": "^11.1.0",
"globby": "^14.0.2",
"handlebars": "^4.7.8",
"iconv-lite": "^0.6.3",
"istextorbinary": "^9.5.0",
"jschardet": "^3.1.3",
Expand Down
9 changes: 9 additions & 0 deletions repopack-instruction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Coding Guidelines
- Follow the Airbnb JavaScript Style Guide
- Suggest splitting files into smaller, focused units when appropriate
- Add comments for non-obvious logic. Keep all text in English
- All new features should have corresponding unit tests

# Generate Comprehensive Output
- Include all content without abbreviation, unless specified otherwise
- Optimize for handling large codebases while maintaining output quality
1 change: 1 addition & 0 deletions repopack.config.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"filePath": "repopack-output.xml",
"style": "xml",
"headerText": "This repository contains the source code for the Repopack tool.\nRepopack is designed to pack repository contents into a single file,\nmaking it easier for AI systems to analyze and process the codebase.\n\nKey Features:\n- Configurable ignore patterns\n- Custom header text support\n- Efficient file processing and packing\n\nPlease refer to the README.md file for more detailed information on usage and configuration.\n",
"instructionFilePath": "repopack-instruction.md",
"removeComments": false,
"removeEmptyLines": false,
"topFilesLength": 5,
Expand Down
2 changes: 2 additions & 0 deletions src/config/configTypes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ interface RepopackConfigBase {
filePath?: string;
style?: RepopackOutputStyle;
headerText?: string;
instructionFilePath?: string;
removeComments?: boolean;
removeEmptyLines?: boolean;
topFilesLength?: number;
Expand All @@ -23,6 +24,7 @@ export type RepopackConfigDefault = RepopackConfigBase & {
filePath: string;
style: RepopackOutputStyle;
headerText?: string;
instructionFilePath?: string;
removeComments: boolean;
removeEmptyLines: boolean;
topFilesLength: number;
Expand Down
35 changes: 27 additions & 8 deletions src/core/output/outputGenerator.ts
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
import fs from 'node:fs/promises';
import path from 'node:path';
import type { RepopackConfigMerged } from '../../config/configTypes.js';
import { RepopackError } from '../../shared/errorHandler.js';
import { generateTreeString } from '../file/fileTreeGenerator.js';
import type { ProcessedFile } from '../file/fileTypes.js';
import type { OutputGeneratorContext } from './outputGeneratorTypes.js';
import { generatePlainStyle } from './plainStyleGenerator.js';
import { generateXmlStyle } from './xmlStyleGenerator.js';

export const generateOutput = async (
rootDir: string,
config: RepopackConfigMerged,
processedFiles: ProcessedFile[],
allFilePaths: string[],
): Promise<string> => {
const outputGeneratorContext = buildOutputGeneratorContext(config, allFilePaths, processedFiles);
const outputGeneratorContext = await buildOutputGeneratorContext(rootDir, config, allFilePaths, processedFiles);

let output: string;
switch (config.output.style) {
Expand All @@ -24,13 +28,28 @@ export const generateOutput = async (
return output;
};

export const buildOutputGeneratorContext = (
export const buildOutputGeneratorContext = async (
rootDir: string,
config: RepopackConfigMerged,
allFilePaths: string[],
processedFiles: ProcessedFile[],
): OutputGeneratorContext => ({
generationDate: new Date().toISOString(),
treeString: generateTreeString(allFilePaths),
processedFiles,
config,
});
): Promise<OutputGeneratorContext> => {
let repositoryInstruction = '';

if (config.output.instructionFilePath) {
const instructionPath = path.resolve(rootDir, config.output.instructionFilePath);
try {
repositoryInstruction = await fs.readFile(instructionPath, 'utf-8');
} catch {
throw new RepopackError(`Instruction file not found at ${instructionPath}`);
}
}

return {
generationDate: new Date().toISOString(),
treeString: generateTreeString(allFilePaths),
processedFiles,
config,
instruction: repositoryInstruction,
};
};
1 change: 1 addition & 0 deletions src/core/output/outputGeneratorTypes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ export interface OutputGeneratorContext {
treeString: string;
processedFiles: ProcessedFile[];
config: RepopackConfigMerged;
instruction: string;
}
56 changes: 56 additions & 0 deletions src/core/output/outputStyleDecorator.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
import type { RepopackConfigMerged } from '../../config/configTypes.js';

export const generateHeader = (generationDate: string): string => {
return `
This file is a merged representation of the entire codebase, combining all repository files into a single document.
Generated by Repopack on: ${generationDate}
`.trim();
};

export const generateSummaryPurpose = (): string => {
return `
This file contains a packed representation of the entire repository's contents.
It is designed to be easily consumable by AI systems for analysis, code review,
or other automated processes.
`.trim();
};

export const generateSummaryFileFormat = (): string => {
return `
The content is organized as follows:
1. This summary section
2. Repository information
3. Repository structure
`.trim();
};

export const generateSummaryUsageGuidelines = (config: RepopackConfigMerged, repositoryInstruction: string): string => {
return `
- This file should be treated as read-only. Any changes should be made to the
original repository files, not this packed version.
- When processing this file, use the file path to distinguish
between different files in the repository.
- Be aware that this file may contain sensitive information. Handle it with
the same level of security as you would the original repository.
${config.output.headerText ? '- Pay special attention to the Repository Description. These contain important context and guidelines specific to this project.' : ''}
${repositoryInstruction ? '- Pay special attention to the Repository Instruction. These contain important context and guidelines specific to this project.' : ''}
`.trim();
};

export const generateSummaryNotes = (config: RepopackConfigMerged): string => {
return `
- Some files may have been excluded based on .gitignore rules and Repopack's
configuration.
- Binary files are not included in this packed representation. Please refer to
the Repository Structure section for a complete list of file paths, including
binary files.
${config.output.removeComments ? '- Code comments have been removed.\n' : ''}
${config.output.showLineNumbers ? '- Line numbers have been added to the beginning of each line.\n' : ''}
`.trim();
};

export const generateSummaryAdditionalInfo = (): string => {
return `
For more information about Repopack, visit: https://github.com/yamadashy/repopack
`.trim();
};
Loading
Loading