feat: add stdin support for CLI using temp-file approach by mvance · Pull Request #3 · mvance/summarize

mvance · 2026-02-06T23:02:35Z

This PR adds stdin support to the summarize CLI, allowing users to pipe content directly into the tool using '-' as the input argument.

Key Changes:

Extended InputTarget type to include stdin
Added stdin resolution for '-' input
Implemented temp-file approach for stdin processing
Added 50MB size limit with streaming checks
Updated documentation and help text
Added comprehensive tests

Usage:

pbpaste | summarize -
cat file.txt | summarize -
curl -s https://example.com | summarize -

All existing tests pass.

Summary by CodeRabbit

New Features
- Accept piped content as stdin via "-" so you can pipe files, command output, or clipboard into the tool.
Behavior Changes
- Stdin inputs capped at 50 MB; piped stdin cannot be used with the extract option.
- Markdown-mode handling clarified for piped inputs within existing mode constraints.
Documentation
- Help, usage, and README examples updated with stdin piping guidance; added a lessons-learned doc for stdin support.
Tests
- Added comprehensive stdin tests covering validation, errors, and mode interactions.

- Extend InputTarget type to include { kind: 'stdin' } - Add resolveInputTarget handling for '-' input - Implement stdin processing in runCli with temp file cleanup - Update help text to document stdin support - Add comprehensive tests for stdin functionality - Follow existing error handling patterns - Reuse handleFileInput logic for minimal changes

- Fix import duplication: use fs.readFile consistently - Optimize Buffer handling in streamToString function - Fix edge case: check for existing file named '-' before treating as stdin - Add test file cleanup with afterAll hook - Add streaming size limit check to prevent OOM (50MB max) - Update README with stdin documentation and examples - Add stdin examples to CLI help text - Improve error messages for stdin size limits

coderabbitai · 2026-02-06T23:03:09Z

📝 Walkthrough

Walkthrough

Adds stdin ("-") support: input resolver recognizes - as stdin; runner reads piped stdin (50MB cap), writes it to a temp file for processing, enforces empty-input and --extract guards, updates help/docs, and adds tests for stdin behavior.

Changes

Cohort / File(s)	Summary
Documentation `README.md`, `LESSONS_LEARNED.md`	Added "Stdin (pipe content using `-`)" section and guidance on testing, temp-file strategy, limits, and examples.
Input Resolution `src/content/asset.ts`, `tests/input.resolve-input-target.test.ts`	Introduced `InputTarget` variant `{ kind: 'stdin' }`; `resolveInputTarget` treats `-` (and surrounding whitespace) as stdin.
CLI Help & Usage `src/run/help.ts`, `tests/cli.run.arg-branches.test.ts`	Updated usage text to `summarize <input> [flags]`, added clipboard/piping examples (e.g., `pbpaste
Core Runner Logic `src/run/runner.ts`, `manifest_file`, `package.json`	Added `streamToString` with 50MB cap; extended `RunEnv` with optional `stdin`; updated `runCli` to accept `stdin`; implements stdin flow (read stream, write temp file with 0o600 perms, process as file, enforce empty-input and disallow `--extract`, cleanup).
Tests `tests/cli.stdin.test.ts`, `tests/...`	New extensive stdin-focused tests: empty/whitespace stdin errors, disallow `--extract` for piped stdin, markdown-mode/llm handling for stdin, in-memory stdout/stderr utilities, and temporary HOME setup per test.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as CLI Runner
    participant Resolver as Input Resolver
    participant FS as File System
    participant Processor as Content Processor

    User->>CLI: Pipe content into "summarize -"
    CLI->>Resolver: resolveInputTarget("-")
    Resolver-->>CLI: { kind: "stdin" }
    CLI->>CLI: streamToString(stdin) (50MB cap)
    CLI->>FS: write temp file (0o600 perms)
    FS-->>CLI: temp file path
    CLI->>Processor: process file input (temp path)
    Processor-->>CLI: processed result
    CLI->>FS: delete temp file
    FS-->>CLI: deleted
    CLI-->>User: output result

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hear the pipe and twitch my nose,
I nibble fifty megs in rows,
A temp file made, I hop and hum,
Then tidy up where bytes have come,
A dashed request — the summary flows.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately describes the primary change: adding stdin support to the CLI using a temporary-file approach, which is the core feature implemented across multiple files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/stdin-temp-file-support

No actionable comments were generated in the recent review. 🎉

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@src/run/help.ts`:
- Around line 247-249: The example in the help text uses an invalid preset
"bullet" for the summarize command; locate the string containing cmd('cat
file.txt | summarize - --length bullet') in src/run/help.ts and replace the
"--length bullet" token with a valid preset such as "--length short" (or
"medium", "long", "xl", "xxl" or shorthand "s/m/l") or a numeric limit so the
example matches the supported summarize --length values.

In `@src/run/runner.ts`:
- Around line 723-739: The code reads directly from process.stdin and should use
the injected stdin from RunEnv instead; update the function to destructure stdin
from the existing RunEnv (matching how stdout/stderr are handled), replace
process.stdin with stdin when calling streamToString, and ensure the
temporary-file branch handles unexpected false returns from handleFileInput by
throwing an explicit error (referencing symbols: process.stdin -> use stdin from
RunEnv, streamToString, tempPath, stdinInputTarget, handleFileInput, RunEnv).
Also keep the cleanup in the finally block but ensure you propagate the error
when handleFileInput returns false so execution doesn't silently continue.

In `@tests/cli.stdin.test.ts`:
- Around line 29-39: The RunEnv type and runCli implementation ignore an
injected stdin stream: update the RunEnv type to include an optional stdin?:
NodeJS.ReadableStream (or same type used for stdout/stderr), then modify the
stdin read in runCli (the block that calls streamToString(process.stdin,
MAX_STDIN_BYTES)) to use the injected env.stdin when present (e.g.,
streamToString(env.stdin ?? process.stdin, MAX_STDIN_BYTES)); ensure references
to streamToString and MAX_STDIN_BYTES remain unchanged and that tests passing
stdin streams will be respected.

🧹 Nitpick comments (2)

tests/cli.stdin.test.ts (1)
67-81: Fragile assertion: test passes regardless of why it throws.

This test asserts .rejects.toThrow() without matching a message, so it will pass even if stdin processing itself is broken. Consider asserting the error message does not match stdin-related errors (e.g., "Stdin is empty") to confirm the code progressed past stdin handling.
Proposed improvement
-    ).rejects.toThrow() // Will throw but not due to stdin processing
+    ).rejects.toThrow(/(?!Stdin is empty)/) // Should fail past stdin, e.g. on missing API key
Or more robustly, catch the error and assert it's not a stdin error:
try {
  await runCli(['-'], { ... })
  expect.unreachable('should have thrown')
} catch (e) {
  expect((e as Error).message).not.toMatch(/Stdin is empty/)
}
src/run/runner.ts (1)
724-724: Temp file name collision potential with Date.now().

Date.now() has millisecond resolution. While unlikely for a single-user CLI, using crypto.randomUUID() or Math.random() as a suffix would be more robust and is a trivial change.
Suggestion
-      const tempPath = path.join(os.tmpdir(), `summarize-stdin-${Date.now()}.txt`)
+      const tempPath = path.join(os.tmpdir(), `summarize-stdin-${Date.now()}-${Math.random().toString(36).slice(2, 8)}.txt`)

src/run/help.ts

tests/cli.stdin.test.ts

- Fix help text: change --length bullet to valid --length short preset - Add stdin to RunEnv type for dependency injection - Use injected stdin from RunEnv instead of hardcoded process.stdin - Handle false return from handleFileInput with explicit error - Add random suffix to temp file name for uniqueness - Fix fragile test assertion to properly verify stdin processing

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/run/runner.ts`:
- Around line 724-744: The UX issue is that the current validation that rejects
the --extract flag only mentions website/YouTube URLs and omits stdin, which
confuses piped-use cases (e.g., cat file.txt | summarize - --extract); update
the validation logic that checks inputTarget.kind (the same check that throws
"--extract is only supported for website/YouTube URLs") to detect the stdin case
(where inputTarget.kind === 'stdin' or where streamToString/stdin is used) and
either (a) include stdin in the error text (e.g., "--extract is only supported
for website/YouTube URLs and not for piped stdin") or (b) add a separate branch
that throws a stdin-specific error message; ensure the change references the
same inputTarget.kind check and keep behavior of handleFileInput and temp-file
flow unchanged.

🧹 Nitpick comments (1)

src/run/runner.ts (1)
53-67: streamToString helper looks solid overall; minor note on chunk handling.

The streaming size check and Buffer accumulation pattern are correct. One small thing: when the stream is in object mode or the source emits strings, Buffer.from(chunk) works but creates a copy even when chunk is already a Buffer. In practice, at the 50 MB ceiling this is negligible, but you could use Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk) to avoid the extra allocation.
♻️ Optional: avoid redundant Buffer copy
   for await (const chunk of stream) {
-    const buffer = Buffer.from(chunk)
+    const buffer = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk)
     totalSize += buffer.length

src/run/runner.ts

- Add stdin-specific error message when using --extract with piped input - Update test to expect new error message - Optimize streamToString to avoid redundant Buffer copying when chunk is already a Buffer

- Remove cat file.txt | summarize - example from help.ts and README.md (passing filename directly is more practical) - Remove curl -s https://example.com | summarize - example (summarize can fetch URLs directly) - Remove cat file.txt | summarize - from concise help - Remove LESSONS_LEARNED.md (not intended for upstream) - Keep pbpaste | summarize - as primary stdin example (this is the most useful real-world use case)

- Update restriction to allow --markdown-mode llm for stdin and file inputs - Reject other markdown modes (readability, auto, off) with clear error message - Add tests for --markdown-mode llm allowance and other mode restrictions - Error message indicates --markdown-mode llm transcript formatting is coming soon - Follows CodeRabbit recommendation to support --markdown-mode llm use case

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/run/runner.ts`:
- Around line 519-538: The third guard duplicates and contradicts the intended
logic: when markdownModeExplicitlySet is true it rethrows "--markdown-mode is
only supported for URL inputs" for file inputs even though the prior guard
allows markdownMode === 'llm' for file/stdin; fix by removing the redundant
third check (the if that tests markdownModeExplicitlySet && inputTarget.kind !==
'url' && inputTarget.kind !== 'stdin') so the allowed path for markdownMode ===
'llm' with inputTarget.kind === 'file' or 'stdin' is not rejected, ensuring the
existing guards using markdownModeExplicitlySet, inputTarget.kind, and
markdownMode remain as the single source of truth.

src/run/runner.ts

The third guard was contradicting the intended logic: - First guard: Allow URL/file/stdin - Second guard: Only allow llm mode for file/stdin - Third guard (BUG): Rejected file inputs even with llm mode Removed the redundant third guard since the first guard already restricts to URL/file/stdin, making the third guard unreachable and buggy. CodeRabbit issue: #3 (comment)

- Update README.md to remove 'file contents' from notes (removed examples) - Fix awkward usage string in concise help: <url-or-file-or--> -> <input> - Clarify --markdown-mode error message for file/stdin inputs - Update test expectation for new error message

mvance added 3 commits February 6, 2026 14:25

docs: add testing lessons learned for development workflow

284dffc

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

src/run/help.ts Outdated Show resolved Hide resolved

tests/cli.stdin.test.ts Show resolved Hide resolved

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

src/run/runner.ts Show resolved Hide resolved

mvance added 3 commits February 6, 2026 17:40

fix: improve UX for --extract with stdin error message

2e34bf6

- Add stdin-specific error message when using --extract with piped input - Update test to expect new error message - Optimize streamToString to avoid redundant Buffer copying when chunk is already a Buffer

coderabbitai bot reviewed Feb 7, 2026

View reviewed changes

src/run/runner.ts Outdated Show resolved Hide resolved

mvance added 6 commits February 6, 2026 23:46

test: update test to match new usage string

2080e67

ci: trigger workflow run

c9c1abb

ci: trigger workflow after enabling actions

d8f5e10

ci: trigger workflow run

37a0f67

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add stdin support for CLI using temp-file approach#3

feat: add stdin support for CLI using temp-file approach#3
mvance wants to merge 13 commits intomainfrom
feature/stdin-temp-file-support

mvance commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 6, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mvance commented Feb 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key Changes:

Usage:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mvance commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 6, 2026 •

edited

Loading