Skip to content

feat: add stdin support for CLI using temp-file approach#3

Open
mvance wants to merge 13 commits intomainfrom
feature/stdin-temp-file-support
Open

feat: add stdin support for CLI using temp-file approach#3
mvance wants to merge 13 commits intomainfrom
feature/stdin-temp-file-support

Conversation

@mvance
Copy link
Owner

@mvance mvance commented Feb 6, 2026

This PR adds stdin support to the summarize CLI, allowing users to pipe content directly into the tool using '-' as the input argument.

Key Changes:

  • Extended InputTarget type to include stdin
  • Added stdin resolution for '-' input
  • Implemented temp-file approach for stdin processing
  • Added 50MB size limit with streaming checks
  • Updated documentation and help text
  • Added comprehensive tests

Usage:

pbpaste | summarize -
cat file.txt | summarize -
curl -s https://example.com | summarize -

All existing tests pass.

Summary by CodeRabbit

  • New Features

    • Accept piped content as stdin via "-" so you can pipe files, command output, or clipboard into the tool.
  • Behavior Changes

    • Stdin inputs capped at 50 MB; piped stdin cannot be used with the extract option.
    • Markdown-mode handling clarified for piped inputs within existing mode constraints.
  • Documentation

    • Help, usage, and README examples updated with stdin piping guidance; added a lessons-learned doc for stdin support.
  • Tests

    • Added comprehensive stdin tests covering validation, errors, and mode interactions.

- Extend InputTarget type to include { kind: 'stdin' }
- Add resolveInputTarget handling for '-' input
- Implement stdin processing in runCli with temp file cleanup
- Update help text to document stdin support
- Add comprehensive tests for stdin functionality
- Follow existing error handling patterns
- Reuse handleFileInput logic for minimal changes
- Fix import duplication: use fs.readFile consistently
- Optimize Buffer handling in streamToString function
- Fix edge case: check for existing file named '-' before treating as stdin
- Add test file cleanup with afterAll hook
- Add streaming size limit check to prevent OOM (50MB max)
- Update README with stdin documentation and examples
- Add stdin examples to CLI help text
- Improve error messages for stdin size limits
@coderabbitai
Copy link

coderabbitai bot commented Feb 6, 2026

📝 Walkthrough

Walkthrough

Adds stdin ("-") support: input resolver recognizes - as stdin; runner reads piped stdin (50MB cap), writes it to a temp file for processing, enforces empty-input and --extract guards, updates help/docs, and adds tests for stdin behavior.

Changes

Cohort / File(s) Summary
Documentation
README.md, LESSONS_LEARNED.md
Added "Stdin (pipe content using `-`)" section and guidance on testing, temp-file strategy, limits, and examples.
Input Resolution
src/content/asset.ts, tests/input.resolve-input-target.test.ts
Introduced InputTarget variant { kind: 'stdin' }; resolveInputTarget treats - (and surrounding whitespace) as stdin.
CLI Help & Usage
src/run/help.ts, tests/cli.run.arg-branches.test.ts
Updated usage text to summarize <input> [flags], added clipboard/piping examples (e.g., `pbpaste
Core Runner Logic
src/run/runner.ts, manifest_file, package.json
Added streamToString with 50MB cap; extended RunEnv with optional stdin; updated runCli to accept stdin; implements stdin flow (read stream, write temp file with 0o600 perms, process as file, enforce empty-input and disallow --extract, cleanup).
Tests
tests/cli.stdin.test.ts, tests/...
New extensive stdin-focused tests: empty/whitespace stdin errors, disallow --extract for piped stdin, markdown-mode/llm handling for stdin, in-memory stdout/stderr utilities, and temporary HOME setup per test.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as CLI Runner
    participant Resolver as Input Resolver
    participant FS as File System
    participant Processor as Content Processor

    User->>CLI: Pipe content into "summarize -"
    CLI->>Resolver: resolveInputTarget("-")
    Resolver-->>CLI: { kind: "stdin" }
    CLI->>CLI: streamToString(stdin) (50MB cap)
    CLI->>FS: write temp file (0o600 perms)
    FS-->>CLI: temp file path
    CLI->>Processor: process file input (temp path)
    Processor-->>CLI: processed result
    CLI->>FS: delete temp file
    FS-->>CLI: deleted
    CLI-->>User: output result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hear the pipe and twitch my nose,
I nibble fifty megs in rows,
A temp file made, I hop and hum,
Then tidy up where bytes have come,
A dashed request — the summary flows.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately describes the primary change: adding stdin support to the CLI using a temporary-file approach, which is the core feature implemented across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/stdin-temp-file-support

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/run/help.ts`:
- Around line 247-249: The example in the help text uses an invalid preset
"bullet" for the summarize command; locate the string containing cmd('cat
file.txt | summarize - --length bullet') in src/run/help.ts and replace the
"--length bullet" token with a valid preset such as "--length short" (or
"medium", "long", "xl", "xxl" or shorthand "s/m/l") or a numeric limit so the
example matches the supported summarize --length values.

In `@src/run/runner.ts`:
- Around line 723-739: The code reads directly from process.stdin and should use
the injected stdin from RunEnv instead; update the function to destructure stdin
from the existing RunEnv (matching how stdout/stderr are handled), replace
process.stdin with stdin when calling streamToString, and ensure the
temporary-file branch handles unexpected false returns from handleFileInput by
throwing an explicit error (referencing symbols: process.stdin -> use stdin from
RunEnv, streamToString, tempPath, stdinInputTarget, handleFileInput, RunEnv).
Also keep the cleanup in the finally block but ensure you propagate the error
when handleFileInput returns false so execution doesn't silently continue.

In `@tests/cli.stdin.test.ts`:
- Around line 29-39: The RunEnv type and runCli implementation ignore an
injected stdin stream: update the RunEnv type to include an optional stdin?:
NodeJS.ReadableStream (or same type used for stdout/stderr), then modify the
stdin read in runCli (the block that calls streamToString(process.stdin,
MAX_STDIN_BYTES)) to use the injected env.stdin when present (e.g.,
streamToString(env.stdin ?? process.stdin, MAX_STDIN_BYTES)); ensure references
to streamToString and MAX_STDIN_BYTES remain unchanged and that tests passing
stdin streams will be respected.
🧹 Nitpick comments (2)
tests/cli.stdin.test.ts (1)

67-81: Fragile assertion: test passes regardless of why it throws.

This test asserts .rejects.toThrow() without matching a message, so it will pass even if stdin processing itself is broken. Consider asserting the error message does not match stdin-related errors (e.g., "Stdin is empty") to confirm the code progressed past stdin handling.

Proposed improvement
-    ).rejects.toThrow() // Will throw but not due to stdin processing
+    ).rejects.toThrow(/(?!Stdin is empty)/) // Should fail past stdin, e.g. on missing API key

Or more robustly, catch the error and assert it's not a stdin error:

try {
  await runCli(['-'], { ... })
  expect.unreachable('should have thrown')
} catch (e) {
  expect((e as Error).message).not.toMatch(/Stdin is empty/)
}
src/run/runner.ts (1)

724-724: Temp file name collision potential with Date.now().

Date.now() has millisecond resolution. While unlikely for a single-user CLI, using crypto.randomUUID() or Math.random() as a suffix would be more robust and is a trivial change.

Suggestion
-      const tempPath = path.join(os.tmpdir(), `summarize-stdin-${Date.now()}.txt`)
+      const tempPath = path.join(os.tmpdir(), `summarize-stdin-${Date.now()}-${Math.random().toString(36).slice(2, 8)}.txt`)

- Fix help text: change --length bullet to valid --length short preset
- Add stdin to RunEnv type for dependency injection
- Use injected stdin from RunEnv instead of hardcoded process.stdin
- Handle false return from handleFileInput with explicit error
- Add random suffix to temp file name for uniqueness
- Fix fragile test assertion to properly verify stdin processing
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/run/runner.ts`:
- Around line 724-744: The UX issue is that the current validation that rejects
the --extract flag only mentions website/YouTube URLs and omits stdin, which
confuses piped-use cases (e.g., cat file.txt | summarize - --extract); update
the validation logic that checks inputTarget.kind (the same check that throws
"--extract is only supported for website/YouTube URLs") to detect the stdin case
(where inputTarget.kind === 'stdin' or where streamToString/stdin is used) and
either (a) include stdin in the error text (e.g., "--extract is only supported
for website/YouTube URLs and not for piped stdin") or (b) add a separate branch
that throws a stdin-specific error message; ensure the change references the
same inputTarget.kind check and keep behavior of handleFileInput and temp-file
flow unchanged.
🧹 Nitpick comments (1)
src/run/runner.ts (1)

53-67: streamToString helper looks solid overall; minor note on chunk handling.

The streaming size check and Buffer accumulation pattern are correct. One small thing: when the stream is in object mode or the source emits strings, Buffer.from(chunk) works but creates a copy even when chunk is already a Buffer. In practice, at the 50 MB ceiling this is negligible, but you could use Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk) to avoid the extra allocation.

♻️ Optional: avoid redundant Buffer copy
   for await (const chunk of stream) {
-    const buffer = Buffer.from(chunk)
+    const buffer = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk)
     totalSize += buffer.length

- Add stdin-specific error message when using --extract with piped input
- Update test to expect new error message
- Optimize streamToString to avoid redundant Buffer copying when chunk is already a Buffer
- Remove cat file.txt | summarize - example from help.ts and README.md
  (passing filename directly is more practical)
- Remove curl -s https://example.com | summarize - example
  (summarize can fetch URLs directly)
- Remove cat file.txt | summarize - from concise help
- Remove LESSONS_LEARNED.md (not intended for upstream)
- Keep pbpaste | summarize - as primary stdin example
  (this is the most useful real-world use case)
- Update restriction to allow --markdown-mode llm for stdin and file inputs
- Reject other markdown modes (readability, auto, off) with clear error message
- Add tests for --markdown-mode llm allowance and other mode restrictions
- Error message indicates --markdown-mode llm transcript formatting is coming soon
- Follows CodeRabbit recommendation to support --markdown-mode llm use case
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/run/runner.ts`:
- Around line 519-538: The third guard duplicates and contradicts the intended
logic: when markdownModeExplicitlySet is true it rethrows "--markdown-mode is
only supported for URL inputs" for file inputs even though the prior guard
allows markdownMode === 'llm' for file/stdin; fix by removing the redundant
third check (the if that tests markdownModeExplicitlySet && inputTarget.kind !==
'url' && inputTarget.kind !== 'stdin') so the allowed path for markdownMode ===
'llm' with inputTarget.kind === 'file' or 'stdin' is not rejected, ensuring the
existing guards using markdownModeExplicitlySet, inputTarget.kind, and
markdownMode remain as the single source of truth.

The third guard was contradicting the intended logic:
- First guard: Allow URL/file/stdin
- Second guard: Only allow llm mode for file/stdin
- Third guard (BUG): Rejected file inputs even with llm mode

Removed the redundant third guard since the first guard already restricts
to URL/file/stdin, making the third guard unreachable and buggy.

CodeRabbit issue: #3 (comment)
- Update README.md to remove 'file contents' from notes (removed examples)
- Fix awkward usage string in concise help: <url-or-file-or--> -> <input>
- Clarify --markdown-mode error message for file/stdin inputs
- Update test expectation for new error message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant