Skip to content

fix: The detected filetype is PLAIN_TEXT, but the provided filetype was HTML#6885

Merged
DOsinga merged 2 commits intomainfrom
goose/issue-6873
Feb 13, 2026
Merged

fix: The detected filetype is PLAIN_TEXT, but the provided filetype was HTML#6885
DOsinga merged 2 commits intomainfrom
goose/issue-6873

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Feb 1, 2026

Closes #6873

Summary

Issue #6873 Fix Summary

Problem

When using Goose with the AWS Bedrock provider, viewing .html files containing template syntax (Jinja2, Django, etc.) caused a ValidationException error:
"The detected filetype is PLAIN_TEXT, but the provided filetype was HTML."

This occurred because the to_bedrock_document() function determined the document format based on file extension. When it encountered .html files, it declared them as Html format to Bedrock's API. However, Bedrock validates content against the declared format, and template files containing {{ }} or {% %} syntax aren't valid HTML according to Bedrock's content validator.

Solution

Removed HTML from the list of supported document formats in the to_bedrock_document() function. HTML files now fall through to the default case (_ => return Ok(None)), which means they are handled as plain text content instead of being declared as HTML documents.

Change Made

File: crates/goose/src/providers/formats/bedrock.rs

Removed line:

Some((name, "html")) => (name, bedrock::DocumentFormat::Html),

From the match statement at line 258-263:

// Before:
let (name, format) = match filename.split_once('.') {
    Some((name, "txt")) => (name, bedrock::DocumentFormat::Txt),
    Some((name, "csv")) => (name, bedrock::DocumentFormat::Csv),
    Some((name, "md")) => (name, bedrock::DocumentFormat::Md),
    Some((name, "html")) => (name, bedrock::DocumentFormat::Html),  // REMOVED
    _ => return Ok(None), // Not a supported document type
};

// After:
let (name, format) = match filename.split_once('.') {
    Some((name, "txt")) => (name, bedrock::DocumentFormat::Txt),
    Some((name, "csv")) => (name, bedrock::DocumentFormat::Csv),
    Some((name, "md")) => (name, bedrock::DocumentFormat::Md),
    _ => return Ok(None), // Not a supported document type
};

Verification

  • cargo check - passed
  • cargo fmt - passed
  • ./scripts/clippy-lint.sh - passed (pre-existing warnings only)

Impact

  • HTML files will now be handled as plain text content, avoiding Bedrock's content validation
  • No impact on other file types (txt, csv, md)
  • This is the minimal safe fix as suggested in the issue comments

Generated by goose Issue Solver

@blackgirlbytes blackgirlbytes marked this pull request as ready for review February 2, 2026 02:29
Copilot AI review requested due to automatic review settings February 2, 2026 02:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Collaborator

@katzdave katzdave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to delete since scoped to bedrock.

@katzdave katzdave enabled auto-merge February 6, 2026 16:15
@katzdave katzdave disabled auto-merge February 6, 2026 16:16
@DOsinga DOsinga merged commit 6a7adb4 into main Feb 13, 2026
17 of 18 checks passed
@DOsinga DOsinga deleted the goose/issue-6873 branch February 13, 2026 15:31
katzdave added a commit that referenced this pull request Feb 13, 2026
…ntext

* 'main' of github.com:block/goose:
  feat: add onFallbackRequest handler to McpAppRenderer (#7208)
  feat: add streaming support for Claude Code CLI provider (#6833)
  fix: The detected filetype is PLAIN_TEXT, but the provided filetype was HTML (#6885)
  Add prompts (#7212)
  Add testing instructions for speech to text (#7185)
  Diagnostic files copying (#7209)
  fix: allow concurrent tool execution within the same MCP extension (#7202)
  fix: handle missing arguments in MCP tool calls to prevent GUI crash (#7143)
  Filter Apps page to only show standalone Goose Apps (#6811)
  opt: use static for Regex (#7205)
  nit: show dir in title, and less... jank (#7138)
  feat(gemini-cli): use stream-json output and re-use session (#7118)
  chore(deps): bump qs from 6.14.1 to 6.14.2 in /documentation (#7191)
  Switch jsonwebtoken to use aws-lc-rs (already used by rustls) (#7189)
  chore(deps): bump qs from 6.14.1 to 6.14.2 in /evals/open-model-gym/mcp-harness (#7184)
  Add SLSA build provenance attestations to release workflows (#7097)
  fix save and run recipe not working (#7186)
  Upgraded npm packages for latest security updates (#7183)
  docs: reasoning effort levels for Codex provider (#6798)
michaelneale added a commit that referenced this pull request Feb 16, 2026
* origin/main: (42 commits)
  fix: use dynamic port for Tetrate auth callback server (#7228)
  docs: removing LLM Usage admonitions (#7227)
  feat(otel): respect standard OTel env vars for exporter selection (#7144)
  fix: fork session (#7219)
  Bump version numbers for 1.24.0 release (#7214)
  Move platform extensions into their own folder (#7210)
  fix: ignore deprecated skills extension (#7139)
  Add a goosed over HTTP integration test, and test the developer tool PATH (#7178)
  feat: add onFallbackRequest handler to McpAppRenderer (#7208)
  feat: add streaming support for Claude Code CLI provider (#6833)
  fix: The detected filetype is PLAIN_TEXT, but the provided filetype was HTML (#6885)
  Add prompts (#7212)
  Add testing instructions for speech to text (#7185)
  Diagnostic files copying (#7209)
  fix: allow concurrent tool execution within the same MCP extension (#7202)
  fix: handle missing arguments in MCP tool calls to prevent GUI crash (#7143)
  Filter Apps page to only show standalone Goose Apps (#6811)
  opt: use static for Regex (#7205)
  nit: show dir in title, and less... jank (#7138)
  feat(gemini-cli): use stream-json output and re-use session (#7118)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The detected filetype is PLAIN_TEXT, but the provided filetype was HTML

3 participants