chore: refactor docx_tool to reduce function size#6273
Conversation
Split the monolithic docx_tool function into focused helper functions: - docx_error/invalid_params: Error construction helpers - read_docx_file/read_or_create_docx/write_docx_file: File I/O helpers - extract_paragraph_text: Text extraction from paragraphs - add_styled_paragraphs: Styled paragraph creation - parse_update_mode: Parameter parsing for update modes - extract_text_from_docx/extract_structure_from_docx: Document content extraction - do_extract_text: Extract text operation handler - do_append: Append mode handler - do_replace: Replace mode handler - do_insert_structured: Structured insert mode handler - load_image_as_png: Image loading and conversion - do_add_image: Add image mode handler The main docx_tool function is now a simple dispatcher (33 lines), well under the 200 line target. All existing tests pass.
There was a problem hiding this comment.
Pull request overview
This PR successfully refactors the docx_tool function from 440 lines to 33 lines by extracting focused helper functions. The refactoring maintains all original functionality while significantly improving code organization and readability.
Key changes:
- Extracted 15 helper functions covering error handling, file I/O, content extraction, content creation, and operation handlers
- Simplified error creation with
docx_errorandinvalid_paramshelper functions - Consolidated duplicate paragraph text extraction logic into
extract_paragraph_text
|
I'd also be very willing to remove this mcp server entirely! I'm not sure what use it has. But assuming we want to keep it this should be ready for review |
| } | ||
| } | ||
| text | ||
| } |
There was a problem hiding this comment.
is this supposed to have a trailing \n? consider using filter_map and join instead
There was a problem hiding this comment.
i kept the original code as is other than splitting up the big function here, thinking ill keep it but we can do another pr to clean this implementatino up too
* 'main' of github.com:block/goose: refactor: when changing provider/model,load existing provider/model (#6334) chore: refactor configure_extensions_dialog to reduce line count (#6277) chore: refactor handle_configure to reduce line count (#6276) chore: refactor interactive session to reduce line count (#6274) chore: refactor docx_tool to reduce function size (#6273) chore: refactor cli() function to reduce line count (#6272) make sure the models are using streaming properly (#6331) feat: add a max tokens env var (#6264) docs: slash commands topic (#6333) fix(ci): prevent gh-pages branch bloat (#6340) chore(deps): bump qs and body-parser in /documentation (#6338) Skip the smoke tests for dependabot PRs (#6337)
Summary
Refactors the
docx_toolfunction incrates/goose-mcp/src/computercontroller/docx_tool.rsto address the clippytoo_many_lineswarning. The original function was 440 lines; it is now 33 lines.Changes
Split the monolithic
docx_toolfunction into focused helper functions:Error Helpers
docx_error: Creates INTERNAL_ERROR ErrorDatainvalid_params: Creates INVALID_PARAMS ErrorDataFile I/O Helpers
read_docx_file: Reads and parses a DOCX fileread_or_create_docx: Reads existing or creates new DOCXwrite_docx_file: Builds and writes DOCX to diskContent Extraction Helpers
extract_paragraph_text: Extracts text from a paragraphextract_text_from_docx: Extracts all text from a documentextract_structure_from_docx: Extracts heading structureContent Creation Helpers
add_styled_paragraphs: Creates styled paragraphs from contentparse_update_mode: Parses update mode from JSON paramsload_image_as_png: Loads and converts images to PNGOperation Handlers
do_extract_text: Handles extract_text operationdo_append: Handles append modedo_replace: Handles replace modedo_insert_structured: Handles structured insert modedo_add_image: Handles add_image modeTesting
All 9 existing tests pass:
test_docx_text_extractiontest_docx_update_appendtest_docx_update_styledtest_docx_update_replacetest_docx_add_imagetest_docx_invalid_pathtest_docx_invalid_operationtest_docx_update_without_contenttest_docx_update_preserve_contentResult
The main
docx_toolfunction is now a simple 33-line dispatcher, well under the 200 line target.