Format only code that was optimized (and helpers) #211

zomglings · 2025-05-16T19:02:45Z

PR Type

Enhancement, Tests

Description

Implement CST-based targeted formatting for functions
Track function code ranges with OptimFunctionCollector
Enable disabling telemetry via env var
Add bubble sort samples for formatting tests

Changes walkthrough 📝

Relevant files

Tests

bubble_sort_method_preserve_bad_formatting_for_nonoptimized_code.py `Add bubble sort class with bad formatting` code_to_optimize/bubble_sort_method_preserve_bad_formatting_for_nonoptimized_code.py Add poorly formatted top-level `lol` function Introduce `BubbleSorter` class with bad formatting Implement `sorter` bubble sort with prints	+38/-0
bubble_sort_preserve_bad_formatting_for_nonoptimized_code.py `Add bubble sort function sample` code_to_optimize/bubble_sort_preserve_bad_formatting_for_nonoptimized_code.py Add poorly formatted top-level `lol` function Implement `sorter` bubble sort function Print sorting messages and return array	+19/-0

Enhancement

code_replacer.py `Record function code ranges in CST collector` codeflash/code_utils/code_replacer.py Include `PositionProvider` in metadata dependencies Add `modification_code_range_lines` list Record start/end lines of modified functions	+11/-2
formatter.py `Add function to get modification code ranges` codeflash/code_utils/formatter.py Import `libcst`, `OptimFunctionCollector`, and models Add `get_modification_code_ranges` function Parse code and extract modified line ranges via CST	+36/-0
function_optimizer.py `Implement targeted code splicing in optimizer` codeflash/optimization/function_optimizer.py Import `get_modification_code_ranges` and `cast` Update `reformat_code_and_helpers` for targeted formatting Splice formatted code ranges into unformatted code Validate ranges and write updated files	+41/-14

Configuration changes

main.py `Support disabling telemetry via env var` codeflash/main.py Import `os` and read `CODEFLASH_DISABLE_TELEMETRY` Adjust sentry and telemetry initialization logic Respect env var across CLI commands	+12/-7

Need help?
Type /help how to ... in the comments thread for any questions about PR-Agent usage.
Check out the documentation for more information.

…e telemetry

… sections of the file... This is to test the new formatting changes

To test that the new formatting logic correctly handles indentation

…anch We are going to take a gentler, CST-based approach to targeted formatting.

Also updated CODEFLASH_DISABLE_TELEMETRY behavior

github-actions · 2025-05-16T19:04:48Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Off-by-one slicing The code in `reformat_code_and_helpers` uses 1-based line numbers from `PositionProvider` directly for Python list slicing, which is 0-based and may lead to incorrect replacements. Verify and adjust indices by subtracting one before slicing. new_code_lines = unformatted_code.split("\n") for range_0, range_1 in zip(code_ranges_unformatted, code_ranges_formatted): range_0_0, range_0_1 = range_0 range_1_0, range_1_1 = range_1 new_code_lines = new_code_lines[:range_0_0] + formatted_code_lines[range_1_0:range_1_1 + 1] + new_code_lines[range_0_1 + 1:] new_code = "\n".join(new_code_lines) Missing tests The new `get_modification_code_ranges` function has no dedicated unit tests. Add tests covering various scenarios (e.g., nested functions, classes, edge cases) to ensure correct range detection. def get_modification_code_ranges( modified_code: str, fto: FunctionToOptimize, code_context: CodeOptimizationContext, ) -> list[tuple[int, int]]: """ Returns the line number of modified and new functions in a string containing containing the code in a fully modified file. """ modified_functions = set() modified_functions.add(fto.qualified_name) for helper_function in code_context.helper_functions: if helper_function.jedi_definition.type != "class": modified_functions.add(helper_function.qualified_name) parsed_function_names = set() for original_function_name in modified_functions: if original_function_name.count(".") == 0: class_name, function_name = None, original_function_name elif original_function_name.count(".") == 1: class_name, function_name = original_function_name.split(".") else: msg = f"Unable to find {original_function_name}. Returning unchanged source code." logger.error(msg) continue parsed_function_names.add((class_name, function_name)) module = cst.metadata.MetadataWrapper(cst.parse_module(modified_code)) visitor = OptimFunctionCollector(code_context.preexisting_objects, parsed_function_names) module.visit(visitor) return visitor.modification_code_range_lines

github-actions · 2025-05-16T19:07:11Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Sort imports before writing You should apply `sort_imports` before writing the file so the sorted imports are actually persisted. Otherwise the write happens before sorting. codeflash/optimization/function_optimizer.py [619-622] +if should_sort_imports: + new_code = sort_imports(new_code) path.write_text(new_code, encoding="utf8") -if should_sort_imports: - new_code = sort_imports(new_code) - Suggestion importance[1-10]: 9 __ Why: Writing before sorting means the sorted imports aren’t persisted; reordering ensures the final file contains the sorted imports.	High
Possible issue	Initialize modification flag correctly Instead of defaulting `modification` to True and only clearing it in one branch, initialize it to False and set it to True only when you actually add the function to modified or new lists. This prevents unintentional marking of unrelated functions. codeflash/code_utils/code_replacer.py [55-75] -modification = True +modification = False if (self.current_class, node.name.value) in self.function_names: + modification = True self.modified_functions[(self.current_class, node.name.value)] = node elif self.current_class and node.name.value == "__init__": + modification = True self.modified_init_functions[self.current_class] = node elif ( self.preexisting_objects and (node.name.value, ()) not in self.preexisting_objects and self.current_class is None ): + modification = True self.new_functions.append(node) -else: - modification = False if modification: pos = self.get_metadata(cst.metadata.PositionProvider, node) self.modification_code_range_lines.append((pos.start.line, pos.end.line)) Suggestion importance[1-10]: 8 __ Why: Defaulting `modification` to True marks all functions as modified; initializing it to False ensures only targeted functions are tracked, avoiding incorrect metadata.	Medium
General	Support nested qualified names Support qualified names with multiple dots by splitting at the last dot, rather than erroring out. This will correctly handle nested modules or classes. codeflash/code_utils/formatter.py [79-87] for original_function_name in modified_functions: - if original_function_name.count(".") == 0: - class_name, function_name = None, original_function_name - elif original_function_name.count(".") == 1: - class_name, function_name = original_function_name.split(".") - else: - msg = f"Unable to find {original_function_name}. Returning unchanged source code." - logger.error(msg) - continue + head, sep, tail = original_function_name.rpartition(".") + class_name = head if sep else None + function_name = tail if sep else original_function_name Suggestion importance[1-10]: 6 __ Why: Splitting at the last dot correctly handles nested qualified names, extending support without altering the handling of simple names.	Low

…ing purposes This is done on a per-path basis.

... and started adding tests for that function.

Saga4 · 2025-05-17T23:04:06Z

codeflash/optimization/function_optimizer.py

-        new_code = format_code(self.args.formatter_cmds, path)
-        if should_sort_imports:
-            new_code = sort_imports(new_code)
+            if len(code_ranges_formatted) != len(code_ranges_unformatted):


Q: Can you help me understand the case on Why these have to be equal always?

Just a defensive measure. The function to optimize and helpers are non-overlapping blocks of code, and should not become overlapping due to formatting. If this has happened, something has gone wrong in the formatting step or in the extraction step (which identifies function to optimize and helpers).

misrasaurabh1 · 2025-05-18T04:10:16Z

codeflash/optimization/function_optimizer.py

@@ -299,7 +299,6 @@ def optimize_function(self) -> Result[BestOptimization, str]:

                self.log_successful_optimization(explanation, generated_tests, exp_type)

-                # xylophone


xylophone? 😂

Easy to grep, should be gone. :)

misrasaurabh1 · 2025-05-18T04:10:59Z

approach looks good. Can you write tests for reformat_code_and_helpers that check for the variety of cases we expect around unformatted and formatted code?

with --unsafe-fixes

... AGAIN

zomglings · 2025-06-02T02:03:31Z

No longer have permission to push to the official repo, will open this PR from my fork.

zomglings added 19 commits May 14, 2025 11:16

feat(optimizer): Implement targeted formatting (CF-637)

3cbd6b7

Fixed changes to the FunctionOptimizer

a10600c

CODEFLASH_DISABLE_TELEMETRY environment variable can be set to disabl…

bcba527

…e telemetry

Added bubble sort implementation with bad formatting in non-optimized…

36da640

… sections of the file... This is to test the new formatting changes

Added a file containing a bubble sort method in a class

85fd3c0

To test that the new formatting logic correctly handles indentation

Merge branch 'main' into targeted-formatting

ce33708

Added "scratch/" directory to .gitignore

2c40018

Cleaned up the import sorting code in FunctionOptimizer

82b9d41

Added test for sort_imports_in_place

ce87832

Reverted changes to optimizer and formatter on targeted-formatting br…

0373bfa

…anch We are going to take a gentler, CST-based approach to targeted formatting.

Merge branch 'main' into targeted-formatting-cst-based

81dbb33

Started work on targeted formatting using the CST

9345d80

TODO

bfc8423

Updated implementation of FunctionOptimizer.reformat_code_and_helpers

91fe6a7

Fixed a few bugs in reformat_code_and_helpers

b89622c

More codeposition bugs

5a9265c

Issue with splicing

b903d1a

Fixing more bugs, testing live...

16ca27e

Also updated CODEFLASH_DISABLE_TELEMETRY behavior

Got it functional

59e3667

zomglings requested a review from misrasaurabh1 May 16, 2025 19:02

github-actions bot added the Review effort 3/5 label May 16, 2025

zomglings added 4 commits May 16, 2025 14:04

Do not recalculate code_context when reformatting

c3b8063

Correct calculation of all preexisting "function" symbols for formatt…

6dd72cf

…ing purposes This is done on a per-path basis.

Clarified docstring for get_modification_code_ranges

eae756a

... and started adding tests for that function.

removed xylophone

05817f9

Saga4 reviewed May 17, 2025

View reviewed changes

misrasaurabh1 reviewed May 18, 2025

View reviewed changes

zomglings added 9 commits May 19, 2025 09:56

Added test for get_modification_code_ranges.

9efdc21

Merge branch 'main' into targeted-formatting-cst-based

b6baf05

"ruff check --fix"

9face63

Fixed some more ruff check issues

0615f55

"ruff format"

af4df4a

more fixes for "ruff check"...

cf4a665

with --unsafe-fixes

ruff format...

d43862e

... AGAIN

That should be hte last of the ruff stuff

6ec5ef0

Added a test for FunctionOptimizer.reformat_code_and_helpers

a468ba7

mohammedahmed18 mentioned this pull request Jun 1, 2025

WIP - targeted function reformatting with CST #261

Closed

zomglings closed this Jun 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Format only code that was optimized (and helpers) #211

Format only code that was optimized (and helpers) #211

Uh oh!

zomglings commented May 16, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 16, 2025

Uh oh!

github-actions bot commented May 16, 2025

Uh oh!

Saga4 May 17, 2025

Uh oh!

zomglings Jun 2, 2025

Uh oh!

misrasaurabh1 May 18, 2025

Uh oh!

zomglings May 19, 2025

Uh oh!

misrasaurabh1 commented May 18, 2025

Uh oh!

zomglings commented Jun 2, 2025

Uh oh!

Uh oh!

		@@ -299,7 +299,6 @@ def optimize_function(self) -> Result[BestOptimization, str]:

		self.log_successful_optimization(explanation, generated_tests, exp_type)

		# xylophone

Format only code that was optimized (and helpers) #211

Format only code that was optimized (and helpers) #211

Uh oh!

Conversation

zomglings commented May 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Changes walkthrough 📝

Uh oh!

github-actions bot commented May 16, 2025

PR Reviewer Guide 🔍

Uh oh!

github-actions bot commented May 16, 2025

PR Code Suggestions ✨

Uh oh!

Saga4 May 17, 2025

Choose a reason for hiding this comment

Uh oh!

zomglings Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

misrasaurabh1 May 18, 2025

Choose a reason for hiding this comment

Uh oh!

zomglings May 19, 2025

Choose a reason for hiding this comment

Uh oh!

misrasaurabh1 commented May 18, 2025

Uh oh!

zomglings commented Jun 2, 2025

Uh oh!

Uh oh!

zomglings commented May 16, 2025 •

edited by github-actions bot

Loading