-
Notifications
You must be signed in to change notification settings - Fork 3.3k
feat(copilot): JSON sanitization logic + operations sequence diff correctness #1521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Summary
This PR introduces significant architectural changes to standardize JSON serialization and workflow data structures for copilot training consistency. The main changes include:Core Architecture Refactoring:
- JSON Sanitization Overhaul: The
json-sanitizer.tsfile has been completely restructured to embed connections directly within blocks rather than maintaining separate edges/loops/parallels arrays. This creates a more nested, self-contained data structure that aligns with how copilot operations work internally. - Enhanced Security: Added comprehensive sensitive data detection with regex patterns and OAuth input type checking to prevent API keys and secrets from being included in training data.
- Simplified Data Structures: Replaced complex ReactFlow-specific structures with a unified nested approach where loops/parallels are represented as
nestedNodeswithin their parent blocks.
Training Modal Enhancement:
Added a new "Send Live State" tab to the training modal (training-modal.tsx) that allows users to capture and submit the current workflow state directly for copilot training, complementing the existing session recording functionality. This provides a quick way to submit complete workflow examples without going through the full editing session process.
Operations System Updates:
- Edit Sequence Computation: Refactored
compute-edit-sequence.tsto work with the new sanitized format, introducingextractAllEdgesFromBlocksfunction and unifiednestedNodeshandling. - Workflow Editing: Enhanced
edit-workflow.tswith comprehensive nested node support for loop and parallel blocks, enabling proper hierarchical block structure management. - Metadata Standardization: Renamed copilot block metadata properties (
commonParameters→inputSchema,inputs→inputDefinitions) for better semantic clarity.
User Settings Integration:
Added two new boolean settings (showFloatingControls and showTrainingControls) to support the new UI controls, with appropriate defaults to maintain backward compatibility while allowing users to opt into new training features.
These changes work together to create a more consistent and robust foundation for copilot training data, moving away from UI-specific ReactFlow concepts toward a more standardized, nested data representation that better reflects actual workflow execution patterns.
Important Files Changed
Changed Files
| Filename | Score | Overview |
|---|---|---|
| apps/sim/app/workspace/[workspaceId]/w/[workflowId]/components/training-controls/training-modal.tsx | 4/5 | Adds new "Send Live State" tab for direct workflow state submission to copilot training |
| apps/sim/lib/workflows/training/compute-edit-sequence.ts | 3/5 | Major refactor to work with sanitized JSON format, removes edge removal tracking |
| apps/sim/lib/workflows/json-sanitizer.ts | 4/5 | Complete restructuring to embed connections in blocks and use nested structures |
| apps/sim/lib/copilot/tools/server/blocks/get-blocks-metadata-tool.ts | 5/5 | Renames metadata properties for better semantic consistency |
| apps/sim/app/api/users/me/settings/route.ts | 5/5 | Adds user settings for training UI controls with proper defaults |
| apps/sim/lib/copilot/tools/server/workflow/edit-workflow.ts | 4/5 | Enhances workflow editing with comprehensive nested node support |
Confidence score: 3/5
- This PR involves significant architectural changes that affect core serialization logic and may have widespread implications across the codebase
- Score reflects the complexity and scope of the refactoring, particularly the structural changes to JSON sanitization that could impact other systems consuming this data
- Pay close attention to
compute-edit-sequence.tsandjson-sanitizer.tsas they contain the most significant structural changes
Sequence Diagram
sequenceDiagram
participant User
participant TrainingModal
participant CopilotTrainingStore
participant JSONSanitizer
participant TrainingAPI
participant AgentIndexer
User->>TrainingModal: "Open Training Modal"
TrainingModal->>CopilotTrainingStore: "Get current state"
CopilotTrainingStore-->>TrainingModal: "Return training data"
User->>TrainingModal: "Start Training Session"
TrainingModal->>CopilotTrainingStore: "startTraining(title, prompt)"
CopilotTrainingStore->>CopilotTrainingStore: "Capture start snapshot"
CopilotTrainingStore-->>TrainingModal: "Training session started"
User->>User: "Edit workflow (add/edit/delete blocks)"
User->>TrainingModal: "Stop Training"
CopilotTrainingStore->>JSONSanitizer: "sanitizeForCopilot(startState)"
JSONSanitizer-->>CopilotTrainingStore: "Sanitized start state"
CopilotTrainingStore->>JSONSanitizer: "sanitizeForCopilot(endState)"
JSONSanitizer-->>CopilotTrainingStore: "Sanitized end state"
CopilotTrainingStore->>CopilotTrainingStore: "computeEditSequence(start, end)"
CopilotTrainingStore->>CopilotTrainingStore: "Save dataset with operations"
CopilotTrainingStore-->>TrainingModal: "Dataset saved"
User->>TrainingModal: "Send Dataset to Indexer"
TrainingModal->>JSONSanitizer: "sanitizeForCopilot(input/output)"
JSONSanitizer-->>TrainingModal: "Sanitized workflow states"
TrainingModal->>TrainingAPI: "POST /api/copilot/training"
TrainingAPI->>AgentIndexer: "POST /examples/add"
AgentIndexer-->>TrainingAPI: "Success/Error response"
TrainingAPI-->>TrainingModal: "Training result"
TrainingModal->>CopilotTrainingStore: "markDatasetSent(id)"
User->>TrainingModal: "Send Live Workflow"
TrainingModal->>JSONSanitizer: "sanitizeForCopilot(currentWorkflow)"
JSONSanitizer-->>TrainingModal: "Sanitized workflow"
TrainingModal->>TrainingAPI: "POST /api/copilot/training/examples"
TrainingAPI->>AgentIndexer: "POST /examples/add"
AgentIndexer-->>TrainingAPI: "Success/Error response"
TrainingAPI-->>TrainingModal: "Result"
Additional Comments (1)
-
apps/sim/lib/workflows/json-sanitizer.ts, line 277 (link)style: Export sanitization uses different logic than copilot sanitization - could lead to inconsistencies
6 files reviewed, 3 comments
…rectness (#1521) * add state sending capability * progress * add ability to add title and description to workflow state * progress in language * fix * cleanup code * fix type issue * fix subflow deletion case * Workflow console tool * fix lint --------- Co-authored-by: Siddharth Ganesan <siddharthganesan@gmail.com>
Summary
Make copilot JSON language tracking consistent across workflows and operations.
Type of Change
Testing
Manually via training modal
Checklist