Skip to content

Conversation

@apexlnc
Copy link
Contributor

@apexlnc apexlnc commented Oct 18, 2025

Improves the LangGraph executor HITL implementation with structured decision support and event-based synchronization, plus fixes for checkpoint duplicate key errors.

Controller (Go):

  • Fix checkpoint duplicate key errors using GORM's OnConflict{DoNothing: true}
  • Use errors.Is() for proper wrapped error detection

kagent-core:

  • Add event-based task save synchronization (wait_for_save())
  • Replace arbitrary sleep delays with reactive event signaling

kagent-langgraph:

  • DataPart Support: Check structured DataPart for decision_type before text parsing
  • Event-based sync: Use wait_for_save() instead of asyncio.sleep(0.5)
  • Bug fix: Properly access part.root for RootModel types
  • Complete HITL implementation with interrupt/resume logic

Tested end-to-end with live HITL workflows. Follows A2A protocol patterns from deepagents and uses idiomatic GORM patterns for database operations.

@apexlnc
Copy link
Contributor Author

apexlnc commented Oct 18, 2025

cc @EItanya @yuval-k @ilackarms

@apexlnc apexlnc force-pushed the feat/langgraph-hitl-improvements branch 3 times, most recently from 332efa9 to 3b100c7 Compare October 20, 2025 03:26
Copy link
Contributor

@EItanya EItanya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so so much for the PR, the solution is looking really cool. I just have a few nits and questions!

@apexlnc
Copy link
Contributor Author

apexlnc commented Oct 20, 2025

@EItanya - realized i hadn't pushed some stuff i had locally. pushed and addressed some of your other comments.

"app_name": self.app_name,
},
"project_name": self.app_name,
"run_name": "kagent-langgraph-resume",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there issues in having the name be constant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't answer this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it constant to match the initial execution pattern (line 96 also uses constant "kagent-langgraph-exec").

The run_name is used for LangSmith/tracing to categorize run types. Since tags already include dynamic values (task_id, context_id, thread_id), I figured the run_name should be a constant category label. I'm not 100% certain this is correct. Should run_name include unique identifiers? Or is constant appropriate for grouping trace types? I can change it to f"kagent-langgraph-resume-{task_id}" if that's better for observability.

# Determine decision from message
message_text = context.get_user_input().lower()

if "approved" in message_text or "proceed" in message_text:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be meant for approve/deny scenarios only? what if we want to ask for specific input from the user and resume execution with that?

I think the decisions (e.g. "approve"/"deny") should be constants coming from the client -- clicking Approve/Deny (or whatever UX we go with) should automatically send True/False or some enum that we know will always signal an approve/deny decision, so we don't have to check for specific strings in message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but going to leave TextPart as a fallback for now, if that's ok

apexlnc added a commit to apexlnc/kagent that referenced this pull request Oct 21, 2025
Address maintainer feedback by standardizing HITL (Human-in-the-Loop)
functionality in kagent-core to enable uniform interrupt handling across
all executor types (LangGraph, CrewAI, ADK).

Changes:
- Add kagent-core/a2a/_hitl.py with framework-agnostic HITL types and utilities
- Add HITL constants to kagent-core/a2a/_consts.py
- Refactor kagent-langgraph executor to use core HITL utilities
- Extract backtick escaping to dedicated function
- Implement two-tier decision detection (DataPart priority, TextPart fallback)
- Change security default from approve to deny
- Fix: Use JSON for checkpoint metadata serialization (LangGraph 1.0 compatibility)
- Add 10 comprehensive tests for HITL functionality

Addresses: kagent-dev#1025 (comments kagent-dev#2, kagent-dev#4, kagent-dev#5, kagent-dev#6, kagent-dev#8)
Signed-off-by: apexlnc <43242113+apexlnc@users.noreply.github.com>
@apexlnc apexlnc force-pushed the feat/langgraph-hitl-improvements branch 2 times, most recently from 2c1ab75 to c37cf69 Compare October 21, 2025 19:03
@apexlnc apexlnc requested review from EItanya and peterj October 21, 2025 19:09
@apexlnc
Copy link
Contributor Author

apexlnc commented Oct 21, 2025

@peterj @EItanya -- decided to take the time to move the HITL stuff to kagent-core so it can be reused across the ecosystem. addressed the rest of the comments as well.

Standardize HITL (Human-in-the-Loop) functionality in kagent-core to
enable uniform interrupt handling across all executor types (LangGraph,
CrewAI, ADK).

Changes:
- Add kagent-core/a2a/_hitl.py with framework-agnostic HITL types and utilities
- Add HITL constants to kagent-core/a2a/_consts.py with KAGENT_HITL_ prefix
- Refactor kagent-langgraph executor to use core HITL utilities
- Extract backtick escaping to dedicated function
- Implement two-tier decision detection (DataPart priority, TextPart fallback)
- Change security default from approve to deny
- Fix: Use JSON for checkpoint metadata serialization (LangGraph 1.0 compatibility)
- Add 10 comprehensive tests for HITL functionality

Signed-off-by: apexlnc <43242113+apexlnc@users.noreply.github.com>
@apexlnc apexlnc force-pushed the feat/langgraph-hitl-improvements branch from c37cf69 to 7320883 Compare October 22, 2025 10:14
@apexlnc
Copy link
Contributor Author

apexlnc commented Oct 22, 2025

cc @peterj @EItanya can you kick off another GHA?

@apexlnc apexlnc changed the title feat: Improve LangGraph HITL with DataPart support and event-based sync feat: add HITL with DataPart support and event-based sync Oct 22, 2025
Copy link
Contributor

@EItanya EItanya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of last questions/nits, overall looking awesome!


type_, serialized_checkpoint = self.serde.dumps_typed(checkpoint)
serialized_metadata = self.jsonplus_serde.dumps(get_checkpoint_metadata(config, metadata))
# Serialize metadata as JSON (simpler, no type needed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change a little more? I initially used that serializer in order to match the langraph code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • LangGraph 1.0 changed the API: removed .dumps(), now only .dumps_typed() exists, and this was causing: AttributeError: 'JsonPlusSerializer' object has no attribute 'dumps'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for why JSON instead of .dumps_typed():

  • Go backend expects JSON - The database schema comment explicitly says: Metadata string // JSON serialized metadata
  • No metadata_type field - Go schema has checkpoint_type for checkpoints but no metadata_type for metadata
  • Cross-language - JSON works Python ↔ Go, msgpack would need type info

LangGraph's format and delegates to the generic handler in kagent-core.
"""
# Extract interrupt details from LangGraph format
if not interrupt_data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Python does this make sure it has len > 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah Python empty lists are falsy, so this checks if the list is empty and returns early if so

"app_name": self.app_name,
},
"project_name": self.app_name,
"run_name": "kagent-langgraph-resume",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't answer this

@apexlnc
Copy link
Contributor Author

apexlnc commented Oct 24, 2025

Just a couple of last questions/nits, overall looking awesome!

Let me know if you have any other questions!

@EItanya EItanya merged commit e6bedf6 into kagent-dev:main Oct 27, 2025
17 checks passed
killjoycircuit pushed a commit to killjoycircuit/kagent that referenced this pull request Nov 1, 2025
…#1025)

Improves the LangGraph executor HITL implementation with structured
decision support and event-based synchronization, plus fixes for
checkpoint duplicate key errors.

Controller (Go):
- Fix checkpoint duplicate key errors using GORM's OnConflict{DoNothing:
true}
- Use errors.Is() for proper wrapped error detection

kagent-core:
- Add event-based task save synchronization (wait_for_save())
- Replace arbitrary sleep delays with reactive event signaling

kagent-langgraph:
- DataPart Support: Check structured DataPart for decision_type before
text parsing
- Event-based sync: Use wait_for_save() instead of asyncio.sleep(0.5)
- Bug fix: Properly access part.root for RootModel types
- Complete HITL implementation with interrupt/resume logic

Tested end-to-end with live HITL workflows. Follows A2A protocol
patterns from deepagents and uses idiomatic GORM patterns for database
operations.

Signed-off-by: apexlnc <43242113+apexlnc@users.noreply.github.com>
Signed-off-by: killjoycircuit <rutujdhawale@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants