Skip to content

Re-evaluating Internal Message Handling in SocietyOfMindAgent (v0.4+) #6123

@SongChiYoung

Description

@SongChiYoung

What happened?

old message : First start with SocietyOfMindAgent leaking inner monologues to outter team

Describe the bug
When using a SocietyOfMindAgent inside a GroupChat, messages from the inner team (e.g., inner agents like agent1, agent2) are exposed to the outer GroupChat stream, rather than being contained within the internal reasoning. This leads to unexpected messages being surfaced to outer-level agents and logic, breaking isolation and causing potential routing/termination issues.

To Reproduce
Run the following minimal example:

import asyncio
from autogen_agentchat.agents import AssistantAgent, SocietyOfMindAgent
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import MaxMessageTermination
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.ui import Console

async def main():
    model_client = OpenAIChatCompletionClient(model="gpt-4o")

    agent1 = AssistantAgent("agent1", model_client=model_client, system_message="You are a writer.")
    agent2 = AssistantAgent("agent2", model_client=model_client, system_message="You are a critic.")

    inner_team = RoundRobinGroupChat(
        participants=[agent1, agent2],
        termination_condition=MaxMessageTermination(2)
    )

    society_agent = SocietyOfMindAgent("society", team=inner_team, model_client=model_client)

    outer_agent = AssistantAgent("translator", model_client=model_client, system_message="Translate to Spanish.")

    team = RoundRobinGroupChat(
        participants=[society_agent, outer_agent],
        termination_condition=MaxMessageTermination(2)
    )

    await Console(team.run_stream(task="Write a short story."))

asyncio.run(main())

💥 Expected Console output shows all intermediate messages from the inner team:

user: Write a short story.
agent1: Once upon a time...
agent2: Needs more drama.
society: Here's the revised version.
translator: Aquí está la historia.

💥 Real Console output shows all intermediate messages from the inner team (Termination is do not work expected...):

user: Write a short story.
agent1: Once upon a time... 
society: Here's the revised version.

But expected behavior:

user: Write a short story.
>>> (inner, it could be do not showing) agent1: Once upon a time...
>>> (inner, it could be do not showing) agent2: Needs more drama.
society: Here's the revised version.
translator: Aquí está la historia.

Expected behavior

The SocietyOfMindAgent should:

  • ✅ Internally run its team and return a single Response
  • ✅ Prevent inner messages from leaking into the outer GroupChat stream
  • ⚙️ Optionally log intermediate messages to Console, but not expose them as ChatMessages to the rest of the system

Additional context

This behavior breaks message encapsulation, which is particularly problematic when using nested teams.

These changes would make SocietyOfMindAgent much more robust and suitable for nested orchestration scenarios.


Question

Is this behavior (inner messages leaking into the outer GroupChat) intended?
Or is this something that might be considered a design oversight or bug?

Happy to propose a patch once the expected behavior is clarified. 🙌

I was able to investigate the core behavior of SocietyOfMindAgent more deeply. Since the original design intentions were not fully clear from the documentation alone, I compared the current implementation to the earlier version from v0.2.

Through this comparison, I confirmed that the current behavior introduces four functional regressions that did not exist before. Additionally, I identified four more architectural concerns introduced in recent versions.

I've summarized them in the table below for review. I would really appreciate any feedback or validation on this analysis.


🔍 SocietyOfMindAgent: Design Issues and Historical Comparison (v0.2 vs v0.4+)

✅ P1–P4 Regression Issue Table (Updated with Fixes in PR #6142)

ID Description Current v0.4+ Issue Resolution in PR #6142 Was it a problem in v0.2? Notes
P1 inner_messages leaks into outer team termination evaluation Response.inner_messages is appended to the outer team's _message_thread, affecting termination conditions. Violates encapsulation. inner_messages is excluded from _message_thread, avoiding contamination of outer termination logic. ❌ No Structural boundary is now enforced
P2 Inner team does not execute when outer message history is empty In chained executions, if no new outer message exists, no task is created and the inner team is skipped entirely ✅ Detects absence of new outer message and reuses the previous task, passing it via a handoff message. This ensures the inner team always receives a valid task to execute ❌ No The issue was silent task omission, not summary failure. Summary succeeds as a downstream effect
P3 Summary LLM prompt is built from external input only Prompt is constructed using external message history, ignoring internal reasoning ✅ Prompt construction now uses final_response.inner_messages, restoring internal reasoning as the source of summarization ❌ No Matches v0.2 internal monologue behavior
P4 External input is included in summary prompt (possibly incorrectly) Outer messages are used in the final LLM summarization prompt ✅ Resolved via the same fix as P3; outer messages are no longer used for summary ❌ No Redundant with P3, now fully addressed
ID Description Current v0.4+ Issue Suggested Fix Was it a problem in v0.2? Notes
E1 Fragile count <= len(task) logic in stream parsing Skips a fixed number of messages assuming they are tasks. Breaks with team structure changes. Use explicit criteria like source == "user" to filter task messages ❌ No v0.2 had no streaming/yield logic
E2 Streaming chunks (e.g. ModelClientStreamingChunkEvent) ambiguity Some events are streamed but not stored — unclear if this is intentional Add comments to clarify intent. Maintain current behavior. ❌ No v0.2 had no streaming structure. Keep current but document clearly.
E3 Ambiguous task/message boundary Outer tasks and inner messages are mixed conceptually Clarify roles using message types or consistent tagging (e.g. source) ❌ No v0.2 handled outer input as "User" consistently. Just verify that continues to be true.
E4 reset() may not run if exception occurs If run_stream() fails mid-execution, reset() is skipped → potential team state corruption Wrap reset() inside a finally block for guaranteed cleanup ⚠️ Partially Same logic in v0.2; lacks finally, so it's not always guaranteed either

Error?? Need to more information

ID Description Current v0.4+ Issue Suggested Fix Was it a problem in v0.2? Notes
P5 reset() on inner team affects outer team state SocietyOfMindAgent calls await self._team.reset(), which resets shared team instances, unintentionally clearing the outer team's state Ensure inner team is a separate instance (e.g., deep copy), or isolate reset() behavior to avoid cross-team interference DoNot Check DoNot Check however It's Error. Q.E.D.

Maybe... it's okay. Model context remember their context, so it's right behavior


If there are no objections to my conclusions, I would like to open a DRAFT PR to begin addressing these issues.

Since this agent is critical for a production use case I'm working on, I’m highly motivated to contribute toward improving its reliability.

Looking forward to your feedback—thank you!

Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

AutoGen library version.

Python dev (main branch)

Other library version.

No response

Model used

No response

Model provider

None

Other model provider

No response

Python version

None

.NET version

None

Operating system

None

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions