Limit what files and folders FileSurfer can access.#6024
Conversation
python/packages/autogen-ext/src/autogen_ext/agents/file_surfer/_markdown_file_browser.py
Outdated
Show resolved
Hide resolved
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6024 +/- ##
==========================================
- Coverage 76.73% 76.71% -0.03%
==========================================
Files 191 191
Lines 13170 13191 +21
==========================================
+ Hits 10106 10119 +13
- Misses 3064 3072 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
husseinmozannar
left a comment
There was a problem hiding this comment.
Not sure this is working, is there a change for how filesurfer is instantiated?
file_surfer = FileSurfer(
name="file_surfer",
model_client=model_client_file_surfer,
base_path="home/path1"
)
I was able to get file_surfer to read files in path "home"
This is the correct signature for creating the FileSurfer. My minimal test code is: import asyncio
import sys
from autogen_agentchat.agents import UserProxyAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.agents.file_surfer import FileSurfer
async def main() -> None:
model_client = OpenAIChatCompletionClient(model="gpt-4o")
file_surfer = FileSurfer(
name="FileSurfer",
model_client = model_client,
base_path=None #"/home/afourney/repos/autogen/python"
)
user_proxy = UserProxyAgent("user_proxy")
termination = TextMentionTermination("exit", sources=["user_proxy"])
team = RoundRobinGroupChat([file_surfer, user_proxy], termination_condition=termination)
try:
# Start the team and wait for it to terminate.
await Console(team.run_stream(task=sys.argv[1]))
finally:
await file_surfer.close()
await model_client.close()
asyncio.run(main()) |
I tested again with a few different paths and it's now giving file not found errors which means it is working correctly right? I think in my earlier testing I messed up the paths. Another thing which might be worth possibly doing is making base_path look like path "." in the filesurfer prompt. For instance, I start filesurfer with base_path="/home/USER/autogen/python" if I want it to open "/home/USER/autogen/python/autogen/README.md" it has to do: [FunctionCall(id='CALL ID', arguments='{"path":"/home/USER/autogen/python/autogen/README.md"}', name='open_path')] ideally it would be: arguments='{"path":"./autogen/README.md"}' This also avoids having the LLM know your exact file structure if you want to hide that. |
Yeah, I did FileNotFound errors to not leak information -- but I'm not sure that's strictly necessary. An out of bounds error could trigger for any path outside the base, regardless of if it exists. As for truncating the path... that's a good idea. We leak information otherwise. |
@husseinmozannar I'm going to do the relative path stuff in another PR. For various reasons, it's a little more complex/deeper integrated, and I want to take care to do that right. At least for now, it's self-consistent and closes one avenue of risk. |
Optionally limit what files and folders FileSurfer can access (constraining it to a subtree of the FS).
This is not a replacement for Docker sandboxing, but can be used in conjunction with sandboxing to help prevent FileSurfer from accessing sensitive files.