-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Not planned
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug 🐛
Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9'
The browser is not closed after running. So I am running a lambda function, it gets called multiple times. And then it runs out of memory.
See here the Github issue link and fix in the comments:
microsoft/playwright-java#526
My code 💻 🐊
SCRAPE_CONFIG = {
"llm": {
"api_key": os.environ["OPENAI_API_KEY"],
"model": "openai/gpt-4o-mini",
},
"search_engine": "serper",
"serper_api_key": os.environ["SERPER_API_KEY"],
# "num_results": 5,
"loader_kwargs": {
# https://github.com/microsoft/playwright/issues/14023
"args": ["--single-process", "--disable-gpu", "--disable-dev-shm-usage"],
},
"force": True,
"verbose": True,
"headless": True,
}
scraper = SearchGraph(prompt=prompt, config=SCRAPE_CONFIG, schema=ScraperOutput) # type: ignore
scrape_results = scraper.run()
Hotfix update 🧯
So I've tried to empty some directories when they surpass 1.0GB, and it seems to work for now.
def cleanup_temp_files():
"""Clean up large temporary files and directories in /tmp.
This function checks for files/directories larger than 1GB in /tmp
and removes them to prevent disk space issues.
"""
cleaned_paths = []
ONE_GB = 1024 * 1024 * 1024 # 1GB in bytes
try:
# Get all items in /tmp
tmp_items = os.listdir("/tmp")
for item in tmp_items:
full_path = os.path.join("/tmp", item)
try:
# Get size of file/directory
if os.path.isdir(full_path):
total_size = sum(
os.path.getsize(os.path.join(dirpath, filename))
for dirpath, _, filenames in os.walk(full_path)
for filename in filenames
)
else:
total_size = os.path.getsize(full_path)
# Remove if larger than 1GB
if total_size > ONE_GB:
if os.path.isdir(full_path):
shutil.rmtree(full_path)
else:
os.remove(full_path)
cleaned_paths.append(f"{full_path} ({total_size / ONE_GB:.2f}GB)")
except Exception as e:
print(f"Failed to process {full_path}: {e}")
except Exception as e:
print(f"Error accessing /tmp directory: {e}")
if cleaned_paths:
print(f"Cleaned up {len(cleaned_paths)} large files/directories: {cleaned_paths}")
return cleaned_paths
My log file:
Cleaned up 2 large files/directories: ['/tmp/core.headless_shell.5405 (1.01GB)', '/tmp/core.headless_shell.5910 (1.01GB)']
dosubot
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
VinciGit00 commentedon Jan 24, 2025
ok @dejoma can you make the pull request please?
dosubot commentedon Apr 25, 2025
Hi, @dejoma. I'm Dosu, and I'm helping the Scrapegraph-ai team manage their backlog. I'm marking this issue as stale.
Issue Summary:
Next Steps:
Thank you for your understanding and contribution!