Skip to content

Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9' #901

@dejoma

Description

@dejoma

Describe the bug 🐛

Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9'

The browser is not closed after running. So I am running a lambda function, it gets called multiple times. And then it runs out of memory.

See here the Github issue link and fix in the comments:
microsoft/playwright-java#526

My code 💻 🐊

SCRAPE_CONFIG = {
    "llm": {
        "api_key": os.environ["OPENAI_API_KEY"],
        "model": "openai/gpt-4o-mini",
    },
    "search_engine": "serper",
    "serper_api_key": os.environ["SERPER_API_KEY"],
    # "num_results": 5,
    "loader_kwargs": {
        # https://github.com/microsoft/playwright/issues/14023
        "args": ["--single-process", "--disable-gpu", "--disable-dev-shm-usage"],
    },
    "force": True,
    "verbose": True,
    "headless": True,
}


scraper = SearchGraph(prompt=prompt, config=SCRAPE_CONFIG, schema=ScraperOutput)  # type: ignore
scrape_results = scraper.run()

Hotfix update 🧯

So I've tried to empty some directories when they surpass 1.0GB, and it seems to work for now.

def cleanup_temp_files():
    """Clean up large temporary files and directories in /tmp.

    This function checks for files/directories larger than 1GB in /tmp
    and removes them to prevent disk space issues.
    """
    cleaned_paths = []
    ONE_GB = 1024 * 1024 * 1024  # 1GB in bytes

    try:
        # Get all items in /tmp
        tmp_items = os.listdir("/tmp")

        for item in tmp_items:
            full_path = os.path.join("/tmp", item)
            try:
                # Get size of file/directory
                if os.path.isdir(full_path):
                    total_size = sum(
                        os.path.getsize(os.path.join(dirpath, filename))
                        for dirpath, _, filenames in os.walk(full_path)
                        for filename in filenames
                    )
                else:
                    total_size = os.path.getsize(full_path)

                # Remove if larger than 1GB
                if total_size > ONE_GB:
                    if os.path.isdir(full_path):
                        shutil.rmtree(full_path)
                    else:
                        os.remove(full_path)
                    cleaned_paths.append(f"{full_path} ({total_size / ONE_GB:.2f}GB)")

            except Exception as e:
                print(f"Failed to process {full_path}: {e}")

    except Exception as e:
        print(f"Error accessing /tmp directory: {e}")

    if cleaned_paths:
        print(f"Cleaned up {len(cleaned_paths)} large files/directories: {cleaned_paths}")
    return cleaned_paths

My log file:

Cleaned up 2 large files/directories: ['/tmp/core.headless_shell.5405 (1.01GB)', '/tmp/core.headless_shell.5910 (1.01GB)']

Activity

VinciGit00

VinciGit00 commented on Jan 24, 2025

@VinciGit00
Collaborator

ok @dejoma can you make the pull request please?

dosubot

dosubot commented on Apr 25, 2025

@dosubot

Hi, @dejoma. I'm Dosu, and I'm helping the Scrapegraph-ai team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • The issue involves a failure in launching a browser using the Playwright library due to insufficient disk space.
  • The problem arises because the browser does not close properly, leading to memory exhaustion when a lambda function is repeatedly called.
  • A temporary fix involves cleaning up temporary files larger than 1GB to mitigate disk space issues.
  • @VinciGit00 has requested you to make a pull request for the fix.

Next Steps:

  • Please let us know if this issue is still relevant to the latest version of the Scrapegraph-ai repository. If so, you can keep the discussion open by commenting on the issue.
  • Otherwise, the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

added
staleIssue has not had recent activity or appears to be solved. Stale issues will be automatically closed
on Apr 25, 2025
removed
staleIssue has not had recent activity or appears to be solved. Stale issues will be automatically closed
on May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @dejoma@VinciGit00

        Issue actions

          Attempt 1 failed: BrowserType.launch: ENOSPC: no space left on device, mkdtemp '/tmp/playwright-artifacts-aNE1l9' · Issue #901 · ScrapeGraphAI/Scrapegraph-ai