Skip to content

Git clone hangs with large repo #969

Closed
@Abbyyan

Description

@Abbyyan

I've used Gitpython to clone some repos but it hangs with a specified repo with size of 17G. I've create a Pool to do git clone using Gitpython. There is a large git repo and needs more time than others to clone. Each process do a clone work for one repo. The Pool i used as follows:

  multi_res = [p.apply_async(runfunc, args=(
            incl_info, project_root, skip_dirs,)) for incl_info in incl_infos]
    LogInfo('Waiting for all subprocesses done...')
    for i in range(len(incl_infos)):
        while not multi_res[i].ready():
            LogInfo("Downloading now")
            time.sleep(5)
    p.close()
    p.join()

It works perfectly in most case. But will often hangs in the largest repo. It's wired that when i just clone the repo individually, It works fine. So i wonder if there is some block in python multiprocessing Pool at first.

I've strace the hanged git clone process . The git process output as follows:

Process 27649 attached
read(6, 0x7ffc36dae050, 4)              = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn()                          = 0
read(6, 0x7ffc36dae050, 4)              = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn()                          = 0
read(6, 0x7ffc36dae050, 4)              = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn()                          = 0
read(6, 0x7ffc36dae050, 4)              = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn()   

The git-lfs output as follows:

Process 28006 attached
[ Process PID=28006 runs in 32 bit mode. ]
futex(0x88b982c, FUTEX_WAIT_PRIVATE, 0, NULL

But when i replace the git.repo.clone_from with shell script git clone in a new subprocess, it works fine. So maybe there are some block in git.repo.clone_from, and i wonder whether it's solved. Thanks a lot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions