Description
I've used Gitpython to clone some repos but it hangs with a specified repo with size of 17G. I've create a Pool to do git clone using Gitpython. There is a large git repo and needs more time than others to clone. Each process do a clone work for one repo. The Pool i used as follows:
multi_res = [p.apply_async(runfunc, args=(
incl_info, project_root, skip_dirs,)) for incl_info in incl_infos]
LogInfo('Waiting for all subprocesses done...')
for i in range(len(incl_infos)):
while not multi_res[i].ready():
LogInfo("Downloading now")
time.sleep(5)
p.close()
p.join()
It works perfectly in most case. But will often hangs in the largest repo. It's wired that when i just clone the repo individually, It works fine. So i wonder if there is some block in python multiprocessing Pool at first.
I've strace the hanged git clone process . The git process output as follows:
Process 27649 attached
read(6, 0x7ffc36dae050, 4) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn() = 0
read(6, 0x7ffc36dae050, 4) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn() = 0
read(6, 0x7ffc36dae050, 4) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn() = 0
read(6, 0x7ffc36dae050, 4) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL, si_value={int=2895997, ptr=0x2c307d}} ---
rt_sigreturn()
The git-lfs output as follows:
Process 28006 attached
[ Process PID=28006 runs in 32 bit mode. ]
futex(0x88b982c, FUTEX_WAIT_PRIVATE, 0, NULL
But when i replace the git.repo.clone_from
with shell script git clone
in a new subprocess, it works fine. So maybe there are some block in git.repo.clone_from
, and i wonder whether it's solved. Thanks a lot.