Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce Upload time for larger datasets #301

Open
frsommer opened this issue Nov 13, 2024 · 2 comments
Open

reduce Upload time for larger datasets #301

frsommer opened this issue Nov 13, 2024 · 2 comments
Assignees
Labels
Type: Bug Something is not working, and it is confirmed by maintainers to be a bug.

Comments

@frsommer
Copy link

Is your feature request related to a problem? Please describe.
when committing larger datasets it takes forever, git status seems to check every few minutes for changes, takes forever
(git status or git status -z -u ...)

Describe the solution you'd like
when committing, reduce status lookup occasions or whatever it is to speed up.

Describe alternatives you've considered
with arccommander its one workflow with arcitect i have to wait for commit and then sync

Additional context
Add any other context or screenshots about the feature request here.

@github-actions github-actions bot added the Status: Needs Triage This item is up for investigation. label Nov 13, 2024
@Hannah-Doerpholz
Copy link
Contributor

Perhaps related to this problem, we have tried to upload two files of about 45 GB size with LFS and couldn't manage to do it (we tried to upload either file as the only change in a commit). I see that the ARCitect tried to upload the file multiple times, but eventually it failed with an i/o timeout. I have used ARCitect v0.0.49 on Ubuntu 22.04. This is the log (I have replaced the actual token with in this issue):

git branch
* hannah
main
git push --verbose --atomic --progress origin hannah
Pushing to https://git.nfdi4plants.org/usadellab/ribes_nigrum_genome.git
warning: current Git remote contains credentials
Locking support detected on remote "origin". Consider enabling it with:
$ git config lfs.https://Usadellab:<token>@git.nfdi4plants.org/usadellab/ribes_nigrum_genome.git/info/lfs.locksverify true
Uploading LFS objects: 0% (0/1), 0 B | 0 B/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 41 GB | 46 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 83 GB | 47 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 124 GB | 46 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 165 GB | 47 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 206 GB | 46 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 248 GB | 47 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 289 GB | 47 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 330 GB | 46 MB/s
warning: current Git remote contains credentials
Uploading LFS objects: 0% (0/1), 372 GB | 46 MB/s
LFS: Put "https://git.nfdi4plants.org/usadellab/ribes_nigrum_genome.git/gitlab-lfs/objects/fbaf9c6db32cb2f1e4dd3b50bb3b2c052d1a867d4f133bc2f327a755f6249c91/41279675277": read tcp 134.94.68.91:58176->132.230.102.154:443: i/o timeout
error: failed to push some refs to 'https://git.nfdi4plants.org/usadellab/ribes_nigrum_genome.git'
git status
On branch hannah
nothing to commit, working tree clean
git status -z -u
git remote -v
origin https://Usadellab:<token>@git.nfdi4plants.org/usadellab/ribes_nigrum_genome.git (fetch)
origin https://Usadellab:<token>@git.nfdi4plants.org/usadellab/ribes_nigrum_genome.git (push)
git branch
* hannah
main
git rev-parse HEAD
02984ea02edc5fc372ee360e84d0c403c21198e7
git ls-remote https://Usadellab:<token>@git.nfdi4plants.org/usadellab/ribes_nigrum_genome.git -h refs/heads/hannah
befed118aa8e4109c5c6684f0902238a66ead9f3 refs/heads/hannah

@JonasLukasczyk JonasLukasczyk self-assigned this Nov 19, 2024
@JonasLukasczyk JonasLukasczyk added Type: Bug Something is not working, and it is confirmed by maintainers to be a bug. and removed Status: Needs Triage This item is up for investigation. labels Nov 19, 2024
@JonasLukasczyk
Copy link
Collaborator

This looks like a DataHub problem. I will get in touch with the administrators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Something is not working, and it is confirmed by maintainers to be a bug.
Projects
Status: No status
Development

No branches or pull requests

3 participants