Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for more compression formats for prefetching #48

Closed
maxmitti opened this issue Mar 23, 2022 · 6 comments
Closed

Support for more compression formats for prefetching #48

maxmitti opened this issue Mar 23, 2022 · 6 comments

Comments

@maxmitti
Copy link

I have enabled prefetching and it is working well for most repositories. However it doesn’t work with chaotic-aur and repo-ck.
According to the log output pacoloco assumes that the .db files are gzip compressed.
According to the file tool, one is XZ compressed and the other one is zstd compressed.
I am not sure if there are also other compressions in use with other repositories.

It would be nice if pacoloco could support the same compression formats as pacman.
Seems like pacman uses libarchive for that.

@anatol
Copy link
Owner

anatol commented Mar 24, 2022

cc @Focshole

@Focshole
Copy link
Contributor

Hi,
Pacoloco simply forwards pacman's requests to the upstream mirror, then caches the requested package and saves the relevant info of the requested package to allow prefetching.
The prefetching algorithm fetches the repo.db file, looks for the relevant packages (aka the ones which have been requested before), then it does look for the %FILENAME% section in the desc file of the package. The %FILENAME% explicits the file extension for the file to be requested, so pacoloco tries to prefetch that package with the extension specified in %FILENAME%.

So far, I had assumed this is what also pacman does. If pacoloco behaviour differs from pacman's, what you should see is that updates for those repositories does get prefetched (you see successful prefetches from those repos) but once an updated package gets requested, pacoloco starts a new download (because it has prefetched a file with a wrong extension).

I'm not sure on what is the issue we do have here, do you see that issue or other issues in logs?

@maxmitti
Copy link
Author

The issue is the compression of the repository database files (file extension .db):

pacoloco.go:362: downloading http://repo-ck.com//x86_64/repo-ck.db
repo_db_mirror.go:166: Extracting /media/old/var/cache/pacoloco/tmp-db/repo-ck.db...
repo_db_mirror.go:28: error: gzip: invalid header

pacoloco.go:362: downloading https://geo-mirror.chaotic.cx/chaotic-aur//x86_64/chaotic-aur.db
repo_db_mirror.go:166: Extracting /media/old/var/cache/pacoloco/tmp-db/chaotic-aur.db...
repo_db_mirror.go:28: error: gzip: invalid header
$ file /media/old/var/cache/pacoloco/tmp-db/repo-ck.db
/media/old/var/cache/pacoloco/tmp-db/repo-ck.db: XZ compressed data, checksum CRC64
$ file /media/old/var/cache/pacoloco/tmp-db/chaotic-aur.db 
/media/old/var/cache/pacoloco/tmp-db/chaotic-aur.db: Zstandard compressed data (v0.8+), Dictionary ID: None

@Focshole
Copy link
Contributor

Focshole commented May 11, 2022

Thank you for your report. Right now when pacoloco has to parse .db files, it assumes they are gzip compressed, which is not always true as I see. My fault. I'll have to include other formats as well, I'll look for specifications on allowed compression algorithms for those, the fix should be easy.
Cannot guarantee any ETA for fixing it, I am very busy at the moment.

@BigBrotherLovesYou
Copy link

Hi,
any updates on this?
I am also a chaotic-aur user, on a very low bandwidth link with 4 arch machines behind it. I REALLY do like the prefetch as you can imagine. But that also fails because of the .db being compressed with zstd.

Focshole added a commit to Focshole/pacoloco that referenced this issue Apr 16, 2023
@Focshole
Copy link
Contributor

Got a minute to finish this feature, let me know if it works for you! It should fallback to zstd if gzip extraction fails!

anatol pushed a commit to Focshole/pacoloco that referenced this issue Apr 18, 2023
@anatol anatol closed this as completed in ce5523a Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants