Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve debian package detection #3723

Merged
merged 1 commit into from
Apr 12, 2024
Merged

Conversation

AyanSinhaMahapatra
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra commented Apr 4, 2024

  • Detect and store more attributes from debian .dsc metadata files
  • Also properly detect and create packages from control and md5sums files.

Reference: aboutcode-org/scancode.io#1151
Reference: aboutcode-org/purldb#3

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

@AyanSinhaMahapatra AyanSinhaMahapatra force-pushed the debian-package-detection branch from bd8cfcf to 349d95b Compare April 4, 2024 14:25
Detect and store more attributes from debian .dsc metadata
files. Also properly detect and create packages from control
and md5sums files.

Reference: aboutcode-org/scancode.io#1151
Reference: aboutcode-org/purldb#3

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra AyanSinhaMahapatra force-pushed the debian-package-detection branch from 349d95b to 43dc430 Compare April 4, 2024 14:25
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!
Yet I think we can do better.
See https://en.wikipedia.org/wiki/Deb_(file_format)#Implementation
We should support xz, lzma, bz2, gz and zst AND the data.tar and control.tar may be uncompressed too

See also:

@pmhahn do you reckon we should support all of .tar.xz, .tar.lzma, .tar.bz2, .tar.gz, .tar.zst, .tar.zstd AND the plain data.tar and control.tar as formats for the tarballs inside a .deb? We need to support all current but also legacy formats to properly scan and index them!

@pombredanne
Copy link
Member

@pmhahn are the packages in https://gitlab.com/pmhahn/debian-package-registry testing all known formats?
Note the .deb for deb-lzma_1.0_all.deb and deb-bzip2_1.0_all.deb both contain a .zst extension and not lzma or bzip2 tarballs.

@AyanSinhaMahapatra to build all these combos in a Ci/CD test (likely from our own clone for stability):

sudo apt-get update
sudo apt-get --assume-yes --no-install-recommends install build-essential dput-ng curl ca-certificates
git clone https://gitlab.com/pmhahn/debian-package-registry
cd debian-package-registry
for f in deb*
  do  pushd $f
  sudo apt-get --assume-yes build-dep .
  dpkg-buildpackage --no-sign --build=source,all
  popd
done

@pombredanne
Copy link
Member

@AyanSinhaMahapatra https://wiki.debian.org/Teams/Dpkg/DebSupport gives some visibility on the topic

@pmhahn
Copy link

pmhahn commented Apr 5, 2024

@pmhahn are the packages in https://gitlab.com/pmhahn/debian-package-registry testing all known formats?

As I noted in the Gitlab issue the problem is that newer dpkg-deb no longer supports building some old formats like bzip2 and lzma; so sadly no, my repository is not complete as those two are missing.

But as .deb files are ar-file with .tar-files within, it it trivial to create a testing .deb by using ar and tar yourself.

@pombredanne
Copy link
Member

But as .deb files are ar-file with .tar-files within, it it trivial to create a testing .deb by using ar and tar yourself.

@pmhahn good point! Thank you ++ for your swift reply! .... @AyanSinhaMahapatra let's support them all then as this has no downside.

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@pombredanne pombredanne merged commit 04e24e0 into develop Apr 12, 2024
34 checks passed
@pombredanne pombredanne deleted the debian-package-detection branch April 12, 2024 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants