Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debian mapper: better process Vcs-Git field #3

Open
armijnhemel opened this issue Nov 10, 2022 · 6 comments
Open

debian mapper: better process Vcs-Git field #3

armijnhemel opened this issue Nov 10, 2022 · 6 comments

Comments

@armijnhemel
Copy link

armijnhemel commented Nov 10, 2022

Some .dsc files have a Vcs-Git field, next to Vcs-Browser. From zsh_5.2-3.dsc:

Vcs-Browser: https://anonscm.debian.org/cgit/collab-maint/zsh.git
Vcs-Git: git://anonscm.debian.org/collab-maint/zsh.git -b debian

Although these are somewhat processed by and stored as vcs-url it might be worth breaking it down a bit more (the type of vcs, and so on). Also, as the repositories mentioned in Vcs-Git field are where the Debian specific metadata lives (such as the .dsc itself) it might be worth treating it differently (and potentially use it to mine even more information).

@armijnhemel armijnhemel changed the title debian mapper: process Vcs-Git field debian mapper: better process Vcs-Git field Nov 10, 2022
@armijnhemel
Copy link
Author

Also see aboutcode-org/vulnerablecode#649

@pombredanne
Copy link
Member

To add insult to injury, https://anonscm.debian.org/ has long been replaced by Salsa based on gitlab

@armijnhemel
Copy link
Author

To add insult to injury, https://anonscm.debian.org/ has long been replaced by Salsa based on gitlab

There are still plenty of references:

$ zcat Sources.gz | grep Vcs| grep -v Browser | grep anonscm | wc -l
3538

I guess that #122 is similar, but then for SCM URLs and historical references, not just historical downloads.

@pombredanne
Copy link
Member

@AyanSinhaMahapatra Is this something you may have done while improving Debian support?

AyanSinhaMahapatra added a commit to aboutcode-org/scancode-toolkit that referenced this issue Apr 4, 2024
Detect and store more attributes from debian .dsc metadata
files.

Reference: aboutcode-org/scancode.io#1151
Reference: aboutcode-org/purldb#3

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
AyanSinhaMahapatra added a commit to aboutcode-org/scancode-toolkit that referenced this issue Apr 4, 2024
Detect and store more attributes from debian .dsc metadata
files. Also properly detect and create packages from control
and md5sums files.

Reference: aboutcode-org/scancode.io#1151
Reference: aboutcode-org/purldb#3

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
@AyanSinhaMahapatra
Copy link
Member

@pombredanne no I missed this issue. But see the PR above I we now parse and store vcs_url and code_view_url for a .dsc file.

Is there other fields from .dsc files we can also store, which could be benificial?

would storing any of these be beneficial?

Build-Depends: bsdmainutils, cm-super-minimal, debhelper-compat (= 12), dpkg-dev (>= 1.16.2~), ghostscript, groff, groff-base, libcap-dev [linux-any], libncursesw5-dev, libpcre3-dev, texinfo (>= 5~), texlive-fonts-recommended, texlive-latex-base, texlive-latex-recommended, yodl (>= 3.08.01) | yodl (<< 3.08.00)
Package-List:
 zsh deb shells optional arch=any
 zsh-common deb shells optional arch=all
 zsh-dev deb libdevel optional arch=any
 zsh-doc deb doc optional arch=all
 zsh-static deb shells optional arch=any
Checksums-Sha1:
 b2fd47fdb878aa681edc974864e37baae9b0d6b7 2776796 zsh_5.7.1.orig.tar.xz
 14a8d38d3fae5b8eec0b124be5c943a60c4e8fec 87028 zsh_5.7.1-1+deb10u1.debian.tar.xz
Checksums-Sha256:
 439aafb4341522c307a67a2680e95fadb1b35a5c7f332089b9cc5154496570ca 2776796 zsh_5.7.1.orig.tar.xz
 ecbe22ed6a2b8dcaf10eff02b6b66583ce8d108a936624fc424c72188dea1ddd 87028 zsh_5.7.1-1+deb10u1.debian.tar.xz
Files:
 acc1a32ef5b3120ead5c6f0d011ceb76 2776796 zsh_5.7.1.orig.tar.xz
 d0f5fe26d9548331d9757e26b95c7aaf 87028 zsh_5.7.1-1+deb10u1.debian.tar.xz

  1. We don't parse/store dependencies.
  2. There are a couple checksums, but for .orig. or .debian. files.

@AyanSinhaMahapatra
Copy link
Member

Note that the recent .dsc files like at https://github.com/nexB/scancode-toolkit/blob/debian-package-detection/tests/packagedcode/data/debian/dsc_files/zsh_5.7.1-1%2Bdeb10u1.dsc#L14

are storing the correct vcs url.

To add insult to injury, https://anonscm.debian.org/ has long been replaced by Salsa based on gitlab

What should we do for this?

AyanSinhaMahapatra added a commit to aboutcode-org/scancode-toolkit that referenced this issue Apr 4, 2024
Detect and store more attributes from debian .dsc metadata
files. Also properly detect and create packages from control
and md5sums files.

Reference: aboutcode-org/scancode.io#1151
Reference: aboutcode-org/purldb#3

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
AyanSinhaMahapatra added a commit to aboutcode-org/scancode-toolkit that referenced this issue Apr 4, 2024
Detect and store more attributes from debian .dsc metadata
files. Also properly detect and create packages from control
and md5sums files.

Reference: aboutcode-org/scancode.io#1151
Reference: aboutcode-org/purldb#3

Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants