-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
411 toolkit upgrade 2 #435
Conversation
f0bc99c
to
58d4909
Compare
@JonoYang can we make the new dependency system as a separate feature from the toolkit upgrade? |
@tdruez No problem, I'll revert the model changes. |
d62f75c
to
774c5b7
Compare
I've regenerated the test data using the new command you created and I run into some problems where I would like your opinion on what should be done.
I run I have some previous attempts where I worked on different solutions. In the branch In a scancode JSON output, all Packages detected in a scan are present in a top-level attribute named For each Resource that is for a Package, the When I first modified the pipes code to handle the new top-level I started testing the pipeline using the package I decided to update how I stored the The The scan output
This causes tests like https://github.com/nexB/scancode.io/blob/411-toolkit-upgrade-2/scanpipe/tests/test_pipes.py#L521 to fail since we expect the json to be in the sctk json output format. Properly handling |
The test data was generated from a time where the
Why not completing the #378 (review) PR instead? It seems that my review points were never addressed but it looks like we were close to a merge. Will this PR still relevant to solve this?
Probably not the best place, but I think the goal of this first PR is to make upgrade the toolkit with the least amount of models changes to get it out there asap. We should then have tickets and more PRs to work on the optimizations, such as new models and fields for Packages and Dependencies
We probably have to update the |
@tdruez I'll revisit #378 and see if I can get it to work with what I have in this branch.
ack. I will create tickets for some of the further work that needs to be done. |
I tagged the I've added |
081726b
to
b25c2c8
Compare
While running pipelines on this branch:
The following pipe modules are impacted:
|
Scanning https://github.com/bastikr/boolean.py/archive/refs/heads/master.zip with https://staging.scancode.io/api/projects/ac26f606-4a7b-4359-b585-72a9616a921c/summary/ |
The There doesn't seem to be an easy way to get the previous |
I've brought in the old code that handled processing installed Debian package files and updated enough of it to get the pipeline to work. I had some issues with license detection, but it was resolved when I ran |
I don't think we want to bring old toolkit code over in ScanCode.io, this defeat the purpose of a toolkit upgrade, but we should rather make use of the new code. We had discussion today with @pombredanne to discuss this and he will provide solution on the toolkit side. In the mean time, we should:
|
scanpipe/pipes/scancode.py
Outdated
key_files_packages.extend(packages_data) | ||
for package in packages: | ||
package_data = DiscoveredPackageSerializer(package).data | ||
if package_data not in key_files_packages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit much to test on a whole data structure such as package_data
.
Could we use a more specific id instead?
* Avoid checking if package_data dictionary is already in the key_files_packages list * Keep track of package_uids instead Signed-off-by: Jono Yang <jyang@nexb.com>
Signed-off-by: Jono Yang <jyang@nexb.com>
Signed-off-by: Thomas Druez <tdruez@nexb.com>
Signed-off-by: Thomas Druez <tdruez@nexb.com>
Signed-off-by: Thomas Druez <tdruez@nexb.com>
Signed-off-by: Jono Yang <jyang@nexb.com>
* Remove create_discovered_packages2 and create_codebase_resources2 Signed-off-by: Jono Yang <jyang@nexb.com>
* Normalize package_uids before comparing results in tests * Update expected test results Signed-off-by: Jono Yang <jyang@nexb.com>
* Mark ProjectCodebase tests with expectedFailure * We will revisit ProjectCodebase and update it to fit our current models Signed-off-by: Jono Yang <jyang@nexb.com>
* We are using a scancode scan results for tests since asgiref-3.3.0_scan.json is not exactly the same format as scancode's json output Signed-off-by: Jono Yang <jyang@nexb.com>
* Update regen_test_data.py to generate asgiref-3.3.0_walk_test_fixtures.json Signed-off-by: Jono Yang <jyang@nexb.com>
* No need to explicity get license_clarity_score in make_results_summary() * Update expected test results Signed-off-by: Jono Yang <jyang@nexb.com>
* Add .vscode to .gitignore Signed-off-by: Jono Yang <jyang@nexb.com>
* Check for existence of installed_file attribute before using it Signed-off-by: Jono Yang <jyang@nexb.com>
* Ensure both installed_file and codebase_resource have the same checksum field before comparing them Signed-off-by: Jono Yang <jyang@nexb.com>
* Update mappings_keys_by_fieldname * Look for package data in package_data field instead of packages in save_scan_package_results Signed-off-by: Jono Yang <jyang@nexb.com>
* Move get_installed_packages to rootfs.py * Use get_package_data instead of get_package_info * Rename all instances of packages to package_data when scanning for application packages * Update test docker images and test results * Add test for basic rootfs Signed-off-by: Jono Yang <jyang@nexb.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonoYang The code looks pretty good.
Do you have a list of what's left to do to complete this PR?
Signed-off-by: Jono Yang <jyang@nexb.com>
* Update expected test results Signed-off-by: Jono Yang <jyang@nexb.com>
* Update expected test results * Remove old ubuntu.tar Signed-off-by: Jono Yang <jyang@nexb.com>
faf6fba
to
552bdb8
Compare
Signed-off-by: Jono Yang <jyang@nexb.com>
Signed-off-by: Jono Yang <jyang@nexb.com>
Signed-off-by: Jono Yang <jyang@nexb.com>
* Add package_uid to test package data * Update expected test result Signed-off-by: Jono Yang <jyang@nexb.com>
feca62b
to
b3f4656
Compare
The last few things I finished up were adding the |
Signed-off-by: Jono Yang <jyang@nexb.com>
Signed-off-by: Jono Yang <jyang@nexb.com>
942e06d
to
9fcec67
Compare
* In the LoadInventory pipeline, create the DiscoveredPackages from a scan before creating the CodebaseResources Signed-off-by: Jono Yang <jyang@nexb.com>
Signed-off-by: Jono Yang <jyang@nexb.com>
bb44b4e
to
784dbbc
Compare
Signed-off-by: Thomas Druez <tdruez@nexb.com>
Signed-off-by: Thomas Druez <tdruez@nexb.com>
Signed-off-by: Thomas Druez <tdruez@nexb.com>
Signed-off-by: Thomas Druez <tdruez@nexb.com>
This PR updates scancode-toolkit in scancode.io, expected test results, as well as updating code that uses deprecated scancode-toolkit functions.