Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watch for packages (model and implementation) #244 #271

Merged
merged 20 commits into from
Jan 30, 2024
Merged

Conversation

pombredanne
Copy link
Member

This PR creates a new watch model for packages to watch for new versions as detailed in #244

Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Copy link
Member Author

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my review!
Mostly some cosmetic changes

packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/serializers.py Outdated Show resolved Hide resolved
purldb_project/urls.py Outdated Show resolved Hide resolved
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
packagedb/api.py Outdated Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
packagedb/models.py Outdated Show resolved Hide resolved
Copy link
Member Author

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM... just a few nits for your consideration!

Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
- Use this id to track and manage the perodic watch

Signed-off-by: Keshav Priyadarshi <git@keshav.space>
- Use `redis` and `django-rq` to run the watch tasks and `rq-scheduler` for periodic task scheduling.

Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
@pombredanne pombredanne changed the title Add package watch model #244 Watch for packages (model and implementation) #244 Jan 18, 2024
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
@JonoYang
Copy link
Member

@keshav-space I think the code looks okay, but I'm having trouble running this in a docker container

I run docker compose build and got this error: service "redis" refers to undefined volume redis_data: invalid compose project

I add redis_data: to the volumes section of docker-compose.yml, then run docker compose build, and then docker compose up, and then I get these errors:

purldb-web-1        | Operations to perform:
purldb-web-1        |   Apply all migrations: admin, auth, authtoken, clearcode, contenttypes, django_rq, matchcode, minecode, packagedb, sessions
purldb-web-1        | Running migrations:
purldb-web-1        |   Applying django_rq.0001_initial... OK
purldb-web-1        |   Applying packagedb.0082_packagewatch... OK
purldb-web-1        | [2024-01-26 22:15:38 +0000] [9] [INFO] Starting gunicorn 21.2.0
purldb-web-1        | [2024-01-26 22:15:38 +0000] [9] [INFO] Listening at: http://0.0.0.0:8000 (9)
purldb-web-1        | [2024-01-26 22:15:38 +0000] [9] [INFO] Using worker: sync
purldb-web-1        | [2024-01-26 22:15:38 +0000] [10] [INFO] Booting worker with pid: 10
purldb-web-1        | [2024-01-26 22:15:38 +0000] [10] [ERROR] Exception in worker process
purldb-web-1        | Traceback (most recent call last):
purldb-web-1        |   File "/usr/local/lib/python3.9/site-packages/gunicorn/arbiter.py", line 609, in spawn_worker
purldb-web-1        |     worker.init_process()
purldb-web-1        |   File "/usr/local/lib/python3.9/site-packages/gunicorn/workers/base.py", line 134, in init_process
purldb-web-1        |     self.load_wsgi()
purldb-web-1        |   File "/usr/local/lib/python3.9/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
purldb-web-1        |     self.wsgi = self.app.wsgi()
purldb-web-1        |   File "/usr/local/lib/python3.9/site-packages/gunicorn/app/base.py", line 67, in wsgi
purldb-web-1        |     self.callable = self.load()
purldb-web-1        |   File "/usr/local/lib/python3.9/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
purldb-web-1        |     return self.load_wsgiapp()
purldb-web-1        |   File "/usr/local/lib/python3.9/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
purldb-web-1        |     return util.import_app(self.app_uri)
purldb-web-1        |   File "/usr/local/lib/python3.9/site-packages/gunicorn/util.py", line 371, in import_app
purldb-web-1        |     mod = importlib.import_module(module)
purldb-web-1        |   File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
purldb-web-1        |     return _bootstrap._gcd_import(name[level:], package, level)
purldb-web-1        |   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
purldb-web-1        |   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
purldb-web-1        |   File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
purldb-web-1        |   File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
purldb-web-1        |   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
purldb-web-1        |   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
purldb-web-1        |   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
purldb-web-1        | ModuleNotFoundError: No module named 'purldb'
purldb-web-1        | [2024-01-26 22:15:38 +0000] [10] [INFO] Worker exiting (pid: 10)
purldb-scheduler-1  | wait-for-it: web:8000 is available after 5 seconds
purldb-web-1        | [2024-01-26 22:15:38 +0000] [9] [ERROR] Worker (pid:10) exited with code 3
purldb-web-1        | [2024-01-26 22:15:38 +0000] [9] [ERROR] Shutting down: Master
purldb-web-1        | [2024-01-26 22:15:38 +0000] [9] [ERROR] Reason: Worker failed to boot.
purldb-web-1 exited with code 3
purldb-scheduler-1  | Traceback (most recent call last):
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 264, in connect
purldb-scheduler-1  |     sock = self.retry.call_with_retry(
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/retry.py", line 46, in call_with_retry
purldb-scheduler-1  |     return do()
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 265, in <lambda>
purldb-scheduler-1  |     lambda: self._connect(), lambda error: self.disconnect(error)
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 627, in _connect
purldb-scheduler-1  |     raise err
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 615, in _connect
purldb-scheduler-1  |     sock.connect(socket_address)
purldb-scheduler-1  | OSError: [Errno 99] Cannot assign requested address
purldb-scheduler-1  | 
purldb-scheduler-1  | During handling of the above exception, another exception occurred:
purldb-scheduler-1  | 
purldb-scheduler-1  | Traceback (most recent call last):
purldb-scheduler-1  |   File "/app/manage.py", line 19, in <module>
purldb-scheduler-1  |     execute_from_command_line(sys.argv)
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
purldb-scheduler-1  |     utility.execute()
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 436, in execute
purldb-scheduler-1  |     self.fetch_command(subcommand).run_from_argv(self.argv)
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 412, in run_from_argv
purldb-scheduler-1  |     self.execute(*args, **cmd_options)
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 458, in execute
purldb-scheduler-1  |     output = self.handle(*args, **options)
purldb-scheduler-1  |   File "/app/packagedb/management/commands/run_scheduler.py", line 29, in handle
purldb-scheduler-1  |     clear_zombie_watch_schedules()
purldb-scheduler-1  |   File "/app/packagedb/schedules.py", line 79, in clear_zombie_watch_schedules
purldb-scheduler-1  |     for job in scheduler.get_jobs():
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/rq_scheduler/scheduler.py", line 366, in get_jobs
purldb-scheduler-1  |     job_ids = self.connection.zrangebyscore(self.scheduled_jobs_key, 0,
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/commands/core.py", line 4679, in zrangebyscore
purldb-scheduler-1  |     return self.execute_command(*pieces, **options)
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/client.py", line 533, in execute_command
purldb-scheduler-1  |     conn = self.connection or pool.get_connection(command_name, **options)
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 1086, in get_connection
purldb-scheduler-1  |     connection.connect()
purldb-scheduler-1  |   File "/usr/local/lib/python3.9/site-packages/redis/connection.py", line 270, in connect
purldb-scheduler-1  |     raise ConnectionError(self._error_message(e))
purldb-scheduler-1  | redis.exceptions.ConnectionError: Error 99 connecting to localhost:6379. Cannot assign requested address.
purldb-scheduler-1 exited with code 1
purldb-rq_worker-1  | wait-for-it: timeout occurred after waiting 15 seconds for web:8000
purldb-rq_worker-1  | Error 99 connecting to localhost:6379. Cannot assign requested address.
purldb-rq_worker-1 exited with code 1

I don't know right away about the issue concerning ModuleNotFoundError: No module named 'purldb', but for the redis address related errors, I think you would need to see how we use and configure redis on scancode.io

Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
@keshav-space
Copy link
Member

I don't know right away about the issue concerning ModuleNotFoundError: No module named 'purldb', but for the redis address related errors, I think you would need to see how we use and configure redis on scancode.io

Thanks @JonoYang,
Fixed the error Error 99 connecting to localhost:6379. Cannot assign requested address it was due to incorrect redis host.
Also, we were getting ModuleNotFoundError: No module named 'purldb' error bcz we were not using the correct wsgi application in gunicorn. https://github.com/nexB/purldb/blob/86bce5fb7ab97c8e36f6336531db8f321cc0bba1/docker-compose.yml#L16

@JonoYang
Copy link
Member

@keshav-space Thanks for the fixes! I have another question: in the api for watch, there is the depth option. Where is that value used in the code? Is there a function that adds the purls to be scanned to the scan queue?

@keshav-space
Copy link
Member

I have another question: in the api for watch, there is the depth option. Where is that value used in the code? Is there a function that adds the purls to be scanned to the scan queue?

@JonoYang we'll use the depth value once we have Multi-level data collection #257 in place, as of now we add all newly discovered PURLs in the scan queue irrespective of depth.

@JonoYang
Copy link
Member

@keshav-space ack, I will merge this now and then we will move onto fleshing it out.

@JonoYang JonoYang merged commit baf41d2 into main Jan 30, 2024
9 checks passed
@JonoYang JonoYang deleted the 244-watch-model branch January 30, 2024 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants