Skip to content

Conversation

@djoseph-apphelix
Copy link
Contributor

@djoseph-apphelix djoseph-apphelix commented Aug 1, 2025

Description

This PR extends the functionality of the Course Optimizer page with the following enhancements:

Previous Run Link Detection:

  • Adds support to detect links pointing to previous course runs, controlled by a feature flagcontentstore.enable_course_optimizer_check_prev_run_links. If the flag is disabled, the scan covers only broken, external forbidden, and locked links.

Plain Text Link Scanning:

  • Updates link detection to also search for URLs in plain text, not just in href and src attributes.

Expanded Scanning Scope:

  • Includes Course Updates, Handouts, and Custom Pages in the link scan process, in addition to the existing course content.

Supporting information

  • Waffle Flag: contentstore.enable_course_optimizer_check_prev_run_links controls previous run detection.

  • Enhanced API Response: New course_updates and custom_pages fields with link categorization.

  • Improved URL Detection: Advanced regex patterns detect URLs in HTML attributes and plain text content.

  • Previous Run Link Detection: Identifies and categorizes links pointing to previous course runs using CourseRerunState

Jira

Testing Instructions

  • With the feature flag disabled, verify that existing behavior remains unchanged and no previous-run links are returned.
  • Create a course rerun and ensure a CourseRerunState record is created.
  • Add links pointing to the original (old) course in content, updates, handouts, and custom pages of the new course rerun.
  • Run the Course Optimizer and verify that these links are correctly categorized as previous-run.
  • Confirm the API response includes course_updates and custom_pages arrays with the expected data.
  • Verify that plain text URLs (not just HTML anchor tags) are also detected and categorized correctly.

@djoseph-apphelix djoseph-apphelix force-pushed the djoseph/TNL2-138-Update-API-endpoints-for-course-re-run-link-updation-tool branch 2 times, most recently from 1e03b7c to 3d680b5 Compare August 5, 2025 09:55
@djoseph-apphelix djoseph-apphelix requested review from Faraz32123 and jristau1984 and removed request for Faraz32123 and jristau1984 August 5, 2025 10:35
@djoseph-apphelix djoseph-apphelix force-pushed the djoseph/TNL2-138-Update-API-endpoints-for-course-re-run-link-updation-tool branch from 3d680b5 to cf48844 Compare August 5, 2025 11:39
@djoseph-apphelix djoseph-apphelix self-assigned this Aug 5, 2025
@djoseph-apphelix djoseph-apphelix force-pushed the djoseph/TNL2-138-Update-API-endpoints-for-course-re-run-link-updation-tool branch 2 times, most recently from e9a1b00 to 760d224 Compare August 6, 2025 13:11
@papphelix papphelix self-requested a review August 7, 2025 08:09
from user_tasks.models import UserTaskArtifact, UserTaskStatus

from cms.djangoapps.contentstore.tasks import CourseLinkCheckTask, LinkState
from cms.djangoapps.contentstore.tasks import CourseLinkCheckTask, LinkState, _get_urls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ variables should be avoided to import under PEP8 coding guidelines

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rename _get_urls to extract_content_URLs_from_course may be to avoid PEP8 violiation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's updated as per PEP-8

# .. toggle_default: False
# .. toggle_description: When enabled, allows the Course Optimizer to detect and update links pointing to previous course runs.
# This feature enables instructors to fix internal course links that still point to old course runs
# after creating a course rerun.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description says its still pointing to old run, whereas its a feature to show previous run links on new run. Correct me if i am wrong @Faraz32123

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Descriptions seems right to me, but little changes can be made here.

# .. toggle_description: When enabled, allows the Course Optimizer to detect and update links pointing to previous course runs.
#   This feature allows instructors to see & update internal course links that are still pointing to previous course run
#   in a new course run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated description as per above.

@@ -10,7 +10,8 @@ def cms_api_filter(endpoints):
"""
filtered = []
CMS_PATH_PATTERN = re.compile(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cms_paths = [
    "xblock", 
    "videos", 
    "video_transcripts", 
    "file_assets", 
    "youtube_transcripts", 
    "link_check", 
    "link_check_status"
]

CMS_PATH_PATTERN = re.compile(r"^/api/contentstore/v0/(" + "|".join(cms_paths) + r")")

is it possible to write this way to avoid multiline regexp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the file, as these APIs are not intended to be exposed in the Swagger UI

@djoseph-apphelix djoseph-apphelix force-pushed the djoseph/TNL2-138-Update-API-endpoints-for-course-re-run-link-updation-tool branch from 760d224 to d34c891 Compare August 7, 2025 09:49
Copy link
Contributor

@Faraz32123 Faraz32123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👌

@djoseph-apphelix djoseph-apphelix force-pushed the djoseph/TNL2-138-Update-API-endpoints-for-course-re-run-link-updation-tool branch from 0253811 to 585bb9d Compare August 8, 2025 03:14
main_content = None

if main_content is None:
main_content = {"sections": []}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we directly add main_content = {"sections": []} in except block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is updated as per above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is updated as per above.

@djoseph-apphelix djoseph-apphelix force-pushed the djoseph/TNL2-138-Update-API-endpoints-for-course-re-run-link-updation-tool branch from 585bb9d to b8bd541 Compare August 8, 2025 08:04
@Faraz32123 Faraz32123 merged commit d3eba7a into openedx:master Aug 8, 2025
49 checks passed
@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX production environment.

1 similar comment
@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX production environment.

@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

# after creating a course rerun.
# .. toggle_use_cases: temporary
# .. toggle_creation_date: 2025-07-21
# .. toggle_target_removal_date: None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djoseph-apphelix @Faraz32123 If this is a temporary toggle, why is there no removal date?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll handle this in upcoming PR!

@kdmccormick
Copy link
Member

@djoseph-apphelix @Faraz32123 Could you provide me the broader context of this pull request? For example, is there a public document outlining the plan, or are there related PRs you could link to? The TNL ticket is internal to 2U so it is not accessible to me.

@Faraz32123
Copy link
Contributor

@djoseph-apphelix @Faraz32123 Could you provide me the broader context of this pull request? For example, is there a public document outlining the plan, or are there related PRs you could link to? The TNL ticket is internal to 2U so it is not accessible to me.

Problem: When we create a rerun of a course through publisher, internal links of a course inside course updates, handouts, custom pages or even course content still points to older run instead of newly created run. Due to which learners are redirected to previous run/course instead of staying inside the new run/course.

So, basically the purpose of this PR is to enhance the functionality of the course optimizer APIs to also detect prevRunLinks as mentioned in the PR description within a course with expanded scope and it also detects broken links. There will be more PRs to update these links & we'll link those PRs with this.

@jristau1984 Do we have any public doc for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants