-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add initial version of learning analytics computation and dashboard illustrations #4211
Conversation
…ponent (#4181) Co-authored-by: Julius Schlapbach <80708107+sjschlapbach@users.noreply.github.com> Co-authored-by: sjschlapbach <julius.schlapbach@gmx.net>
Current Aviator status
This PR was merged manually (without Aviator). Merging manually can negatively impact the performance of the queue. Consider using Aviator next time.
See the real-time status of this PR on the
Aviator webapp.
Use the Aviator Chrome Extension
to see the status of your PR within GitHub.
|
…on approach for analytics (#4182) Co-authored-by: Julius Schlapbach <80708107+sjschlapbach@users.noreply.github.com> Co-authored-by: sjschlapbach <julius.schlapbach@gmx.net>
klicker-uzh
|
Project |
klicker-uzh
|
Branch Review |
v3
|
Run status |
|
Run duration | 12m 31s |
Commit |
|
Committer | Julius Schlapbach |
View all properties for this run ↗︎ |
Test results | |
---|---|
|
0
|
|
0
|
|
0
|
|
0
|
|
148
|
View all changes introduced in this branch ↗︎ |
klicker-uzh
|
Project |
klicker-uzh
|
Branch Review |
v3-analytics
|
Run status |
|
Run duration | 12m 02s |
Commit |
|
Committer | Julius Schlapbach |
View all properties for this run ↗︎ |
Test results | |
---|---|
|
0
|
|
0
|
|
0
|
|
0
|
|
148
|
View all changes introduced in this branch ↗︎ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 22
Outside diff range, codebase verification and nitpick comments (1)
apps/analytics/src/modules/participant_analytics/compute_participant_course_analytics.py (1)
8-12
: Consider adding type hints for function parameters.Adding type hints for
db
,df_courses
, andverbose
can improve code readability and maintainability.Apply this diff to add type hints:
-def compute_participant_course_analytics(db, df_courses, verbose=False): +def compute_participant_course_analytics(db: DatabaseType, df_courses: pd.DataFrame, verbose: bool = False):
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
apps/analytics/poetry.lock
is excluded by!**/*.lock
Files selected for processing (44)
- .gitignore (1 hunks)
- apps/analytics/pyproject.toml (2 hunks)
- apps/analytics/src/init.py (1 hunks)
- apps/analytics/src/modules/init.py (1 hunks)
- apps/analytics/src/modules/participant_analytics/init.py (1 hunks)
- apps/analytics/src/modules/participant_analytics/aggregate_analytics.py (1 hunks)
- apps/analytics/src/modules/participant_analytics/compute_correctness.py (1 hunks)
- apps/analytics/src/modules/participant_analytics/compute_participant_analytics.py (1 hunks)
- apps/analytics/src/modules/participant_analytics/compute_participant_course_analytics.py (1 hunks)
- apps/analytics/src/modules/participant_analytics/get_participant_responses.py (1 hunks)
- apps/analytics/src/modules/participant_analytics/save_participant_analytics.py (1 hunks)
- apps/analytics/src/notebooks/archive/daily_participant_analytics.ipynb (1 hunks)
- apps/analytics/src/notebooks/archive/test.ipynb (1 hunks)
- apps/analytics/src/notebooks/participant_analytics.ipynb (1 hunks)
- apps/frontend-control/src/components/sessions/EmbeddingModal.tsx (1 hunks)
- apps/frontend-control/tsconfig.json (1 hunks)
- apps/frontend-manage/next.config.js (1 hunks)
- apps/frontend-manage/src/components/common/Header.tsx (1 hunks)
- apps/frontend-manage/src/components/courses/actions/EvaluationLinkLiveQuiz.tsx (1 hunks)
- apps/frontend-manage/src/components/courses/actions/RunningLiveQuizLink.tsx (1 hunks)
- apps/frontend-manage/src/components/courses/actions/StartLiveQuizButton.tsx (1 hunks)
- apps/frontend-manage/src/components/interaction/AudienceInteraction.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/EmbeddingModal.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/Session.tsx (2 hunks)
- apps/frontend-manage/src/components/sessions/cockpit/CancelSessionModal.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/cockpit/SessionTimeline.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/creation/liveQuiz/LiveSessionWizard.tsx (2 hunks)
- apps/frontend-manage/src/components/sessions/evaluation/EvaluationControlBar.tsx (1 hunks)
- apps/frontend-manage/src/pages/analytics/index.tsx (1 hunks)
- apps/frontend-manage/src/pages/index.tsx (1 hunks)
- apps/frontend-manage/src/pages/quizzes/[id]/cockpit.tsx (2 hunks)
- apps/frontend-manage/src/pages/quizzes/index.tsx (2 hunks)
- apps/frontend-manage/tsconfig.json (1 hunks)
- apps/frontend-pwa/tsconfig.json (1 hunks)
- apps/office-addin/src/content/components/App.tsx (1 hunks)
- apps/office-addin/src/content/components/URLForm.tsx (2 hunks)
- cypress/cypress/e2e/F-live-quiz-workflow.cy.ts (10 hunks)
- cypress/cypress/e2e/H-practice-quiz-workflow.cy.ts (3 hunks)
- cypress/cypress/e2e/K-group-activity-workflow.cy.ts (1 hunks)
- packages/i18n/messages/de.ts (1 hunks)
- packages/i18n/messages/en.ts (1 hunks)
- packages/prisma/src/prisma/migrations/20240826113537_participant_analytics_correctness_counts/migration.sql (1 hunks)
- packages/prisma/src/prisma/migrations/20240827152536_computed_at_analytics/migration.sql (1 hunks)
- packages/prisma/src/prisma/schema/analytics.prisma (9 hunks)
Files skipped from review due to trivial changes (5)
- .gitignore
- apps/frontend-manage/src/components/courses/actions/EvaluationLinkLiveQuiz.tsx
- apps/frontend-manage/src/components/sessions/evaluation/EvaluationControlBar.tsx
- apps/office-addin/src/content/components/App.tsx
- packages/i18n/messages/de.ts
Additional context used
Ruff
apps/analytics/src/__init__.py
1-1:
from .modules import *
used; unable to detect undefined names(F403)
2-2:
from .notebooks import *
used; unable to detect undefined names(F403)
apps/analytics/src/modules/__init__.py
1-1:
.participant_analytics.compute_correctness
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
1-1:
.participant_analytics.get_participant_responses
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
apps/analytics/src/modules/participant_analytics/__init__.py
1-1:
.compute_correctness.compute_correctness
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.get_participant_responses.get_participant_responses
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.aggregate_analytics.aggregate_analytics
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
4-4:
.save_participant_analytics.save_participant_analytics
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
5-5:
.compute_participant_course_analytics.compute_participant_course_analytics
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
apps/analytics/src/modules/participant_analytics/compute_correctness.py
10-12: Use
instance_dict["elementData"].get("options", None)
instead of anif
blockReplace with
instance_dict["elementData"].get("options", None)
(SIM401)
Additional comments not posted (98)
apps/frontend-manage/src/pages/analytics/index.tsx (3)
4-6
: LGTM!The function is correctly implemented and follows best practices for component rendering.
The code changes are approved.
8-13
: LGTM!The function is correctly implemented and follows best practices for localization.
The code changes are approved.
16-16
: LGTM!The default export is correctly implemented.
The code changes are approved.
apps/frontend-manage/tsconfig.json (1)
19-19
: LGTM!The added path mapping improves code readability and maintainability by standardizing import paths.
The code changes are approved.
apps/analytics/pyproject.toml (3)
5-5
: LGTM!The expanded authors list reflects a change in the project's collaborative authorship.
The code changes are approved.
20-21
: LGTM!The new section for development dependencies enhances type checking during development.
The code changes are approved.
31-32
: LGTM!The new configuration for
pyright
enforces stricter type checks across the codebase.The code changes are approved.
packages/prisma/src/prisma/migrations/20240826113537_participant_analytics_correctness_counts/migration.sql (1)
1-16
: Verify the impact of dropping columns and ensure correct data types for new columns.Dropping columns
meanFirstCorrectCount
,meanLastCorrectCount
, andcollectedAchievements
will result in data loss. Ensure that this data is no longer needed or has been backed up. Additionally, verify that the new columnsfirstCorrectCount
,firstWrongCount
,lastCorrectCount
, andlastWrongCount
have the correct data types and constraints.Run the following script to verify the data types and constraints of the new columns:
Verification successful
Verify the impact of dropping columns and ensure correct data types for new columns.
The new columns
firstCorrectCount
,firstWrongCount
,lastCorrectCount
, andlastWrongCount
have the correct data typeREAL
, consistent with existing similar fields. However, dropping columnsmeanFirstCorrectCount
,meanLastCorrectCount
, andcollectedAchievements
will result in data loss. Ensure that this data is no longer needed or has been backed up.
- File:
packages/prisma/src/prisma/migrations/20240826113537_participant_analytics_correctness_counts/migration.sql
- Model Definition:
packages/prisma/src/prisma/schema/analytics.prisma
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the data types and constraints of the new columns in the `ParticipantAnalytics` table. # Test: Check the schema of the `ParticipantAnalytics` table. Expect: Correct data types and constraints for the new columns. psql -c '\d+ "ParticipantAnalytics"'Length of output: 110
Script:
#!/bin/bash # Search for files related to the `ParticipantAnalytics` table schema or ORM models fd -e sql -e js -e ts -e prisma # Search for mentions of `ParticipantAnalytics` within those files to gather schema information rg 'ParticipantAnalytics' --type sql --type js --type ts --type prismaLength of output: 19081
Script:
#!/bin/bash # Search for mentions of `ParticipantAnalytics` within `.prisma` files to gather schema information rg 'ParticipantAnalytics' --type-add 'prisma:*.prisma' --type prismaLength of output: 550
Script:
#!/bin/bash # Extract the definition of the `ParticipantAnalytics` model from the `analytics.prisma` file rg 'model ParticipantAnalytics {' -A 20 packages/prisma/src/prisma/schema/analytics.prismaLength of output: 212
Script:
#!/bin/bash # Correct the regex pattern and extract the definition of the `ParticipantAnalytics` model from the `analytics.prisma` file rg 'model ParticipantAnalytics \{' -A 20 packages/prisma/src/prisma/schema/analytics.prismaLength of output: 1004
apps/frontend-control/tsconfig.json (1)
19-19
: LGTM!The new path alias mapping
~/*
pointing tosrc/*
improves code readability and maintainability by providing a clearer structure for imports.The code changes are approved.
apps/frontend-manage/src/components/courses/actions/RunningLiveQuizLink.tsx (1)
17-17
: LGTM!The modified URL structure correctly reflects the shift in navigation logic, directing users to a quiz cockpit instead of a session cockpit.
The code changes are approved.
apps/frontend-pwa/tsconfig.json (1)
19-19
: LGTM!The new path mapping for the alias
~/*
enhances module resolution and simplifies import statements.The code changes are approved.
apps/frontend-manage/next.config.js (1)
11-39
: LGTM!The new asynchronous
redirects
function ensures seamless redirection from old session URLs to new quiz URLs, improving user experience and maintaining link integrity.The code changes are approved.
apps/frontend-manage/src/components/courses/actions/StartLiveQuizButton.tsx (1)
27-27
: LGTM!The URL path modification in the
router.push
method correctly reflects the reorganization of the application's routing strategy.The code changes are approved.
apps/analytics/src/modules/participant_analytics/compute_participant_analytics.py (5)
1-4
: LGTM!The import statements are correctly importing the necessary functions.
The code changes are approved.
7-11
: LGTM!The function definition and initial data retrieval are correctly implemented.
The code changes are approved.
12-19
: LGTM!The correctness computation is correctly implemented.
The code changes are approved.
21-25
: LGTM!The analytics aggregation and saving are correctly implemented.
The code changes are approved.
27-30
: LGTM!The data cleanup is correctly implemented.
The code changes are approved.
apps/office-addin/src/content/components/URLForm.tsx (2)
24-26
: LGTM!The regex for URL validation is correctly updated to match quiz URLs.
The code changes are approved.
45-45
: LGTM!The placeholder text is correctly updated to match the new URL format and provides clear guidance to the user.
The code changes are approved.
packages/prisma/src/prisma/migrations/20240827152536_computed_at_analytics/migration.sql (2)
1-11
: Address the warnings.The warnings indicate that new columns are added without default values. Ensure that the tables are empty or provide default values for the new columns.
Verify that the tables are empty or provide default values for the new columns.
12-36
: LGTM!The SQL statements to alter tables and add new columns are correctly implemented.
The code changes are approved.
apps/frontend-manage/src/components/sessions/cockpit/CancelSessionModal.tsx (1)
48-48
: Verify the updated navigation route.The route in the
router.push
function has been updated from/sessions
to/quizzes
. Ensure this change aligns with the intended user experience and that all related navigation flows are updated accordingly.apps/analytics/src/modules/participant_analytics/get_participant_responses.py (5)
5-11
: Verify the structure ofdetail
.Ensure that
detail
always contains eitherpracticeQuiz
ormicroLearning
to avoid potentialNoneType
errors.
14-21
: Verify the structure ofparticipant_dict
.Ensure that
participant_dict
always containsdetailQuestionResponses
to avoid potentialKeyError
.
24-31
: Verify the DataFrame structure.Ensure that the DataFrame structure aligns with the expected schema for participant details.
35-49
: Verify the structure ofdetail
.Ensure that
detail
always containspracticeQuiz
ormicroLearning
or handle cases where neither is present.
52-86
: Verify the database query and verbose logging.Ensure that the database query retrieves the expected data and that the verbose logging provides useful information.
apps/frontend-control/src/components/sessions/EmbeddingModal.tsx (1)
38-38
: Verify the updated URL structure.The URL structure in the
LazyHMACLink
function has been updated from/sessions
to/quizzes
. Ensure this change aligns with the intended functionality and that all related URL structures are updated accordingly.apps/frontend-manage/src/components/sessions/EmbeddingModal.tsx (1)
32-32
: LGTM!The URL path change from
/sessions/
to/quizzes/
is consistent with the shift to a quiz-based model.The code changes are approved.
apps/analytics/src/modules/participant_analytics/compute_participant_course_analytics.py (9)
1-5
: LGTM!The necessary modules are imported correctly.
The code changes are approved.
21-39
: LGTM!The database query to find all participants and corresponding linked responses is correctly implemented.
The code changes are approved.
41-51
: LGTM!The creation of dataframes containing all detail responses and question responses is correctly implemented.
The code changes are approved.
64-80
: LGTM!The creation of pandas dataframes containing all question responses and details is correctly implemented.
The code changes are approved.
81-90
: LGTM!The computation of the correctness of each question response detail is correctly implemented.
The code changes are approved.
92-94
: LGTM!The computation of participant analytics is correctly implemented.
The code changes are approved.
95-103
: LGTM!The saving of aggregated analytics into the database is correctly implemented.
The code changes are approved.
105-108
: LGTM!The deletion of dataframes to avoid conflicts in the next iteration is correctly implemented.
The code changes are approved.
110-110
: LGTM!The function returns the count of courses without responses correctly.
The code changes are approved.
apps/frontend-manage/src/pages/quizzes/index.tsx (1)
18-18
: LGTM!The renaming of the component from
SessionList
toQuizList
is consistent with the shift to a quiz-based model.The code changes are approved.
apps/analytics/src/modules/participant_analytics/save_participant_analytics.py (3)
1-1
: Import statement is correct.The import statement for
datetime
is correct.The code changes are approved.
4-5
: Ensure timestamp consistency.The
computedAt
timestamp is set to the current date with a fixed time of "00:00:00.000Z". Ensure this is the intended behavior and that the timestamp format is consistent with other parts of the application.Verify that the
computedAt
timestamp format is consistent with other parts of the application.
91-92
: Raise appropriate exceptions.The function raises a
ValueError
for unknown analytics types. Ensure this is the appropriate exception type and that it is handled correctly in the calling code.Verify that the
ValueError
is the appropriate exception type and that it is handled correctly in the calling code.apps/frontend-manage/src/pages/quizzes/[id]/cockpit.tsx (2)
41-42
: Update prefetch paths.The prefetch paths have been updated to reflect the transition from sessions to quizzes. Ensure that these new paths are correct and that the corresponding pages exist.
Verify that the new prefetch paths are correct and that the corresponding pages exist.
118-118
: Update push route.The push route has been updated to reflect the transition from sessions to quizzes. Ensure that the new route is correct and that the corresponding page exists.
Verify that the new push route is correct and that the corresponding page exists.
apps/analytics/src/modules/participant_analytics/aggregate_analytics.py (9)
1-1
: Import statement is correct.The import statement for
pandas
is correct.The code changes are approved.
4-21
: Ensure aggregation logic is correct.The aggregation logic for
df_analytics_counts
seems correct. Ensure that the column names and aggregation functions match the expected schema.Verify that the column names and aggregation functions match the expected schema.
23-31
: Ensure correctness of unstack operation.The unstack operation in
df_analytics_corr_temp
seems correct. Ensure that the resulting dataframe has the expected structure.Verify that the resulting dataframe from the unstack operation has the expected structure.
33-65
: Ensure correctness of mean calculations.The mean calculations for correctness columns seem correct. Ensure that the division operations handle cases where the denominator is zero.
Verify that the division operations handle cases where the denominator is zero.
67-79
: Ensure correctness of aggregation logic.The aggregation logic for
df_analytics_correctness
seems correct. Ensure that the column names and aggregation functions match the expected schema.Verify that the column names and aggregation functions match the expected schema.
81-94
: Ensure correctness of column mapping.The column mapping for
df_analytics_counts
seems correct. Ensure that the new column names match the expected schema.Verify that the new column names match the expected schema.
96-126
: Ensure correctness of course analytics aggregation.The aggregation logic for
df_course_analytics
seems correct. Ensure that the column names and aggregation functions match the expected schema.Verify that the column names and aggregation functions match the expected schema.
127-144
: Ensure correctness of dataframe merge.The merge operations for the dataframes seem correct. Ensure that the resulting dataframe has the expected structure and that there are no duplicate rows.
Verify that the resulting dataframe from the merge operations has the expected structure and that there are no duplicate rows.
145-145
: Return statement is correct.The return statement is correct.
The code changes are approved.
apps/frontend-manage/src/components/common/Header.tsx (5)
33-35
: LGTM!The changes to the navigation item label and
cy
attribute are consistent with the new focus of the application.The code changes are approved.
38-41
: LGTM!The changes to the navigation item label,
href
, andcy
attributes are consistent with the new focus of the application.The code changes are approved.
49-54
: LGTM!The addition of the new navigation item for "analytics" enhances the functionality of the component by providing users with more options.
The code changes are approved.
59-66
: LGTM!The updates to the class names for the
Navigation
component indicate a visual redesign, transitioning from a dark theme to a lighter theme.The code changes are approved.
74-143
: LGTM!The updates to the class names, properties, icon sizes, and hover effects for
Navigation.ButtonItem
andNavigation.TriggerItem
components enhance the user interface and experience. TheonClick
handlers have been correctly updated to route to new paths.The code changes are approved.
apps/analytics/src/modules/participant_analytics/compute_correctness.py (2)
17-96
: LGTM!The function is well-structured and correctly computes correctness columns based on the element type.
The code changes are approved.
99-164
: LGTM!The function is well-structured and correctly processes the data to compute correctness.
The code changes are approved.
packages/prisma/src/prisma/schema/analytics.prisma (6)
19-20
: LGTM!The addition of the
computedAt
field enhances the granularity of data captured for each participant.The code changes are approved.
38-41
: LGTM!The addition of the new fields allows for more detailed tracking of correctness over questions attempted.
The code changes are approved.
51-52
: LGTM!The addition of the
createdAt
andupdatedAt
fields enhances the temporal context of the data.The code changes are approved.
Line range hint
64-79
: LGTM!The addition of the
computedAt
,createdAt
, andupdatedAt
fields enhances the granularity and temporal context of the data captured for each participant.The code changes are approved.
98-100
: LGTM!The addition of the
createdAt
andupdatedAt
fields enhances the temporal context of the data.The code changes are approved.
119-121
: LGTM!The addition of the
createdAt
andupdatedAt
fields enhances the temporal context of the data.The code changes are approved.
Also applies to: 140-142, 161-162
apps/analytics/src/notebooks/participant_analytics.ipynb (5)
4-8
: LGTM!The markdown cell is correctly formatted and provides a clear section header.
The code changes are approved.
11-27
: LGTM!The imports and setup are correctly implemented.
The code changes are approved.
54-58
: LGTM!The markdown cell is correctly formatted and provides a clear section header.
The code changes are approved.
113-117
: LGTM!The markdown cell is correctly formatted and provides a clear section header.
The code changes are approved.
150-157
: LGTM!The database disconnection is correctly implemented.
The code changes are approved.
apps/frontend-manage/src/components/interaction/AudienceInteraction.tsx (1)
91-91
: LGTM! But verify the URL structure consistency.The change correctly updates the URL structure to reference a quizzes endpoint. Ensure that the new URL structure is consistent with the rest of the application.
The code changes are approved.
Run the following script to verify the URL structure consistency:
Verification successful
URL Structure Consistency Verified
The
/quizzes/
URL structure is consistently used across the application, confirming that the update inAudienceInteraction.tsx
aligns with the existing structure. This supports the correctness of the change.
- Files with consistent URL usage:
apps/frontend-manage/next.config.js
apps/frontend-manage/src/components/interaction/AudienceInteraction.tsx
- Various components and test files
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the URL structure consistency for quizzes endpoint. # Test: Search for the URL structure. Expect: Consistent URL structure across the application. rg --type js --type tsx --type ts $'/quizzes/'Length of output: 73
Script:
#!/bin/bash # Description: Verify the URL structure consistency for quizzes endpoint. # Test: Search for the URL structure. Expect: Consistent URL structure across the application. rg --type javascript --type typescript '/quizzes/'Length of output: 85
Script:
#!/bin/bash # Description: Verify the URL structure consistency for quizzes endpoint. # Test: Search for the URL structure. Expect: Consistent URL structure across the application. rg --type js --type ts '/quizzes/'Length of output: 2487
apps/analytics/src/notebooks/archive/test.ipynb (3)
16-27
: LGTM!The database connection setup is correctly implemented.
The code changes are approved.
128-139
: LGTM!The code correctly processes response details and activity analytics.
The code changes are approved
4-13
: Avoid hardcoding the database URL.Hardcoding the database URL is not a best practice. Consider using environment variables or configuration files to manage sensitive information.
Apply this diff to avoid hardcoding the database URL:
- os.environ['DATABASE_URL'] = 'postgresql://klicker:klicker@localhost:5432/klicker-prod' + os.environ['DATABASE_URL'] = os.getenv('DATABASE_URL', 'default_value')Likely invalid or redundant comment.
apps/frontend-manage/src/components/sessions/cockpit/SessionTimeline.tsx (1)
198-198
: Verify consistency of routing changes.The
href
attribute has been updated to reflect the transition from sessions to quizzes. Ensure that this change is consistent across the application and does not break any existing functionality.Run the following script to verify the consistency of routing changes:
apps/frontend-manage/src/components/sessions/creation/liveQuiz/LiveSessionWizard.tsx (2)
241-241
: Verify consistency of routing changes.The
router.push
method call has been updated to redirect users to/quizzes
. Ensure that this change is consistent across the application and does not break any existing functionality.Run the following script to verify the consistency of routing changes:
259-259
: Verify consistency of routing changes.The
router.push
method call has been updated to redirect users to/quizzes/${data?.createSession?.id}/cockpit
. Ensure that this change is consistent across the application and does not break any existing functionality.Run the following script to verify the consistency of routing changes:
apps/frontend-manage/src/components/sessions/Session.tsx (2)
195-195
: Verify consistency of routing changes.The
href
attribute has been updated to redirect users to/quizzes/${session.id}/evaluation
. Ensure that this change is consistent across the application and does not break any existing functionality.Run the following script to verify the consistency of routing changes:
180-180
: Verify consistency of routing changes.The
href
attribute has been updated to redirect users to/quizzes/${session.id}/cockpit
. Ensure that this change is consistent across the application and does not break any existing functionality.Run the following script to verify the consistency of routing changes:
apps/analytics/src/notebooks/archive/daily_participant_analytics.ipynb (5)
4-10
: LGTM!The markdown cell provides useful context for the notebook.
The content is approved.
13-23
: LGTM!The imports are necessary for the notebook's functionality.
The code changes are approved.
72-76
: LGTM!The markdown cell provides useful context for the notebook.
The content is approved.
79-104
: Validate the column selection.Ensure that all necessary columns are included in the DataFrame. Consider adding a validation step to check for missing columns.
Apply this diff to add a validation step:
df_details = convert_to_df(participant_response_details) + required_columns = ['score', 'pointsAwarded', 'xpAwarded', 'timeSpent', 'response', 'elementInstanceId', 'participantId', 'courseId'] + missing_columns = [col for col in required_columns if col not in df_details.columns] + if missing_columns: + raise ValueError(f"Missing columns: {', '.join(missing_columns)}") df_details = df_details[['score', 'pointsAwarded', 'xpAwarded', 'timeSpent', 'response', 'elementInstanceId', 'participantId', 'courseId']]
325-331
: LGTM!The database disconnection is correctly implemented.
The code changes are approved.
apps/frontend-manage/src/pages/index.tsx (1)
87-88
: LGTM! But verify the new routes.The changes correctly update the prefetching behavior. Ensure that the new routes
/quizzes/running
and/quizzes
are valid and accessible.The code changes are approved.
Run the following script to verify the new routes:
cypress/cypress/e2e/H-practice-quiz-workflow.cy.ts (3)
30-30
: LGTM! But verify the consistency of the new selector.The change correctly updates the selector to match the new UI structure. Ensure that the new selector
data-cy="library"
is used consistently across the codebase.The code changes are approved.
Run the following script to verify the consistency of the new selector:
Verification successful
Selector
data-cy="library"
is used consistently across the codebase.The selector appears in multiple test files, confirming its consistent application. No further action is needed regarding this change.
cypress/cypress/e2e/K-group-activity-workflow.cy.ts
cypress/cypress/e2e/H-practice-quiz-workflow.cy.ts
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in the codebase. Expect: Only occurrences of the new selector. rg --type js --type ts --type tsx --type jsx 'data-cy="library"'Length of output: 94
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in the codebase. Expect: Only occurrences of the new selector. rg --type js --type ts 'data-cy="library"'Length of output: 417
397-397
: LGTM! But verify the consistency of the new selector.The change correctly updates the selector to match the new UI structure. Ensure that the new selector
data-cy="library"
is used consistently across the codebase.The code changes are approved.
Run the following script to verify the consistency of the new selector:
Verification successful
The new selector is used consistently across the codebase.
The
data-cy="library"
selector appears in multiple test files, confirming its consistent usage.
cypress/cypress/e2e/K-group-activity-workflow.cy.ts
cypress/cypress/e2e/H-practice-quiz-workflow.cy.ts
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in the codebase. Expect: Only occurrences of the new selector. rg --type js --type ts --type tsx --type jsx 'data-cy="library"'Length of output: 94
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in JavaScript and TypeScript files. rg --type js --type ts 'data-cy="library"'Length of output: 417
201-201
: LGTM! But verify the consistency of the new selector.The change correctly updates the selector to match the new UI structure. Ensure that the new selector
data-cy="library"
is used consistently across the codebase.The code changes are approved.
Run the following script to verify the consistency of the new selector:
Verification successful
The selector
data-cy="library"
is used consistently across the codebase.The new selector appears in multiple test files, indicating that the change has been applied correctly. No inconsistencies were found.
cypress/cypress/e2e/H-practice-quiz-workflow.cy.ts
cypress/cypress/e2e/K-group-activity-workflow.cy.ts
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in the codebase. Expect: Only occurrences of the new selector. rg --type js --type ts --type tsx --type jsx 'data-cy="library"'Length of output: 94
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in the codebase. Expect: Only occurrences of the new selector. rg --type javascript --type typescript --type jsx 'data-cy="library"'Length of output: 106
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in the codebase. Expect: Only occurrences of the new selector. rg --type js --type ts --type jsx 'data-cy="library"'Length of output: 83
Script:
#!/bin/bash # Description: Verify that the new selector is used consistently across the codebase. # Test: Search for the new selector in the codebase. Expect: Only occurrences of the new selector. rg --type js --type ts 'data-cy="library"'Length of output: 417
cypress/cypress/e2e/K-group-activity-workflow.cy.ts (1)
19-19
: Verify the correctness of the new selector.Ensure that the new selector
[data-cy="library"]
correctly targets the intended UI element and that the test passes with this change.Run the following script to verify the correctness of the new selector:
cypress/cypress/e2e/F-live-quiz-workflow.cy.ts (4)
653-653
: Verify the correctness of the new selector.Ensure that the new selector
[data-cy="quizzes"]
correctly targets the intended UI element and that the test passes with this change.Run the following script to verify the correctness of the new selector:
680-680
: Verify the correctness of the new selector.Ensure that the new selector
[data-cy="quizzes"]
correctly targets the intended UI element and that the test passes with this change.Run the following script to verify the correctness of the new selector:
760-760
: Verify the correctness of the new selector.Ensure that the new selector
[data-cy="quizzes"]
correctly targets the intended UI element and that the test passes with this change.Run the following script to verify the correctness of the new selector:
689-689
: Verify the correctness of the new URL path.Ensure that the new URL path
/quizzes/
correctly targets the intended endpoint and that the test passes with this change.Run the following script to verify the correctness of the new URL path:
Verification successful
The new URL path
/quizzes/
is correctly integrated across the codebase.The change from
/sessions/
to/quizzes/
is consistent and appears intentional, with the new path being used in multiple relevant contexts, such as routing and component links.
- Files and contexts using
/quizzes/
:
apps/office-addin/src/content/components/URLForm.tsx
cypress/cypress/e2e/F-live-quiz-workflow.cy.ts
apps/frontend-manage/next.config.js
- Various components in
apps/frontend-manage/src
This confirms that the new URL path is correctly targeting the intended endpoint.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the correctness of the new URL path `/quizzes/`. # Test: Search for the new URL path usage. Expect: Only occurrences of the new URL path. rg --type js --type ts -A 5 $'/quizzes/'Length of output: 11105
packages/i18n/messages/en.ts (3)
696-696
: LGTM!The entry
library
is correctly added and follows the existing pattern.The code changes are approved.
697-697
: LGTM!The entry
quizzes
is correctly added and follows the existing pattern.The code changes are approved.
698-698
: LGTM!The entry
analytics
is correctly added and follows the existing pattern.The code changes are approved.
apps/analytics/src/modules/participant_analytics/compute_participant_course_analytics.py
Show resolved
Hide resolved
apps/analytics/src/modules/participant_analytics/compute_participant_course_analytics.py
Show resolved
Hide resolved
# Conflicts: # apps/frontend-manage/src/components/common/Header.tsx # apps/office-addin/src/content/components/URLForm.tsx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (13)
- apps/frontend-control/src/components/sessions/EmbeddingModal.tsx (1 hunks)
- apps/frontend-manage/src/components/common/Header.tsx (1 hunks)
- apps/frontend-manage/src/components/courses/actions/EvaluationLinkLiveQuiz.tsx (1 hunks)
- apps/frontend-manage/src/components/courses/actions/RunningLiveQuizLink.tsx (1 hunks)
- apps/frontend-manage/src/components/interaction/AudienceInteraction.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/EmbeddingModal.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/Session.tsx (2 hunks)
- apps/frontend-manage/src/components/sessions/cockpit/CancelSessionModal.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/cockpit/SessionTimeline.tsx (1 hunks)
- apps/frontend-manage/src/components/sessions/evaluation/EvaluationControlBar.tsx (1 hunks)
- apps/frontend-manage/src/pages/index.tsx (1 hunks)
- apps/office-addin/src/content/components/URLForm.tsx (2 hunks)
- packages/markdown/src/public/utilities.css (1 hunks)
Files skipped from review due to trivial changes (2)
- apps/frontend-manage/src/components/courses/actions/EvaluationLinkLiveQuiz.tsx
- packages/markdown/src/public/utilities.css
Files skipped from review as they are similar to previous changes (10)
- apps/frontend-control/src/components/sessions/EmbeddingModal.tsx
- apps/frontend-manage/src/components/courses/actions/RunningLiveQuizLink.tsx
- apps/frontend-manage/src/components/interaction/AudienceInteraction.tsx
- apps/frontend-manage/src/components/sessions/EmbeddingModal.tsx
- apps/frontend-manage/src/components/sessions/Session.tsx
- apps/frontend-manage/src/components/sessions/cockpit/CancelSessionModal.tsx
- apps/frontend-manage/src/components/sessions/cockpit/SessionTimeline.tsx
- apps/frontend-manage/src/components/sessions/evaluation/EvaluationControlBar.tsx
- apps/frontend-manage/src/pages/index.tsx
- apps/office-addin/src/content/components/URLForm.tsx
Additional comments not posted (6)
apps/frontend-manage/src/components/common/Header.tsx (6)
33-35
: LGTM!The changes to the first navigation item are straightforward and align with the new focus of the application.
The code changes are approved.
38-41
: LGTM!The changes to the second navigation item reflect the shift in the application's navigation structure.
The code changes are approved.
49-54
: LGTM!The addition of the new navigation item for "analytics" enhances the component's functionality by providing users with more options.
The code changes are approved.
59-66
: LGTM!The stylistic changes to the
Navigation
component indicate a visual redesign of the header.The code changes are approved.
74-78
: LGTM!The changes to the
Navigation.ButtonItem
component improve the visual consistency of the navigation items.The code changes are approved.
86-142
: LGTM!The changes to the
Navigation.TriggerItem
component improve the visual consistency and functionality of the navigation items.The code changes are approved.
|
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
1509424 | Triggered | Generic Password | a47b11f | util/_restore-db-dev.sh | View secret |
1509424 | Triggered | Generic Password | 01eee07 | docker-compose.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.
🛑 Comments failed to post (62)
apps/frontend-manage/src/components/analytics/activity/LevelTag.tsx (1)
6-24: 🛠️ Refactor suggestion
Consider refactoring to reduce duplication and improve accessibility.
The current implementation has duplicated JSX structure and relies solely on colors to convey information.
Consider applying these improvements:
+interface TagConfig { + bgColor: string; + label: string; + ariaLabel: string; +} + +const TAG_CONFIGS: Record<1 | 2 | 3, TagConfig> = { + 3: { bgColor: 'bg-green-700', label: 'levelHigh', ariaLabel: 'High level' }, + 2: { bgColor: 'bg-yellow-500', label: 'levelMedium', ariaLabel: 'Medium level' }, + 1: { bgColor: 'bg-red-600', label: 'levelLow', ariaLabel: 'Low level' }, +}; + function LevelTag({ level }: { level: 1 | 2 | 3 }) { const t = useTranslations() + const config = TAG_CONFIGS[level] - if (level === 3) { - return ( - <div className="w-max rounded-md border bg-green-700 px-1 py-0.5 text-white"> - {t('manage.analytics.levelHigh')} - </div> - ) - } else if (level === 2) { - return ( - <div className="w-max rounded-md border bg-yellow-500 px-1 py-0.5 text-white"> - {t('manage.analytics.levelMedium')} - </div> - ) - } else { - return ( - <div className="w-max rounded-md border bg-red-600 px-1 py-0.5 text-white"> - {t('manage.analytics.levelLow')} - </div> - ) - } + return ( + <div + className={`w-max rounded-md border ${config.bgColor} px-1 py-0.5 text-white`} + role="status" + aria-label={config.ariaLabel} + > + {t(`manage.analytics.${config.label}`)} + </div> + ) }This refactoring:
- Reduces code duplication
- Adds proper ARIA attributes for accessibility
- Makes the component more maintainable
- Centralizes the configuration
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.interface TagConfig { bgColor: string; label: string; ariaLabel: string; } const TAG_CONFIGS: Record<1 | 2 | 3, TagConfig> = { 3: { bgColor: 'bg-green-700', label: 'levelHigh', ariaLabel: 'High level' }, 2: { bgColor: 'bg-yellow-500', label: 'levelMedium', ariaLabel: 'Medium level' }, 1: { bgColor: 'bg-red-600', label: 'levelLow', ariaLabel: 'Low level' }, }; function LevelTag({ level }: { level: 1 | 2 | 3 }) { const t = useTranslations() const config = TAG_CONFIGS[level] return ( <div className={`w-max rounded-md border ${config.bgColor} px-1 py-0.5 text-white`} role="status" aria-label={config.ariaLabel} > {t(`manage.analytics.${config.label}`)} </div> ) }
apps/frontend-manage/src/components/analytics/StackedBarChartLabel.tsx (1)
14-26: 🛠️ Refactor suggestion
Enhance accessibility and flexibility of the text element.
The SVG text element needs accessibility improvements and could be more flexible.
Consider these improvements:
<text x={x + width / 2} y={y + height / 2 + 1} fill="white" fontSize={14} textAnchor="middle" dominantBaseline="middle" - className="font-bold" + className="font-bold select-none" + role="presentation" + aria-hidden="true" > - {value} % + {Math.round(value)}% </text>Also consider:
- Making
fill
andfontSize
configurable via props- Adding text truncation for large numbers
- Documenting the purpose of the 1px vertical offset or removing it if unnecessary
Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/notebooks/participant_performance.ipynb (2)
94-102: 🛠️ Refactor suggestion
Ensure proper database connection cleanup
The database disconnection should be guaranteed even if errors occur.
Consider using a context manager:
+from contextlib import contextmanager + +@contextmanager +def database_connection(): + db = Prisma() + try: + db.connect() + yield db + finally: + db.disconnect() + +with database_connection() as db: + # Your database operations here + passCommittable suggestion skipped: line range outside the PR's diff.
70-91: 🛠️ Refactor suggestion
Add error handling and performance optimizations
The main computation loop lacks error handling and could be optimized for performance.
Consider these improvements:
- Add try-catch blocks for database operations
- Implement batch processing for database operations
- Add progress tracking using
tqdm
- Implement memory-efficient processing for large datasets
+from tqdm import tqdm + # Fetch all courses from the database -df_courses = get_running_past_courses(db) +try: + df_courses = get_running_past_courses(db) +except Exception as e: + print(f"Error fetching courses: {e}") + db.disconnect() + raise -for idx, course in df_courses.iterrows(): +for idx, course in tqdm(df_courses.iterrows(), total=len(df_courses)): course_id = course["id"] - print(f"Processing course", idx, "of", len(df_courses), "with id", course_id) + try: # fetch all question responses linked to this course question_responses = db.questionresponse.find_many(where={"courseId": course_id}) df_responses = pd.DataFrame(list(map(lambda x: x.dict(), question_responses))) if df_responses.empty: print("No responses linked to course", course_id) continue df_performance = compute_response_error_rates(df_responses) df_performance = compute_performance_levels(df_performance) # store computed performance analytics in the corresponding database table save_participant_performance(db, df_performance, course_id) + except Exception as e: + print(f"Error processing course {course_id}: {e}") + continue📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from tqdm import tqdm # Fetch all courses from the database try: df_courses = get_running_past_courses(db) except Exception as e: print(f"Error fetching courses: {e}") db.disconnect() raise # Iterate over the course and fetch all question responses linked to it for idx, course in tqdm(df_courses.iterrows(), total=len(df_courses)): course_id = course["id"] try: # fetch all question responses linked to this course question_responses = db.questionresponse.find_many(where={"courseId": course_id}) df_responses = pd.DataFrame(list(map(lambda x: x.dict(), question_responses))) # if no responses are linked to the course, skip the iteration if df_responses.empty: print("No responses linked to course", course_id) continue df_performance = compute_response_error_rates(df_responses) df_performance = compute_performance_levels(df_performance) # store computed performance analytics in the corresponding database table save_participant_performance(db, df_performance, course_id) except Exception as e: print(f"Error processing course {course_id}: {e}") continue
apps/analytics/src/notebooks/participant_course_analytics.ipynb (2)
47-55:
⚠️ Potential issueSecurity: Remove hardcoded database credentials.
Several critical issues in the database setup:
- Database credentials should not be hardcoded in the notebook
- Missing error handling for database connection
- The
verbose
flag is defined but never usedConsider this safer approach:
-os.environ["DATABASE_URL"] = "postgresql://klicker:klicker@localhost:5432/klicker-prod" +# Load database URL from environment variable +if "DATABASE_URL" not in os.environ: + raise ValueError("DATABASE_URL environment variable is required") -db.connect() +try: + db.connect() +except Exception as e: + print(f"Failed to connect to database: {e}") + raise📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.db = Prisma() # set the environment variable DATABASE_URL to the connection string of your database # Load database URL from environment variable if "DATABASE_URL" not in os.environ: raise ValueError("DATABASE_URL environment variable is required") try: db.connect() except Exception as e: print(f"Failed to connect to database: {e}") raise # Script settings verbose = False
71-92: 🛠️ Refactor suggestion
Enhance robustness of the computation logic.
The main computation loop could benefit from several improvements:
- Better progress tracking for long-running operations
- Error handling for computation functions
- Transaction management for database operations
Consider this enhanced implementation:
+from tqdm.notebook import tqdm +from contextlib import contextmanager +@contextmanager +def db_transaction(db): + try: + yield + db.commit() + except Exception as e: + db.rollback() + raise # find all courses that started in the past df_courses = get_running_past_courses(db) # iterate over all courses and compute the participant course analytics -for idx, course in df_courses.iterrows(): +for idx, course in tqdm(df_courses.iterrows(), total=len(df_courses), desc="Processing courses"): print(f"Processing course", idx, "of", len(df_courses), "with id", course["id"]) # compute the number of active weeks per participant and activity level - df_activity = get_active_weeks(db, course) + try: + df_activity = get_active_weeks(db, course) + except Exception as e: + print(f"Error processing course {course['id']}: {e}") + continue # if the dataframe is empty, no participant was active in the course if df_activity.empty: print("No participant was active in the course, skipping") continue # compute the number of active days per week and mean elements per day - df_activity = compute_participant_activity( - db, df_activity, course["id"], course["startDate"], course["endDate"] - ) + try: + df_activity = compute_participant_activity( + db, df_activity, course["id"], course["startDate"], course["endDate"] + ) + except Exception as e: + print(f"Error computing activity for course {course['id']}: {e}") + continue # store the computed participant course analytics - save_participant_course_analytics(db, df_activity) + with db_transaction(db): + save_participant_course_analytics(db, df_activity)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from tqdm.notebook import tqdm from contextlib import contextmanager @contextmanager def db_transaction(db): try: yield db.commit() except Exception as e: db.rollback() raise # find all courses that started in the past df_courses = get_running_past_courses(db) # iterate over all courses and compute the participant course analytics for idx, course in tqdm(df_courses.iterrows(), total=len(df_courses), desc="Processing courses"): print(f"Processing course", idx, "of", len(df_courses), "with id", course["id"]) # compute the number of active weeks per participant and activity level try: df_activity = get_active_weeks(db, course) except Exception as e: print(f"Error processing course {course['id']}: {e}") continue # if the dataframe is empty, no participant was active in the course if df_activity.empty: print("No participant was active in the course, skipping") continue # compute the number of active days per week and mean elements per day try: df_activity = compute_participant_activity( db, df_activity, course["id"], course["startDate"], course["endDate"] ) except Exception as e: print(f"Error computing activity for course {course['id']}: {e}") continue # store the computed participant course analytics with db_transaction(db): save_participant_course_analytics(db, df_activity)
apps/frontend-manage/src/components/analytics/ActivityTypeFilter.tsx (1)
18-28: 🛠️ Refactor suggestion
Consider deriving activity types from enum and sharing code with ElementTypeFilter.
- The items array is hardcoded with specific ActivityTypes. Consider deriving it from the ActivityType enum for maintainability.
- There's significant code duplication with ElementTypeFilter. Consider creating a shared generic filter component.
+import { useMemo } from 'react' function ActivityTypeFilter({ activityType, setActivityType, }: { activityType: ActivityType | 'all' setActivityType: Dispatch<SetStateAction<ActivityType | 'all'>> }) { const t = useTranslations() + + const items = useMemo(() => [ + { value: 'all', label: t('manage.analytics.allActivityTypes') }, + ...Object.values(ActivityType).map((value) => ({ + value, + label: t(`shared.generic.${value.toLowerCase()}s`), + })), + ], [t]) return ( <SelectField label={t('manage.analytics.activityType')} - items={[ - { value: 'all', label: t('manage.analytics.allActivityTypes') }, - { - value: ActivityType.PracticeQuiz, - label: t('shared.generic.practiceQuizzes'), - }, - { - value: ActivityType.MicroLearning, - label: t('shared.generic.microlearnings'), - }, - ]} + items={items} value={activityType}Alternative: Create a shared generic filter component:
interface TypeFilterProps<T> { type: T | 'all' setType: Dispatch<SetStateAction<T | 'all'>> enumValues: Record<string, T> labelPrefix: string translationKey: string } function TypeFilter<T extends string>({ type, setType, enumValues, labelPrefix, translationKey, }: TypeFilterProps<T>) { const t = useTranslations() const items = useMemo(() => [ { value: 'all', label: t(`${labelPrefix}.all${translationKey}Types`) }, ...Object.values(enumValues).map((value) => ({ value, label: t(`${labelPrefix}.${value.toLowerCase()}s`), })), ], [t, enumValues, labelPrefix, translationKey]) return ( <SelectField label={t(`${labelPrefix}.${translationKey}Type`)} items={items} value={type} onChange={(value) => setType(value as T | 'all')} className={{ select: { root: 'w-52', trigger: 'h-8' } }} /> ) } // Usage: <TypeFilter type={activityType} setType={setActivityType} enumValues={ActivityType} labelPrefix="manage.analytics" translationKey="activity" />apps/frontend-manage/src/components/analytics/ActivitiesElementsSwitch.tsx (1)
21-40: 🛠️ Refactor suggestion
Enhance accessibility for button group
The button group implementation should include proper ARIA attributes for better screen reader support.
- <div className="flex flex-row"> + <div + className="flex flex-row" + role="group" + aria-label={t('manage.analytics.viewTypeSelector')} + > <Button basic onClick={() => setType('activity')} + aria-pressed={type === 'activity'} className={{ root: `py-0.25 h-8 rounded-l border !border-r-0 border-solid px-2 ${type === 'activity' ? 'bg-primary-100 border-primary-100 text-white' : ''}`, }} > {t('manage.analytics.activities')} </Button> <Button basic onClick={() => setType('instance')} + aria-pressed={type === 'instance'} className={{ root: `h-8 rounded-r border !border-l-0 border-solid px-2 py-0.5 ${type === 'instance' ? 'bg-primary-100 border-primary-100 text-white' : ''}`, }} > {t('manage.analytics.elements')} </Button> </div>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.<div className="flex flex-row" role="group" aria-label={t('manage.analytics.viewTypeSelector')} > <Button basic onClick={() => setType('activity')} aria-pressed={type === 'activity'} className={{ root: `py-0.25 h-8 rounded-l border !border-r-0 border-solid px-2 ${type === 'activity' ? 'bg-primary-100 border-primary-100 text-white' : ''}`, }} > {t('manage.analytics.activities')} </Button> <Button basic onClick={() => setType('instance')} aria-pressed={type === 'instance'} className={{ root: `h-8 rounded-r border !border-l-0 border-solid px-2 py-0.5 ${type === 'instance' ? 'bg-primary-100 border-primary-100 text-white' : ''}`, }} > {t('manage.analytics.elements')} </Button> </div>
apps/frontend-manage/src/components/analytics/activity/useTotalStudentActivityHistogram.ts (2)
21-24:
⚠️ Potential issueAdd error handling for empty participant activity
The hook should handle the case when there's no participant activity data.
return useMemo(() => { + if (!participantActivity?.length) { + return [] + } const activeWeeks = participantActivity.map((p) => p.activeWeeks) const histogram = groupStudentActivityBins(activeWeeks, courseWeeks) const stats = computeHistogramStatistics(activeWeeks)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.return useMemo(() => { if (!participantActivity?.length) { return [] } const activeWeeks = participantActivity.map((p) => p.activeWeeks) const histogram = groupStudentActivityBins(activeWeeks, courseWeeks) const stats = computeHistogramStatistics(activeWeeks)
6-12:
⚠️ Potential issueAdd input validation for histogram binning
The
groupStudentActivityBins
function should validate inputs to prevent potential issues with largetotalWeeks
values or negative numbers.function groupStudentActivityBins(activeWeeks: number[], totalWeeks: number) { + if (totalWeeks < 0 || totalWeeks > 52) { + throw new Error('Total weeks must be between 0 and 52') + } + if (activeWeeks.some(weeks => weeks < 0 || weeks > totalWeeks)) { + throw new Error('Active weeks must be between 0 and total weeks') + } const histogram = new Array(totalWeeks + 1).fill(0) activeWeeks.forEach((weeks) => { histogram[weeks] = (histogram[weeks] || 0) + 1 }) return histogram }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.function groupStudentActivityBins(activeWeeks: number[], totalWeeks: number) { if (totalWeeks < 0 || totalWeeks > 52) { throw new Error('Total weeks must be between 0 and 52') } if (activeWeeks.some(weeks => weeks < 0 || weeks > totalWeeks)) { throw new Error('Active weeks must be between 0 and total weeks') } const histogram = new Array(totalWeeks + 1).fill(0) activeWeeks.forEach((weeks) => { histogram[weeks] = (histogram[weeks] || 0) + 1 }) return histogram }
apps/frontend-manage/src/components/analytics/activity/DailyActivityTimeSeries.tsx (1)
21-34: 🛠️ Refactor suggestion
Optimize data transformation logic
The current implementation has several potential improvements:
- Date formatting logic could be extracted to a utility function
- Percentage calculation should handle division by zero
- Consider using useMemo for the transformed data to prevent unnecessary recalculations
+const formatDate = (date: Date): string => { + return date.toLocaleDateString('en-GB', { + day: '2-digit', + month: '2-digit', + year: 'numeric', + }).replace(/\//g, '-') +} +const calculatePercentage = (active: number, total: number): number => { + return total > 0 ? (active / total) * 100 : 0 +} function DailyActivityTimeSeries({ activity, courseParticipants }: DailyActivityTimeSeriesProps) { const t = useTranslations() + const transformedData = useMemo(() => + activity.map((item) => ({ + date: formatDate(new Date(item.date)), + activeParticipants: calculatePercentage(item.activeParticipants, courseParticipants) + })), [activity, courseParticipants]) return ( // ... - <ActivityTimeSeriesPlot - activityData={activity.map((item) => { - const date = new Date(item.date) - return { - date: date.toLocaleDateString('en-GB', { - day: '2-digit', - month: '2-digit', - year: 'numeric', - }).replace(/\//g, '-'), - activeParticipants: (item.activeParticipants / courseParticipants) * 100, - } - })} - /> + <ActivityTimeSeriesPlot activityData={transformedData} /> )Committable suggestion skipped: line range outside the PR's diff.
apps/frontend-manage/src/components/analytics/activity/ActivityTimeSeriesPlot.tsx (1)
14-24: 🛠️ Refactor suggestion
Enhance type safety and validation for chart data
The component could benefit from stronger typing and validation:
- Define explicit interfaces for the data structure
- Add runtime validation for required props
+interface ActivityDataPoint { + date: string + activeParticipants: number + activeParticipantsReference?: number +} +interface ActivityTimeSeriesPlotProps { + singleCourse?: boolean + currentCourse?: string + comparisonCourse?: string + activityData: ActivityDataPoint[] +} -function ActivityTimeSeriesPlot({ +function ActivityTimeSeriesPlot({ singleCourse = true, currentCourse, comparisonCourse, activityData, -}: { - singleCourse?: boolean - currentCourse?: string - comparisonCourse?: string - activityData: { date: string; activeParticipants: number }[] -}) { +}: ActivityTimeSeriesPlotProps) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.interface ActivityDataPoint { date: string activeParticipants: number activeParticipantsReference?: number } interface ActivityTimeSeriesPlotProps { singleCourse?: boolean currentCourse?: string comparisonCourse?: string activityData: ActivityDataPoint[] } function ActivityTimeSeriesPlot({ singleCourse = true, currentCourse, comparisonCourse, activityData, }: ActivityTimeSeriesPlotProps) {
apps/frontend-manage/src/components/analytics/activity/SuspendedCourseComparison.tsx (1)
24-31:
⚠️ Potential issueImprove error handling and data transformation
The GraphQL query handling could be improved:
- Add error handling for the query
- Memoize the transformed courses data
- Add loading state for initial data fetch
+import { ApolloError } from '@apollo/client' +import { UserNotification } from '@uzh-bf/design-system' function SuspendedCourseComparison({ ... }) { - const { data } = useSuspenseQuery(GetUserCoursesDocument) + const { data, error } = useSuspenseQuery(GetUserCoursesDocument) + const courses = useMemo(() => + data?.userCourses + ?.filter((course) => course.id !== router.query.courseId) + .map((course) => ({ + label: course.name, + value: course.id, + })) ?? [], + [data?.userCourses, router.query.courseId] + ) + if (error) { + return ( + <UserNotification + message={t('shared.errors.failedToLoadCourses')} + type="error" + /> + ) + } - const courses = - data.userCourses - ?.filter((course) => course.id !== router.query.courseId) - .map((course) => ({ - label: course.name, - value: course.id, - })) ?? []Committable suggestion skipped: line range outside the PR's diff.
apps/frontend-manage/src/components/analytics/activity/WeeklyActivityTimeSeries.tsx (2)
40-73: 🛠️ Refactor suggestion
Refactor complex data transformation logic
The data transformation logic is complex and could benefit from better organization and safety checks.
- Extract data transformation logic:
const formatComparisonData = ( activity: ParticipantActivityTimestamp[], courseParticipants: number, secondActivity: any[], secondParticipants: number ) => { return activity.map((item, idx) => ({ date: t('manage.analytics.weekN', { number: idx + 1 }), activeParticipants: (item.activeParticipants / courseParticipants) * 100, activeParticipantsReference: secondActivity.length > idx ? (secondActivity[idx].activeParticipants / secondParticipants) * 100 : undefined, })) } const formatSingleCourseData = ( activity: ParticipantActivityTimestamp[], courseParticipants: number ) => { return activity.map((item) => ({ date: new Date(item.date).toLocaleDateString('en-GB', { day: '2-digit', month: '2-digit', year: 'numeric', }).replace(/\//g, '-'), activeParticipants: (item.activeParticipants / courseParticipants) * 100, })) }
- Add safety check for division by zero:
- (secondActivity[idx].activeParticipants / secondParticipants) * 100 + secondParticipants > 0 + ? (secondActivity[idx].activeParticipants / secondParticipants) * 100 + : 0
26-33: 🛠️ Refactor suggestion
Add error handling for the GraphQL query
The query implementation should handle error states to improve user experience.
Add error handling:
- const { data, loading } = useQuery(GetCourseWeeklyActivityDocument, { + const { data, loading, error } = useQuery(GetCourseWeeklyActivityDocument, { variables: { courseId: courseComparison?.id }, skip: typeof courseComparison === 'undefined', }) + + if (error) { + return ( + <UserNotification + message={t('manage.analytics.errorLoadingComparison')} + type="error" + /> + ) + }Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/modules/instance_activity_performance/agg_activity_performance.py (1)
1-6:
⚠️ Potential issueFix the to_dict() call and add type hints, docstring, and error handling
The function has several issues that need to be addressed:
- The
to_dict()
call on line 4 has no effect as its result is not assigned to any variable- Missing type hints for better code maintainability
- Missing docstring for documentation
- No error handling for edge cases
Here's the suggested fix:
-def agg_activity_performance(df_instance_performance): - activity_performance = df_instance_performance.mean() - activity_performance.drop("instanceId", inplace=True) - activity_performance.to_dict() - - return activity_performance +from typing import Dict +import pandas as pd + +def agg_activity_performance(df_instance_performance: pd.DataFrame) -> Dict[str, float]: + """ + Compute mean performance metrics across all instances. + + Args: + df_instance_performance: DataFrame containing performance metrics with instanceId + + Returns: + Dictionary of aggregated performance metrics + + Raises: + ValueError: If DataFrame is empty or missing required columns + """ + if df_instance_performance.empty: + raise ValueError("DataFrame is empty") + + if "instanceId" not in df_instance_performance.columns: + raise ValueError("DataFrame missing required column: instanceId") + + activity_performance = df_instance_performance.mean() + activity_performance.drop("instanceId", inplace=True) + return activity_performance.to_dict()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import Dict import pandas as pd def agg_activity_performance(df_instance_performance: pd.DataFrame) -> Dict[str, float]: """ Compute mean performance metrics across all instances. Args: df_instance_performance: DataFrame containing performance metrics with instanceId Returns: Dictionary of aggregated performance metrics Raises: ValueError: If DataFrame is empty or missing required columns """ if df_instance_performance.empty: raise ValueError("DataFrame is empty") if "instanceId" not in df_instance_performance.columns: raise ValueError("DataFrame missing required column: instanceId") activity_performance = df_instance_performance.mean() activity_performance.drop("instanceId", inplace=True) return activity_performance.to_dict()
apps/analytics/src/modules/instance_activity_performance/get_course_activities.py (1)
1-14: 🛠️ Refactor suggestion
Add type hints, error handling, and refactor duplicate query logic
The function needs improvements in several areas:
- Missing type hints
- Duplicate query structure
- No error handling for database operations
- No validation for course_id
Here's the suggested refactor:
+from typing import List, Dict, Tuple +from prisma import Prisma + +def get_activities_by_type(db: Prisma, course_id: str, collection: str) -> List[Dict]: + """Helper function to fetch activities from a specific collection""" + try: + results = getattr(db, collection).find_many( + where={"courseId": course_id}, + include={"stacks": {"include": {"elements": {"include": {"responses": True}}}}}, + ) + return list(map(lambda x: x.dict(), results)) + except Exception as e: + raise Exception(f"Failed to fetch {collection}: {str(e)}") + +def get_course_activities(db: Prisma, course_id: str) -> Tuple[List[Dict], List[Dict]]: + """ + Fetch practice quizzes and microlearning activities for a course. + + Args: + db: Prisma database instance + course_id: ID of the course + + Returns: + Tuple of (practice_quizzes, microlearning_activities) + + Raises: + ValueError: If course_id is invalid + Exception: If database operations fail + """ + if not course_id: + raise ValueError("course_id is required") + + return ( + get_activities_by_type(db, course_id, "practicequiz"), + get_activities_by_type(db, course_id, "microlearning") + )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import List, Dict, Tuple from prisma import Prisma def get_activities_by_type(db: Prisma, course_id: str, collection: str) -> List[Dict]: """Helper function to fetch activities from a specific collection""" try: results = getattr(db, collection).find_many( where={"courseId": course_id}, include={"stacks": {"include": {"elements": {"include": {"responses": True}}}}}, ) return list(map(lambda x: x.dict(), results)) except Exception as e: raise Exception(f"Failed to fetch {collection}: {str(e)}") def get_course_activities(db: Prisma, course_id: str) -> Tuple[List[Dict], List[Dict]]: """ Fetch practice quizzes and microlearning activities for a course. Args: db: Prisma database instance course_id: ID of the course Returns: Tuple of (practice_quizzes, microlearning_activities) Raises: ValueError: If course_id is invalid Exception: If database operations fail """ if not course_id: raise ValueError("course_id is required") return ( get_activities_by_type(db, course_id, "practicequiz"), get_activities_by_type(db, course_id, "microlearning") )
apps/analytics/src/modules/instance_activity_performance/save_instance_performances.py (1)
1-44: 🛠️ Refactor suggestion
Add type hints, validation, error handling, and optimize performance
The function needs several improvements:
- Missing type hints
- No validation of required DataFrame columns
- Inefficient row-by-row processing
- No error handling for database operations
- Hard-coded column names
Here's the suggested refactor:
+from typing import List, Dict, Set +import pandas as pd +from prisma import Prisma + +REQUIRED_COLUMNS: Set[str] = { + "instanceId", "responseCount", "totalErrorRate", "totalPartialRate", + "totalCorrectRate", "averageTimeSpent" +} + +DETAILED_COLUMNS: Set[str] = { + "firstErrorRate", "firstPartialRate", "firstCorrectRate", + "lastErrorRate", "lastPartialRate", "lastCorrectRate" +} + +def validate_dataframe(df: pd.DataFrame, total_only: bool) -> None: + """Validate DataFrame has required columns""" + missing_columns = REQUIRED_COLUMNS - set(df.columns) + if missing_columns: + raise ValueError(f"Missing required columns: {missing_columns}") + + if not total_only: + missing_detailed = DETAILED_COLUMNS - set(df.columns) + if missing_detailed: + raise ValueError(f"Missing detailed columns: {missing_detailed}") + +def prepare_upsert_data(row: pd.Series, course_id: str, total_only: bool) -> Dict: + """Prepare data for upsert operation""" + values = {col: row[col] for col in REQUIRED_COLUMNS if col != "instanceId"} + + if not total_only: + values.update({col: row[col] for col in DETAILED_COLUMNS}) + + create_values = values.copy() + create_values.update({ + "instance": {"connect": {"id": row["instanceId"]}}, + "course": {"connect": {"id": course_id}}, + }) + + return { + "where": {"instanceId": row["instanceId"]}, + "data": {"create": create_values, "update": values} + } + +def save_instance_performances( + db: Prisma, + df_instance_performance: pd.DataFrame, + course_id: str, + total_only: bool = False +) -> None: + """ + Save or update instance performance metrics in the database. + + Args: + db: Prisma database instance + df_instance_performance: DataFrame with performance metrics + course_id: ID of the course + total_only: If True, only save total metrics + + Raises: + ValueError: If DataFrame is invalid or missing required columns + Exception: If database operations fail + """ + if df_instance_performance.empty: + raise ValueError("DataFrame is empty") + + if not course_id: + raise ValueError("course_id is required") + + validate_dataframe(df_instance_performance, total_only) + + try: + # Prepare all upsert operations + upsert_data = [ + prepare_upsert_data(row, course_id, total_only) + for _, row in df_instance_performance.iterrows() + ] + + # Execute upserts in batches + batch_size = 100 + for i in range(0, len(upsert_data), batch_size): + batch = upsert_data[i:i + batch_size] + db.instanceperformance.batch_( + [db.instanceperformance.upsert(**data) for data in batch] + ) + except Exception as e: + raise Exception(f"Failed to save instance performances: {str(e)}")This refactor:
- Adds type hints for better code maintainability
- Validates DataFrame columns before processing
- Uses batch processing for better performance
- Adds proper error handling
- Extracts magic strings into constants
- Separates concerns into helper functions
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import List, Dict, Set import pandas as pd from prisma import Prisma REQUIRED_COLUMNS: Set[str] = { "instanceId", "responseCount", "totalErrorRate", "totalPartialRate", "totalCorrectRate", "averageTimeSpent" } DETAILED_COLUMNS: Set[str] = { "firstErrorRate", "firstPartialRate", "firstCorrectRate", "lastErrorRate", "lastPartialRate", "lastCorrectRate" } def validate_dataframe(df: pd.DataFrame, total_only: bool) -> None: """Validate DataFrame has required columns""" missing_columns = REQUIRED_COLUMNS - set(df.columns) if missing_columns: raise ValueError(f"Missing required columns: {missing_columns}") if not total_only: missing_detailed = DETAILED_COLUMNS - set(df.columns) if missing_detailed: raise ValueError(f"Missing detailed columns: {missing_detailed}") def prepare_upsert_data(row: pd.Series, course_id: str, total_only: bool) -> Dict: """Prepare data for upsert operation""" values = {col: row[col] for col in REQUIRED_COLUMNS if col != "instanceId"} if not total_only: values.update({col: row[col] for col in DETAILED_COLUMNS}) create_values = values.copy() create_values.update({ "instance": {"connect": {"id": row["instanceId"]}}, "course": {"connect": {"id": course_id}}, }) return { "where": {"instanceId": row["instanceId"]}, "data": {"create": create_values, "update": values} } def save_instance_performances( db: Prisma, df_instance_performance: pd.DataFrame, course_id: str, total_only: bool = False ) -> None: """ Save or update instance performance metrics in the database. Args: db: Prisma database instance df_instance_performance: DataFrame with performance metrics course_id: ID of the course total_only: If True, only save total metrics Raises: ValueError: If DataFrame is invalid or missing required columns Exception: If database operations fail """ if df_instance_performance.empty: raise ValueError("DataFrame is empty") if not course_id: raise ValueError("course_id is required") validate_dataframe(df_instance_performance, total_only) try: # Prepare all upsert operations upsert_data = [ prepare_upsert_data(row, course_id, total_only) for _, row in df_instance_performance.iterrows() ] # Execute upserts in batches batch_size = 100 for i in range(0, len(upsert_data), batch_size): batch = upsert_data[i:i + batch_size] db.instanceperformance.batch_( [db.instanceperformance.upsert(**data) for data in batch] ) except Exception as e: raise Exception(f"Failed to save instance performances: {str(e)}")
apps/analytics/src/modules/instance_activity_performance/save_activity_performance.py (1)
38-41: 🛠️ Refactor suggestion
Add error handling for database operations
The database operation should be wrapped in try-except to handle potential errors.
+ try: db.activityperformance.upsert( where=where_clause, data={"create": create_values, "update": values}, ) + except Exception as e: + raise RuntimeError(f"Failed to save activity performance: {str(e)}") from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.try: db.activityperformance.upsert( where=where_clause, data={"create": create_values, "update": values}, ) except Exception as e: raise RuntimeError(f"Failed to save activity performance: {str(e)}") from e
apps/analytics/src/modules/instance_activity_performance/compute_instance_performance.py (3)
25-29: 🛠️ Refactor suggestion
Add error handling for DataFrame creation
The DataFrame creation from instance responses lacks error handling.
- df_responses = pd.DataFrame(instance["responses"]) + try: + df_responses = pd.DataFrame(instance["responses"]) + except Exception as e: + print(f"Warning: Failed to process responses for instance {instance['id']}: {str(e)}") + continue📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.try: df_responses = pd.DataFrame(instance["responses"]) except Exception as e: print(f"Warning: Failed to process responses for instance {instance['id']}: {str(e)}") continue if df_responses.empty: continue
73-81:
⚠️ Potential issuePotential division by zero risk
The calculations of rates don't handle the case where trialsCount is zero.
- df_responses["responseErrorRate"] = ( - df_responses["wrongCount"] / df_responses["trialsCount"] - ) + df_responses["responseErrorRate"] = ( + df_responses["wrongCount"] / df_responses["trialsCount"] + if df_responses["trialsCount"].any() else 0 + ) - df_responses["responsePartialRate"] = ( - df_responses["partialCorrectCount"] / df_responses["trialsCount"] - ) + df_responses["responsePartialRate"] = ( + df_responses["partialCorrectCount"] / df_responses["trialsCount"] + if df_responses["trialsCount"].any() else 0 + ) - df_responses["responseCorrectRate"] = ( - df_responses["correctCount"] / df_responses["trialsCount"] - ) + df_responses["responseCorrectRate"] = ( + df_responses["correctCount"] / df_responses["trialsCount"] + if df_responses["trialsCount"].any() else 0 + )Committable suggestion skipped: line range outside the PR's diff.
4-21: 🛠️ Refactor suggestion
Consider splitting the function and adding type hints
The function is quite long and handles multiple responsibilities. Consider breaking it down and adding type hints.
+from typing import Dict, Any +import pandas as pd + -def compute_instance_performance(db, activity, total_only=False): +def compute_instance_performance( + db: Any, + activity: Dict[str, Any], + total_only: bool = False +) -> pd.DataFrame: + """Compute performance metrics for activity instances. + + Args: + db: Database connection + activity: Activity data containing stacks and elements + total_only: If True, compute only total metrics + + Returns: + DataFrame containing performance metrics for each instance + """Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/notebooks/instance_activity_performance.ipynb (2)
134-135: 🛠️ Refactor suggestion
Ensure database disconnection in case of errors
The database disconnection should be guaranteed even if errors occur.
- # Disconnect from the database - db.disconnect() + try: + # Disconnect from the database + db.disconnect() + except Exception as e: + print(f"Error disconnecting from database: {str(e)}") + raise📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.try: # Disconnect from the database db.disconnect() except Exception as e: print(f"Error disconnecting from database: {str(e)}") raise
52-57:
⚠️ Potential issueSecurity concern: Hardcoded database credentials
The database connection string contains hardcoded credentials, which is a security risk.
- os.environ["DATABASE_URL"] = "postgresql://klicker:klicker@localhost:5432/klicker-prod" + # Load database URL from environment variable or .env file + if "DATABASE_URL" not in os.environ: + raise ValueError("DATABASE_URL environment variable must be set")Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/modules/participant_performance/save_participant_performance.py (1)
1-30: 🛠️ Refactor suggestion
Add error handling and input validation
The function needs several improvements for robustness and performance:
- Add error handling for database operations
- Validate DataFrame columns before processing
- Consider batch processing for better performance
- Reduce code duplication in create/update blocks
Here's a suggested implementation:
def save_participant_performance(db, df_performance, course_id): + required_columns = [ + "participantId", "firstErrorRate", "firstPerformance", + "lastErrorRate", "lastPerformance", "totalErrorRate", "totalPerformance" + ] + + # Validate input DataFrame + missing_columns = [col for col in required_columns if col not in df_performance.columns] + if missing_columns: + raise ValueError(f"Missing required columns: {missing_columns}") + + # Prepare common data structure + performance_data = { + "firstErrorRate": "firstErrorRate", + "firstPerformance": "firstPerformance", + "lastErrorRate": "lastErrorRate", + "lastPerformance": "lastPerformance", + "totalErrorRate": "totalErrorRate", + "totalPerformance": "totalPerformance", + } + + try: for _, row in df_performance.iterrows(): db.participantperformance.upsert( where={ "participantId_courseId": { "participantId": row["participantId"], "courseId": course_id, } }, data={ "create": { - "firstErrorRate": row["firstErrorRate"], - "firstPerformance": row["firstPerformance"], - "lastErrorRate": row["lastErrorRate"], - "lastPerformance": row["lastPerformance"], - "totalErrorRate": row["totalErrorRate"], - "totalPerformance": row["totalPerformance"], + **{k: row[v] for k, v in performance_data.items()}, "participant": {"connect": {"id": row["participantId"]}}, "course": {"connect": {"id": course_id}}, }, "update": { - "firstErrorRate": row["firstErrorRate"], - "firstPerformance": row["firstPerformance"], - "lastErrorRate": row["lastErrorRate"], - "lastPerformance": row["lastPerformance"], - "totalErrorRate": row["totalErrorRate"], - "totalPerformance": row["totalPerformance"], + **{k: row[v] for k, v in performance_data.items()}, }, }, ) + except Exception as e: + raise Exception(f"Failed to save participant performance: {str(e)}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.def save_participant_performance(db, df_performance, course_id): required_columns = [ "participantId", "firstErrorRate", "firstPerformance", "lastErrorRate", "lastPerformance", "totalErrorRate", "totalPerformance" ] # Validate input DataFrame missing_columns = [col for col in required_columns if col not in df_performance.columns] if missing_columns: raise ValueError(f"Missing required columns: {missing_columns}") # Prepare common data structure performance_data = { "firstErrorRate": "firstErrorRate", "firstPerformance": "firstPerformance", "lastErrorRate": "lastErrorRate", "lastPerformance": "lastPerformance", "totalErrorRate": "totalErrorRate", "totalPerformance": "totalPerformance", } try: for _, row in df_performance.iterrows(): db.participantperformance.upsert( where={ "participantId_courseId": { "participantId": row["participantId"], "courseId": course_id, } }, data={ "create": { **{k: row[v] for k, v in performance_data.items()}, "participant": {"connect": {"id": row["participantId"]}}, "course": {"connect": {"id": course_id}}, }, "update": { **{k: row[v] for k, v in performance_data.items()}, }, }, ) except Exception as e: raise Exception(f"Failed to save participant performance: {str(e)}")
apps/analytics/src/modules/participant_performance/compute_performance_levels.py (1)
1-39: 🛠️ Refactor suggestion
Improve robustness and maintainability
The function needs several improvements:
- Add input validation
- Make performance thresholds configurable
- Handle edge cases
- Reduce code duplication
Here's a suggested implementation:
+from typing import Dict, List, Literal + +PerformanceLevel = Literal["LOW", "MEDIUM", "HIGH"] + +def compute_performance_levels( + df_performance, + quantiles: List[float] = [0.25, 0.75], + metrics: Dict[str, str] = { + "firstErrorRate": "firstPerformance", + "lastErrorRate": "lastPerformance", + "totalErrorRate": "totalPerformance", + }, +) -> pd.DataFrame: + """ + Compute performance levels based on error rate quantiles. + + Args: + df_performance: DataFrame containing error rate columns + quantiles: List of quantile values for thresholds + metrics: Dictionary mapping error rate columns to performance columns + + Returns: + DataFrame with added performance level columns + """ + if df_performance.empty: + return df_performance + + # Validate input DataFrame + missing_columns = [col for col in metrics.keys() if col not in df_performance.columns] + if missing_columns: + raise ValueError(f"Missing required columns: {missing_columns}") + + def assign_performance_levels( + df: pd.DataFrame, error_col: str, perf_col: str + ) -> pd.DataFrame: + qs = df[error_col].quantile(quantiles) + df[perf_col] = "MEDIUM" + df.loc[df[error_col] <= qs[quantiles[0]], perf_col] = "HIGH" + df.loc[df[error_col] >= qs[quantiles[1]], perf_col] = "LOW" + return df + + for error_col, perf_col in metrics.items(): + df_performance = assign_performance_levels(df_performance, error_col, perf_col) + + return df_performance -def compute_performance_levels(df_performance): - # set the performance levels based on the quantiles - first_qs = df_performance.firstErrorRate.quantile([0.25, 0.75]) - last_qs = df_performance.lastErrorRate.quantile([0.25, 0.75]) - total_qs = df_performance.totalErrorRate.quantile([0.25, 0.75]) - - first_q1 = first_qs[0.25] - first_q3 = first_qs[0.75] - last_q1 = last_qs[0.25] - last_q3 = last_qs[0.75] - total_q1 = total_qs[0.25] - total_q3 = total_qs[0.75] - - # set the performance levels based on the quantiles - df_performance["firstPerformance"] = "MEDIUM" - df_performance.loc[ - df_performance.firstErrorRate <= first_q1, "firstPerformance" - ] = "HIGH" - df_performance.loc[ - df_performance.firstErrorRate >= first_q3, "firstPerformance" - ] = "LOW" - - df_performance["lastPerformance"] = "MEDIUM" - df_performance.loc[df_performance.lastErrorRate <= last_q1, "lastPerformance"] = ( - "HIGH" - ) - df_performance.loc[df_performance.lastErrorRate >= last_q3, "lastPerformance"] = ( - "LOW" - ) - - df_performance["totalPerformance"] = "MEDIUM" - df_performance.loc[ - df_performance.totalErrorRate <= total_q1, "totalPerformance" - ] = "HIGH" - df_performance.loc[ - df_performance.totalErrorRate >= total_q3, "totalPerformance" - ] = "LOW" - - return df_performance📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import Dict, List, Literal PerformanceLevel = Literal["LOW", "MEDIUM", "HIGH"] def compute_performance_levels( df_performance, quantiles: List[float] = [0.25, 0.75], metrics: Dict[str, str] = { "firstErrorRate": "firstPerformance", "lastErrorRate": "lastPerformance", "totalErrorRate": "totalPerformance", }, ) -> pd.DataFrame: """ Compute performance levels based on error rate quantiles. Args: df_performance: DataFrame containing error rate columns quantiles: List of quantile values for thresholds metrics: Dictionary mapping error rate columns to performance columns Returns: DataFrame with added performance level columns """ if df_performance.empty: return df_performance # Validate input DataFrame missing_columns = [col for col in metrics.keys() if col not in df_performance.columns] if missing_columns: raise ValueError(f"Missing required columns: {missing_columns}") def assign_performance_levels( df: pd.DataFrame, error_col: str, perf_col: str ) -> pd.DataFrame: qs = df[error_col].quantile(quantiles) df[perf_col] = "MEDIUM" df.loc[df[error_col] <= qs[quantiles[0]], perf_col] = "HIGH" df.loc[df[error_col] >= qs[quantiles[1]], perf_col] = "LOW" return df for error_col, perf_col in metrics.items(): df_performance = assign_performance_levels(df_performance, error_col, perf_col) return df_performance
apps/analytics/src/modules/participant_performance/compute_response_error_rates.py (1)
1-54: 🛠️ Refactor suggestion
Add validation and optimize DataFrame operations
The function needs several improvements:
- Add input validation and edge case handling
- Optimize DataFrame operations
- Add proper documentation
- Handle potential division by zero
Here's a suggested implementation:
+from typing import List +import pandas as pd + def compute_response_error_rates(df_responses): + """ + Compute various error rates from participant responses. + + Args: + df_responses: DataFrame containing response data with columns: + - participantId + - wrongCount + - trialsCount + - firstResponseCorrectness + - lastResponseCorrectness + + Returns: + DataFrame containing computed error rates and counts + """ + required_columns = [ + "participantId", "wrongCount", "trialsCount", + "firstResponseCorrectness", "lastResponseCorrectness" + ] + + # Validate input DataFrame + if df_responses.empty: + return pd.DataFrame() + + missing_columns = [col for col in required_columns if col not in df_responses.columns] + if missing_columns: + raise ValueError(f"Missing required columns: {missing_columns}") + # compute the error rate for each response itself - df_responses["responseErrorRate"] = ( - df_responses["wrongCount"] / df_responses["trialsCount"] - ) + df_responses["responseErrorRate"] = df_responses["wrongCount"].div( + df_responses["trialsCount"] + ).fillna(0) # compute the total number of responses, number of wrong first and last responses, # total number of wrong responses, and the average total error rate - df_response_count = ( - df_responses.groupby("participantId").size().reset_index(name="responseCount") - ) - df_first_response_wrong_count = ( - df_responses[df_responses["firstResponseCorrectness"] == "WRONG"] - .groupby("participantId") - .size() - .reset_index(name="wrongFirstResponseCount") - ) - df_last_response_wrong_count = ( - df_responses[df_responses["lastResponseCorrectness"] == "WRONG"] - .groupby("participantId") - .size() - .reset_index(name="wrongLastResponseCount") - ) - df_total_error_rate = ( - df_responses[["participantId", "responseErrorRate"]] - .groupby("participantId") - .agg("mean") - .reset_index() - .rename( - columns={ - "responseErrorRate": "totalErrorRate", - } - ) - ) + agg_dict = { + "responseCount": ("participantId", "size"), + "wrongFirstResponseCount": ("firstResponseCorrectness", + lambda x: (x == "WRONG").sum()), + "wrongLastResponseCount": ("lastResponseCorrectness", + lambda x: (x == "WRONG").sum()), + "totalErrorRate": ("responseErrorRate", "mean") + } + + df_performance = df_responses.groupby("participantId").agg(**agg_dict).reset_index() - # combine the dataframes into a single one - df_performance = ( - df_response_count.merge( - df_first_response_wrong_count, on="participantId", how="left" - ) - .merge(df_last_response_wrong_count, on="participantId", how="left") - .merge(df_total_error_rate, on="participantId", how="left") - .fillna(0) + # compute the first and last error rates safely + df_performance["firstErrorRate"] = df_performance["wrongFirstResponseCount"].div( + df_performance["responseCount"] + ).fillna(0) + df_performance["lastErrorRate"] = df_performance["wrongLastResponseCount"].div( + df_performance["responseCount"] + ).fillna(0) - # compute the first and last error rates - df_performance["firstErrorRate"] = ( - df_performance["wrongFirstResponseCount"] / df_performance["responseCount"] - ) - df_performance["lastErrorRate"] = ( - df_performance["wrongLastResponseCount"] / df_performance["responseCount"] ) return df_performanceThe improvements include:
- Added input validation and edge case handling
- Optimized DataFrame operations by using a single groupby
- Added proper documentation with type hints
- Added safe division operations with fillna
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import List import pandas as pd def compute_response_error_rates(df_responses): """ Compute various error rates from participant responses. Args: df_responses: DataFrame containing response data with columns: - participantId - wrongCount - trialsCount - firstResponseCorrectness - lastResponseCorrectness Returns: DataFrame containing computed error rates and counts """ required_columns = [ "participantId", "wrongCount", "trialsCount", "firstResponseCorrectness", "lastResponseCorrectness" ] # Validate input DataFrame if df_responses.empty: return pd.DataFrame() missing_columns = [col for col in required_columns if col not in df_responses.columns] if missing_columns: raise ValueError(f"Missing required columns: {missing_columns}") # compute the error rate for each response itself df_responses["responseErrorRate"] = df_responses["wrongCount"].div( df_responses["trialsCount"] ).fillna(0) # compute the total number of responses, number of wrong first and last responses, # total number of wrong responses, and the average total error rate agg_dict = { "responseCount": ("participantId", "size"), "wrongFirstResponseCount": ("firstResponseCorrectness", lambda x: (x == "WRONG").sum()), "wrongLastResponseCount": ("lastResponseCorrectness", lambda x: (x == "WRONG").sum()), "totalErrorRate": ("responseErrorRate", "mean") } df_performance = df_responses.groupby("participantId").agg(**agg_dict).reset_index() # compute the first and last error rates safely df_performance["firstErrorRate"] = df_performance["wrongFirstResponseCount"].div( df_performance["responseCount"] ).fillna(0) df_performance["lastErrorRate"] = df_performance["wrongLastResponseCount"].div( df_performance["responseCount"] ).fillna(0) return df_performance
apps/analytics/src/modules/participant_analytics/get_participant_responses.py (1)
5-11:
⚠️ Potential issueEnsure fallback handling if both practiceQuiz and microLearning are None.
If both keys are None, the current else-branch would attempt to access a "courseId" in a nonexistent object, leading to a runtime error. Consider adding an explicit check or fallback.def map_details(detail, participantId): - courseId = ( - detail["practiceQuiz"]["courseId"] - if detail["practiceQuiz"] - else detail["microLearning"]["courseId"] - ) + if detail["practiceQuiz"] is not None: + courseId = detail["practiceQuiz"]["courseId"] + elif detail["microLearning"] is not None: + courseId = detail["microLearning"]["courseId"] + else: + # Handle the edge case explicitly + courseId = None return {**detail, "participantId": participantId, "courseId": courseId}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.def map_details(detail, participantId): if detail["practiceQuiz"] is not None: courseId = detail["practiceQuiz"]["courseId"] elif detail["microLearning"] is not None: courseId = detail["microLearning"]["courseId"] else: # Handle the edge case explicitly courseId = None return {**detail, "participantId": participantId, "courseId": courseId}
apps/analytics/src/modules/aggregated_course_analytics/compute_weekday_activity.py (1)
104-107:
⚠️ Potential issueFix double appending to avoid incorrect counts.
The code appends zero if empty, then always appends the total count. Rows with data add both 0 and the actual length. The logic should do one or the other.if df_weekday.empty: - collector.append(0) - collector.append(len(df_weekday)) + collector.append(0) + else: + collector.append(len(df_weekday))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.if df_weekday.empty: collector.append(0) else: collector.append(len(df_weekday))
apps/analytics/src/modules/aggregated_analytics/save_aggregated_analytics.py (1)
40-58: 💡 Codebase verification
Add error handling for find_unique_or_raise operation
Thefind_unique_or_raise
operation is used within a loop processing analytics data, and a failure to find a course would interrupt the entire analytics aggregation. Add a try-except block to handle missing courses gracefully:try: course = db.course.find_unique_or_raise( where={"id": row["courseId"]}, include={...} ) except Exception as e: # Log the error and skip this course print(f"Error processing course {row['courseId']}: {str(e)}") continue🔗 Analysis chain
Handle potential exceptions from find_unique_or_raise.
If the course is unexpectedly missing, the operation will raise an exception. Consider wrapping this call to provide a fallback or a more graceful error.
Let me gather more context about the error handling patterns in this codebase.
Let me try a different approach to check the error handling patterns.
Let me check one more thing to understand if there's any global error handling or if this is part of a larger try-except block.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Checks for rescue or fallback usage of find_unique_or_raise calls across the codebase. rg --context 3 'find_unique_or_raise'Length of output: 936
Script:
#!/bin/bash # Check for try-except blocks around database operations in the analytics module rg -A 5 -B 5 "try:.*\n.*db\." apps/analytics/src/modules/ # Check the implementation of find_unique_or_raise to understand its behavior ast-grep --pattern 'def find_unique_or_raise'Length of output: 299
Script:
#!/bin/bash # Check for try-except blocks around database operations in the analytics module rg -U -A 5 -B 5 "try:" apps/analytics/src/modules/ # Check for any existing error handling in save_aggregated_analytics.py rg -A 5 -B 5 "except|raise|error" apps/analytics/src/modules/aggregated_analytics/save_aggregated_analytics.pyLength of output: 763
Script:
#!/bin/bash # Get the full content of save_aggregated_analytics.py to understand the broader context cat apps/analytics/src/modules/aggregated_analytics/save_aggregated_analytics.pyLength of output: 4843
apps/analytics/src/modules/activity_progress/save_practice_quiz_progress.py (2)
1-9: 🛠️ Refactor suggestion
Add type hints and docstring documentation
The function lacks type hints and documentation. This is particularly important for analytics code where data integrity is crucial.
Add type hints and documentation:
+ from typing import Any + + def save_practice_quiz_progress( + db: Any, + course_participants: int, + started_count: int, + completed_count: int, + repeated_count: int, + course_id: str, + quiz_id: str, + ) -> None: + """Save or update practice quiz progress statistics. + + Args: + db: Database connection instance + course_participants: Total number of course participants + started_count: Number of participants who started the quiz + completed_count: Number of participants who completed the quiz + repeated_count: Number of participants who repeated the quiz + course_id: Unique identifier of the course + quiz_id: Unique identifier of the practice quiz + """📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import Any def save_practice_quiz_progress( db: Any, course_participants: int, started_count: int, completed_count: int, repeated_count: int, course_id: str, quiz_id: str, ) -> None: """Save or update practice quiz progress statistics. Args: db: Database connection instance course_participants: Total number of course participants started_count: Number of participants who started the quiz completed_count: Number of participants who completed the quiz repeated_count: Number of participants who repeated the quiz course_id: Unique identifier of the course quiz_id: Unique identifier of the practice quiz """
1-23:
⚠️ Potential issueAdd input validation and error handling
The function should validate input parameters and handle potential database errors.
Add validation and error handling:
+ from typing import Any + def save_practice_quiz_progress( db: Any, course_participants: int, started_count: int, completed_count: int, repeated_count: int, course_id: str, quiz_id: str, ) -> None: + # Validate input parameters + if not all(isinstance(count, int) and count >= 0 for count in [ + course_participants, started_count, completed_count, repeated_count + ]): + raise ValueError("All count parameters must be non-negative integers") + + if not all(isinstance(id, str) and id for id in [course_id, quiz_id]): + raise ValueError("Course ID and Quiz ID must be non-empty strings") + values = { "totalCourseParticipants": course_participants, "startedCount": started_count, "completedCount": completed_count, "repeatedCount": repeated_count, } creation_values = values.copy() creation_values["course"] = {"connect": {"id": course_id}} creation_values["practiceQuiz"] = {"connect": {"id": quiz_id}} + try: db.activityprogress.upsert( where={"practiceQuizId": quiz_id}, data={"create": creation_values, "update": values}, ) + except Exception as e: + raise RuntimeError(f"Failed to save practice quiz progress: {str(e)}") from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import Any def save_practice_quiz_progress( db: Any, course_participants: int, started_count: int, completed_count: int, repeated_count: int, course_id: str, quiz_id: str, ) -> None: # Validate input parameters if not all(isinstance(count, int) and count >= 0 for count in [ course_participants, started_count, completed_count, repeated_count ]): raise ValueError("All count parameters must be non-negative integers") if not all(isinstance(id, str) and id for id in [course_id, quiz_id]): raise ValueError("Course ID and Quiz ID must be non-empty strings") values = { "totalCourseParticipants": course_participants, "startedCount": started_count, "completedCount": completed_count, "repeatedCount": repeated_count, } creation_values = values.copy() creation_values["course"] = {"connect": {"id": course_id}} creation_values["practiceQuiz"] = {"connect": {"id": quiz_id}} try: db.activityprogress.upsert( where={"practiceQuizId": quiz_id}, data={"create": creation_values, "update": values}, ) except Exception as e: raise RuntimeError(f"Failed to save practice quiz progress: {str(e)}") from e
apps/analytics/pyproject.toml (1)
31-32: 💡 Codebase verification
Based on the evidence gathered, I can now provide the final response:
Type hints required for strict type checking
The codebase lacks proper type hints which are required for strict type checking mode. Found issues:
- Missing return type annotations in all functions
- Missing parameter type hints (e.g.,
db
,df_activity
,df_analytics
,timestamp
)- No typing imports present despite handling complex types like DataFrames
- 21 out of 38 Python files contain functions without type hints
Key files needing attention:
apps/analytics/src/modules/participant_course_analytics/save_participant_course_analytics.py
apps/analytics/src/modules/participant_analytics/save_participant_analytics.py
- And 19 other files identified with missing type hints
🔗 Analysis chain
Verify strict type checking compatibility
The
strict
type checking mode is enabled, which is great for code quality. However, ensure all modules have proper type hints to prevent build failures.
Let me verify the type hints in more detail by checking the function signatures in these files.
Let me check for any type annotations in the function bodies and variable declarations to get a more complete picture.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for missing type hints in Python files # Find Python files without type hints echo "Files potentially missing type hints:" rg -l "def [^(]*\([^:]*\)\s*[^->]*:" -g "*.py" # Count total Python files for comparison echo -e "\nTotal Python files:" fd -e py | wc -lLength of output: 1916
Script:
#!/bin/bash # Check a sample of files for their function signatures for file in $(head -n 3 <<EOF apps/analytics/src/modules/participant_course_analytics/save_participant_course_analytics.py apps/analytics/src/modules/participant_analytics/save_participant_analytics.py apps/analytics/src/modules/participant_performance/save_participant_performance.py EOF ); do echo "=== $file ===" rg "^def" "$file" -A 1 done # Also check if type imports are present echo -e "\nChecking type-related imports:" rg "^from typing import|^import typing" apps/analytics/src/modules/Length of output: 2294
Script:
#!/bin/bash # Check for type annotations in the first few files for file in $(head -n 2 <<EOF apps/analytics/src/modules/participant_course_analytics/save_participant_course_analytics.py apps/analytics/src/modules/participant_analytics/save_participant_analytics.py EOF ); do echo "=== $file ===" # Show the entire file content to see type annotations cat "$file" echo -e "\n" done # Also check if there are any type-related configurations in setup files echo "=== Checking for type-related configurations ===" fd "setup.py|setup.cfg|mypy.ini|pytest.ini" apps/analytics/Length of output: 6883
apps/analytics/src/modules/participant_course_analytics/get_running_past_courses.py (2)
6-16:
⚠️ Potential issueAdd timezone handling and optimize database query
The current date handling doesn't account for timezones, which could lead to inconsistencies. Also, loading all participations might cause memory issues with large datasets.
Consider these improvements:
+ from datetime import datetime, timezone def get_running_past_courses(db): - curr_date = datetime.now().strftime("%Y-%m-%d") + curr_date = datetime.now(timezone.utc).strftime("%Y-%m-%d") courses = db.course.find_many( where={ # Incremental scripts can add this statement to reduce the amount of required computations # 'endDate': { # 'gt': datetime.now().strftime('%Y-%m-%d') + 'T00:00:00.000Z' # } "startDate": {"lte": curr_date + "T23:59:59.999Z"}, }, - include={"participations": True}, + include={ + "participations": { + "select": { + "id": True, + "userId": True, + "status": True + } + } + }, )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from datetime import datetime, timezone curr_date = datetime.now(timezone.utc).strftime("%Y-%m-%d") courses = db.course.find_many( where={ # Incremental scripts can add this statement to reduce the amount of required computations # 'endDate': { # 'gt': datetime.now().strftime('%Y-%m-%d') + 'T00:00:00.000Z' # } "startDate": {"lte": curr_date + "T23:59:59.999Z"}, }, include={ "participations": { "select": { "id": True, "userId": True, "status": True } } }, )
18-23: 🛠️ Refactor suggestion
Optimize memory usage and add error handling
The DataFrame conversion could be memory-intensive for large datasets. Also, add error handling for the database query.
+ from typing import Any + import logging + + def get_running_past_courses(db: Any) -> pd.DataFrame: + """Retrieve courses that have started in the past. + + Args: + db: Database connection instance + + Returns: + DataFrame containing course information + """ + try: df_courses = pd.DataFrame(list(map(lambda x: x.dict(), courses))) print( "Found {} courses with a start date before {}".format( len(df_courses), curr_date ) ) + except Exception as e: + logging.error(f"Failed to retrieve courses: {str(e)}") + raise + + # Consider chunking for large datasets + if len(df_courses) > 1000: + logging.warning(f"Large dataset detected: {len(df_courses)} courses")Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/modules/aggregated_analytics/aggregate_participant_analytics.py (1)
1-8: 🛠️ Refactor suggestion
Add type hints and improve error handling
The function should include type hints and better error handling for input validation.
-def aggregate_participant_analytics(df_participant_analytics, verbose=False): +from pandas import DataFrame + +def aggregate_participant_analytics(df_participant_analytics: DataFrame, verbose: bool = False) -> DataFrame | None: + """Aggregate participant analytics by courseId. + + Args: + df_participant_analytics: DataFrame containing participant analytics + verbose: If True, print debug information + + Returns: + DataFrame with aggregated analytics or None if input is empty + + Raises: + ValueError: If required columns are missing + """ # if the dataframe is empty, return None if df_participant_analytics.empty: if verbose: print("No participant analytics to aggregate") return None + + required_columns = {"courseId", "id", "responseCount", "totalScore", "totalPoints", "totalXp"} + missing_columns = required_columns - set(df_participant_analytics.columns) + if missing_columns: + raise ValueError(f"Missing required columns: {missing_columns}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from pandas import DataFrame def aggregate_participant_analytics(df_participant_analytics: DataFrame, verbose: bool = False) -> DataFrame | None: """Aggregate participant analytics by courseId. Args: df_participant_analytics: DataFrame containing participant analytics verbose: If True, print debug information Returns: DataFrame with aggregated analytics or None if input is empty Raises: ValueError: If required columns are missing """ # if the dataframe is empty, return None if df_participant_analytics.empty: if verbose: print("No participant analytics to aggregate") return None required_columns = {"courseId", "id", "responseCount", "totalScore", "totalPoints", "totalXp"} missing_columns = required_columns - set(df_participant_analytics.columns) if missing_columns: raise ValueError(f"Missing required columns: {missing_columns}")
apps/analytics/src/modules/aggregated_analytics/load_participant_analytics.py (2)
4-10: 🛠️ Refactor suggestion
Add type hints and error handling to convert_to_df
The function needs type hints and should handle potential conversion errors.
-def convert_to_df(analytics): +from typing import List, Dict, Any +import pandas as pd + +def convert_to_df(analytics: List[Dict[str, Any]]) -> pd.DataFrame: + """Convert analytics data to pandas DataFrame. + + Args: + analytics: List of analytics dictionaries + + Returns: + DataFrame containing the analytics data + + Raises: + ValueError: If analytics data is invalid + """ # convert the database query result into a pandas dataframe + if not isinstance(analytics, list): + raise ValueError("Analytics must be a list of dictionaries") rows = [] for item in analytics: - rows.append(dict(item)) + try: + rows.append(dict(item)) + except (TypeError, ValueError) as e: + raise ValueError(f"Failed to convert item to dictionary: {e}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import List, Dict, Any import pandas as pd def convert_to_df(analytics: List[Dict[str, Any]]) -> pd.DataFrame: """Convert analytics data to pandas DataFrame. Args: analytics: List of analytics dictionaries Returns: DataFrame containing the analytics data Raises: ValueError: If analytics data is invalid """ # convert the database query result into a pandas dataframe if not isinstance(analytics, list): raise ValueError("Analytics must be a list of dictionaries") rows = [] for item in analytics: try: rows.append(dict(item)) except (TypeError, ValueError) as e: raise ValueError(f"Failed to convert item to dictionary: {e}") return pd.DataFrame(rows)
13-30:
⚠️ Potential issueFix critical issues in load_participant_analytics
The function has several critical issues that need to be addressed.
-def load_participant_analytics(db, timestamp, analytics_type, verbose=False): +from datetime import datetime +from typing import Optional + +def load_participant_analytics( + db: Any, + timestamp: datetime, + analytics_type: str, + verbose: bool = False +) -> pd.DataFrame: + """Load participant analytics from database. + + Args: + db: Database connection + timestamp: Timestamp for filtering analytics + analytics_type: Type of analytics to load + verbose: If True, print debug information + + Returns: + DataFrame containing the loaded analytics + + Raises: + Exception: If database query fails + """ + try: participant_analytics = db.participantanalytics.find_many( where={"timestamp": timestamp, "type": analytics_type}, ) if verbose: - # Print the first participant analytics print( "Found {} analytics for the timespan from {} to {}".format( - len(participant_analytics), start_date, end_date + len(participant_analytics), + timestamp.strftime("%Y-%m-%d %H:%M:%S"), + timestamp.strftime("%Y-%m-%d %H:%M:%S") ) ) - print(participant_analytics[0]) + if participant_analytics: + print(participant_analytics[0]) - # convert the analytics to a dataframe df_loaded_analytics = convert_to_df(participant_analytics) - return df_loaded_analytics + except Exception as e: + raise Exception(f"Failed to load participant analytics: {e}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from datetime import datetime from typing import Optional def load_participant_analytics( db: Any, timestamp: datetime, analytics_type: str, verbose: bool = False ) -> pd.DataFrame: """Load participant analytics from database. Args: db: Database connection timestamp: Timestamp for filtering analytics analytics_type: Type of analytics to load verbose: If True, print debug information Returns: DataFrame containing the loaded analytics Raises: Exception: If database query fails """ try: participant_analytics = db.participantanalytics.find_many( where={"timestamp": timestamp, "type": analytics_type}, ) if verbose: print( "Found {} analytics for the timespan from {} to {}".format( len(participant_analytics), timestamp.strftime("%Y-%m-%d %H:%M:%S"), timestamp.strftime("%Y-%m-%d %H:%M:%S") ) ) if participant_analytics: print(participant_analytics[0]) df_loaded_analytics = convert_to_df(participant_analytics) return df_loaded_analytics except Exception as e: raise Exception(f"Failed to load participant analytics: {e}")
🧰 Tools
🪛 Ruff (0.8.2)
22-22: Undefined name
start_date
(F821)
22-22: Undefined name
end_date
(F821)
apps/analytics/src/modules/participant_course_analytics/save_participant_course_analytics.py (1)
1-26:
⚠️ Potential issueImprove performance and add error handling
The current implementation processes rows one by one and lacks error handling, which could be problematic for large datasets.
-def save_participant_course_analytics(db, df_activity): +from typing import Any +import pandas as pd +from tqdm import tqdm + +def save_participant_course_analytics(db: Any, df_activity: pd.DataFrame, batch_size: int = 100) -> None: + """Save participant course analytics to database. + + Args: + db: Database connection + df_activity: DataFrame containing activity data + batch_size: Number of records to process in each batch + + Raises: + Exception: If database operations fail + """ + total_rows = len(df_activity) + + try: + # Process in batches with progress bar + for start_idx in tqdm(range(0, total_rows, batch_size), desc="Saving analytics"): + batch = df_activity.iloc[start_idx:start_idx + batch_size] + + for _, row in batch.iterrows(): + try: db.participantcourseanalytics.upsert( where={ "courseId_participantId": { "courseId": row["courseId"], "participantId": row["participantId"], } }, data={ "create": { "activeWeeks": row["activeWeeks"], "activeDaysPerWeek": row["activeDaysPerWeek"], "meanElementsPerDay": row["meanElementsPerDay"], "activityLevel": row["activityLevel"], "course": {"connect": {"id": row["courseId"]}}, "participant": {"connect": {"id": row["participantId"]}}, }, "update": { "activeWeeks": row["activeWeeks"], "activeDaysPerWeek": row["activeDaysPerWeek"], "meanElementsPerDay": row["meanElementsPerDay"], "activityLevel": row["activityLevel"], }, }, ) + except Exception as e: + print(f"Error processing row {row.name}: {e}") + except Exception as e: + raise Exception(f"Failed to save participant course analytics: {e}")Additionally, consider implementing bulk upsert if your database supports it for better performance.
Would you like me to provide an example implementation of bulk upsert?
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import Any import pandas as pd from tqdm import tqdm def save_participant_course_analytics(db: Any, df_activity: pd.DataFrame, batch_size: int = 100) -> None: """Save participant course analytics to database. Args: db: Database connection df_activity: DataFrame containing activity data batch_size: Number of records to process in each batch Raises: Exception: If database operations fail """ total_rows = len(df_activity) try: # Process in batches with progress bar for start_idx in tqdm(range(0, total_rows, batch_size), desc="Saving analytics"): batch = df_activity.iloc[start_idx:start_idx + batch_size] for _, row in batch.iterrows(): try: db.participantcourseanalytics.upsert( where={ "courseId_participantId": { "courseId": row["courseId"], "participantId": row["participantId"], } }, data={ "create": { "activeWeeks": row["activeWeeks"], "activeDaysPerWeek": row["activeDaysPerWeek"], "meanElementsPerDay": row["meanElementsPerDay"], "activityLevel": row["activityLevel"], "course": {"connect": {"id": row["courseId"]}}, "participant": {"connect": {"id": row["participantId"]}}, }, "update": { "activeWeeks": row["activeWeeks"], "activeDaysPerWeek": row["activeDaysPerWeek"], "meanElementsPerDay": row["meanElementsPerDay"], "activityLevel": row["activityLevel"], }, }, ) except Exception as e: print(f"Error processing row {row.name}: {e}") except Exception as e: raise Exception(f"Failed to save participant course analytics: {e}")
apps/analytics/src/modules/aggregated_analytics/compute_aggregated_analytics.py (2)
31-34: 🛠️ Refactor suggestion
Add error handling for database operations.
Consider wrapping the database operation in a try-except block to handle potential errors gracefully.
if df_aggregated_analytics is not None: + try: save_aggregated_analytics( db, df_aggregated_analytics, timestamp, analytics_type ) + except Exception as e: + print(f"Error saving aggregated analytics: {e}") + raise📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.if df_aggregated_analytics is not None: try: save_aggregated_analytics( db, df_aggregated_analytics, timestamp, analytics_type ) except Exception as e: print(f"Error saving aggregated analytics: {e}") raise
19-28: 🛠️ Refactor suggestion
Use f-strings for string formatting and add error handling.
Replace string concatenation with f-strings and consider adding error handling for database operations.
if df_aggregated_analytics is not None and verbose: - print("Aggregated analytics for time range:" + start_date + " to " + end_date) + print(f"Aggregated analytics for time range: {start_date} to {end_date}") print(df_aggregated_analytics.head()) elif df_aggregated_analytics is None: - print( - "No aggregated analytics to compute for time range:" - + start_date - + " to " - + end_date - ) + print(f"No aggregated analytics to compute for time range: {start_date} to {end_date}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.if df_aggregated_analytics is not None and verbose: print(f"Aggregated analytics for time range: {start_date} to {end_date}") print(df_aggregated_analytics.head()) elif df_aggregated_analytics is None: print(f"No aggregated analytics to compute for time range: {start_date} to {end_date}")
apps/analytics/src/modules/participant_course_analytics/get_active_weeks.py (2)
4-7: 🛠️ Refactor suggestion
Add input validation and type hints.
The function should validate the input course object and include type hints for better maintainability.
-def get_active_weeks(db, course): +from typing import Dict, Any +import pandas as pd + +def get_active_weeks(db, course: Dict[str, Any]) -> pd.DataFrame: + """Calculate the number of active weeks for participants in a course. + + Args: + db: Database connection object + course: Course object containing id and participations + + Returns: + DataFrame with participant activity metrics + """ + if not course or "id" not in course or "participations" not in course: + raise ValueError("Invalid course object") + course_id = course["id"] participations = course["participations"]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import Dict, Any import pandas as pd def get_active_weeks(db, course: Dict[str, Any]) -> pd.DataFrame: """Calculate the number of active weeks for participants in a course. Args: db: Database connection object course: Course object containing id and participations Returns: DataFrame with participant activity metrics """ if not course or "id" not in course or "participations" not in course: raise ValueError("Invalid course object") course_id = course["id"] participations = course["participations"]
14-20: 🛠️ Refactor suggestion
Optimize database query performance.
Consider fetching all weekly analytics in a single query instead of querying for each participant.
- weekly_analytics = db.participantanalytics.find_many( - where={ - "type": "WEEKLY", - "courseId": course_id, - "participantId": participant_id, - }, - ) + # Fetch all weekly analytics for the course in one query + all_weekly_analytics = db.participantanalytics.find_many( + where={ + "type": "WEEKLY", + "courseId": course_id, + "participantId": {"in": [p["participantId"] for p in participations]}, + }, + ) + + # Group by participant + analytics_by_participant = {} + for analytic in all_weekly_analytics: + pid = analytic["participantId"] + if pid not in analytics_by_participant: + analytics_by_participant[pid] = [] + analytics_by_participant[pid].append(analytic)Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/modules/activity_progress/compute_progress_counts.py (1)
4-8: 🛠️ Refactor suggestion
Add input validation and type hints.
The function should validate the input activity object and include type hints.
-def compute_progress_counts(activity): +from typing import Dict, Any, Tuple + +def compute_progress_counts(activity: Dict[str, Any]) -> Tuple[int, int, int]: + """Compute progress counts for an activity. + + Args: + activity: Activity object containing responses and stacks + + Returns: + Tuple of (started_count, completed_count, repeated_count) + """ + if not activity or "responses" not in activity or "stacks" not in activity: + raise ValueError("Invalid activity object") + started_count = 0 completed_count = 0 repeated_count = 0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from typing import Dict, Any, Tuple def compute_progress_counts(activity: Dict[str, Any]) -> Tuple[int, int, int]: """Compute progress counts for an activity. Args: activity: Activity object containing responses and stacks Returns: Tuple of (started_count, completed_count, repeated_count) """ if not activity or "responses" not in activity or "stacks" not in activity: raise ValueError("Invalid activity object") started_count = 0 completed_count = 0 repeated_count = 0
apps/analytics/src/modules/participant_course_analytics/compute_participant_activity.py (2)
56-67: 🛠️ Refactor suggestion
Add type hints and docstring for better maintainability
The helper function lacks type hints and documentation.
Add type hints and docstring:
-def sum_active_days_per_week(week_end, daily_analytics): +from datetime import datetime +from typing import List, Dict, Any + +def sum_active_days_per_week(week_end: datetime, daily_analytics: List[Dict[str, Any]]) -> List[Dict[str, Any]]: + """Filter daily analytics entries for a specific week. + + Args: + week_end: The end date of the week + daily_analytics: List of daily analytics entries + + Returns: + List of analytics entries within the specified week + """ week_start = week_end - pd.DateOffset(days=6)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from datetime import datetime from typing import List, Dict, Any def sum_active_days_per_week(week_end: datetime, daily_analytics: List[Dict[str, Any]]) -> List[Dict[str, Any]]: """Filter daily analytics entries for a specific week. Args: week_end: The end date of the week daily_analytics: List of daily analytics entries Returns: List of analytics entries within the specified week """ week_start = week_end - pd.DateOffset(days=6) # filter the daily analytics entries for the current week week_analytics = list( filter( lambda daily: week_start <= dict(daily)["timestamp"] <= week_end, daily_analytics, ) ) return week_analytics
4-53:
⚠️ Potential issueAdd error handling and input validation
The function lacks validation for input parameters and error handling for database operations.
Consider adding these validations:
def compute_participant_activity(db, df_activity, course_id, course_start, course_end): + if course_end < course_start: + raise ValueError("course_end must be after course_start") + if df_activity is None or df_activity.empty: + return df_activity + # compute course duration in days course_duration = (course_end - course_start).days + 1Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/notebooks/aggregated_course_analytics.ipynb (1)
66-74:
⚠️ Potential issueAdd error handling for database operations
The course processing loop lacks error handling for database operations.
Add try-except blocks:
# find all courses that started in the past -df_courses = get_running_past_courses(db) +try: + df_courses = get_running_past_courses(db) +except Exception as e: + print(f"Error fetching courses: {e}") + db.disconnect() + raise # iterate over all courses and compute the participant course analytics for idx, course in df_courses.iterrows(): print(f"Processing course", idx, "of", len(df_courses), "with id", course["id"])Committable suggestion skipped: line range outside the PR's diff.
apps/analytics/src/notebooks/activity_progress.ipynb (1)
73-113: 🛠️ Refactor suggestion
Optimize database operations and add error handling
The processing loop makes multiple database calls without proper error handling or batching.
Consider these improvements:
- Add error handling:
for idx, course in df_courses.iterrows(): + try: course_id = course["id"] print(f"Processing course", idx, "of", len(df_courses), "with id", course_id) # ... processing logic ... + except Exception as e: + print(f"Error processing course {course_id}: {e}") + continue
- Consider batching database operations:
def batch_save_progress(db, progress_data, batch_size=100): """Batch save progress data to reduce database calls""" for i in range(0, len(progress_data), batch_size): batch = progress_data[i:i + batch_size] db.batch_([ db.practiceQuizProgress.create(data=item) for item in batch ])apps/analytics/src/modules/participant_course_analytics/__init__.py (1)
1-4: 🛠️ Refactor suggestion
Define module's public API using
__all__
The imports appear to be intended for re-export, but they're currently flagged as unused. To explicitly define the module's public API and silence the linter warnings, add an
__all__
declaration.Add this after the imports:
from .get_running_past_courses import get_running_past_courses from .get_active_weeks import get_active_weeks from .compute_participant_activity import compute_participant_activity from .save_participant_course_analytics import save_participant_course_analytics + +__all__ = [ + 'get_running_past_courses', + 'get_active_weeks', + 'compute_participant_activity', + 'save_participant_course_analytics', +]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.from .get_running_past_courses import get_running_past_courses from .get_active_weeks import get_active_weeks from .compute_participant_activity import compute_participant_activity from .save_participant_course_analytics import save_participant_course_analytics __all__ = [ 'get_running_past_courses', 'get_active_weeks', 'compute_participant_activity', 'save_participant_course_analytics', ]
🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.get_running_past_courses.get_running_past_courses
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.get_active_weeks.get_active_weeks
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.compute_participant_activity.compute_participant_activity
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
4-4:
.save_participant_course_analytics.save_participant_course_analytics
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
apps/analytics/README.md (3)
5-9: 🛠️ Refactor suggestion
Add missing critical requirements.
Please include:
- Database requirements (type and version) since Prisma is used
- Minimum system requirements (memory, disk space)
- More specific version ranges (e.g., "Python >=3.12.0, <3.13.0")
1-24: 🛠️ Refactor suggestion
Add essential documentation sections.
Consider adding the following sections to make the documentation more complete:
- Configuration guide (environment variables, config files)
- Contributing guidelines
- Development workflow
- Troubleshooting guide
- API documentation (if applicable)
🧰 Tools
🪛 LanguageTool
[grammar] ~16-~16: The word ‘install’ is not a noun.
Context: ...onment not matching specifications, the install command or script execution might fail....(A_INSTALL)
15-15: 💡 Codebase verification
Update README to accurately describe the sync-schema script's functionality
The script exists and is functional, but the README's description is not entirely accurate. The script specifically syncs
.prisma
files frompackages/prisma/src/prisma/schema/
toapps/analytics/prisma/schema/
, excludingjs.prisma
. The current README description oversimplifies this and could mislead users.Suggested locations to update:
apps/analytics/README.md
: Update the instruction to: "Make sure that all required Prisma schema files are available inapps/analytics/prisma/schema/
. If this is not the case, run theutil/sync-schema.sh
script to sync them from the main schema directory."🔗 Analysis chain
Verify the sync-schema script location and functionality.
The README references
util/sync-schema.sh
but its existence and location should be verified.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if the sync-schema script exists and is executable if [ -f "util/sync-schema.sh" ]; then echo "Script exists" ls -l "util/sync-schema.sh" else echo "Script not found at util/sync-schema.sh" # Search for the script in the repository fd "sync-schema.sh" fiLength of output: 165
Script:
#!/bin/bash # Let's verify the content of the script to ensure it matches the README's description cat util/sync-schema.sh # Also check for .prisma files in the prisma directory fd -e prismaLength of output: 883
apps/frontend-manage/src/components/analytics/overview/QuizDashboardLabel.tsx (1)
5-14: 🛠️ Refactor suggestion
Add TypeScript return type and improve accessibility
The component should have explicit TypeScript types and better accessibility support.
Apply these improvements:
-function QuizDashboardLabel() { +function QuizDashboardLabel({ className }: { className?: string }): JSX.Element { const t = useTranslations() return ( <> - <FontAwesomeIcon className="mr-1" icon={faChartPie} /> + <FontAwesomeIcon + className="mr-1" + icon={faChartPie} + aria-hidden="true" + /> - <div>{t('manage.analytics.quizDashboard')}</div> + <div className={className}>{t('manage.analytics.quizDashboard')}</div> </> ) }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.function QuizDashboardLabel({ className }: { className?: string }): JSX.Element { const t = useTranslations() return ( <> <FontAwesomeIcon className="mr-1" icon={faChartPie} aria-hidden="true" /> <div className={className}>{t('manage.analytics.quizDashboard')}</div> </> ) }
apps/frontend-manage/src/components/analytics/overview/ActivityDashboardLabel.tsx (1)
1-16: 🛠️ Refactor suggestion
Consider creating a shared base component
This component shares identical structure with QuizDashboardLabel and PerformanceDashboardLabel. Consider creating a reusable base component.
Create a new shared component:
// components/analytics/overview/DashboardLabel.tsx import { IconDefinition } from '@fortawesome/free-solid-svg-icons' import { FontAwesomeIcon } from '@fortawesome/react-fontawesome' interface DashboardLabelProps { icon: IconDefinition translationKey: string className?: string } export function DashboardLabel({ icon, translationKey, className }: DashboardLabelProps): JSX.Element { const t = useTranslations() return ( <> <FontAwesomeIcon className="mr-1" icon={icon} aria-hidden="true" /> <div className={className}>{t(translationKey)}</div> </> ) }Then simplify this component to:
-import { faChartLine } from '@fortawesome/free-solid-svg-icons' -import { FontAwesomeIcon } from '@fortawesome/react-fontawesome' -import { useTranslations } from 'next-intl' +import { DashboardLabel } from './DashboardLabel' +import { faChartLine } from '@fortawesome/free-solid-svg-icons' -function ActivityDashboardLabel() { +function ActivityDashboardLabel({ className }: { className?: string }): JSX.Element { - const t = useTranslations() - - return ( - <> - <FontAwesomeIcon className="mr-1" icon={faChartLine} /> - <div>{t('manage.analytics.activityDashboard')}</div> - </> - ) + return <DashboardLabel + icon={faChartLine} + translationKey="manage.analytics.activityDashboard" + className={className} + /> }apps/frontend-manage/src/components/analytics/overview/PerformanceDashboardLabel.tsx (2)
1-16: 🛠️ Refactor suggestion
Apply shared component pattern consistently
This component should also use the shared DashboardLabel component proposed earlier.
Apply the same pattern:
-import { faChartSimple } from '@fortawesome/free-solid-svg-icons' -import { FontAwesomeIcon } from '@fortawesome/react-fontawesome' -import { useTranslations } from 'next-intl' +import { DashboardLabel } from './DashboardLabel' +import { faChartSimple } from '@fortawesome/free-solid-svg-icons' -function PerformanceDashboardLabel() { +function PerformanceDashboardLabel({ className }: { className?: string }): JSX.Element { - const t = useTranslations() - - return ( - <> - <FontAwesomeIcon className="mr-1" icon={faChartSimple} /> - <div>{t('manage.analytics.performanceDashboard')}</div> - </> - ) + return <DashboardLabel + icon={faChartSimple} + translationKey="manage.analytics.performanceDashboard" + className={className} + /> }Committable suggestion skipped: line range outside the PR's diff.
1-16: 💡 Codebase verification
Translation key not found in message files
The translation key 'manage.analytics.performanceDashboard' appears to be missing from both English and German translation files. Please add the translation key to the message files at:
- packages/i18n/messages/en.ts
- packages/i18n/messages/de.ts
🔗 Analysis chain
Verify translation keys exist
Ensure that all translation keys used in these components are defined in the translation files.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for translation key definitions rg -g '*.{json,ts,js}' "manage\.analytics\.(quizDashboard|activityDashboard|performanceDashboard)"Length of output: 100
Script:
#!/bin/bash # Search for translation files in the repository fd -e json messages # After finding the translation files, let's check their content fd -e json messages -x cat {}Length of output: 55
Script:
#!/bin/bash # Try to find translation files in common i18n/locales directories fd -e json . -p "i18n|locales|translations" # Let's also check for any files that might contain translation definitions rg -g '*.{json,ts,js,tsx}' -i "translations|messages|i18n" --files-with-matchesLength of output: 22851
Script:
#!/bin/bash # Let's check the content of the translation files we found cat packages/i18n/messages/en.ts packages/i18n/messages/de.tsLength of output: 131031
Script:
#!/bin/bash # Let's check if the translation key exists in the messages files rg "manage\.analytics\.performanceDashboard" packages/i18n/messages/en.ts packages/i18n/messages/de.tsLength of output: 104
apps/frontend-manage/src/components/analytics/overview/AnalyticsCourseLabel.tsx (1)
13-18: 🛠️ Refactor suggestion
Optimize date comparisons and add error handling.
The date comparisons could be memoized, and error handling should be added for invalid dates.
+import { useMemo } from 'react' function AnalyticsCourseLabel({ course, }: { course: Pick<Course, 'id' | 'name' | 'startDate' | 'endDate'> }) { - const isPast = course.endDate - ? dayjs(course.endDate).isBefore(dayjs()) - : false - const isFuture = course.startDate - ? dayjs(course.startDate).isAfter(dayjs()) - : false + const { isPast, isFuture } = useMemo(() => { + try { + return { + isPast: course.endDate ? dayjs(course.endDate).isBefore(dayjs()) : false, + isFuture: course.startDate ? dayjs(course.startDate).isAfter(dayjs()) : false + } + } catch (error) { + console.error('Error parsing course dates:', error) + return { isPast: false, isFuture: false } + } + }, [course.startDate, course.endDate])📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.const { isPast, isFuture } = useMemo(() => { try { return { isPast: course.endDate ? dayjs(course.endDate).isBefore(dayjs()) : false, isFuture: course.startDate ? dayjs(course.startDate).isAfter(dayjs()) : false } } catch (error) { console.error('Error parsing course dates:', error) return { isPast: false, isFuture: false } } }, [course.startDate, course.endDate])
apps/frontend-manage/src/components/analytics/overview/CourseDashboardList.tsx (2)
7-11: 🛠️ Refactor suggestion
Add loading and error states.
The component should handle loading and error states for better user experience.
function CourseDashboardList({ courses, + isLoading, + error, }: { courses?: Pick<Course, 'id' | 'name' | 'startDate' | 'endDate'>[] | null + isLoading?: boolean + error?: Error }) {Committable suggestion skipped: line range outside the PR's diff.
1-42: 🛠️ Refactor suggestion
Wrap component with error boundary.
Consider implementing an error boundary to gracefully handle runtime errors.
+import { ErrorBoundary } from '@uzh-bf/design-system' function CourseDashboardList({ // ... props }) { // ... component logic } -export default CourseDashboardList +export default function CourseDashboardListWithErrorBoundary(props) { + return ( + <ErrorBoundary fallback={<ErrorMessage />}> + <CourseDashboardList {...props} /> + </ErrorBoundary> + ) +}Committable suggestion skipped: line range outside the PR's diff.
apps/frontend-manage/src/components/analytics/overview/DashboardButtons.tsx (1)
11-48: 🛠️ Refactor suggestion
Optimize component performance and organization
Consider these improvements for better performance and maintainability:
+const DASHBOARD_BUTTONS = [ + { + href: (courseId: string) => `/analytics/${courseId}/activity`, + label: <ActivityDashboardLabel />, + getCy: (courseName: string) => `activity-dashboard-button-${courseName}`, + }, + // ... other buttons +]; function DashboardButtons({ course }: { course: Pick<Course, 'id' | 'name'> }) { const router = useRouter() + const handleNavigation = useCallback((href: string) => { + router.push(href) + }, [router]) return ( <div - key={`dashboards-${course.id}`} className="flex h-10 flex-row items-center gap-2" > - {[ - { - href: `/analytics/${course.id}/activity`, - // ... - }, - ].map((button, ix) => ( + {DASHBOARD_BUTTONS.map((button, ix) => ( <Button className={{ root: 'h-8.5 border-uzh-blue-40 shadow-none', }} - onClick={() => router.push(button.href)} - data={{ cy: button.cy }} + onClick={() => handleNavigation(button.href(course.id))} + data={{ cy: button.getCy(course.name) }} key={`dashboard-button-${ix}-course-${course.id}`} > {button.label} </Button>Committable suggestion skipped: line range outside the PR's diff.
apps/frontend-manage/src/components/analytics/overview/AnalyticsNavigation.tsx (2)
29-35:
⚠️ Potential issueAdd error handling for GraphQL query
The component currently handles loading state but doesn't handle query errors, which could lead to runtime errors.
- const { data, loading } = useQuery(GetUserCoursesDocument) + const { data, loading, error } = useQuery(GetUserCoursesDocument) const router = useRouter() const t = useTranslations() + if (error) { + return ( + <div className="text-red-600"> + {t('shared.errors.failedToLoadCourses')} + </div> + ) + } + if (loading) { return <Loader /> }Committable suggestion skipped: line range outside the PR's diff.
46-62: 🛠️ Refactor suggestion
Optimize SelectField implementation and add null safety
The current implementation could benefit from performance optimizations and better null handling.
+ const courseItems = useMemo( + () => + data?.userCourses?.map((course) => ({ + label: course.name, + value: course.id, + })) ?? [], + [data?.userCourses] + ) + + const handleCourseChange = useCallback( + (value: string) => { + router.push({ pathname: `/analytics/${value}/${slug}` }) + }, + [router, slug] + ) <div className="hidden justify-center md:flex"> <SelectField label={`${t('shared.generic.course')}:`} labelType="large" - value={router.query.courseId as string} + value={router.query.courseId?.toString() ?? ''} items={courseItems} - onChange={(value) => { - router.push({ pathname: `/analytics/${value}/${slug}` }) - }} + onChange={handleCourseChange} className={{ select: { trigger: 'h-8' } }} /> </div>Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/prisma/src/prisma/schema/element.prisma (2)
Line range hint
156-166
: Track migration tasks for ElementStackTypeThe migration path is well-documented, but we should ensure these TODOs are tracked in the issue tracker:
- Drop
type
field after migration- Rename
typeNEW
toElementStackType
and make it requiredWould you like me to create GitHub issues to track these migration tasks?
Line range hint
1-400
: Consider implementing analytics-specific indicesGiven the focus on learning analytics, consider adding database indices to optimize analytics queries, particularly for:
- ElementInstance.instancePerformance lookups
- ElementType filtering for new types (CONTENT, FLASHCARD)
This will improve the performance of analytics queries, especially as the dataset grows.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (2)
packages/graphql/src/services/stacks.ts
(0 hunks)packages/prisma/src/prisma/schema/element.prisma
(1 hunks)
💤 Files with no reviewable changes (1)
- packages/graphql/src/services/stacks.ts
🔇 Additional comments (2)
packages/prisma/src/prisma/schema/element.prisma (2)
90-94
: Verify InstancePerformance model definition
The addition of instancePerformance
field for analytics tracking looks good, but we need to ensure the referenced type is properly defined.
✅ Verification successful
Based on the search results, I can see that the InstancePerformance
model is properly defined in packages/prisma/src/prisma/schema/analytics.prisma
with the following fields:
- id (Int, auto-incremented primary key)
- responseCount (Int)
- averageTimeSpent (Float)
- firstErrorRate, firstPartialRate, firstCorrectRate (Float, optional)
- lastErrorRate, lastPartialRate, lastCorrectRate (Float, optional)
- totalErrorRate, totalPartialRate, totalCorrectRate (Float)
- instance relation (ElementInstance, one-to-one)
- course relation (Course, many-to-one)
Additionally, I found the migration file 20241206141120_learning_analytics_performance_progress/migration.sql
that introduced these analytics-related changes.
InstancePerformance model is properly defined
The addition of the instancePerformance
field to track analytics is well-structured and properly integrated with the necessary relations and migrations.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify InstancePerformance model definition and usage
# Look for InstancePerformance model definition
echo "Checking for InstancePerformance model definition..."
rg -l "model InstancePerformance" --type prisma
# Check for related migrations
echo "Checking for related migrations..."
fd -e sql -e prisma . | rg -l "InstancePerformance"
Length of output: 326
Script:
#!/bin/bash
# Let's try a different approach to find the model definition and related files
# Search for InstancePerformance in all files
echo "Searching for InstancePerformance model..."
rg "model InstancePerformance" -A 5
# Find all .prisma files and check their content
echo "Checking .prisma files..."
fd -e prisma . -x cat {}
# Look for any migrations or schema changes
echo "Looking for migrations..."
fd -g "*migration*.sql" .
Length of output: 64668
Line range hint 6-15
: Verify handling of new ElementTypes across the codebase
The addition of CONTENT
and FLASHCARD
types is aligned with the learning analytics feature. However, we should ensure proper handling across related components.
✅ Verification successful
Let me gather more specific information about the switch statements by modifying the pattern to better match ElementType cases.
Based on the comprehensive results, I can now provide the final response:
New ElementTypes are properly handled across the codebase
The new CONTENT
and FLASHCARD
types are well-implemented throughout the codebase with proper handling in:
- Type-specific response processing in frontend components
- Data transformation and migration scripts
- GraphQL services and resolvers
- Evaluation and grading logic
The codebase shows consistent handling of these types alongside existing ones, with proper type checks and corresponding business logic in:
- Response processing (
packages/graphql/src/services/stacks.ts
) - Results aggregation (
packages/graphql/src/scripts/2024-05-07_compute_aggregated_FCCT_responses.ts
) - Frontend components (
apps/frontend-pwa/src/components/practiceQuiz/ElementStack.tsx
) - Data migration scripts (
packages/prisma/src/scripts/
)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify implementation of new ElementTypes
# Check for CONTENT type handling
echo "Checking CONTENT type implementation..."
rg -l "ElementType.CONTENT|type: ['|\"]CONTENT" --type ts
# Check for FLASHCARD type handling
echo "Checking FLASHCARD type implementation..."
rg -l "ElementType.FLASHCARD|type: ['|\"]FLASHCARD" --type ts
# Check for potential switch statements that need updating
echo "Checking switch statements on ElementType..."
ast-grep --pattern 'switch ($expr) { $$$case ElementType$$$ }'
Length of output: 2416
Script:
#!/bin/bash
# Check for switch statements specifically handling ElementType cases with context
rg "switch.*ElementType.*\{" -A 10 --type ts
# Check for if-else chains handling ElementType
rg "if.*ElementType\." -A 5 --type ts
# Look for type-specific handling in function signatures
ast-grep --pattern 'function $name($_: { type: ElementType, $$$ }) { $$$ }'
Length of output: 39299
…earning analytics computation (#4410)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (46)
apps/analytics/src/scripts/2_initial_aggregated_course_analytics.py (4)
4-4
: Remove unused import 'os'
The import statement for os is never used in this file. Removing it will help avoid clutter.Suggested fix:
- import os from prisma import Prisma
🧰 Tools
🪛 Ruff (0.8.2)
4-4:
os
imported but unusedRemove unused import:
os
(F401)
6-6
: Remove unused import 'pandas'
Although you use a DataFrame object from another module, the direct import pandas as pd is never referenced here. Consider removing it.Suggested fix:
- import pandas as pd + # import pandas as pd # (Remove altogether if not needed)🧰 Tools
🪛 Ruff (0.8.2)
6-6:
pandas
imported but unusedRemove unused import:
pandas
(F401)
32-32
: Remove extraneous 'f' prefix in the print statement
The string does not contain any placeholders and therefore does not need an f-string.Suggested fix:
- print(f"Processing course", idx, "of", len(df_courses), "with id", course["id"]) + print("Processing course", idx, "of", len(df_courses), "with id", course["id"])🧰 Tools
🪛 Ruff (0.8.2)
32-32: f-string without any placeholders
Remove extraneous
f
prefix(F541)
1-38
: Consider adding tests to validate the script's functionality
This script could benefit from automated tests to confirm that course analytics are computed correctly and that any future modifications won't break existing functionality.🧰 Tools
🪛 Ruff (0.8.2)
4-4:
os
imported but unusedRemove unused import:
os
(F401)
6-6:
pandas
imported but unusedRemove unused import:
pandas
(F401)
32-32: f-string without any placeholders
Remove extraneous
f
prefix(F541)
apps/analytics/src/scripts/6_initial_activity_progress.py (4)
4-4
: Remove the unused 'os' import.This import is flagged as unused by static analysis tools, so removing it will reduce clutter and potential confusion.
Apply this diff to remove the unused import:
- import os from prisma import Prisma import pandas as pd import sys
🧰 Tools
🪛 Ruff (0.8.2)
4-4:
os
imported but unusedRemove unused import:
os
(F401)
6-6
: Remove the unused 'pandas' import.Similarly, this import is also unused. Removing it keeps dependencies clear and avoids unnecessary clutter.
Apply this diff to remove the unused import:
import prisma -import pandas as pd import sys
🧰 Tools
🪛 Ruff (0.8.2)
6-6:
pandas
imported but unusedRemove unused import:
pandas
(F401)
42-42
: Fix the extraneous 'f' prefix in the print statement.Since the statement does not contain any interpolation placeholders, the
f
prefix is unnecessary. Alternatively, you could keep thef
string and add actual placeholders.Apply one of the following diffs:
Option A: Remove the
f
prefix if no interpolation is intended:-print(f"Processing course", idx, "of", len(df_courses), "with id", course_id) +print("Processing course", idx, "of", len(df_courses), "with id", course_id)Option B: Keep the
f
prefix and use placeholders for improved readability:-print(f"Processing course", idx, "of", len(df_courses), "with id", course_id) +print(f"Processing course {idx} of {len(df_courses)} with id {course_id}")🧰 Tools
🪛 Ruff (0.8.2)
42-42: f-string without any placeholders
Remove extraneous
f
prefix(F541)
39-76
: Consider adding error handling for database and function calls.The current script assumes that all operations succeed. In production, exceptions (e.g., network disruptions, incomplete data, or stale connections) could occur.
As a follow-up, you could wrap critical sections in try/except blocks or use context managers for database operations:
try: db.connect() # Perform queries and computations finally: db.disconnect()This helps ensure the script remains robust even under unexpected conditions.
🧰 Tools
🪛 Ruff (0.8.2)
42-42: f-string without any placeholders
Remove extraneous
f
prefix(F541)
apps/analytics/src/scripts/4_initial_participant_performance.py (5)
4-4
: Remove the unused import.The import statement for "os" is not referenced anywhere in this script and should be removed to comply with best practices and to avoid clutter.
- import os + # import os not used, removing🧰 Tools
🪛 Ruff (0.8.2)
4-4:
os
imported but unusedRemove unused import:
os
(F401)
28-30
: Consider using the verbose variable or removing it.You have defined "verbose = False" but it does not appear to be used. If you intended to implement logging or detailed console output, implement it. Otherwise, remove this unused variable to keep the code clean.
38-38
: Remove unnecessary f-string.This print statement uses the 'f' prefix but doesn't utilize any placeholder expressions in curly braces.
- print(f"Processing course", idx, "of", len(df_courses), "with id", course_id) + print("Processing course", idx, "of", len(df_courses), "with id", course_id)🧰 Tools
🪛 Ruff (0.8.2)
38-38: f-string without any placeholders
Remove extraneous
f
prefix(F541)
25-27
: Add exception handling for database connection.While the script correctly connects and disconnects from Prisma, consider wrapping the database operations in a try-except block to handle potential connection errors. This helps avoid incomplete runs in production environments.
35-53
: Data integrity & performance assurance.This block processes each course and saves performance analytics. For large datasets, consider chunking or batched writes. Also, double-check that the data insertion logic properly handles potential duplicates or conflicts in the database.
🧰 Tools
🪛 Ruff (0.8.2)
38-38: f-string without any placeholders
Remove extraneous
f
prefix(F541)
apps/analytics/src/scripts/5_initial_participant_course_analytics.py (4)
5-7
: Remove unused imports "os", "json", and "datetime".
These imports are flagged by static analysis as unused. Please remove them to keep the codebase clean and comply with best practices.-import os -import json -from datetime import datetime🧰 Tools
🪛 Ruff (0.8.2)
5-5:
os
imported but unusedRemove unused import:
os
(F401)
6-6:
json
imported but unusedRemove unused import:
json
(F401)
7-7:
datetime.datetime
imported but unusedRemove unused import:
datetime.datetime
(F401)
9-9
: Remove unused import "pandas".
Although a DataFrame is used, it appears to be returned by the helper functions. The code here does not use the "pd" namespace directly.-import pandas as pd
🧰 Tools
🪛 Ruff (0.8.2)
9-9:
pandas
imported but unusedRemove unused import:
pandas
(F401)
31-31
: "verbose" variable is defined but never used.
Consider removing it or incorporating logging statements controlled by this variable to make the script more informative.
40-40
: Fix the extraneous f-string usage.
No placeholders exist in the string as currently written. Either remove the "f" prefix or convert it to a proper f-string with placeholders.- print(f"Processing course", idx, "of", len(df_courses), "with id", course["id"]) + print(f"Processing course {idx} of {len(df_courses)} with id {course['id']}")🧰 Tools
🪛 Ruff (0.8.2)
40-40: f-string without any placeholders
Remove extraneous
f
prefix(F541)
apps/analytics/package.json (2)
5-7
: Consider adding @types/node and stricter version constraints
- Since this is a Node.js project (as specified in engines), consider adding
@types/node
as a dev dependency for better TypeScript support.- For development tools like
nodemon
, it's safer to use exact versions (^) instead of patch-level updates (~) to ensure consistent behavior across development environments."devDependencies": { - "nodemon": "~3.1.7" + "nodemon": "^3.1.7", + "@types/node": "^20.0.0" }
8-15
: Improve script configuration and documentation
- The
script
,script:prod
, andscript:qa
commands are incomplete as they require a module name parameter.- The watch paths in the
dev
script could be more specific to avoid unnecessary rebuilds.Consider adding:
- Documentation comments for each script's purpose and usage
- More specific watch paths in the dev script
- A helper script that validates the module parameter
"scripts": { + // Watch and reload the analytics service in development "dev": "doppler run --config dev -- nodemon --exec 'poetry run poe main' --watch src,prisma --ext py,prisma", + // Generate Prisma client "generate": "poetry run poe generate", + // Run the analytics service "main": "doppler run --config dev -- poetry run poe main", + // Run a specific Python module in development (usage: pnpm script <module_name>) "script": "doppler run --config dev -- poetry run python -m", + // Run a specific Python module in production (usage: pnpm script:prod <module_name>) "script:prod": "doppler run --config prd -- poetry run python -m", + // Run a specific Python module in QA (usage: pnpm script:qa <module_name>) "script:qa": "doppler run --config qa -- poetry run python -m" }apps/analytics/src/scripts/0_initial_participant_analytics.py (2)
4-5
: Remove unused imports.
“os” and “json” are not utilized here, so consider removing them.Apply this diff to remove the unused imports:
- import os - import json🧰 Tools
🪛 Ruff (0.8.2)
4-4:
os
imported but unusedRemove unused import:
os
(F401)
5-5:
json
imported but unusedRemove unused import:
json
(F401)
42-91
: Consolidate repetitive daily, weekly, and monthly analytics logic.
Currently, there is repetitive code for daily, weekly, and monthly computations. Consider extracting a reusable function or loop structure to handle these intervals. This approach reduces duplication and simplifies future maintenance.apps/analytics/src/scripts/3_initial_instance_activity_performance.py (2)
4-6
: Remove unused imports to optimize your script.
Currently, the script doesn’t use the "os", "json", "datetime", or "pandas" modules. Removing them would comply with static analysis guidance and improve maintainability.Here is a suggested diff:
-import os -import json -from datetime import datetime -import pandas as pdAlso applies to: 8-8
🧰 Tools
🪛 Ruff (0.8.2)
4-4:
os
imported but unusedRemove unused import:
os
(F401)
5-5:
json
imported but unusedRemove unused import:
json
(F401)
6-6:
datetime.datetime
imported but unusedRemove unused import:
datetime.datetime
(F401)
45-45
: Convert the print statement to either a standard print or use placeholders within the f-string.
The current f-string on this line does not use any placeholders, which triggers a static analysis warning. Consider the following improvement for clearer output:-print(f"Processing course", idx, "of", len(df_courses), "with id", course_id) +print(f"Processing course {idx} of {len(df_courses)} with id {course_id}")🧰 Tools
🪛 Ruff (0.8.2)
45-45: f-string without any placeholders
Remove extraneous
f
prefix(F541)
apps/analytics/src/scripts/1_initial_aggregated_analytics.py (4)
4-4
: Remove unused import 'os'
According to the static analysis (F401), the 'os' module is not used in the script. Removing it will clean up the imports.- import os
🧰 Tools
🪛 Ruff (0.8.2)
4-4:
os
imported but unusedRemove unused import:
os
(F401)
5-5
: Remove unused import 'json'
As indicated by the static analysis (F401), the 'json' module is not used in this script. Consider removing it to keep imports concise.- import json
🧰 Tools
🪛 Ruff (0.8.2)
5-5:
json
imported but unusedRemove unused import:
json
(F401)
37-45
: Consider using logging instead of print statements
Print statements provide basic info, but using the logging module allows for configurable log levels, better formatting, and integration with external log monitoring. This can be especially beneficial for batch jobs running multiple computations.
92-92
: Remove extraneous 'f' prefix
Static analysis (F541) indicates this f-string has no placeholders. You can simply remove the prefix.- print(f"Computing course-wide aggregated analytics") + print("Computing course-wide aggregated analytics")🧰 Tools
🪛 Ruff (0.8.2)
92-92: f-string without any placeholders
Remove extraneous
f
prefix(F541)
packages/prisma/src/prisma/schema/analytics.prisma (4)
14-18
: Consider reusing existing enums or centralizing naming conventions.
The new "PerformanceLevel" enum is well-defined, but it duplicates the idea of "ActivityLevel" declared earlier. If they're semantically different, clarify it in comments or rename to avoid confusion.
42-47
: Validate optional counts and avoid negative or inconsistent values.
These new float fields capturing first/last correctness/wrongness rates can cause confusion if left uninitialized or negative. Consider enforcing data integrity checks during inserts/updates.
155-163
: Evaluate future extensibility of weekly participant activity fields.
Storing each weekday’s activity count in a single row is fine for now, but if you envisage adding multiple time windows (e.g., monthly or daily details), consider a separate table or structure.
270-288
: ActivityProgress model is straightforward.
The fields are self-explanatory for progress tracking. Consider minimum values (e.g., repeatedCount >= 0) if data anomalies can occur.apps/analytics/src/modules/participant_activity_performance/save_participant_activity_performance.py (3)
1-2
: Iterate carefully over DataFrame rows.
Using “iterrows()” for potentially large data sets can impact performance. Consider vectorized or batch upsert approaches if performance becomes an issue.
3-7
: Creation and update values are logically consistent.
The approach of mirroring fields in both dictionaries for upsert is clear. Optional: unify them if identical to reduce duplication.Also applies to: 8-11
25-26
: MicroLearnings section mirrors the quiz logic well.
No immediate issues. Could be refactored into a smaller shared routine if duplication grows over time.Also applies to: 27-35
apps/analytics/src/scripts/7_participant_activity_performance.py (2)
5-5
: Unused imports detected by static analysis.
“import os” and “import pandas as pd” are flagged as unused. Remove them if you’re not using these modules.Also applies to: 7-7
🧰 Tools
🪛 Ruff (0.8.2)
5-5:
os
imported but unusedRemove unused import:
os
(F401)
38-38
: Remove unnecessary f-string usage.
The print statement includes “f” but no placeholders, so it can be simplified.🧰 Tools
🪛 Ruff (0.8.2)
38-38: f-string without any placeholders
Remove extraneous
f
prefix(F541)
apps/analytics/src/modules/participant_activity_performance/prepare_participant_activity_data.py (1)
59-72
: Nested loops are generally OK but watch out for performance.
This straightforward approach is fine for moderate data sizes. If you handle large courses or many responses, look into bulk operations or indexes to speed queries.apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py (2)
20-20
: Rename loop index to avoid confusion.
The variableidx
is unused. Rename it to “idx” or “” to adhere to static analysis warnings and clarify intent.🧰 Tools
🪛 Ruff (0.8.2)
20-20: Loop control variable
idx
not used within loop bodyRename unused
idx
to_idx
(B007)
22-82
: DataFrame concatenation in a loop can be costly.
Repeatedly callingpd.concat()
in a loop might degrade performance. Consider building a list of dictionaries and creating a single DataFrame at the end.apps/analytics/src/notebooks/participant_activity_performance.ipynb (1)
98-98
: Validate database disconnection logic.
Calling db.disconnect() is good practice. Ensure that this method is not prematurely disconnecting, especially if multiple notebooks or scripts reuse the same Prisma client instance within the same session.packages/prisma/src/prisma/schema/participant.prisma (2)
51-52
: Confirm optional vs. mandatory relationships.
Both fields 'participantActivityPerformances' and 'coursePerformances' are declared as arrays. Make sure the foreign key constraints and cascade rules behave as intended (e.g., no orphaned records if a participant is removed).
51-52
: Validate indexing strategy for performance data.
Consider adding composite indexes or partial indexes on these new performance-related fields if you plan to query large volumes of participant performance data frequently.packages/prisma/src/prisma/schema/quiz.prisma (2)
34-36
: Ensure performance vs. progress fields are well-defined.
These newly-added fields in 'PracticeQuiz' can lead to confusion if there's conceptual overlap (e.g., performance vs. completion). Provide clear definitions and usage guidelines for each field to avoid data inconsistencies.
219-221
: Sustain consistent naming conventions.
'performance' and 'progress' are added to 'MicroLearning'. For clarity, consider harmonizing them with any existing naming patterns (e.g., 'activityPerformance', 'activityProgress') to maintain consistency across the codebase.packages/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql (1)
4-5
: Consider adding constraints for score and completion fields.The
totalScore
andcompletion
fields should have value constraints:
completion
should be between 0 and 1 (or 0 and 100 if using percentage)totalScore
might benefit from a non-negative constraintALTER TABLE "ParticipantActivityPerformance" ADD CONSTRAINT "completion_range" CHECK ("completion" >= 0 AND "completion" <= 1), ADD CONSTRAINT "score_non_negative" CHECK ("totalScore" >= 0);apps/analytics/src/modules/participant_activity_performance/__init__.py (1)
1-3
: Define module's public API explicitly.Add
__all__
to explicitly declare the module's public interface and silence the unused import warnings.from .prepare_participant_activity_data import prepare_participant_activity_data from .save_participant_activity_performance import save_participant_activity_performance from .agg_participant_activity_performance import agg_participant_activity_performance + +__all__ = [ + 'prepare_participant_activity_data', + 'save_participant_activity_performance', + 'agg_participant_activity_performance', +]🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.prepare_participant_activity_data.prepare_participant_activity_data
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.save_participant_activity_performance.save_participant_activity_performance
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.agg_participant_activity_performance.agg_participant_activity_performance
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
apps/analytics/package.json
(1 hunks)apps/analytics/src/__init__.py
(1 hunks)apps/analytics/src/modules/__init__.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/__init__.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/prepare_participant_activity_data.py
(1 hunks)apps/analytics/src/modules/participant_activity_performance/save_participant_activity_performance.py
(1 hunks)apps/analytics/src/notebooks/participant_activity_performance.ipynb
(1 hunks)apps/analytics/src/notebooks/participant_analytics.ipynb
(1 hunks)apps/analytics/src/scripts/0_initial_participant_analytics.py
(1 hunks)apps/analytics/src/scripts/1_initial_aggregated_analytics.py
(1 hunks)apps/analytics/src/scripts/2_initial_aggregated_course_analytics.py
(1 hunks)apps/analytics/src/scripts/3_initial_instance_activity_performance.py
(1 hunks)apps/analytics/src/scripts/4_initial_participant_performance.py
(1 hunks)apps/analytics/src/scripts/5_initial_participant_course_analytics.py
(1 hunks)apps/analytics/src/scripts/6_initial_activity_progress.py
(1 hunks)apps/analytics/src/scripts/7_participant_activity_performance.py
(1 hunks)packages/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql
(1 hunks)packages/prisma/src/prisma/schema/analytics.prisma
(8 hunks)packages/prisma/src/prisma/schema/participant.prisma
(1 hunks)packages/prisma/src/prisma/schema/quiz.prisma
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/analytics/src/notebooks/participant_analytics.ipynb
🧰 Additional context used
🪛 Ruff (0.8.2)
apps/analytics/src/scripts/2_initial_aggregated_course_analytics.py
4-4: os
imported but unused
Remove unused import: os
(F401)
6-6: pandas
imported but unused
Remove unused import: pandas
(F401)
32-32: f-string without any placeholders
Remove extraneous f
prefix
(F541)
apps/analytics/src/scripts/6_initial_activity_progress.py
4-4: os
imported but unused
Remove unused import: os
(F401)
6-6: pandas
imported but unused
Remove unused import: pandas
(F401)
42-42: f-string without any placeholders
Remove extraneous f
prefix
(F541)
apps/analytics/src/scripts/4_initial_participant_performance.py
4-4: os
imported but unused
Remove unused import: os
(F401)
38-38: f-string without any placeholders
Remove extraneous f
prefix
(F541)
apps/analytics/src/scripts/1_initial_aggregated_analytics.py
4-4: os
imported but unused
Remove unused import: os
(F401)
5-5: json
imported but unused
Remove unused import: json
(F401)
92-92: f-string without any placeholders
Remove extraneous f
prefix
(F541)
apps/analytics/src/scripts/0_initial_participant_analytics.py
4-4: os
imported but unused
Remove unused import: os
(F401)
5-5: json
imported but unused
Remove unused import: json
(F401)
apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py
20-20: Loop control variable idx
not used within loop body
Rename unused idx
to _idx
(B007)
apps/analytics/src/scripts/7_participant_activity_performance.py
5-5: os
imported but unused
Remove unused import: os
(F401)
7-7: pandas
imported but unused
Remove unused import: pandas
(F401)
38-38: f-string without any placeholders
Remove extraneous f
prefix
(F541)
apps/analytics/src/scripts/3_initial_instance_activity_performance.py
4-4: os
imported but unused
Remove unused import: os
(F401)
5-5: json
imported but unused
Remove unused import: json
(F401)
6-6: datetime.datetime
imported but unused
Remove unused import: datetime.datetime
(F401)
8-8: pandas
imported but unused
Remove unused import: pandas
(F401)
45-45: f-string without any placeholders
Remove extraneous f
prefix
(F541)
apps/analytics/src/scripts/5_initial_participant_course_analytics.py
5-5: os
imported but unused
Remove unused import: os
(F401)
6-6: json
imported but unused
Remove unused import: json
(F401)
7-7: datetime.datetime
imported but unused
Remove unused import: datetime.datetime
(F401)
9-9: pandas
imported but unused
Remove unused import: pandas
(F401)
40-40: f-string without any placeholders
Remove extraneous f
prefix
(F541)
apps/analytics/src/__init__.py
1-1: from .modules import *
used; unable to detect undefined names
(F403)
2-2: from .notebooks import *
used; unable to detect undefined names
(F403)
3-3: from .scripts import *
used; unable to detect undefined names
(F403)
apps/analytics/src/modules/__init__.py
1-1: from .participant_analytics import *
used; unable to detect undefined names
(F403)
2-2: from .aggregated_analytics import *
used; unable to detect undefined names
(F403)
3-3: from .participant_course_analytics import *
used; unable to detect undefined names
(F403)
4-4: from .aggregated_course_analytics import *
used; unable to detect undefined names
(F403)
5-5: from .participant_performance import *
used; unable to detect undefined names
(F403)
6-6: from .instance_activity_performance import *
used; unable to detect undefined names
(F403)
7-7: from .activity_progress import *
used; unable to detect undefined names
(F403)
8-8: from .participant_activity_performance import *
used; unable to detect undefined names
(F403)
apps/analytics/src/modules/participant_activity_performance/__init__.py
1-1: .prepare_participant_activity_data.prepare_participant_activity_data
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .save_participant_activity_performance.save_participant_activity_performance
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .agg_participant_activity_performance.agg_participant_activity_performance
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
🔇 Additional comments (17)
apps/analytics/package.json (1)
16-21
: Verify Node.js version consistency
The Node.js version is strictly pinned to 20, which might be too restrictive. Let's verify if this matches the root configuration and consider using a more flexible version range.
✅ Verification successful
Node.js version is consistent with root configuration
The Node.js version (=20
) in apps/analytics/package.json
exactly matches the version specified in the root package.json
. Most packages in the monorepo also use the same strict version, with only one exception (apps/office-addin
using >=16
). Since this follows the project-wide convention, the strict version pinning appears to be intentional.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check Node.js version consistency across the project
# Get Node.js version from root package.json
echo "Root package.json Node.js version:"
jq -r '.engines.node' package.json
# List all package.json files and their Node.js version requirements
echo "\nNode.js versions across all package.json files:"
fd package.json --type f --exec sh -c 'echo "{}" && jq -r ".engines.node // \"not specified\"" "{}"'
Length of output: 1198
apps/analytics/src/scripts/0_initial_participant_analytics.py (1)
36-40
: Verify time zone handling.
Using naive date/time strings (e.g., “YYYY-MM-DD”) and converting them to “T00:00:00.000Z” might overlook actual time zone offsets. If your data or users span multiple time zones, ensure that time zone conversions are addressed or confirm that naive handling is sufficient.
✅ Verification successful
Let me search for timezone-related code and patterns in the codebase to better understand the context and potential impact.
Let me check the specific timezone handling in the files where we found datetime usage.
Based on the analysis of the codebase, I can now provide a final response:
Time zone handling is consistently implemented and safe.
The codebase consistently handles timezone-sensitive operations by:
- Using UTC for all database timestamps (appending 'Z' suffix)
- Properly localizing timestamps when needed (e.g.,
tz_localize("UTC")
in weekday activity calculations) - Following a consistent pattern where date strings are converted to timestamps with explicit timezone markers (T00:00:00.000Z or T23:59:59.999Z)
The naive date handling in the reviewed code is intentional and appropriate because:
- Start/end dates are always converted to UTC timestamps before database operations
- Date ranges are used for binning/aggregation purposes where timezone precision isn't critical
- All timestamp comparisons use consistent UTC timezone markers
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Search for timezone-related imports and usage
rg -l "timezone|pytz|tzinfo|tz" apps/analytics/src/
# Look for datetime usage patterns
ast-grep --pattern 'datetime.now()' apps/analytics/src/
# Search for any configuration files that might contain timezone settings
fd -e yaml -e yml -e json -e conf -e ini . apps/analytics/src/
# Check if there are any other files using date ranges
rg "date_range" apps/analytics/src/
Length of output: 5141
Script:
#!/bin/bash
# Check the content of files where we found datetime usage, focusing on timezone handling
rg -B 5 -A 5 "T00:00:00.000Z|T23:59:59.999Z" apps/analytics/src/
# Check the content of the weekday activity file that had timezone references
cat apps/analytics/src/modules/aggregated_course_analytics/compute_weekday_activity.py
# Look for any timezone-related configurations or imports at the top of analytics files
rg -B 5 "^import|^from.*import" apps/analytics/src/modules/participant_analytics/compute_participant_course_analytics.py
rg -B 5 "^import|^from.*import" apps/analytics/src/scripts/0_initial_participant_analytics.py
Length of output: 37608
apps/analytics/src/scripts/3_initial_instance_activity_performance.py (3)
50-68
: Review the logic for handling practice quizzes.
The core flow looks consistent:
• Compute instance performance.
• Skip if empty.
• Aggregate activity performance.
• Save instance performance data.
• Save activity performance data.
Everything seems logical. Ensure you have comprehensive tests covering courses with no quizzes and courses where quizzes have zero participants.
33-34
: Database connection management is implemented correctly.
The connection is opened once and closed at the end, which is good practice. Continue to ensure no intermittent errors remain unhandled.
Also applies to: 90-91
69-89
: Align microlearning flow with practice quizzes to ensure consistent data handling.
The structure parallels the practice quiz flow. Confirm that the “total_only” parameter is correctly used and tested.
packages/prisma/src/prisma/schema/analytics.prisma (6)
57-59
: Consistent timestamp fields added.
The added “createdAt” and “updatedAt” fields across multiple models follow standard best practices. Make sure to maintain them consistently, especially when performing bulk or manual updates.
Also applies to: 70-70, 83-85, 104-106, 125-127, 146-148
164-169
: No issues spotted with the relational mapping.
Ensuring a unique course ID for aggregated course analytics is consistent with single-record logic per course.
223-249
: ActivityPerformance model looks consistent with your approach.
No critical issues found, but be mindful of large-scale expansions if you plan to store many fields per day or per attempt.
251-268
: Two unique constraints might cause unexpected duplication errors.
With "@@unique([participantId, practiceQuizId])" and "@@unique([participantId, microLearningId])", ensure your logic never tries to fill both IDs in a single record.
25-26
: Double-check time precision requirements for analytics.
Storing timestamps as a date only (without time) might limit or complicate more granular analyses or daily cross-overs. Confirm that this meets your analytics needs.
✅ Verification successful
Let me search for more context about the analytics schema and its usage to better understand the time precision requirements.
Let me try a different approach to search for analytics-related files and usage patterns.
Let me search for the actual usage of these timestamp fields in the analytics processing code.
Based on the code analysis, I can now provide a definitive response about the timestamp precision requirements:
@db.Date precision is appropriate for the analytics timestamps
The timestamp fields in the analytics schema are correctly using @db.Date
precision because:
- The analytics data is aggregated on a daily, weekly, monthly, or course-wide basis as shown by the
AnalyticsType
enum - The Python code processes analytics in day-level granularity, with timestamps always set to the start or end of a day (e.g.,
T00:00:00.000Z
orT23:59:59.999Z
) - No sub-day (hourly/minute) analytics are being computed or stored
- Cross-day analysis is handled through the separate daily aggregations, not through time components
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Search for files containing analytics schema references
rg -l "analytics" --type prisma
# Search for any analytics-related queries or operations
rg "analytics" -A 5 --type typescript --type javascript
Length of output: 153
Script:
#!/bin/bash
# Search for files containing "analytics" in their name or content
fd analytics
# Search for content with analytics references using basic grep
rg "analytics" -A 5
Length of output: 74237
Script:
#!/bin/bash
# Search for timestamp usage in Python files
rg "timestamp" --type py -A 3 apps/analytics/
# Search for timestamp in the schema file
cat packages/prisma/src/prisma/schema/analytics.prisma
Length of output: 29003
195-221
: Optionally, constrain first/last/total rates in “InstancePerformance.”
It’s good to track attempt-based rates. However, ensure partial rates, error rates, etc., sum to sensible values (e.g., 1.0) if that’s a design requirement.
✅ Verification successful
Let me search for any related schema constraints or validation logic in the codebase.
Let me search for any validation or calculation logic related to these rates.
Based on the code analysis so far, I can see that the rates are used in GraphQL types and services, but I haven't found explicit validation logic. Let me try one more search pattern to look for any rate-related calculations or validations.
Based on the code analysis, I can now provide a final response:
The rates in InstancePerformance are already constrained by design.
The codebase shows that these rates are handled consistently throughout the application:
- The rates are used in pairs (error/partial/correct) across first, last, and total attempts
- Frontend components like
PerformanceRatesBarChart
enforce that rates sum to 100% by calculatingroundedCorrectRate = 100 - roundedErrorRate - roundedPartialRate
- The analytics service maintains consistency by deriving rates from actual performance data
No additional database constraints are necessary as the data integrity is maintained at the application level.
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Search for any validation logic related to InstancePerformance rates
rg -A 5 "InstancePerformance.*Rate" --type ts
# Look for any test files related to InstancePerformance
fd "InstancePerformance.*test" --type file
# Search for any service/utility files that handle rate calculations
rg -A 5 "(first|last|total)(Error|Partial|Correct)Rate" --type ts
Length of output: 65904
Script:
#!/bin/bash
# Search for any validation or calculation logic for the rates
rg -A 5 "calculatePerformance|validateRates|computeRates" --type ts
# Look for any database triggers or constraints
fd "analytics.*sql" --type file
# Search for any test files that verify rate calculations
rg -A 5 "test.*Rate" --type ts
Length of output: 136
Script:
#!/bin/bash
# Search for any files that might contain rate calculations or validations
rg -A 5 "Rate.*=|updateRates|setRates|rate.*validation" --type ts
# Look for any prisma middleware or hooks that might handle rate validation
fd "middleware.*ts|hooks.*ts" --type file
# Search for any utility functions that handle rates
rg -A 5 "utils.*rate" --type-add 'ts:*.{ts,tsx}' --type ts
Length of output: 31660
apps/analytics/src/modules/participant_activity_performance/save_participant_activity_performance.py (1)
13-14
: PracticeQuizzes upsert logic is sound.
The “where” clause ensures uniqueness on participant and quiz ID, preventing duplicates. Confirm that you handle missing or invalid “activityId” values upstream.
Also applies to: 15-23
✅ Verification successful
Let me gather information about how activityId is handled before reaching this point.
Let me search for the validation of activityId in the Python codebase.
Let me search for the complete implementation and any validation in the surrounding code.
Let me check how activityId is prepared before reaching this function.
Based on the complete code analysis, I can now generate the final response:
The activityId handling is properly validated upstream. The prepare_participant_activity_data.py ensures that:
- Only activities with valid statuses ("PUBLISHED", "ENDED", "GRADED") are processed
- ActivityIds are extracted directly from the database query results
- The data is properly structured before reaching the upsert operation
The uniqueness constraint in the upsert operation (participantId_practiceQuizId
and participantId_microLearningId
) provides additional safety against duplicates.
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Search for the function containing this code and its context
ast-grep --pattern 'def save_participant_activity_performance($$$)'
# Also search for any validation or preprocessing of activityId
rg "activityId" -A 5 -B 5
Length of output: 65632
Script:
#!/bin/bash
# Search for the save_participant_activity_performance function implementation
rg -A 10 -B 10 "def save_participant_activity_performance" apps/analytics/src/modules/participant_activity_performance/
# Search for any validation of activityId
rg "activityId.*validate" -A 5 -B 5
Length of output: 1735
Script:
#!/bin/bash
# Get the complete file content to understand the validation context
cat apps/analytics/src/modules/participant_activity_performance/save_participant_activity_performance.py
# Also search for any input validation in related files
rg -g "*.py" "activityId" apps/analytics/src/modules/participant_activity_performance/
Length of output: 3763
Script:
#!/bin/bash
# Check the prepare_participant_activity_data.py file for activityId handling
cat apps/analytics/src/modules/participant_activity_performance/prepare_participant_activity_data.py
# Also check the aggregation file
cat apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py
Length of output: 6351
apps/analytics/src/modules/participant_activity_performance/agg_participant_activity_performance.py (2)
5-17
: Grouping approach for aggregated data is clear.
The logic to sum scores, trial counts, and response counts is straightforward. Keep an eye on memory usage for large DataFrame
merges.
85-87
: Upserting data is correct.
Storing the aggregated performance data using save_participant_activity_performance
is consistent with the rest of this module’s design.
apps/analytics/src/notebooks/participant_activity_performance.ipynb (1)
70-89
: Verify handling of empty responses.
When no responses are submitted, the script correctly skips analytics processing. However, confirm that you handle edge cases such as partially completed courses or incomplete data for other fields (like missing participant metadata) before continuing analytics computations.
✅ Verification successful
Let me gather information about how edge cases are handled in the codebase.
Let me try a different approach to search for these functions and their error handling.
Let me try one more search to find the implementation of these functions.
Based on the code analysis, I can now provide a final response:
Edge cases are properly handled in the analytics pipeline.
The codebase shows comprehensive handling of empty or missing data:
- Empty responses are explicitly checked and skipped:
if df_responses.empty: continue
- Missing participant data is handled:
if df_activity.empty: print("No participant was active in the course, skipping")
- Null checks are performed before processing:
if df_details is None: print("No participant responses found...")
- Empty performance data is gracefully handled:
if df_instance_performance.empty: continue
The implementation shows robust error handling for all the edge cases mentioned in the review comment, including empty responses, partially completed courses, and missing participant metadata.
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Search for prepare_participant_activity_data function implementation
ast-grep --pattern 'def prepare_participant_activity_data($$$)'
# Search for agg_participant_activity_performance function implementation
ast-grep --pattern 'def agg_participant_activity_performance($$$)'
# Look for any error handling or data validation in these functions
rg -A 5 "raise|error|exception|assert|validate|check" --type python
Length of output: 235
Script:
#!/bin/bash
# Search for the functions in any file
rg -A 10 "def prepare_participant_activity_data"
# Search for the aggregation function
rg -A 10 "def agg_participant_activity_performance"
# Look for error handling patterns
rg -A 5 "raise|error|exception|assert|validate|check"
# Find all Python files that might contain these functions
fd ".py$"
Length of output: 125936
Script:
#!/bin/bash
# Look for the prepare_participant_activity_data function implementation
fd -e py -x cat {} | rg -A 10 "def prepare_participant_activity_data"
# Look for the agg_participant_activity_performance function implementation
fd -e py -x cat {} | rg -A 10 "def agg_participant_activity_performance"
# Look for any data validation in these files
fd -e py -x cat {} | rg -A 5 "if.*empty|if.*None|if not|raise|assert"
Length of output: 8692
packages/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql (1)
20-26
: Review CASCADE deletion impact.
The ON DELETE CASCADE behavior on all foreign keys means that deleting a participant or activity will automatically remove all their performance records. Verify if this aligns with data retention requirements.
apps/analytics/src/__init__.py (1)
1-3
: Avoid wildcard imports.
Wildcard imports can lead to namespace pollution and make it difficult to track which names are present in the namespace. Import only the necessary components explicitly.
🧰 Tools
🪛 Ruff (0.8.2)
1-1: from .modules import *
used; unable to detect undefined names
(F403)
2-2: from .notebooks import *
used; unable to detect undefined names
(F403)
3-3: from .scripts import *
used; unable to detect undefined names
(F403)
.../analytics/src/modules/participant_activity_performance/prepare_participant_activity_data.py
Show resolved
Hide resolved
...s/prisma/src/prisma/migrations/20241223092511_participant_activity_performance/migration.sql
Show resolved
Hide resolved
Important Review skippedMore than 25% of the files skipped due to max files limit. The review is being skipped to prevent a low-quality review. 132 files out of 222 files are above the max files limit of 75. Please upgrade to Pro plan to get higher limits. You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…formance as table (#4412)
|
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores