Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add service to update namespace stats. #5075

Merged

Conversation

vaidehi44
Copy link
Contributor

@vaidehi44 vaidehi44 commented Jul 27, 2022

What this PR does

Defines a service to generate/update namespace stats to CourseStats.

Steps to reproduce for cookbook namespace

  1. Copy/ Import the course -https://outreachdashboard.wmflabs.org/courses/Edriiic/Nigerian_Cuisine
  2. In rails console,
course = Course.find_by(title: "Nigerian Cuisine");
wiki = course.wikis.find_by(project: "wikibooks");
  1. Update revisions with UpdateCourseStats.new(course)
  2. Update namespace stats with UpdateWikiNamespaceStats.new(course, wiki, 102)


# live, tracked articles filtered by wiki and namespace
def articles_filtered_by_wiki_namespace
@course.articles_courses.joins(:article).where(articles: { wiki: @wiki, namespace: @namespace })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't include generating ArticlesCourses for non-mainspace articles, correct? So this will always be zero for this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. I missed it. Will correct it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ragesoss, right now tracked_namespaces are not known, so should I enable article_courses for all namespaces?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, i don't think so. I think it would be better to include a minimal version of tracked_namespaces in this PR such that it could be deployed without changing current behavior, but would still allow for testing.

One approach would be to define tracked_namespaces for course.rb but just have it return [0] (ie, hard-code the current behavior of mainspace being the only tracked namespace for any course), and then in the test you can stub that method to override it and add cookbook. This would be the simplest way i could think of for this PR, although it would mean a small amount of rewriting for the next stage.

Another approach would be include the new database table in this PR, so that tracked_namespaces could be determined from the database (which will be necessary for the later PRs anyway).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, are you going to deploy it right away? I mean before we have tracked_namespaces from CourseWikiNamespaces ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe! Part of the point of the separate PRs is to make it possible to deploy before everything else is ready.

@ragesoss
Copy link
Member

What additional changes would be necessary to ensure that the correct stats are being generated (via an rspec test)?

I guess generating ArticlesCourses for non-mainspace articles is the main thing?

@ragesoss
Copy link
Member

ragesoss commented Aug 1, 2022

Now with the tracked_namespaces, you should be able to write a test for generating ArticlesCourses records and calculating stats for a Cookbook course. This will probably entail something like allow_any_instance_of(Course).to receive(:tracked_namespacse).and_return([0, 102]) or similar.

@@ -282,6 +282,10 @@ def tracked_revisions
.where(wiki_id: wiki_ids)
end

def tracked_namespaces
return ['0']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an integer instead of a string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def self.get_mainspace_revisions(revisions)
revisions.joins(:article).where(articles: { namespace: '0' })
end

Yeah, but in articles_courses it was a string, so I kept it a string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I don't think there's any reason for it to be a string, as namespace is an integer in the database.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay.


require 'rails_helper'

describe UpdateWikiNamespaceStats do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look solid!

expect(stats).to have_key(:user_count)
expect(stats).to have_key(:word_count)
expect(stats).to have_key(:reference_count)
expect(stats).to have_key(:view_count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What values do these have? Does this test exercise the generation of non-zero values for all of these stats? (I'm guessing no for the view count, and yes for the most of the rest?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I haven't checked for the values.
So, should I get the count for each attribute and write a test for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I wasn't sure if I should include the test where it checks for each attribute in key. Should I keep it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We might find that such values can vary a little bit over time (if any of the edits get deleted at some point) but for now, I think it's better to have a little extra test focus on verifying that these stats are actually being counted the way we think they should be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I wasn't sure if I should include the test where it checks for each attribute in key. Should I keep it?

Yes, I think that's fine. It makes it a little more explicit what shape we expect this data to take.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ragesoss, have updated the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants