Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dkan linkchecker feature #2339

Merged
merged 4 commits into from
Apr 28, 2018
Merged

Add dkan linkchecker feature #2339

merged 4 commits into from
Apr 28, 2018

Conversation

janette
Copy link
Member

@janette janette commented Jan 19, 2018

User Story

As a user in charge of data integrity, I want to be able to see where links are failing in my data catalog.

  • See broken URLs report displayed in a human-readable format
  • Find/exclude intentionally non-public results (public access level)
  • Find/exclude different error codes/types (i.e. redirects vs. file not found)
  • See, along with broken URLs, the data resource URL so that they can easily find the resource
  • See, along with broken URLs, the dataset, so that they can group multiple-bad-URLs that are from a single dataset
  • See contact info for the dataset so that they can reach out to the data source
  • Know if a bad link is from a JSON harvest, so that they can change it there.

QA steps

  • drush en dkan_linkchecker

  • drush fr dkan_linkchecker -y

  • drush cc all

  • Add a dataset with bad links in multiple fields

  • Add resource with a bad remote file

  • Add a harvest source with a bad url

  • if Local:

    • drush linkchecker-analyze
    • drush cron
  • if Probo:

    • Go to Configuration > Content authoring > Link checker
    • Scroll to the bottom, click Maintenance, then click Reanalyze content for links
    • run cron
  • Admin should see Broken Links Report under 'Reports'

  • Admin should see Link checker under 'Configuration > Content Authoring'

  • Sitemanager should see Link Checker Settings in the admin menu under 'Site Configuration'

  • Sitemanager should see Broken Links Report in the admin menu under 'Site Configuration > Link Checker Settings'

  • View broken link report and confirm the bad links are listed.

Reminders

  • There is test for the issue.
  • CHANGELOG updated.
  • Coding standards checked.
  • Review docs.getdkan.com (or in /docs) to see if it still covers the scope of the PR and update if needed.

Connects https://github.com/NuCivic/client-usva-data/issues/69

@erogray
Copy link

erogray commented Jan 23, 2018

@janette janette force-pushed the dkan-linkchecker branch 2 times, most recently from e8dda20 to 9b2ff14 Compare February 6, 2018 02:23
@kimwdavidson kimwdavidson mentioned this pull request Feb 6, 2018
@janette janette force-pushed the dkan-linkchecker branch 5 times, most recently from 32cbb50 to 6da7289 Compare February 13, 2018 19:51
@janette
Copy link
Member Author

janette commented Mar 13, 2018

Fails linting

Generating autoload files
Config value "installed_paths" added successfully
Diff URL: https://github.com/GetDKAN/dkan/pull/2339.diff
Linting: modules/dkan/dkan_dataset/modules/dkan_dataset_content_types/dkan_dataset_content_types.info
modules/dkan/dkan_dataset/modules/dkan_dataset_groups/dkan_dataset_groups.info
modules/dkan/dkan_dataset/modules/dkan_dataset_rest_api/dkan_dataset_rest_api.info
modules/dkan/dkan_dataset/modules/dkan_dataset_voting/dkan_dataset_voting.info
modules/dkan/dkan_datastore/modules/dkan_datastore_api/dkan_datastore_api.info
modules/dkan/dkan_datastore/modules/dkan_datastore_fast_import/dkan_datastore_fast_import.info
modules/dkan/dkan_linkchecker/dkan_linkchecker.features.inc
modules/dkan/dkan_linkchecker/dkan_linkchecker.features.user_permission.inc
modules/dkan/dkan_linkchecker/dkan_linkchecker.info
modules/dkan/dkan_linkchecker/dkan_linkchecker.install
modules/dkan/dkan_linkchecker/dkan_linkchecker.module
modules/dkan/dkan_linkchecker/dkan_linkchecker.strongarm.inc
modules/dkan/dkan_linkchecker/dkan_linkchecker.views_default.inc
modules/dkan/dkan_sitewide/modules/dkan_sitewide_panelizer/dkan_sitewide_panelizer.info
modules/dkan/dkan_topics/modules/dkan_default_topics/dkan_default_topics.info
+linkchecker.module
test/dkanextension/src/Drupal/DKANExtension/Context/LinkcheckerContext.php
themes/nuboot_radix/nuboot_radix.info
ERROR: The file "+linkchecker.module" does not exist.

* Implements hook_preprocess_page().
*/
function dkan_linkchecker_preprocess_page(&$vars) {
if (arg(2) == 'dkan-linkchecker-report' || arg(4) == 'dkan_linkchecker_reports') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the check on arg(4)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size of the table is so large that I'm reducing font size so it does not require an annoying amount of scrolling. The arg(4) is to include this small font size on the views UI page for the same reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, lets add a comment to that effect. I think the arg(2) is self explanatory.

*/
function dkan_linkchecker_menu_alter(&$items) {
// Remove normal linkchecker report link.
$items['admin/reports/linkchecker']['access callback'] = FALSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming that this report is the default view you disabled previously. In that case, is this necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't disable the original view there will be multiple report pages and confusion in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the code in dkan_linkchecker_enable do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is removing the link to the disabled view from the admin menu.

class LinkcheckerContext extends RawDKANContext {

protected $old_global_user;
public static $modules_before_feature = array();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason for this variables to be public? It looks like we (this class) are managing them through the whole cycle of the use of this context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based this code on what is being used in the workbench context, I have no reasoning beyond that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make them private then. public properties are generally not good as we can easily lose control of their state.

| name | mail | roles |
| John | john@example.com | site manager |

Given groups:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not using the groups or group memberships on the test, I think? Why are we creating them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so the test passes on client sites that require group assignment on their datasets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boooo... Lets add a comment then. Without that knowledge I would have deleted that during clean up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok after looking again I remember I removed the test that created a dataset with a bad link because too much needs to happen to get that bad link into the report. So since we are not creating a dataset I've removed the group background setup.

*/
class LinkcheckerContext extends RawDKANContext {

protected $old_global_user;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we should discuss OOP style standards as a group, but I personally like proper camel case both for methods and properties of a class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the PSR standards and they do not say anything about how to name properties, so I will back down for now :)

* Implements hook_preprocess_page().
*/
function dkan_linkchecker_preprocess_page(&$vars) {
if (arg(2) == 'dkan-linkchecker-report' || arg(4) == 'dkan_linkchecker_reports') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, lets add a comment to that effect. I think the arg(2) is self explanatory.

*/
function dkan_linkchecker_menu_alter(&$items) {
// Remove normal linkchecker report link.
$items['admin/reports/linkchecker']['access callback'] = FALSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the code in dkan_linkchecker_enable do?

*/
class LinkcheckerContext extends RawDKANContext {

protected $old_global_user;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the PSR standards and they do not say anything about how to name properties, so I will back down for now :)

class LinkcheckerContext extends RawDKANContext {

protected $old_global_user;
public static $modules_before_feature = array();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make them private then. public properties are generally not good as we can easily lose control of their state.

| name | mail | roles |
| John | john@example.com | site manager |

Given groups:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boooo... Lets add a comment then. Without that knowledge I would have deleted that during clean up.

@kimwdavidson
Copy link

@janette We QA'd this, but got stuck on the last step!

After running cron, there's no Broken Links Report under 'Reports' for admins and no Broken Links Report in the admin menu for site managers.

@janette
Copy link
Member Author

janette commented Apr 23, 2018

@kimwdavidson @dafeder did you run cron?
I cleared cache

logged in a admin:
reports___dkan

logged in as sitemanager:
site-manager

@kimwdavidson
Copy link

kimwdavidson commented Apr 23, 2018 via email

@janette
Copy link
Member Author

janette commented Apr 23, 2018

@kimwdavidson ok I guess the trick was clearing cache

@fmizzell
Copy link
Contributor

@janette yes we did

@kimwdavidson
Copy link

@janette: QA'd and this all looks great, except when I'm logged in as a site manager, I can find the broken links report if I go into Site Configuration -> Link Checker Settings, but is it supposed to just be an item at the top of the menu?

@janette
Copy link
Member Author

janette commented Apr 27, 2018

@kimwdavidson do you want to screen share? the link is under Site Configuration if you are a site manager

@kimwdavidson
Copy link

Checked that last thing, and it's working as expected, so this is all ready!

@fmizzell fmizzell merged commit 6ee39a2 into develop Apr 28, 2018
@fmizzell fmizzell deleted the dkan-linkchecker branch April 28, 2018 23:41
fmizzell pushed a commit that referenced this pull request Apr 28, 2018
* Adds a linkchecker to DKAN.
fmizzell pushed a commit that referenced this pull request Apr 29, 2018
* Adds a linkchecker to DKAN.
dafeder pushed a commit that referenced this pull request Apr 24, 2020
* Adds a linkchecker to DKAN.
dafeder pushed a commit that referenced this pull request Apr 24, 2020
* Adds a linkchecker to DKAN.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants