Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display format of authors name on dataset page (#80, #81, #82) #165

Merged
merged 32 commits into from
Feb 13, 2018

Conversation

rija
Copy link
Contributor

@rija rija commented Feb 12, 2018

Author names display

Functionality

The work aims to improve the way author names are displayed on datasets by implementing the following:

  • Fix issue where accentuated characters where not displayed correctly
  • Apply the same formatting rule to author names on dataset page as on Gigascience Journal's papers
  • Enable curators with free-form editing of the display of author names on dataset pages

Implementation

algorithm for displaying author name

In the model Author.php, I've created a new static function (generateDisplayName) that:

  • applies a formula for automatically generating an appropriately formatted display name for authors
  • that caters for many variations and combinations of surname, middle name and first name
  • removes some trailing special characters
  • does not uppercase names and initials (unlike the old behaviour) so to respect that some names should stay lowercase even when taken as initials

The model getters for author's name used in controllers and views make use of generateDisplayName() in its various forms.

mbstring

The reason the accentuated characters were not always displaying properly,
is because the string manipulation to display the name were using non unicode-aware PHP functions.

PHP has unicode-aware string manipulation functions but they are not part of the core library.
They are part of the php-mbstring PHP extention.

So for this to work, I have:

  • updated gigadb Chef's default recipe to install the php-mbstring module.
  • updated gigadb Chef's default template for the main config to configure mbstring (only needed for PHP<5.6)
  • updated Yii test config to configure mbstring (only needed for PHP<5.6)

database change

A database change has been performed to enable the feature for free-form customising the display of author names.
a new column, custom_name has been added to the author table.

I have updated the sql files in the protected/sql directory as well as added a migration file in the protected/schema directory.

Development & Testing

automated acceptance tests using Behat and Mink

There are two new .feature files, author-names.feature has the acceptance criteria for how the names are displayed
While name-preview.feature has the acceptance criteria for the new author field to free-form customise the display of an author's name.

The step definitions and the way database sate is reset between scenarios have been refactored
so that all feature can set the database according to their needs without impacting other features (e.g: affilate login is happy to run with default test data, while the author-names.feature need a special production-like dataset that has a wide selection of authors)

unit tests using PHPUnit and Yii CDbFixtureManager

This branch introduces unit testing.
Initially used to reproduce the bug from issue #82, it is used to test-drive at code level the functionalities of issues #80, #81.

It requires phpunit which is part of the Behat folder (installed with composer install).

It also uses Yii's unit testing features (CDbFixtureManager) to set up database fixtures.
This required creating a new test database and:

  • new Chef instructions in Gigadb's default recipe to create a gigadb_test database
  • a new json config for the test database configuration details: db_test.json
  • the test config file has been updated to load the test database and to configure constraints settings
  • a new sql file gigadb_unit_tests.sql to repopulate the test database between test run

The test runner ./tests/run has been updated to run all the tests (acceptance tests and then unit tests) and set initial and final database state appropriately. TESTING.md has also been updated with info on unit testing.

Rija Menage added 30 commits January 31, 2018 20:51
Wrote test scenarios for issue #81 using examples from the prodcution database.
Feature Context file boilerplate created.
Behat framework installed.
…science story

implemented the step definitions for the scenario to implement the story for issue #81 from gigascience perspective.

Made used of Behat tables to and real author names from the database.

That means taht before each run of the suite, the sample of production database needs to be loaded. The database dump is called "author-names-80-81-82.pgdmp" and it differs only from the  2016 dump by having an Attribute record for "urlredirect".
It was tricky to get PHPUnit with Yii1.1 with Gigadb-website running in Vagrant.
Among the several problem:   recent version of Php unit and its extensions would cause issues (4.8 fails. but 4.1 is ok).
Also the phpunit selenium extension is required even we don't use it  because the Yii framework references it!
Also avoid using minimum-stability as 'dev' in composer.json. prefer 'stable' and override with @dev on individual package if necessary.
…uce bug

Set up unit test framework for gigadb-website's Yii install.
Test are to be run inside vagrant ssh session using the command:
./../../Behat/vendor/phpunit/phpunit/phpunit --config=phpunit.xml unit

Created a test to replicate the bug mentioned in #82. Now the test fail as expected.

The unit test infrastructure follow best practices for Yii 1.1.

configuring bootstrap.php required extra care and is what glues everything.
work on fix to  #82 by replacing php string operations by unicode aware functions from mbstring.

The mbstring php extension is not installed by default, so needs to update gigadb provisioning to include the Chef recipe for mbstring.

The problem is that it seems the php.ini file needs to be updated with some minimal configuration parameters for mbstring, but the Chef recipe for php::module_mbstring is not doing it:

[mbstring]
mbstring.language = Neutral
mbstring.internal_encoding    = UTF-8
mbstring.encoding_translation = On

So I still needs to figure either how to get the provisioning mechanims to update php.ini or to make the configuration change unecessary.

I've also improved unit tests coverage for this functionality to cover Dataset methods that are related to authors' names.
second part of the fix to #82.
Added  a php configuration setting to enable the use of multibyte unicode aware string functions from the php module mbstring.

That setting is only needed when running PHP <5.6 as from 5.6, that setting is deprecated and the  default setting for characterset is UTF-8.

Updated both the main config and the test config as the unit tests are running from the test config.
set up the unit tests for the necessary routines. Currently failing as expected as implementation incomplete.
The display format for authors on the dataset page now is the same as the references on Gigascience Journal and all tests (unit and acceptance) are passing.
Installed php-pecl-xcebug in vagrant chef recipe as it is needed by phpunit to calculate test coverage.
Set up the creation of a test database in gigadb chef recipe, so that unit tests can run without disrupting manual and automated user testing by wiping out the development database.
Created a db_test.json config for the test database.
Moidified Yii test config to use the test database.
Run the test coverage for the gigadb-website codebase and generate a comprehensive html report.
I didn't realised the test.php config for Yii was also pulling main.php and therefore overriding the $dbConfig variable causing the unit tests to use the mai database. Corrected it by loadding the test database config into a distinct variable.
figured out how to organise Behat FeatureContext's step definitions in multiple file. The answer is to use subContexts.
Created a feature and a scenario for the preview functionality.
Refactored the Author's displayName function to be more flexible.
Added a field in the author view to show display name.
Updated the login form to be actionable by automated tests (submit button needs to be wihin the <form></form> element.

Moved snapshot after failure hook to GigadbWebsite so it can be automatically available to all subcontexts.

Got all acceptance tests scenario passing.
part of attempt at setting functional testing with phpunit
all test passing.

Just made sure the admin login comes from env variables.
Also add test_users* env variables with tests login/password in .gitignore.
moved the sign in as admin to the main context as it's a common scenario step.
added a test runner for acceptance tests
configure unit test to stop on failure and to show colorful output
…unit tests

fix to the following error:

CDbException: CDbCommand failed to execute the SQL statement: SQLSTATE[42501]: Insufficient privilege: 7 ERROR:  permission denied: "RI_ConstraintTrigger_73211" is a system trigger
temporary fix to the error below. Proper fix is to find way to create a fixture for gigadb_user table (whose model class as a different name) but I failed to figure how to get that working so far.

CDbException: CDbCommand failed to execute the SQL statement: SQLSTATE[23503]: Foreign key violation: 7 ERROR:  insert or update on table "dataset" violates foreign key constraint "dataset_submitter_id_fkey"
DETAIL:  Key (submitter_id)=(1) is not present in table "gigadb_user".
It's best practice to use "composer install" insstad of keeeping the composer vendorised subtree of dependencies.
downgraded PHPUnit to 4.1.1, so it work with the Yii Framework (even 4.3.* doesn't work with Yii 1.1).
Switched to minimum-stability: stable, so I can remove the @stable suffixes
add @dev suffix to mink-wunit-driver as that's the only branch avaiable for that package.
Removed google api client as it's already in Gigadb codebase.

Because of the PHPUNit downgraded, cannot use namespace when using PHPUNit Assert
(PHPUNit\Framework\Assert had to be replaced by PHPUnit_Framework_Assert)

Controlled which Behat hooks get run with which scenario using tags on features and on hooks
so that for example oauth revokation hook are only actioned in affilate login scenario

Addded a scenario to ensure the test environment is loaded up with all the test data in the author name display and name display preview tests.

Updated one sql test data script to drop the not null constraint on the affiliation column of the gigadb_user table.

Updated the test runner so it run all the acceptance tests and unit tests.
…yr is gitignored and dev will run "composer install" on their environment.
…cumentation

After all tests have run, the previous state of the database is restored.
This is done in the test runner now as it is a concern that overarch all tests.
Updating the TESTING docs with info on running unit tests and the above database setup.
-behat.yml because the correct one is in tests/behat.yml
-composer.lock because dev are supposed to do composer install on their environment. The entire composer vendord directory is gitignored so no need of this file.
Fixed syntax of pg_dump to ensure the initial state of database is saved and restored.
increased sleep time after terminating pg backend processes
Made the test runner logs visible in protected/runtime/
Moved the printCurrentUrl inside the try{} block as it blows up when a step fails and no web session has been started yet(with visit)
…e deterministic

By default CDbFixtureManager.php loads the fixtures using readdir() which returns list files in the order in which they are stored by the filesystem. It "seems" than that order vary from system to system resulting in the following error:

CDbException: CDbCommand failed to execute the SQL statement: SQLSTATE[23503]: Foreign key violation: 7 ERROR:  insert or update on table "dataset_author" violates foreign key constraint "dataset_author_author_id_fkey"
DETAIL:  Key (author_id)=(1) is not present in table "author".

because the join table fixture is loaded before the data table.

I've created an init.php file in the fixtures directory to customise the fixture loading behaviour, in this case, ensuring the fixtures are loaded  in the order that won't violate the foreign key constraint.
getDisplayName on Author model used the third form for generateDisplayName (as intended) instead of making the concatenation itself
Rija Menage added 2 commits February 8, 2018 21:45
…a custom name

updated the database schema to add a new varchar column custom_name to the author table.
updated the author form and model to edit/save the new column.
updated the getDisplayName to display the custom_name field if it is not null (otherwise display the calculated display name)
@pli888 pli888 merged commit 6f12b89 into gigascience:develop Feb 13, 2018
@rija rija mentioned this pull request Sep 24, 2019
pli888 added a commit to pli888/gigadb-website that referenced this pull request Nov 25, 2019
…ebsite into rija-file-upload-wizard

This commit merges pull request File upload wizard $346:

1. [gigascience#146: User stories, GitHub issues,  Acceptance tests and dependencies] (rija#146)
1. [gigascience#147: Prototype of client/server file upload and software architecture blueprint] (rija#147)
1. [gigascience#148: Setup upload server and ftp servers in Docker Compose and for CI/CD] (rija#148)
1. [gigascience#151: Build admin function to create restricted access: filesystem directories & security token ] (rija#151)
1. [gigascience#152: Build secure and reliable  upload server based on validated prototype] (rija#152)
1. [gigascience#153: Add remaining time estimate for file uploads] (rija#163)
1. [gigascience#165: Add FTP server interface to the Drop-Off area] (rija#165)

This pull request has the following functionalities:

File Upload Wizard FileDrop administration API and a prototype using it:
* Create or Delete a FileDrop account and filesystem for a specific dataset through a REST API.
* JSon Web Token based authentication for accessing the REST API
* The FileDrop account's filesystem can be interacted with through a TUSd file upload server and an FTP server
* A watcher (based on Linux's inotify) service moves files around the filesystem from the drop-off area to a download area (where ftp links used in mockup page are built from)
* The initial prototype has been updated to use the API and to be optionally deployable from within the project
* the File Upload Wizard is a standalone, full-stack web application built out of the [Yii 2 Advanced template](https://www.yiiframework.com/extension/yiisoft/yii2-app-advanced/doc/guide/2.0/en/start-installation) but still accessed through GigaDB website's URL space and sharing the same Database server and code repository
* Nginx setup was updated to proxy multiple web apps and other services (Tusd). See ``ops/configuration/nginx-conf/sites``
* Nginx setup was updated to make it easy to disable/enable specific webapps (used for feature flagging the prototype). See ``ops/configuration/nginx-conf/enable_sites``
* Significant changes to the CI/CD scripts for deploying multiple webapps and for improved operational safety
* Added [TUSd](https://tusd.io) service as container
* Added [Pure-FTPD](https://pureftpd.org) service as container
* Added a [docker-inotify-command](https://github.com/coppit/docker-inotify-command) service as container

This PR is meant to lay down infrastructure changes needed to implement File Upload Wizard, so no visible business features are delivered here but the prototype was updated to use the API and to be deployable at ``/proto/``.

Both the [GoogleDocs spreadsheet for tasks](https://docs.google.com/spreadsheets/d/1bogVGdvchB790TpMQDN3LL_BvolXWoCni76k1Qi7OiA/edit?pli=1#gid=0) and my [Git project board](https://github.com/users/rija/projects/1) should be up-to-date.

No change to **gigadb** database, but a new database **fuwdb** to support File Upload Wizard web application's functions.

**fuwdb**'s schema is almost entirely built out of Yii2 migrations in ``fuw/app/console/migrations``.

Only the **user** table is created by the Yii2 Advanced Template's [Composer script](https://www.yiiframework.com/extension/yiisoft/yii2-app-advanced/doc/guide/2.0/en/start-installation#installing-using-composer).

No change to GigaDb model classes.
New model layer in the File Upload Wizard webapps divided in 3 categories:
* Backend models (``fuw/app/backend/models/FiledropAccount.php`` and ``fuw/app/backend/models/DockerManager.php``): admin functions, used for admin management of Filedrop accounts
* Frontend models: user facing for future integration between GigaDB UX and FUW's REST API, currently just scaffolded code from Yii2 Advanced template
* Common models: model and config used by backend and frontend. ``fuw/app/common/models/User.php`` is used for the JSon web token (JWT) based authentication to the REST API.

No change to GigaDB controller classes.
There is a new REST controller in File Upload Wizard for creating and deleting filedrop accounts ``fuw/app/backend/controllers/FiledropAccountController.php``.
That controller that corresponds to the ``FiledropAccount.php`` model was automatically scaffoled using `gii`.

However, I've added a custom ``fuw/app/backend/actions/FiledropAccountController/DeleteAction.php`` action so that DELETE command marks the model with a special status instead of deleting it.

N/A

N/A

The test setup for GigaDB was updated to fix inconsistent and fragile approach to loading and restoring database data and to work in context of multi-webapp testing. ([relevant commit](rija@f84d739#diff-ddbd905c83513d6eac66f30e9a493068))

Also Behat acceptance tests now use profiles instead of tags for configuration. See ``behat.yml``
So acceptance are run with a profile as argument. e.g:
```
$ docker-compose run --rm test bin/behat --profile local -v --stop-on-failure
```

The 'cli' profile is for Gitlab CI environment.

The File Upload Wizard has unit tests and functional tests. There's no business functionality completed yet so no acceptance tests has been added yet.

The tests are located in:
* ``fuw/app/backend/tests/``
* ``fuw/app/common/tests/``
* ``fuw/app/frontend/tests/``

[Codeception](https://github.com/Codeception/Codeception) is setup in File Upload Wizard directory as a test runner and test framework for unit, functional and acceptance tests (and it can support Behat features). It is integrated, extensible, up-to-date and maintained and work with the most recent and future version of underlying technology (PHP, PHPunit, Yii2, Selenium).

The config files relevant to Codeception are:
* ``fuw/app/backend/codeception.yml``
* ``fuw/app/common/codeception.yml``
* ``fuw/app/frontend/codeception.yml``
* ``fuw/app/codeception.yml``

More info on testing is included in ``docs/fuw/developer_guide.md``.

The overall flow and technologies haven't changed. The following changes have been made to accomodate CI and CD of a multi-webapps project or to fix structural weakness in the CI/CD process:
* Build manifests (Dockerfile and Production-Dockerfile) should be part of each webapp directory space, but service composition (docker-compose.\*.yml), appliance provisioning (terraform) and system configuraiton (ansible) should stay centralized
* It is dangerous to only have one Terraform state for all environments as a provisioning mistake can take down the wrong environment (e.g: production), so now there is a distinct Terraform state for each environment represented by a separate environment directory in ``ops/infrastructure/envs``
* Previously, provisioning a new environment required duplicating Terraform and Ansible code. Now they are modularized with the bulk of the code (Ansible roles and Terraform modules) stored outside the environment directory because they are environment-agnostic.
* The Ansible playbook and variables and Terraform main script and variables are in the environment-specific directory in ``ops/infrastructure/envs``

Furthermore, the provisioning for GigaDB has been upgraded to Debian 9 Stretch and the database server upgraded to Postgresql 9.6.

Finally, there are updates to Gitlab CI config ``.gitlab-ci.yml``:
* distinct stages to deploy on staging for GigaDB only (**deploy_gigadb**), GigaDB and File Upload Wizard API (**deploy_apps**), or GigaDB with File Upload Wizard API and the prototype (**deploy_proto**)
* It is no longer necessary to have a separate stage for initial deployment of Let's Encrypt certificate.

* The CI/CD docs have been updated in light of changes in provisioning
* There's a new ``docs/fuw`` directory with documentation relating to File Upload Wizard (with the quickstart guide in ``docs/fuw/index.md``)
* I now use [mkdocs](https://www.mkdocs.org) to create a doc server and to build a doc web site easily (configured with ``mkdocs.yml`` at the root of the project)
* There's a ``docs/index.md`` with instructions for using mkdocs and to serve as index page for doc server/website.
* updated ``README.md`` and ``docs/INSTALL.md`` to reflect the change in starting and testing the GigaDB webapp

Bind-mounting of Docker Daemon unix socket inside container is a security vulnerability, so to allow inter-container communication, File Upload Wizard use the ``Docker-PHP`` php library that uses the Docker Daemon API instead. See ``DockerManager.php`` for the current implementation.
(I haven't changed yet the GigaDB acceptance tests that still use the unix socket approach).
However on the Mac, the current version of Docker For Desktop doesn't expose the API on a TCP port, so when working on File Upload Wizard, we need to manually expose the API on a TCP port during development. See ``docs/fuw/index.md``
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants