handle multiple middle names of authors #81

only1chunts · 2016-11-07T07:51:10Z

This should be considered in conjunction with #80 & #82

Many authors have more than 1 middle name. e.g. Thomas Gilbert = M Thomas P Gilbert. or Irene Wing Shan Chik, Steve Kwan Hok Tong, Kwok Wing Stephen Tsui etc.... we need a method to enable the display of names to be slightly more configurable. Perhaps the easiest method is to have a new column in the database for "display name", which auto populates with the usual first initial of each name, but can be over-written if required (in admin pages) so Steve Kwan Hok Tong - autopopulates to "Tong SK" but can be updated to "Tong SKH" if required. The webpage then takes the display name to generate the web-view.
The citations are listed in GigaScience papers as . e.g Hebsgaard MB, Gilbert MTP, Arneborg J

only1chunts · 2017-09-15T01:09:33Z

This is the same issue as ticket #8 but this solution is better so I have closed #8.
Some of our authors have multiple middle initials,
e.g. "Teo A S M", is currently stored in the database as:
Surname = "Teo"
middle name = "SM"
first name = "Audrey"
but displays on the dataset as "Teo, A S" (http://gigadb.org/dataset/100165) instead of "Teo A S M".
Make it so that the website displays the 1st letter of each word of middle_name column. We then need to store the names in the database with spaces between multiple middle initials e.g.
Surname = "Teo"
middle name = "S M"
first name = "Audrey"

rija · 2018-01-11T10:33:16Z

Hi,

As the goal of this change request is to allow the accommodation of: the diversity of names,
a standardised format (Gigascience Journal), and ability to tweak the displayed information,
I suggest the implementation of the following display algorhitm:

extract surname
extract first letter of every word in first_name and middle_name field
If only capital letters are entered in one of the first_name or middle_name field, return each of the capital letter before moving on to the next word.
If a period, space or comma is met, it is considered as a word separator and algorithm will apply to the words before and after
merge all initials into one string
return the resulting display name in the order: Surname, initials string, separated by a space

Below, I take a sample of author records from the database and apply the algorithm to each.

Given the author has surname surname
And the author has first name first_name
And the auhor has middle name middle_name
When gigadb web site shows a paper by the author
Then the author's name should be displayed as display_name

Examples:

surname	first_name	middle_name	display_name
Teo	Audrey	SM	Teo ASM
Gilbert	M.Thomas	P	Gilbert MTP
Muñoz	Ángel	GG	Muñoz ÁGG
Martinez-Cruzado	Juan	Carlos	Martinez-Cruzado JC
Shen	Yong-Yi		Shen Y
Tong	Steve	KwanHok	Tong SK
Tong	Steve	Kwan.Hok	Tong SKH
Tong	Steve	Kwan,Hok	Tong SKH
Tong	Steve	Kwan Hok	Tong SKH
Tong	Kwan Hok	Steve	Tong KHS

In the last example, the middle name has been entered as "KwanHok" in the database, but if the middle name is changed by the author
to "Kwan Hok" or "Kwan.Hok" or "Kwan,Hok"
then the resulting display name would be: Tong SKH

It will help to have a preview area in the author edit form, so that form user can see in the preview how the name would be displayed on the web site.

Thus, with this predictable algorithm and a preview area, authors (or admins) have the flexibility to experiment in the form's existing name fields to influence how the name is displayed.

Let me know if the above examples and suggestion meet your goal for this ticket.

only1chunts · 2018-01-11T11:04:39Z

thank you Rija, this is perfect! Cheer Chris

…

----- Original Message ----- From: "Rija Ménagé" <notifications@github.com> To: "gigascience/gigadb-website" <gigadb-website@noreply.github.com> Cc: "Chris Hunter" <only1chunts@gmail.com>, "Author" <author@noreply.github.com> Sent: Thursday, 11 January, 2018 10:33:17 AM Subject: Re: [gigascience/gigadb-website] handle multiple middle names of authors (#81) Hi, As the goal of this change request is to allow the accommodation of: the diversity of names, a standardised format (Gigascience Journal), and ability to tweak the displayed information, I suggest the implementation of the following display algorhitm: 1. extract surname 2. extract first letter of every word in first_name and middle_name field 3. If only capital letters are entered in one of the first_name or middle_name field, return each of the capital letter before moving on to the next word. 4. If a period, space or comma is met, it is considered as a word separator and algorithm will apply to the words before and after 5. merge all initials into one string 6. return the resulting display name in the order: Surname, initials string, separated by a space Below, I take a sample of author records from the database and apply the algorithm to each. Given the author has surname surname And the author has first name first_name And the auhor has middle name middle_name When gigadb web site shows a paper by the author Then the author's name should be displayed as display_name Examples: surname first_name middle_name display_name Teo Audrey SM Teo ASM Gilbert M.Thomas P Gilbert MTP Muñoz Ángel GG Muñoz ÁGG Martinez-Cruzado Juan Carlos Martinez-Cruzado JC Shen Yong-Yi Shen Y Tong Steve KwanHok Tong SK Tong Steve Kwan.Hok Tong SKH Tong Steve Kwan,Hok Tong SKH Tong Steve Kwan Hok Tong SKH Tong Kwan Hok Steve Tong KHS In the last example, the middle name has been entered as "KwanHok" in the database, but if the middle name is changed by the author to "Kwan Hok" or "Kwan.Hok" or "Kwan,Hok" then the resulting display name would be: Tong SKH It will help to have a preview area in the author edit form, so that form user can see in the preview how the name would be displayed on the web site. Thus, with this predictable algorithm and a preview area, authors (or admins) have the flexibility to experiment in the form's existing name fields to influence how the name is displayed. Let me know if the above examples and suggestion meet your goal for this ticket. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or mute the thread .

create the initial behat config file in tests.

Wrote test scenarios for issue #81 using examples from the prodcution database. Feature Context file boilerplate created. Behat framework installed.

…science story implemented the step definitions for the scenario to implement the story for issue #81 from gigascience perspective. Made used of Behat tables to and real author names from the database. That means taht before each run of the suite, the sample of production database needs to be loaded. The database dump is called "author-names-80-81-82.pgdmp" and it differs only from the 2016 dump by having an Attribute record for "urlredirect".

@dev

It was tricky to get PHPUnit with Yii1.1 with Gigadb-website running in Vagrant. Among the several problem: recent version of Php unit and its extensions would cause issues (4.8 fails. but 4.1 is ok). Also the phpunit selenium extension is required even we don't use it because the Yii framework references it! Also avoid using minimum-stability as 'dev' in composer.json. prefer 'stable' and override with @dev on individual package if necessary.

set up the unit tests for the necessary routines. Currently failing as expected as implementation incomplete.

the display format for authors on the dataset page now is the same as the references on Gigascience Journal and all tests (unit and acceptance) are passing. However I came across two names which were entered in a way that may make me revise one of my initial assumption.

The display format for authors on the dataset page now is the same as the references on Gigascience Journal and all tests (unit and acceptance) are passing.

Installed php-pecl-xcebug in vagrant chef recipe as it is needed by phpunit to calculate test coverage. Set up the creation of a test database in gigadb chef recipe, so that unit tests can run without disrupting manual and automated user testing by wiping out the development database. Created a db_test.json config for the test database. Moidified Yii test config to use the test database. Run the test coverage for the gigadb-website codebase and generate a comprehensive html report.

I didn't realised the test.php config for Yii was also pulling main.php and therefore overriding the $dbConfig variable causing the unit tests to use the mai database. Corrected it by loadding the test database config into a distinct variable.

figured out how to organise Behat FeatureContext's step definitions in multiple file. The answer is to use subContexts. Created a feature and a scenario for the preview functionality.

Refactored the Author's displayName function to be more flexible. Added a field in the author view to show display name. Updated the login form to be actionable by automated tests (submit button needs to be wihin the <form></form> element. Moved snapshot after failure hook to GigadbWebsite so it can be automatically available to all subcontexts. Got all acceptance tests scenario passing.

part of attempt at setting functional testing with phpunit

make sure test coverage reports are not commited

all test passing. Just made sure the admin login comes from env variables. Also add test_users* env variables with tests login/password in .gitignore.

rija · 2018-01-30T08:14:18Z

Hi @only1chunts,

I've made a first implementation of this functionality on my branch:
https://github.com/rija/gigadb-website/tree/author-names-81

You can see the stories and acceptance criteria on the page below (click the [+] all button to expand and see them)

https://rija.github.io/gigadb-website/scenarios_report/author-names-80-81-82.html

It implements the algorithm I've described (see the examples in the page) but it scales down a little the idea of preview area in the form:
The form is so small and simple (only 4 fields matching the database colum of the author table), that it's no less quick and convenient if I add a display name preview field in the author view page the curator sees after updating an author form.
(Maybe adding an "edit" button on the author view for admin users would make that even more easier to make further correction)

Whereas a faithful preview area in the form would require a round-trip with the server for an ajax call, which is not resource efficient.

Other observations I made during my implementation shown in the example on the above acceptance criteria page:

(1) There are some abbrievated names who are meant to stay lowercase. See the example of "Hekkert BtL", the "t" should stay lowercase, as shown in the pubmed article linked from the dataset: https://www.ncbi.nlm.nih.gov/pubmed/21743474
Therefore the algorithm only abbreviate, it won't capitalise. Authors would have to capitalise appropriately for the case to be retained (which is what is happening anyway, looking at sample from the database).

(2) Accentuated characters (issue #82) are now represented correctly as seen in the example "Muñoz ÁGG" and "Schiøtt M"

(3) Some trailing special characters are filtered out automatically as seen in "Schiøtt, "

(4) Some names have a "Jr" attached, this should not be abbreviated hence the resulting display name "Loughran TPJr", the corresponding Gigascience manuscript preserve the "Jr" in the list of references: https://academic.oup.com/gigascience/article/2/1/1/2656134

Let me know what you think.

only1chunts · 2018-01-31T09:01:59Z

I think that all sounds great, I think I need to see it in action to understand if it does what I need so I will ask @jessesiu to get it up and running on his machine for me to see.

Wrote test scenarios for issue #81 using examples from the prodcution database. Feature Context file boilerplate created. Behat framework installed.

…science story implemented the step definitions for the scenario to implement the story for issue #81 from gigascience perspective. Made used of Behat tables to and real author names from the database. That means taht before each run of the suite, the sample of production database needs to be loaded. The database dump is called "author-names-80-81-82.pgdmp" and it differs only from the 2016 dump by having an Attribute record for "urlredirect".

@dev

It was tricky to get PHPUnit with Yii1.1 with Gigadb-website running in Vagrant. Among the several problem: recent version of Php unit and its extensions would cause issues (4.8 fails. but 4.1 is ok). Also the phpunit selenium extension is required even we don't use it because the Yii framework references it! Also avoid using minimum-stability as 'dev' in composer.json. prefer 'stable' and override with @dev on individual package if necessary.

set up the unit tests for the necessary routines. Currently failing as expected as implementation incomplete.

The display format for authors on the dataset page now is the same as the references on Gigascience Journal and all tests (unit and acceptance) are passing.

Installed php-pecl-xcebug in vagrant chef recipe as it is needed by phpunit to calculate test coverage. Set up the creation of a test database in gigadb chef recipe, so that unit tests can run without disrupting manual and automated user testing by wiping out the development database. Created a db_test.json config for the test database. Moidified Yii test config to use the test database. Run the test coverage for the gigadb-website codebase and generate a comprehensive html report.

I didn't realised the test.php config for Yii was also pulling main.php and therefore overriding the $dbConfig variable causing the unit tests to use the mai database. Corrected it by loadding the test database config into a distinct variable.

figured out how to organise Behat FeatureContext's step definitions in multiple file. The answer is to use subContexts. Created a feature and a scenario for the preview functionality.

Refactored the Author's displayName function to be more flexible. Added a field in the author view to show display name. Updated the login form to be actionable by automated tests (submit button needs to be wihin the <form></form> element. Moved snapshot after failure hook to GigadbWebsite so it can be automatically available to all subcontexts. Got all acceptance tests scenario passing.

part of attempt at setting functional testing with phpunit

all test passing. Just made sure the admin login comes from env variables. Also add test_users* env variables with tests login/password in .gitignore.

It's best practice to use "composer install" insstad of keeeping the composer vendorised subtree of dependencies.

@dev

downgraded PHPUnit to 4.1.1, so it work with the Yii Framework (even 4.3.* doesn't work with Yii 1.1). Switched to minimum-stability: stable, so I can remove the @stable suffixes add @dev suffix to mink-wunit-driver as that's the only branch avaiable for that package. Removed google api client as it's already in Gigadb codebase. Because of the PHPUNit downgraded, cannot use namespace when using PHPUNit Assert (PHPUNit\Framework\Assert had to be replaced by PHPUnit_Framework_Assert) Controlled which Behat hooks get run with which scenario using tags on features and on hooks so that for example oauth revokation hook are only actioned in affilate login scenario Addded a scenario to ensure the test environment is loaded up with all the test data in the author name display and name display preview tests. Updated one sql test data script to drop the not null constraint on the affiliation column of the gigadb_user table. Updated the test runner so it run all the acceptance tests and unit tests.

…cumentation After all tests have run, the previous state of the database is restored. This is done in the test runner now as it is a concern that overarch all tests. Updating the TESTING docs with info on running unit tests and the above database setup.

Fixed syntax of pg_dump to ensure the initial state of database is saved and restored. increased sleep time after terminating pg backend processes Made the test runner logs visible in protected/runtime/ Moved the printCurrentUrl inside the try{} block as it blows up when a step fails and no web session has been started yet(with visit)

…e deterministic By default CDbFixtureManager.php loads the fixtures using readdir() which returns list files in the order in which they are stored by the filesystem. It "seems" than that order vary from system to system resulting in the following error: CDbException: CDbCommand failed to execute the SQL statement: SQLSTATE[23503]: Foreign key violation: 7 ERROR: insert or update on table "dataset_author" violates foreign key constraint "dataset_author_author_id_fkey" DETAIL: Key (author_id)=(1) is not present in table "author". because the join table fixture is loaded before the data table. I've created an init.php file in the fixtures directory to customise the fixture loading behaviour, in this case, ensuring the fixtures are loaded in the order that won't violate the foreign key constraint.

only1chunts · 2018-02-08T08:35:32Z

Hi @rija , I've had a look at the version @jessesiu got running on his machine, it appears to be mostly fine but when I look at the admin page of the authors (e.g. http://127.0.0.1:9170/adminAuthor/update/id/3789 ) I have no way to update the display name?

rija · 2018-02-08T10:39:22Z

Hi @only1chunts,

The list of examples in the acceptance test scenarios was to show how the display format of an author’s name can be tweaked in a variety of ways while adhering to Gigascience Journal’s reference standards.

As it is implemented, there is no explicit editable display name field for you to type in.
The variety of ways an author’s name could be displayed as can be obtained by using the existing fields (Surname, First name, and Middle name) on the new formatting rules. Additionally, the author’s view screen has a display field to show how it would appear on a dataset page.

What I understand in your last comment, is that, irrespective of those new rules and how an author name’s elements can be edited in existing fields, there is a need for an additional free-form, editable, display name field on that form.
Could you confirm that my understanding there is correct?

only1chunts · 2018-02-08T12:16:24Z

Hi, yes that is correct. Thanks. ⁣Sent from Blue

…

On Feb 8, 2018, 6:39 PM, at 6:39 PM, "Rija Ménagé" ***@***.***> wrote: Hi @only1chunts, The list of examples in the acceptance test scenarios was to show how the display format of an author’s name can be tweaked in a variety of ways while adhering to Gigascience Journal’s reference standards. As it is implemented, there is no explicit editable display name field for you to type in. The variety of ways an author’s name could be displayed as can be obtained by using the existing fields (Surname, First name, and Middle name) on the new formatting rules. Additionally, the author’s view screen has a display field to show how it would appear on a dataset page. What I understand in your last comment, is that, irrespective of those new rules and how an author name’s elements can be edited in existing fields, there is a need for an additional free-form, editable, display name field on that form. Could you confirm that my understanding there is correct? -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #81 (comment)

rija · 2018-02-08T13:41:14Z

Thank you @only1chunts,

In that case, two things:

We will need a new varchar column (custom_name) in the author database table
We need to add the following two acceptance criteria:


Scenario: If custom display name field is empty, save calculated value 
	Given I sign in as an admin
	And I am on "/adminAuthor/update/id/19"
	When I fill in "Author_surname" with "Poe"
	And I fill in "Author_first_name" with "Edgar"
	And I fill in "Author_middle_name" with "Allan"
	And I press "Save"
	Then I should see "Poe EA"


Scenario: when display name edited, save it instead of calculated value
	Given I sign in as an admin
	And I am on "/adminAuthor/update/id/19"
	When I fill in "Author_surname" with "Poe"
	And I fill in "Author_first_name" with "Edgar"
	And I fill in "Author_middle_name" with "Allan"
	And I fill in "Author_custom_name" with "PEA"
	And I press "Save"
	Then I should see "PEA"

getDisplayName on Author model used the third form for generateDisplayName (as intended) instead of making the concatenation itself

…a custom name updated the database schema to add a new varchar column custom_name to the author table. updated the author form and model to edit/save the new column. updated the getDisplayName to display the custom_name field if it is not null (otherwise display the calculated display name)

…splay name

…ure for issue #81 that keeps crashing since adding the feature for #49

This was referenced Nov 7, 2016

handle accented letters in authors names #82

Closed

change the author list display #80

Closed

only1chunts mentioned this issue Sep 15, 2017

Authors with multiple middle initials #8

Closed

only1chunts added the enhancement label Sep 15, 2017

only1chunts added the Rija label Oct 28, 2017

rija mentioned this issue Jan 11, 2018

reproduce accentuated character problem rija/gigadb-website#51

Closed

rija mentioned this issue Jan 11, 2018

Create user stories and scenarios for acceptance tests rija/gigadb-website#53

Closed

rija referenced this issue in rija/gigadb-website Jan 12, 2018

Author-names (#81): setting up test infrastructure

a610bd6

create the initial behat config file in tests.

rija referenced this issue in rija/gigadb-website Jan 12, 2018

Auhtor-names(#81): setting up test scenarios

1e784cc

Wrote test scenarios for issue #81 using examples from the prodcution database. Feature Context file boilerplate created. Behat framework installed.

rija referenced this issue in rija/gigadb-website Jan 14, 2018

Author-names(#81): started implement display format changes

8af9db1

set up the unit tests for the necessary routines. Currently failing as expected as implementation incomplete.

rija referenced this issue in rija/gigadb-website Jan 17, 2018

Author-names(#81): base class for functional test

bac74fb

part of attempt at setting functional testing with phpunit

rija referenced this issue in rija/gigadb-website Jan 17, 2018

Author-names(#81): unit tests

f9a4348

make sure test coverage reports are not commited

rija referenced this issue in rija/gigadb-website Jan 17, 2018

Author-names(#81): Fixed #81

3caf2b6

all test passing. Just made sure the admin login comes from env variables. Also add test_users* env variables with tests login/password in .gitignore.

rija referenced this issue in rija/gigadb-website Jan 19, 2018

acceptance test runs for issues #80, #81, #82

830094a

rija referenced this issue in rija/gigadb-website Feb 2, 2018

Auhtor-names(#81): setting up test scenarios

138d72b

Wrote test scenarios for issue #81 using examples from the prodcution database. Feature Context file boilerplate created. Behat framework installed.

rija referenced this issue in rija/gigadb-website Feb 2, 2018

Author-names(#81): started implement display format changes

b0c25cb

set up the unit tests for the necessary routines. Currently failing as expected as implementation incomplete.

rija referenced this issue in rija/gigadb-website Feb 2, 2018

Author-names(#81): base class for functional test

54164af

part of attempt at setting functional testing with phpunit

rija referenced this issue in rija/gigadb-website Feb 2, 2018

Author-names(#81): Fixed #81

158f21c

all test passing. Just made sure the admin login comes from env variables. Also add test_users* env variables with tests login/password in .gitignore.

rija referenced this issue in rija/gigadb-website Feb 2, 2018

author-names (#80,#81,#82): removed the vendor directory in Behat

da7d627

It's best practice to use "composer install" insstad of keeeping the composer vendorised subtree of dependencies.

rija mentioned this issue Feb 2, 2018

Fix tests after rebasing from develop rija/gigadb-website#66

Closed

rija referenced this issue in rija/gigadb-website Feb 8, 2018

edit display name (#81)

e7d4897

getDisplayName on Author model used the third form for generateDisplayName (as intended) instead of making the concatenation itself

rija mentioned this issue Feb 9, 2018

Add a custom_name field to author rija/gigadb-website#74

Closed

rija referenced this issue in rija/gigadb-website Feb 9, 2018

author names (#81): added unit test for custom name field used for di…

6f12b89

…splay name

rija mentioned this issue Feb 12, 2018

Display format of authors name on dataset page (#80, #81, #82) #165

Merged

pli888 pushed a commit that referenced this issue May 14, 2018

Removed a non-consequential checks from a background scenario in feat…

a842137

…ure for issue #81 that keeps crashing since adding the feature for #49

only1chunts closed this as completed Nov 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

handle multiple middle names of authors #81

handle multiple middle names of authors #81

only1chunts commented Nov 7, 2016 •

edited

Loading

only1chunts commented Sep 15, 2017

rija commented Jan 11, 2018

only1chunts commented Jan 11, 2018 via email

rija commented Jan 30, 2018

only1chunts commented Jan 31, 2018

only1chunts commented Feb 8, 2018

rija commented Feb 8, 2018

only1chunts commented Feb 8, 2018 via email

rija commented Feb 8, 2018 •

edited

Loading

handle multiple middle names of authors #81

handle multiple middle names of authors #81

Comments

only1chunts commented Nov 7, 2016 • edited Loading

only1chunts commented Sep 15, 2017

rija commented Jan 11, 2018

only1chunts commented Jan 11, 2018 via email

rija commented Jan 30, 2018

only1chunts commented Jan 31, 2018

only1chunts commented Feb 8, 2018

rija commented Feb 8, 2018

only1chunts commented Feb 8, 2018 via email

rija commented Feb 8, 2018 • edited Loading

only1chunts commented Nov 7, 2016 •

edited

Loading

rija commented Feb 8, 2018 •

edited

Loading