-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle multiple middle names of authors #81
Comments
This is the same issue as ticket #8 but this solution is better so I have closed #8. |
Hi, As the goal of this change request is to allow the accommodation of: the diversity of names,
Below, I take a sample of author records from the database and apply the algorithm to each. Given the author has surname Examples:
In the last example, the middle name has been entered as "KwanHok" in the database, but if the middle name is changed by the author It will help to have a preview area in the author edit form, so that form user can see in the preview how the name would be displayed on the web site. Thus, with this predictable algorithm and a preview area, authors (or admins) have the flexibility to experiment in the form's existing name fields to influence how the name is displayed. Let me know if the above examples and suggestion meet your goal for this ticket. |
thank you Rija, this is perfect!
Cheer
Chris
…----- Original Message -----
From: "Rija Ménagé" <notifications@github.com>
To: "gigascience/gigadb-website" <gigadb-website@noreply.github.com>
Cc: "Chris Hunter" <only1chunts@gmail.com>, "Author" <author@noreply.github.com>
Sent: Thursday, 11 January, 2018 10:33:17 AM
Subject: Re: [gigascience/gigadb-website] handle multiple middle names of authors (#81)
Hi,
As the goal of this change request is to allow the accommodation of: the diversity of names,
a standardised format (Gigascience Journal), and ability to tweak the displayed information,
I suggest the implementation of the following display algorhitm:
1. extract surname
2. extract first letter of every word in first_name and middle_name field
3. If only capital letters are entered in one of the first_name or middle_name field, return each of the capital letter before moving on to the next word.
4. If a period, space or comma is met, it is considered as a word separator and algorithm will apply to the words before and after
5. merge all initials into one string
6. return the resulting display name in the order: Surname, initials string, separated by a space
Below, I take a sample of author records from the database and apply the algorithm to each.
Given the author has surname surname
And the author has first name first_name
And the auhor has middle name middle_name
When gigadb web site shows a paper by the author
Then the author's name should be displayed as display_name
Examples:
surname first_name middle_name display_name
Teo Audrey SM Teo ASM
Gilbert M.Thomas P Gilbert MTP
Muñoz Ángel GG Muñoz ÁGG
Martinez-Cruzado Juan Carlos Martinez-Cruzado JC
Shen Yong-Yi Shen Y
Tong Steve KwanHok Tong SK
Tong Steve Kwan.Hok Tong SKH
Tong Steve Kwan,Hok Tong SKH
Tong Steve Kwan Hok Tong SKH
Tong Kwan Hok Steve Tong KHS
In the last example, the middle name has been entered as "KwanHok" in the database, but if the middle name is changed by the author
to "Kwan Hok" or "Kwan.Hok" or "Kwan,Hok"
then the resulting display name would be: Tong SKH
It will help to have a preview area in the author edit form, so that form user can see in the preview how the name would be displayed on the web site.
Thus, with this predictable algorithm and a preview area, authors (or admins) have the flexibility to experiment in the form's existing name fields to influence how the name is displayed.
Let me know if the above examples and suggestion meet your goal for this ticket.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub , or mute the thread .
|
create the initial behat config file in tests.
Wrote test scenarios for issue #81 using examples from the prodcution database. Feature Context file boilerplate created. Behat framework installed.
…science story implemented the step definitions for the scenario to implement the story for issue #81 from gigascience perspective. Made used of Behat tables to and real author names from the database. That means taht before each run of the suite, the sample of production database needs to be loaded. The database dump is called "author-names-80-81-82.pgdmp" and it differs only from the 2016 dump by having an Attribute record for "urlredirect".
It was tricky to get PHPUnit with Yii1.1 with Gigadb-website running in Vagrant. Among the several problem: recent version of Php unit and its extensions would cause issues (4.8 fails. but 4.1 is ok). Also the phpunit selenium extension is required even we don't use it because the Yii framework references it! Also avoid using minimum-stability as 'dev' in composer.json. prefer 'stable' and override with @dev on individual package if necessary.
set up the unit tests for the necessary routines. Currently failing as expected as implementation incomplete.
the display format for authors on the dataset page now is the same as the references on Gigascience Journal and all tests (unit and acceptance) are passing. However I came across two names which were entered in a way that may make me revise one of my initial assumption.
The display format for authors on the dataset page now is the same as the references on Gigascience Journal and all tests (unit and acceptance) are passing.
Installed php-pecl-xcebug in vagrant chef recipe as it is needed by phpunit to calculate test coverage. Set up the creation of a test database in gigadb chef recipe, so that unit tests can run without disrupting manual and automated user testing by wiping out the development database. Created a db_test.json config for the test database. Moidified Yii test config to use the test database. Run the test coverage for the gigadb-website codebase and generate a comprehensive html report.
I didn't realised the test.php config for Yii was also pulling main.php and therefore overriding the $dbConfig variable causing the unit tests to use the mai database. Corrected it by loadding the test database config into a distinct variable.
figured out how to organise Behat FeatureContext's step definitions in multiple file. The answer is to use subContexts. Created a feature and a scenario for the preview functionality.
Refactored the Author's displayName function to be more flexible. Added a field in the author view to show display name. Updated the login form to be actionable by automated tests (submit button needs to be wihin the <form></form> element. Moved snapshot after failure hook to GigadbWebsite so it can be automatically available to all subcontexts. Got all acceptance tests scenario passing.
part of attempt at setting functional testing with phpunit
make sure test coverage reports are not commited
Hi @only1chunts, I've made a first implementation of this functionality on my branch: You can see the stories and acceptance criteria on the page below (click the [+] all button to expand and see them) https://rija.github.io/gigadb-website/scenarios_report/author-names-80-81-82.html It implements the algorithm I've described (see the examples in the page) but it scales down a little the idea of preview area in the form: Whereas a faithful preview area in the form would require a round-trip with the server for an ajax call, which is not resource efficient. Other observations I made during my implementation shown in the example on the above acceptance criteria page: (1) There are some abbrievated names who are meant to stay lowercase. See the example of "Hekkert BtL", the "t" should stay lowercase, as shown in the pubmed article linked from the dataset: https://www.ncbi.nlm.nih.gov/pubmed/21743474 (2) Accentuated characters (issue #82) are now represented correctly as seen in the example "Muñoz ÁGG" and "Schiøtt M" (3) Some trailing special characters are filtered out automatically as seen in "Schiøtt, " (4) Some names have a "Jr" attached, this should not be abbreviated hence the resulting display name "Loughran TPJr", the corresponding Gigascience manuscript preserve the "Jr" in the list of references: https://academic.oup.com/gigascience/article/2/1/1/2656134 Let me know what you think. |
I think that all sounds great, I think I need to see it in action to understand if it does what I need so I will ask @jessesiu to get it up and running on his machine for me to see. |
Wrote test scenarios for issue #81 using examples from the prodcution database. Feature Context file boilerplate created. Behat framework installed.
…science story implemented the step definitions for the scenario to implement the story for issue #81 from gigascience perspective. Made used of Behat tables to and real author names from the database. That means taht before each run of the suite, the sample of production database needs to be loaded. The database dump is called "author-names-80-81-82.pgdmp" and it differs only from the 2016 dump by having an Attribute record for "urlredirect".
It was tricky to get PHPUnit with Yii1.1 with Gigadb-website running in Vagrant. Among the several problem: recent version of Php unit and its extensions would cause issues (4.8 fails. but 4.1 is ok). Also the phpunit selenium extension is required even we don't use it because the Yii framework references it! Also avoid using minimum-stability as 'dev' in composer.json. prefer 'stable' and override with @dev on individual package if necessary.
set up the unit tests for the necessary routines. Currently failing as expected as implementation incomplete.
The display format for authors on the dataset page now is the same as the references on Gigascience Journal and all tests (unit and acceptance) are passing.
Installed php-pecl-xcebug in vagrant chef recipe as it is needed by phpunit to calculate test coverage. Set up the creation of a test database in gigadb chef recipe, so that unit tests can run without disrupting manual and automated user testing by wiping out the development database. Created a db_test.json config for the test database. Moidified Yii test config to use the test database. Run the test coverage for the gigadb-website codebase and generate a comprehensive html report.
I didn't realised the test.php config for Yii was also pulling main.php and therefore overriding the $dbConfig variable causing the unit tests to use the mai database. Corrected it by loadding the test database config into a distinct variable.
figured out how to organise Behat FeatureContext's step definitions in multiple file. The answer is to use subContexts. Created a feature and a scenario for the preview functionality.
Refactored the Author's displayName function to be more flexible. Added a field in the author view to show display name. Updated the login form to be actionable by automated tests (submit button needs to be wihin the <form></form> element. Moved snapshot after failure hook to GigadbWebsite so it can be automatically available to all subcontexts. Got all acceptance tests scenario passing.
part of attempt at setting functional testing with phpunit
It's best practice to use "composer install" insstad of keeeping the composer vendorised subtree of dependencies.
downgraded PHPUnit to 4.1.1, so it work with the Yii Framework (even 4.3.* doesn't work with Yii 1.1). Switched to minimum-stability: stable, so I can remove the @stable suffixes add @dev suffix to mink-wunit-driver as that's the only branch avaiable for that package. Removed google api client as it's already in Gigadb codebase. Because of the PHPUNit downgraded, cannot use namespace when using PHPUNit Assert (PHPUNit\Framework\Assert had to be replaced by PHPUnit_Framework_Assert) Controlled which Behat hooks get run with which scenario using tags on features and on hooks so that for example oauth revokation hook are only actioned in affilate login scenario Addded a scenario to ensure the test environment is loaded up with all the test data in the author name display and name display preview tests. Updated one sql test data script to drop the not null constraint on the affiliation column of the gigadb_user table. Updated the test runner so it run all the acceptance tests and unit tests.
…cumentation After all tests have run, the previous state of the database is restored. This is done in the test runner now as it is a concern that overarch all tests. Updating the TESTING docs with info on running unit tests and the above database setup.
Fixed syntax of pg_dump to ensure the initial state of database is saved and restored. increased sleep time after terminating pg backend processes Made the test runner logs visible in protected/runtime/ Moved the printCurrentUrl inside the try{} block as it blows up when a step fails and no web session has been started yet(with visit)
…e deterministic By default CDbFixtureManager.php loads the fixtures using readdir() which returns list files in the order in which they are stored by the filesystem. It "seems" than that order vary from system to system resulting in the following error: CDbException: CDbCommand failed to execute the SQL statement: SQLSTATE[23503]: Foreign key violation: 7 ERROR: insert or update on table "dataset_author" violates foreign key constraint "dataset_author_author_id_fkey" DETAIL: Key (author_id)=(1) is not present in table "author". because the join table fixture is loaded before the data table. I've created an init.php file in the fixtures directory to customise the fixture loading behaviour, in this case, ensuring the fixtures are loaded in the order that won't violate the foreign key constraint.
Hi @rija , I've had a look at the version @jessesiu got running on his machine, it appears to be mostly fine but when I look at the admin page of the authors (e.g. http://127.0.0.1:9170/adminAuthor/update/id/3789 ) I have no way to update the display name? |
Hi @only1chunts, The list of examples in the acceptance test scenarios was to show how the display format of an author’s name can be tweaked in a variety of ways while adhering to Gigascience Journal’s reference standards. As it is implemented, there is no explicit editable display name field for you to type in. What I understand in your last comment, is that, irrespective of those new rules and how an author name’s elements can be edited in existing fields, there is a need for an additional free-form, editable, display name field on that form. |
Hi, yes that is correct. Thanks.
Sent from Blue
…On Feb 8, 2018, 6:39 PM, at 6:39 PM, "Rija Ménagé" ***@***.***> wrote:
Hi @only1chunts,
The list of examples in the acceptance test scenarios was to show how
the display format of an author’s name can be tweaked in a variety of
ways while adhering to Gigascience Journal’s reference standards.
As it is implemented, there is no explicit editable display name field
for you to type in.
The variety of ways an author’s name could be displayed as can be
obtained by using the existing fields (Surname, First name, and Middle
name) on the new formatting rules. Additionally, the author’s view
screen has a display field to show how it would appear on a dataset
page.
What I understand in your last comment, is that, irrespective of those
new rules and how an author name’s elements can be edited in existing
fields, there is a need for an additional free-form, editable, display
name field on that form.
Could you confirm that my understanding there is correct?
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#81 (comment)
|
Thank you @only1chunts, In that case, two things:
|
getDisplayName on Author model used the third form for generateDisplayName (as intended) instead of making the concatenation itself
…a custom name updated the database schema to add a new varchar column custom_name to the author table. updated the author form and model to edit/save the new column. updated the getDisplayName to display the custom_name field if it is not null (otherwise display the calculated display name)
This should be considered in conjunction with #80 & #82
Many authors have more than 1 middle name. e.g. Thomas Gilbert = M Thomas P Gilbert. or Irene Wing Shan Chik, Steve Kwan Hok Tong, Kwok Wing Stephen Tsui etc.... we need a method to enable the display of names to be slightly more configurable. Perhaps the easiest method is to have a new column in the database for "display name", which auto populates with the usual first initial of each name, but can be over-written if required (in admin pages) so Steve Kwan Hok Tong - autopopulates to "Tong SK" but can be updated to "Tong SKH" if required. The webpage then takes the display name to generate the web-view.
The citations are listed in GigaScience papers as . e.g Hebsgaard MB, Gilbert MTP, Arneborg J
The text was updated successfully, but these errors were encountered: