-
Notifications
You must be signed in to change notification settings - Fork 105
added some medical suffixes #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,12 +7,10 @@ | |
'brother', | ||
'dame', | ||
'father', | ||
'king', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both king and queen are including in the set of titles that indicate first names when placed before a single name, e.g. King David and Queen Mary, so this pull request will break some tests. In 2005 there were 148 people born in the US named King, so maybe it is a more useful case to handle than the title. I'm know people have used this parser on datasets that include kings and queens before though, but I guess we can let them customize the titles constant to pick them up. We should update the test cases that include "king" to use one of the other titles in that set. |
||
'maid', | ||
'master', | ||
'mother', | ||
'pope', | ||
'queen', | ||
'sir', | ||
'sister', | ||
'uncle', | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Van is sometimes a first name, so including it in prefixes would break parsing for all the Vans of the world. Skimming the US birth names database there do appear to be people named Van, eg 183 people born in 1983.
Similar comment with Mac. I went to school with a guy named Mac.
Mc is fine because there's no vowel so it can't be a first name. Although I guess it could be a title abbreviation, Master of Ceremonies, and I'm not sure how that would play out.
El is an article in Spanish, so I'd kinda like to know how it is used in a name. Is it used as the Spanish article in a title like el senator, or as a prefix like del?