-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
34,711 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,107 @@ | ||
# sloinnte | ||
Bunachar Sloinnte Gaeilge | Database of Irish-Language Surnames | ||
# Database of Irish-Language Surnames | ||
|
||
This repository contains a snapshot of the Database of Irish-Language Surnames *(Bunachar Sloinnte Gaeilge)* developed by the Gaois research group in *Fiontar & Scoil na Gaeilge* in Dublin City University. The structure and design principles of the database are described in this publication: | ||
|
||
> Brian Ó Raghallaigh, Michal Boleslav Měchura, Aengus Ó Fionnagáin, Sophie Osborne (forthcoming, 2020): | ||
> **‘Handling Irish-language surnames’** | ||
> in *Names: A Journal of Onomastics* | ||
You can browse the database through our web interface at [www.gaois.ie/surnames/](https://www.gaois.ie/surnames/). This repository contains a downloadable version of the database in XML format which you can reuse in your own applications. | ||
|
||
## `minimal.xml` | ||
|
||
This file contains the database in *minimal* format. Example: | ||
|
||
```xml | ||
<cluster> | ||
<title>Ceallach</title> | ||
<surnames-irish-main> | ||
<surname-irish> | ||
<form>Ó Ceallaigh</form> | ||
</surname-irish> | ||
<surname-irish> | ||
<form>Mac Ceallaigh</form> | ||
</surname-irish> | ||
</surnames-irish-main> | ||
<surnames-irish-historical /> | ||
<surnames-irish-alternative /> | ||
<surnames-english-main> | ||
<surname-english> | ||
<form>Kelly</form> | ||
</surname-english> | ||
<surname-english> | ||
<form>O'Kelly</form> | ||
</surname-english> | ||
</surnames-english-main> | ||
<surnames-english-historical /> | ||
<surnames-english-alternative /> | ||
</cluster> | ||
``` | ||
|
||
Each `<cluster>` contains at least one `<surname-irish>` and at least one `<surname-english>`. Each `<surname-irish>` and each `<surname-english>` contains exactly one `<form>`. There are over 600 such clusters in the database (and in the XML file). Each cluster groups surnames in both languages (Irish and English) which are more or less equivalent (see the publication for an explanation of how we decide when surnames are equivalent and when not). | ||
|
||
## `expand.xsl` | ||
|
||
This XSL stylesheet can be used to transform the database from the *minimal* format into an *expanded* format. It contains an algorithm for detecting which **inflection pattern** an Irish-language surname belongs to, and for inflecting surnames according to such patterns. | ||
|
||
## `expanded.xml` | ||
|
||
This file contains the database in *expanded* format, after it has been transformed by the stylesheet. Example: | ||
|
||
```xml | ||
<cluster> | ||
<title>Ceallach</title> | ||
<surnames-irish-main> | ||
<surname-irish> | ||
<form gender="male" case="nom"><pre>Ó</pre> Ceallaigh</form> | ||
<form gender="male" case="gen"><pre>Uí</pre> C<mut>h</mut>eallaigh</form> | ||
<form gender="male" case="voc"><pre>Uí</pre> C<mut>h</mut>eallaigh</form> | ||
<form gender="female" familyStatus="wife" case="nom"><pre>Uí</pre> C<mut>h</mut>eallaigh</form> | ||
<form gender="female" familyStatus="wife" case="gen"><pre>Uí</pre> C<mut>h</mut>eallaigh</form> | ||
<form gender="female" familyStatus="wife" case="voc"><pre>Uí</pre> C<mut>h</mut>eallaigh</form> | ||
<form gender="female" familyStatus="daughter" case="nom"><pre>Ní</pre> C<mut>h</mut>eallaigh</form> | ||
<form gender="female" familyStatus="daughter" case="gen"><pre>Ní</pre> C<mut>h</mut>eallaigh</form> | ||
<form gender="female" familyStatus="daughter" case="voc"><pre>Ní</pre> C<mut>h</mut>eallaigh</form> | ||
</surname-irish> | ||
<surname-irish> | ||
<form gender="male" case="nom"><pre>Mac</pre> Ceallaigh</form> | ||
<form gender="male" case="gen"><pre>Mhic</pre> Ceallaigh</form> | ||
<form gender="male" case="voc"><pre>Mhic</pre> Ceallaigh</form> | ||
<form gender="female" familyStatus="wife" case="nom"><pre>Mhic</pre> Ceallaigh</form> | ||
<form gender="female" familyStatus="wife" case="gen"><pre>Mhic</pre> Ceallaigh</form> | ||
<form gender="female" familyStatus="wife" case="voc"><pre>Mhic</pre> Ceallaigh</form> | ||
<form gender="female" familyStatus="daughter" case="nom"><pre>Nic</pre> Ceallaigh</form> | ||
<form gender="female" familyStatus="daughter" case="gen"><pre>Nic</pre> Ceallaigh</form> | ||
<form gender="female" familyStatus="daughter" case="voc"><pre>Nic</pre> Ceallaigh</form> | ||
</surname-irish> | ||
</surnames-irish-main> | ||
<surnames-irish-historical /> | ||
<surnames-irish-alternative /> | ||
<surnames-english-main> | ||
<surname-english> | ||
<form>Kelly</form> | ||
</surname-english> | ||
<surname-english> | ||
<form>O'Kelly</form> | ||
</surname-english> | ||
</surnames-english-main> | ||
<surnames-english-historical /> | ||
<surnames-english-alternative /> | ||
</cluster> | ||
``` | ||
|
||
The *expanded* format has the same structure as the *minimal* format, except that each `<surname-irish>` contains a larger number of `<form>` elements. | ||
|
||
- If the surname takes on different shapes for men and women (which most – but not all! – Irish-language surnames do), then those forms are distinguished with the `gender` attribute (value `male` or `female`). | ||
|
||
- If the woman's surname takes on different shapes depending on whether the bearer of the name is conceptualized as a "wife" or as a "daughter" (which most – but not all! – Irish-language female surnames do), then those forms are distinguished with the `familyStatus` attribute (value `wife` or `daughter`). | ||
|
||
- Each Irish-language `<form>` also has a `case` attribute to tell you which grammatical case this form is in: nominative (`nom`), genitive (`gen`) or vocative (`voc`). | ||
|
||
The text content of each Irish-language `<form>` is maked up with the following inline elements: | ||
|
||
- `<pre>` marks up prefixed elements such as `Ó` and `Mac` as well their inflected forms. | ||
- `<mut>` marks up initial mutations caused by the prefixed elements. | ||
|
||
If you remove the `<pre>` and `<mut>` elements (including their content) from the `<form>` and trim any leading whitespace from the result, you will end up with a gender-neutral, family status-neutral sortkey for the surname. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,209 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> | ||
<xsl:template match="cluster"> | ||
<cluster> | ||
<xsl:apply-templates/> | ||
</cluster> | ||
</xsl:template> | ||
|
||
<xsl:template match="title"> | ||
<title> | ||
<xsl:apply-templates/> | ||
</title> | ||
</xsl:template> | ||
|
||
<xsl:template match="surnames-irish-main"> | ||
<surnames-irish-main> | ||
<xsl:apply-templates/> | ||
</surnames-irish-main> | ||
</xsl:template> | ||
<xsl:template match="surnames-irish-historical"> | ||
<surnames-irish-historical> | ||
<xsl:apply-templates/> | ||
</surnames-irish-historical> | ||
</xsl:template> | ||
<xsl:template match="surnames-irish-alternative"> | ||
<surnames-irish-alternative> | ||
<xsl:apply-templates/> | ||
</surnames-irish-alternative> | ||
</xsl:template> | ||
|
||
<xsl:template match="surnames-english-main"> | ||
<surnames-english-main> | ||
<xsl:apply-templates/> | ||
</surnames-english-main> | ||
</xsl:template> | ||
<xsl:template match="surnames-english-historical"> | ||
<surnames-english-historical> | ||
<xsl:apply-templates/> | ||
</surnames-english-historical> | ||
</xsl:template> | ||
<xsl:template match="surnames-english-alternative"> | ||
<surnames-english-alternative> | ||
<xsl:apply-templates/> | ||
</surnames-english-alternative> | ||
</xsl:template> | ||
|
||
<xsl:template match="surname-english"> | ||
<xsl:variable name="form" select="form/text()"/> | ||
<surname-english> | ||
<form xml:space="preserve"><xsl:value-of select="$form"/></form> | ||
</surname-english> | ||
</xsl:template> | ||
|
||
<xsl:template match="surname-irish"> | ||
<xsl:variable name="form" select="form/text()"/> | ||
<xsl:choose> | ||
<xsl:when test="starts-with($form, 'Ó ')"><xsl:call-template name="pattern-o"/></xsl:when> | ||
<xsl:when test="starts-with($form, 'Mac Giolla ')"><xsl:call-template name="pattern-mac-unmut"/></xsl:when> | ||
<xsl:when test="starts-with($form, 'Mac Con ')"><xsl:call-template name="pattern-mac-unmut"/></xsl:when> | ||
<xsl:when test="starts-with($form, 'Mac ')"><xsl:call-template name="pattern-mac"/></xsl:when> | ||
<xsl:when test="substring($form, string-length($form)-2)='ach' and not(contains($form, ' '))"><xsl:call-template name="pattern-adj-ach"/></xsl:when> | ||
<xsl:otherwise><xsl:call-template name="pattern-zero"/></xsl:otherwise> | ||
</xsl:choose> | ||
</xsl:template> | ||
|
||
<xsl:template name="pattern-zero"> | ||
<xsl:variable name="form" select="form/text()"/> | ||
<surname-irish> | ||
<form xml:space="preserve" gender="male" case="nom"><xsl:copy-of select="$form"/></form> | ||
<form xml:space="preserve" gender="male" case="gen"><xsl:copy-of select="$form"/></form> | ||
<form xml:space="preserve" gender="male" case="voc"><xsl:copy-of select="$form"/></form> | ||
<form xml:space="preserve" gender="female" case="nom"><xsl:copy-of select="$form"/></form> | ||
<form xml:space="preserve" gender="female" case="gen"><xsl:copy-of select="$form"/></form> | ||
<form xml:space="preserve" gender="female" case="voc"><xsl:copy-of select="$form"/></form> | ||
</surname-irish> | ||
</xsl:template> | ||
|
||
<xsl:template name="pattern-o"> | ||
<xsl:variable name="form" select="form/text()"/> | ||
<xsl:variable name="base"><xsl:call-template name="extract-base"><xsl:with-param name="what" select="$form"/></xsl:call-template></xsl:variable> | ||
<xsl:variable name="prefh"><xsl:call-template name="prefh"><xsl:with-param name="what" select="$base"/></xsl:call-template></xsl:variable> | ||
<xsl:variable name="lenited"><xsl:call-template name="lenite"><xsl:with-param name="what" select="$base"/></xsl:call-template></xsl:variable> | ||
<surname-irish> | ||
<form xml:space="preserve" gender="male" case="nom"><pre>Ó</pre> <xsl:copy-of select="$prefh"/></form> | ||
<form xml:space="preserve" gender="male" case="gen"><pre>Uí</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="male" case="voc"><pre>Uí</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="nom"><pre>Uí</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="gen"><pre>Uí</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="voc"><pre>Uí</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="nom"><pre>Ní</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="gen"><pre>Ní</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="voc"><pre>Ní</pre> <xsl:copy-of select="$lenited"/></form> | ||
</surname-irish> | ||
</xsl:template> | ||
|
||
<xsl:template name="pattern-mac-unmut"> | ||
<xsl:variable name="form" select="form/text()"/> | ||
<xsl:variable name="base"><xsl:call-template name="extract-base"><xsl:with-param name="what" select="$form"/></xsl:call-template></xsl:variable> | ||
<surname-irish> | ||
<form xml:space="preserve" gender="male" case="nom"><pre>Mac</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="male" case="gen"><pre>Mhic</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="male" case="voc"><pre>Mhic</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="nom"><pre>Mhic</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="gen"><pre>Mhic</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="voc"><pre>Mhic</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="nom"><pre>Nic</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="gen"><pre>Nic</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="voc"><pre>Nic</pre> <xsl:copy-of select="$base"/></form> | ||
</surname-irish> | ||
</xsl:template> | ||
|
||
<xsl:template name="pattern-mac"> | ||
<xsl:variable name="form" select="form/text()"/> | ||
<xsl:variable name="base"><xsl:call-template name="extract-base"><xsl:with-param name="what" select="$form"/></xsl:call-template></xsl:variable> | ||
<xsl:variable name="lenited"><xsl:call-template name="lenite-except-cg"><xsl:with-param name="what" select="$base"/></xsl:call-template></xsl:variable> | ||
<surname-irish> | ||
<form xml:space="preserve" gender="male" case="nom"><pre>Mac</pre> <xsl:copy-of select="$base"/></form> | ||
<form xml:space="preserve" gender="male" case="gen"><pre>Mhic</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="male" case="voc"><pre>Mhic</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="nom"><pre>Mhic</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="gen"><pre>Mhic</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="wife" case="voc"><pre>Mhic</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="nom"><pre>Nic</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="gen"><pre>Nic</pre> <xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" familyStatus="daughter" case="voc"><pre>Nic</pre> <xsl:copy-of select="$lenited"/></form> | ||
</surname-irish> | ||
</xsl:template> | ||
|
||
<xsl:template name="pattern-adj-ach"> | ||
<xsl:variable name="form" select="form/text()"/> | ||
<xsl:variable name="lenited"><xsl:call-template name="lenite"><xsl:with-param name="what" select="$form"/></xsl:call-template></xsl:variable> | ||
<xsl:variable name="gm"><xsl:call-template name="genitivise-ending-masc"><xsl:with-param name="what" select="$lenited"/></xsl:call-template></xsl:variable> | ||
<xsl:variable name="gf"><xsl:call-template name="genitivise-ending-fem"><xsl:with-param name="what" select="$lenited"/></xsl:call-template></xsl:variable> | ||
<surname-irish> | ||
<form xml:space="preserve" gender="male" case="nom"><xsl:copy-of select="$form"/></form> | ||
<form xml:space="preserve" gender="male" case="gen"><xsl:copy-of select="$gm"/></form> | ||
<form xml:space="preserve" gender="male" case="voc"><xsl:copy-of select="$gm"/></form> | ||
<form xml:space="preserve" gender="female" case="nom"><xsl:copy-of select="$lenited"/></form> | ||
<form xml:space="preserve" gender="female" case="gen"><xsl:copy-of select="$gf"/></form> | ||
<form xml:space="preserve" gender="female" case="voc"><xsl:copy-of select="$gf"/></form> | ||
</surname-irish> | ||
</xsl:template> | ||
|
||
|
||
<xsl:template name="extract-base"> | ||
<xsl:param name="what"/> | ||
<xsl:choose> | ||
<xsl:when test="starts-with($what, 'Ó h')"><xsl:value-of select="substring($what, 4)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'Ó ')"><xsl:value-of select="substring($what, 3)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'Mac ')"><xsl:value-of select="substring($what, 5)"/></xsl:when> | ||
<xsl:otherwise><xsl:value-of select="$what"/></xsl:otherwise> | ||
</xsl:choose> | ||
</xsl:template> | ||
|
||
<xsl:template name="prefh"> | ||
<xsl:param name="what"/> | ||
<xsl:choose> | ||
<xsl:when test="contains('AÁEÉIÍOÓUÚ', substring($what, 1, 1))"><mut>h</mut><xsl:value-of select="$what"/></xsl:when> | ||
<xsl:otherwise><xsl:value-of select="$what"/></xsl:otherwise> | ||
</xsl:choose> | ||
</xsl:template> | ||
|
||
<xsl:template name="lenite"> | ||
<xsl:param name="what"/> | ||
<xsl:choose> | ||
<xsl:when test="starts-with($what, 'B') and not(starts-with($what, 'Bh'))">B<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'C') and not(starts-with($what, 'Ch'))">C<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'D') and not(starts-with($what, 'Dh'))">D<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'F') and not(starts-with($what, 'Fh'))">F<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'G') and not(starts-with($what, 'Gh'))">G<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'M') and not(starts-with($what, 'Mh'))">M<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'P') and not(starts-with($what, 'Ph'))">P<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'S') and contains('rnlaeiouáéíóú', substring($what, 2, 1))">S<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'T') and not(starts-with($what, 'Th'))">T<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:otherwise><xsl:value-of select="$what"/></xsl:otherwise> | ||
</xsl:choose> | ||
</xsl:template> | ||
|
||
<xsl:template name="lenite-except-cg"> | ||
<xsl:param name="what"/> | ||
<xsl:choose> | ||
<xsl:when test="starts-with($what, 'B') and not(starts-with($what, 'Bh'))">B<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'D') and not(starts-with($what, 'Dh'))">D<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'F') and not(starts-with($what, 'Fh'))">F<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'M') and not(starts-with($what, 'Mh'))">M<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'P') and not(starts-with($what, 'Ph'))">P<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'S') and contains('rnlaeiouáéíóú', substring($what, 2, 1))">S<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:when test="starts-with($what, 'T') and not(starts-with($what, 'Th'))">T<mut>h</mut><xsl:value-of select="substring($what, 2)"/></xsl:when> | ||
<xsl:otherwise><xsl:value-of select="$what"/></xsl:otherwise> | ||
</xsl:choose> | ||
</xsl:template> | ||
|
||
<xsl:template name="genitivise-ending-masc"> | ||
<xsl:param name="what"/> | ||
<xsl:choose> | ||
<xsl:when test="substring($what, string-length($what)-2)='ach'"><xsl:value-of select="substring($what, 1, string-length($what)-3)"/>aigh</xsl:when> | ||
<xsl:otherwise><xsl:value-of select="$what"/></xsl:otherwise> | ||
</xsl:choose> | ||
</xsl:template> | ||
|
||
<xsl:template name="genitivise-ending-fem"> | ||
<xsl:param name="what"/> | ||
<xsl:choose> | ||
<xsl:when test="substring($what, string-length($what)-2)='ach'"><xsl:value-of select="substring($what, 1, string-length($what)-3)"/>aí</xsl:when> | ||
<xsl:otherwise><xsl:value-of select="$what"/></xsl:otherwise> | ||
</xsl:choose> | ||
</xsl:template> | ||
|
||
</xsl:stylesheet> |
Oops, something went wrong.