Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate person when using write_eml() #339

Open
peterdesmet opened this issue Apr 29, 2022 · 2 comments
Open

Duplicate person when using write_eml() #339

peterdesmet opened this issue Apr 29, 2022 · 2 comments

Comments

@peterdesmet
Copy link
Member

The Creating EML vignette suggests using as_emld(R_person) to efficiently code a person as an EML party.

library(EML)
me <- person("Peter", "Desmet", , "fakeaddress@email.com", "mdc", comment = c(ORCID = "0000-0002-8442-8025"))
my_eml <- list(dataset = list(
  title = "A Minimal Valid EML Dataset",
  creator = as_emld(me),
  contact = as_emld(me)
))
my_eml
#> $dataset
#> $dataset$title
#> [1] "A Minimal Valid EML Dataset"
#> 
#> $dataset$creator
#> individualName:
#>   givenName: Peter
#>   surName: Desmet
#> electronicMailAddress: fakeaddress@email.com
#> '@id': https://orcid.org/0000-0002-8442-8025
#> 
#> $dataset$contact
#> individualName:
#>   givenName: Peter
#>   surName: Desmet
#> electronicMailAddress: fakeaddress@email.com
#> '@id': https://orcid.org/0000-0002-8442-8025
write_eml(my_eml, "ex.xml")

Created on 2022-04-29 by the reprex package (v2.0.1)

That generated EML does indeed contain that info nicely. However, the written EML contains the individualName twice:

<?xml version="1.0" encoding="UTF-8"?>
<eml:eml xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.2" packageId="4ef7c004-cb89-4888-b095-240ecbf18c28" system="uuid" xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0 https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd">
  <dataset>
    <title>A Minimal Valid EML Dataset</title>
    <creator id="https://orcid.org/0000-0002-8442-8025">
      <individualName>
        <givenName>Peter</givenName>
        <surName>Desmet</surName>
      </individualName>
      <individualName> <-- DUPLICATE
        <givenName>Peter</givenName>
        <surName>Desmet</surName>
      </individualName>
      <electronicMailAddress>fakeaddress@email.com</electronicMailAddress>
    </creator>
    <contact id="https://orcid.org/0000-0002-8442-8025">
      <individualName>
        <givenName>Peter</givenName>
        <surName>Desmet</surName>
      </individualName>
      <individualName> <-- DUPLICATE
        <givenName>Peter</givenName>
        <surName>Desmet</surName>
      </individualName>
      <electronicMailAddress>fakeaddress@email.com</electronicMailAddress>
    </contact>
  </dataset>
</eml:eml>

Any idea why this is happening? Note, it is not happening when:

  • Only the creator or contact is set (rather than both)
  • The person only contains first and last name (not email, role, comment)
@cboettig
Copy link
Member

Thanks for reporting. @amoeba may be able to shed more light here, but my shot-in-the-dark is that it's related to the fact that we parse the ORCID identifier as the id to the block (I think when you re-use an element in EML you really want to use a reference and not repeat the element like we do in the example; but it's only really an issue when the element has an id)

e.g. can you try the above but without a comment element on the person the used in the contact field? (I could be entirely wrong here too)

@peterdesmet
Copy link
Member Author

Hmm, yes, if I try:

me <- person("Peter", "Desmet", , "fakeaddress@email.com", "mdc")
my_eml <- list(dataset = list(
  title = "A Minimal Valid EML Dataset",
  creator = as_emld(me),
  contact = as_emld(me)
))
write_eml(my_eml, "ex.xml")

It doesn't get duplicated.

Too bad, it was pretty useful that I could use as_emld() on person. I guess I'll have to parse those out and feed them to set_responsibleParty() where I specifically assign each property?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants