Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.2 schema updates from DataCite #50

Merged
merged 4 commits into from
Apr 6, 2021
Merged

4.2 schema updates from DataCite #50

merged 4 commits into from
Apr 6, 2021

Conversation

tmorrell
Copy link
Contributor

@tmorrell tmorrell commented Aug 7, 2019

Get the latest 4.2 schema updates from DataCite. I'm still working on the XML<->JSON transform for identifiers. Both identifier and alternative identifiers in the XML need to get merged into the identifiers field in JSON. I'll also add a small description on the changes between the 4.1 schema and 4.2 DataCite schema (beyond just the new metadata fields).

@tmorrell
Copy link
Contributor Author

I've gotten the identifiers transform to work correctly, but we may want to wait for datacite/schema#68 to get merged to avoid another PR.

@tmorrell tmorrell changed the title [WIP] 4.2 schema updates from DataCite 4.2 schema updates from DataCite Aug 15, 2019
@lnielsen
Copy link
Member

I would prefer we wait until the "official" 4.2 schema is fully locked down. I think there's still a couple of discussions to take like datacite/schema#68 but also the identifiers vs alternate identifiers

Also, can I ask you to rebase and create logical commits, otherwise it is very hard for me to review it. Here's how I would structure it:

  1. a commit for changing the identifier + alternative_identifiers into identifiers (schema42.py, datacite-v4.2.json , test_schema42.py, xmlutils.py, conftest.py, only the identifiers property in examples). Only change actual lines that need change (e.g. don't include JSON reformatting changes for datacite-v4.2.json and examples)
  2. a commit for adding new test JSON files
  3. a commit for reformatting JSON files (schema + examples)
  4. a commit for CHANGES.rst

This way I can clearly see the behaviour change (bullet 1) and what impact it has on examples files without having to understand if it's just reformatting or the behaviour changed. Also, there's less risk that I miss to see important changes (changes to example files are often a good indicator to understand how the behaviour changed). It also drastically reduce the number of lines I have to carefully review, vs what I just need to skim through (e.g. I don't have to review pure reformatting much). Once the code is merged, it is also a lot easier to track down bugs and understand why something was changed when using logical commits.

Two last things, I'd also like to ask:

  • For each logical commit add a descriptive commit message why something was changed
  • Match commit message style to previous commit history and ensure there's no merge commits in the history.

@tmorrell
Copy link
Contributor Author

Thanks @lnielsen! I've reorganized the commits to make more sense. I'm not sure how I changed the spacing in the jsonschema when I did the first PR, but the first commit is just matching what is currently in datacite/schema. The rest are the actual changes to get identifier + alternative_identifiers working correctly, an extra test (which can be removed if you don't like it), and the docs.

No problem waiting until 4.2 is settled. I'll rebase my other PRs once this is merged.

@tmorrell
Copy link
Contributor Author

tmorrell commented Apr 3, 2020

It doesn't seem like there will be any more changes to 4.2 @lnielsen. If the formatting changes are too much I can re-do the pull request and separate out formatting from the identifiers changes, but since I'm just syncing with DataCite I think it would have to be manual.

Once 4.2 is merged I can rebase the 4.3 PR.

@tmorrell
Copy link
Contributor Author

tmorrell commented Apr 3, 2020

Oh, and in case it's not clear I'm not including the geopoints as floats component. It might make sense for a general jsonschema, but I can't figure out a straightforward way to implement it in a way that makes sense in Python.

@tmorrell
Copy link
Contributor Author

tmorrell commented May 8, 2020

Rebased with master so Travis passes

@lnielsen lnielsen self-assigned this Oct 19, 2020
@tmorrell
Copy link
Contributor Author

tmorrell commented Oct 21, 2020

@lnielsen I've added a new PR #57 to address an issue with isort that was making travis fail. I've included the new PR in here already, so start with the new PR first. Thanks!

@ntarocco ntarocco merged commit 649c535 into inveniosoftware:master Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants