Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge parsing of bracketed patterns #6989

Merged
merged 13 commits into from
Oct 13, 2020

Conversation

k3KAW8Pnf7mkmdSMPHz27
Copy link
Member

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 commented Oct 7, 2020

Fixes #6892.
Character classes (regex expressions in brackets, e.g., [a-zA-Z]) should be usable with the regex modifier, but it currently isn't for citation keys.

Bracketed patterns are parsed using several different methods in CitationKeyGenerator.java, BracketedPattern.java and AbstractCitationKeyPattern. This PR aims to merge all parsing code into BracketedPattern.java to make it easier to modify bracketed patterns without breaking the citation key generator.

Currently BracketedPattern's parser can parse expressions containing the regex modifier using character classes.

Todo

  • What are AbstractCitationKeyPattern and GlobalCitationKeyPattern used for, and should they be removed?
  • Why are both GlobalCitationKeyPattern and CitationKeyPatternPreferences needed to create a CitationKeyGenerator?

Checklist

  • Change in CHANGELOG.md described (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.

@k3KAW8Pnf7mkmdSMPHz27
Copy link
Member Author

I think I need an opinion on the cleaning of "unwanted characters".

In #4770 the argument were that citation patterns containing special characters ([auth]-[year]) should not remove the "-" between the two bracketed expressions/field markers as it is a wanted special character.
I'd argue based on recent changes #6300, this should no longer be the case, as it is a preference that the user can set themselves.

In particular, I believe this test case,

@Test
void generateKeyWithMinusInCitationStyleOutsideAField() {
BibEntry entry = new BibEntry()
.withField(StandardField.AUTHOR, AUTHOR_STRING_FIRSTNAME_FULL_LASTNAME_FULL_COUNT_1)
.withField(StandardField.YEAR, "2019");
assertEquals("Newton-2019", generateKey(entry, "[auth]-[year]"));
}

should fail using the default unwanted characters.
public static final String DEFAULT_UNWANTED_CHARACTERS = "-`ʹ:!;?^+";

Is this reasonable? Should I open a separate PR with more information on this issue? (the answer dictates where the cleanKey method call goes and should only change 1-4 lines of code)

Extracts code from `generateKey` as methods for readability purposes and
 switches out the parsing code for the one in `BracketedPattern`
@Siedlerchr
Copy link
Member

Regarding the unwanted characters, you are right. The generator should respect the user defined unwanted chars (and of course the illegal ones)

@k3KAW8Pnf7mkmdSMPHz27
Copy link
Member Author

@Siedlerchr I'll change it and add some test cases

@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 marked this pull request as ready for review October 12, 2020 19:51
@k3KAW8Pnf7mkmdSMPHz27 k3KAW8Pnf7mkmdSMPHz27 changed the title [WIP] Merge parsing of bracketed patterns Merge parsing of bracketed patterns Oct 12, 2020
@k3KAW8Pnf7mkmdSMPHz27
Copy link
Member Author

I think looking more into AbstractCitationKeyPattern and GlobalCitationKeyPattern will end up being out of scope of this PR. I have left them unchanged.

Copy link
Member

@Siedlerchr Siedlerchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, just one minor thing where I am not sure

@Siedlerchr Siedlerchr merged commit e955c46 into JabRef:master Oct 13, 2020
Siedlerchr added a commit that referenced this pull request Oct 14, 2020
* upstream/master:
  Jstor Fetcher (#6992)
  Group: "Searching for keywords" searches for a single keyword ==> use singular (#6995)
  Merge parsing of bracketed patterns (#6989)
  6848 fixed the issue of clicking collapse all expanding tree (#6993)
  Enable auto sync per default for Open/Libre Office (#6985)
  Bump unirest-java from 3.11.00 to 3.11.01 (#7001)
  Bump byte-buddy-parent from 1.10.16 to 1.10.17 (#7004)
  Bump lucene-queryparser from 8.6.2 to 8.6.3 (#7002)
  Bump postgresql from 42.2.16 to 42.2.17 (#7005)
  Bump pascalgn/automerge-action from v0.11.0 to v0.12.0 (#7006)
  Bump flowless from 0.6.1 to 0.6.2 (#7003)
Siedlerchr added a commit that referenced this pull request Oct 16, 2020
* upstream/master:
  Update journalList.mv
  Update to javafx15 (#7018)
  Squashed 'src/main/resources/csl-styles/' changes from 6fab78b..5297abd
  try to fix DEP issue with official jdk (#7008)
  Jstor Fetcher (#6992)
  Group: "Searching for keywords" searches for a single keyword ==> use singular (#6995)
  Merge parsing of bracketed patterns (#6989)
  6848 fixed the issue of clicking collapse all expanding tree (#6993)
  Enable auto sync per default for Open/Libre Office (#6985)
  Bump unirest-java from 3.11.00 to 3.11.01 (#7001)
  Bump byte-buddy-parent from 1.10.16 to 1.10.17 (#7004)
  Bump lucene-queryparser from 8.6.2 to 8.6.3 (#7002)
  Bump postgresql from 42.2.16 to 42.2.17 (#7005)
  Bump pascalgn/automerge-action from v0.11.0 to v0.12.0 (#7006)
  Bump flowless from 0.6.1 to 0.6.2 (#7003)
Siedlerchr added a commit that referenced this pull request Oct 17, 2020
* upstream/master: (58 commits)
  remove any newlines and spaces in isbn when fetching (#7023)
  add exception to error handler in integrity check
  Update journalList.mv
  Update to javafx15 (#7018)
  Squashed 'src/main/resources/csl-styles/' changes from 6fab78b..5297abd
  try to fix DEP issue with official jdk (#7008)
  Jstor Fetcher (#6992)
  Group: "Searching for keywords" searches for a single keyword ==> use singular (#6995)
  Merge parsing of bracketed patterns (#6989)
  6848 fixed the issue of clicking collapse all expanding tree (#6993)
  Enable auto sync per default for Open/Libre Office (#6985)
  Bump unirest-java from 3.11.00 to 3.11.01 (#7001)
  Bump byte-buddy-parent from 1.10.16 to 1.10.17 (#7004)
  Bump lucene-queryparser from 8.6.2 to 8.6.3 (#7002)
  Bump postgresql from 42.2.16 to 42.2.17 (#7005)
  Bump pascalgn/automerge-action from v0.11.0 to v0.12.0 (#7006)
  Bump flowless from 0.6.1 to 0.6.2 (#7003)
  Rewrite guidelines to Java 15 (#6973)
  Lint CHANGELOG.md
  Removing "BibTeX" when not specific to BibTeX (#6983)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Key generator regex() function not handling character classes
3 participants