Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Towards hierarchical keywords #1950

Merged
merged 17 commits into from
Sep 24, 2016
Merged

Conversation

tobiasdiez
Copy link
Member

This PR is the first step towards supporting hierarchical keywords #628.

  • Refactor the keyword-related code to use new classes KeywordList and Keyword
  • Unified keyword delimiter to be a single character and in this way fixed the issues mentioned in Keyword separator #705 and Keyword separator behaviour (again, from a groups perspective) #1877
  • Move keyword delimiter from preferences to metadata (database properties) -> new PR
  • Add hierarchical delimiter and change parsing logic in KeywordList accordingly -> new PR
  • Change UI to support hierarchical keywords -> new PR

Note: I also moved the StringUtil class to model (in some sense it is JabRef's own String class) and removed EntryUtil (only had methods related to strings (-> StringUtil) or keywords (-> KeywordList) ).

  • Change in CHANGELOG.md described
  • Tests created for changes
  • Screenshots added (for bigger UI changes)
  • Manually tested changed features in running JabRef
  • Check documentation status (Issue created for outdated help page at help.jabref.org?)

@tobiasdiez tobiasdiez added status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers dev: code-quality Issues related to code or architecture decisions component: keywords labels Sep 11, 2016
@tobiasdiez tobiasdiez changed the title Towards Towards hierarchical keywords Sep 11, 2016
Copy link
Member

@lenhard lenhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things need to be addressed before merging (especially StringUtil and the inheritance of KeywordList).

@@ -65,7 +65,7 @@ public BibDatabaseContext(BibDatabase database, MetaData metaData, File file) {
this(database, metaData, file, new Defaults());
}

public BibDatabaseContext(Defaults defaults, DatabaseLocation location, String keywordSeparator) {
public BibDatabaseContext(Defaults defaults, DatabaseLocation location, Character keywordSeparator) {
Copy link
Member

@lenhard lenhard Sep 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that this is an error or anything the like. But there is hardly ever an advantage of using Character over String (except for illusive memory benefits). So what is the reason for this change and all the related changes in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some issues related to strings as keyword separators. For example, the default is , (with space) but some places tried to split strings only at the comma (i.e. recognize apple,orange). The space was only used to reformat the list as apple, orange. So I decided to force that only one character can be used as a separator and the final space is always adapted upon writing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sounds reasonable. Let's hope that no one ever wants to use some weird multi-keyword separator.

@@ -3,7 +3,7 @@
import java.util.Arrays;
import java.util.Collection;

import net.sf.jabref.model.entry.EntryUtil;
import net.sf.jabref.model.strings.StringUtil;
Copy link
Member

@lenhard lenhard Sep 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I get this correctly, you merged StringUtil into EntryUtil? We have had discussions about this before and I am not entirely happy with it. StringUtil has a high potential of becoming a god class that can do everything and is used everywhere. Please let's be careful with this.

In the end, however, I also see no point in arguing endlessly over it, so it is ok for me for now. However, please not that on current master there is the class model.util.ModelStringUtil. There is absolutely no change that I would tolerate two string util classes in model, so you'll have to merge that one as well

Copy link
Member Author

@tobiasdiez tobiasdiez Sep 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we now have a ModelStringUtil class, which already moved some string-related methods to model. I would propose to merge everything together: logic.StringUtil, model.ModelStringUtil and model.EntryUtil, they all deal with the same thing. I will create a separate PR with this change (because it also makes merging easier).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I understand. Do this in a separate PR, but please do not forget it :)

sortedKeywordsOfAllEntriesBeforeUpdateByUser.retainAll(separatedKeywords);
}
}
for (String s : sortedKeywordsOfAllEntriesBeforeUpdateByUser) {
for (Keyword s : sortedKeywordsOfAllEntriesBeforeUpdateByUser) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Rename s to keyword :)

new ExplicitGroup(Localization.lang("Automatically created groups"),
GroupHierarchyType.INCLUDING,
Globals.prefs.getKeywordDelimiter()));
Set<String> hs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Improve naming in the new code here.

* Represents a list of keyword chains.
* For example, "Type > A, Type > B, Something else".
*/
public class KeywordList extends ArrayList<Keyword> {
Copy link
Member

@lenhard lenhard Sep 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like that you extend the API class here. That way, KeywordList inherits all kinds of methods that it probably does not need and is prone to weird polymorphic misuses. Instead, KeyWordList should have an internal attribute of type ArrayList<Keyword> and just provide a limited subset of methods that are really needed to modify this attribute.

Also, the current implementation supports duplicates in the list. Is this reasonable for keywords? Wouldn't duplicate elimination make sense here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Composition over inheritance, noted!

I wasn't sure how to handle duplicates. There was some code which tried to replace a keyword exactly at the same position: a, b, c -> a, replace, c. This was not easily possible with Set, but just one line with List. Not sure that this was the right decision through. What is your opinion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tough question, really. Personally, I cannot come up with a reason to have duplicate keywords, but users are weird, as we all know :) It is probably best if we just keep things as they were for the moment. Since the keywords were prior stored as a String, it was certainly possible to add duplicates. So let us keep this for now. My fear is that it may result in random bugs, where processing stops, because a keyword is found and another occurrence of the keyword is still in the list.

As a side note: You do not have to use a Set to avoid duplicates. You can also use a List internally and purge duplicates at the level of KeywordList.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now changed it so that no duplicates are left upon parsing a string to a KeywordList. I feel we get more bugs if the list may actually contain duplicates.

import org.junit.Test;
import org.xml.sax.SAXException;

public class GvkParserTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get why this class is introduced in this PR.


import static org.junit.Assert.assertEquals;

public class KeywordListTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests that document the behavior in case of duplicate keywords.

@tobiasdiez tobiasdiez merged commit 6ae86ae into JabRef:master Sep 24, 2016
@tobiasdiez tobiasdiez deleted the hierKeywords branch September 24, 2016 12:04
@tobiasdiez tobiasdiez mentioned this pull request Sep 24, 2016
Siedlerchr added a commit that referenced this pull request Sep 25, 2016
* upstream/master:
  Implemented Integrity NoBibtexFieldChecker (#2059)
  Implemented title and camel key modifiers (#1894)
  Fix localization task hints (#2031)
  Result of syncLang.py update
  Result of syncLang.py update (with manual correction of captial_letters, and The_BibTeX_entry...)
  Fix "large capitals" to "capital letters"
  Updated Menu_tr.properties (#2057)
  Updated jabref_tr.properties (#2056)
  fix ignore version (#2055)
  Towards hierarchical keywords (#1950)
  Fix failing test, replace \uxxx encoded chars with proper UTF8 chars.
  Import Italian patch
zesaro pushed a commit to zesaro/jabref that referenced this pull request Nov 22, 2016
* Small code cleanup in SpecialFieldsUtils

* Refactor code related to keywords

* Move StringUtil to model and remove EntryUtil

* Add a few more tests

* Change keyword delemiter in groups to Character

* Optimize imports

* Removed unused keyword separator in shareddb ui manager

* Fix build errors

* Fix failing architecture tests

* Reformat imports

* Small renamings

* Move from Inheritance to composition in KeywordList

* Fix tests

* ArXiv accepts import format preferences instead of keyword delimiter

* Fix binding

* Fix arXiv tests
zesaro added a commit to zesaro/jabref that referenced this pull request Nov 22, 2016
* Small code cleanup in SpecialFieldsUtils

* Refactor code related to keywords

* Move StringUtil to model and remove EntryUtil

* Add a few more tests

* Change keyword delemiter in groups to Character

* Optimize imports

* Removed unused keyword separator in shareddb ui manager

* Fix build errors

* Fix failing architecture tests

* Reformat imports

* Small renamings

* Move from Inheritance to composition in KeywordList

* Fix tests

* ArXiv accepts import format preferences instead of keyword delimiter

* Fix binding

* Fix arXiv tests

# Conflicts:
#	src/main/java/net/sf/jabref/gui/importer/fetcher/EntryFetchers.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: keywords dev: code-quality Issues related to code or architecture decisions status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants