-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Towards hierarchical keywords #1950
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things need to be addressed before merging (especially StringUtil
and the inheritance of KeywordList
).
@@ -65,7 +65,7 @@ public BibDatabaseContext(BibDatabase database, MetaData metaData, File file) { | |||
this(database, metaData, file, new Defaults()); | |||
} | |||
|
|||
public BibDatabaseContext(Defaults defaults, DatabaseLocation location, String keywordSeparator) { | |||
public BibDatabaseContext(Defaults defaults, DatabaseLocation location, Character keywordSeparator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that this is an error or anything the like. But there is hardly ever an advantage of using Character
over String
(except for illusive memory benefits). So what is the reason for this change and all the related changes in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some issues related to strings as keyword separators. For example, the default is ,
(with space) but some places tried to split strings only at the comma (i.e. recognize apple,orange
). The space was only used to reformat the list as apple, orange
. So I decided to force that only one character can be used as a separator and the final space is always adapted upon writing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds reasonable. Let's hope that no one ever wants to use some weird multi-keyword separator.
@@ -3,7 +3,7 @@ | |||
import java.util.Arrays; | |||
import java.util.Collection; | |||
|
|||
import net.sf.jabref.model.entry.EntryUtil; | |||
import net.sf.jabref.model.strings.StringUtil; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I get this correctly, you merged StringUtil
into EntryUtil
? We have had discussions about this before and I am not entirely happy with it. StringUtil
has a high potential of becoming a god class that can do everything and is used everywhere. Please let's be careful with this.
In the end, however, I also see no point in arguing endlessly over it, so it is ok for me for now. However, please not that on current master there is the class model.util.ModelStringUtil
. There is absolutely no change that I would tolerate two string util classes in model, so you'll have to merge that one as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we now have a ModelStringUtil class, which already moved some string-related methods to model. I would propose to merge everything together: logic.StringUtil, model.ModelStringUtil and model.EntryUtil, they all deal with the same thing. I will create a separate PR with this change (because it also makes merging easier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I understand. Do this in a separate PR, but please do not forget it :)
sortedKeywordsOfAllEntriesBeforeUpdateByUser.retainAll(separatedKeywords); | ||
} | ||
} | ||
for (String s : sortedKeywordsOfAllEntriesBeforeUpdateByUser) { | ||
for (Keyword s : sortedKeywordsOfAllEntriesBeforeUpdateByUser) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Rename s
to keyword
:)
new ExplicitGroup(Localization.lang("Automatically created groups"), | ||
GroupHierarchyType.INCLUDING, | ||
Globals.prefs.getKeywordDelimiter())); | ||
Set<String> hs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Improve naming in the new code here.
* Represents a list of keyword chains. | ||
* For example, "Type > A, Type > B, Something else". | ||
*/ | ||
public class KeywordList extends ArrayList<Keyword> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like that you extend the API class here. That way, KeywordList
inherits all kinds of methods that it probably does not need and is prone to weird polymorphic misuses. Instead, KeyWordList
should have an internal attribute of type ArrayList<Keyword>
and just provide a limited subset of methods that are really needed to modify this attribute.
Also, the current implementation supports duplicates in the list. Is this reasonable for keywords? Wouldn't duplicate elimination make sense here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Composition over inheritance, noted!
I wasn't sure how to handle duplicates. There was some code which tried to replace a keyword exactly at the same position: a, b, c
-> a, replace, c
. This was not easily possible with Set, but just one line with List. Not sure that this was the right decision through. What is your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tough question, really. Personally, I cannot come up with a reason to have duplicate keywords, but users are weird, as we all know :) It is probably best if we just keep things as they were for the moment. Since the keywords were prior stored as a String, it was certainly possible to add duplicates. So let us keep this for now. My fear is that it may result in random bugs, where processing stops, because a keyword is found and another occurrence of the keyword is still in the list.
As a side note: You do not have to use a Set to avoid duplicates. You can also use a List internally and purge duplicates at the level of KeywordList
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now changed it so that no duplicates are left upon parsing a string to a KeywordList. I feel we get more bugs if the list may actually contain duplicates.
import org.junit.Test; | ||
import org.xml.sax.SAXException; | ||
|
||
public class GvkParserTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get why this class is introduced in this PR.
|
||
import static org.junit.Assert.assertEquals; | ||
|
||
public class KeywordListTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests that document the behavior in case of duplicate keywords.
629d6ad
to
2665d83
Compare
* upstream/master: Implemented Integrity NoBibtexFieldChecker (#2059) Implemented title and camel key modifiers (#1894) Fix localization task hints (#2031) Result of syncLang.py update Result of syncLang.py update (with manual correction of captial_letters, and The_BibTeX_entry...) Fix "large capitals" to "capital letters" Updated Menu_tr.properties (#2057) Updated jabref_tr.properties (#2056) fix ignore version (#2055) Towards hierarchical keywords (#1950) Fix failing test, replace \uxxx encoded chars with proper UTF8 chars. Import Italian patch
* Small code cleanup in SpecialFieldsUtils * Refactor code related to keywords * Move StringUtil to model and remove EntryUtil * Add a few more tests * Change keyword delemiter in groups to Character * Optimize imports * Removed unused keyword separator in shareddb ui manager * Fix build errors * Fix failing architecture tests * Reformat imports * Small renamings * Move from Inheritance to composition in KeywordList * Fix tests * ArXiv accepts import format preferences instead of keyword delimiter * Fix binding * Fix arXiv tests
* Small code cleanup in SpecialFieldsUtils * Refactor code related to keywords * Move StringUtil to model and remove EntryUtil * Add a few more tests * Change keyword delemiter in groups to Character * Optimize imports * Removed unused keyword separator in shareddb ui manager * Fix build errors * Fix failing architecture tests * Reformat imports * Small renamings * Move from Inheritance to composition in KeywordList * Fix tests * ArXiv accepts import format preferences instead of keyword delimiter * Fix binding * Fix arXiv tests # Conflicts: # src/main/java/net/sf/jabref/gui/importer/fetcher/EntryFetchers.java
This PR is the first step towards supporting hierarchical keywords #628.
KeywordList
andKeyword
KeywordList
accordingly -> new PRNote: I also moved the StringUtil class to model (in some sense it is JabRef's own String class) and removed EntryUtil (only had methods related to strings (-> StringUtil) or keywords (-> KeywordList) ).