Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store LaTeX-free fields in BibEntry #2102

Merged
merged 12 commits into from
Oct 3, 2016
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ We refer to [GitHub issues](https://github.com/JabRef/jabref/issues) by using `#
- Improve language quality of the German translation of shared database

### Fixed
- Fixed [#1993](https://github.com/JabRef/jabref/issues/1993): Various optimizations regarding search performance
- Fixed [koppor#160](https://github.com/koppor/jabref/issues/160): Tooltips now working in the main table
- Fixed [#2054](https://github.com/JabRef/jabref/issues/2054): Ignoring a new version now works as expected
- Fixed selecting an entry out of multiple duplicates
Expand Down
5 changes: 5 additions & 0 deletions jabref.install4j
Original file line number Diff line number Diff line change
Expand Up @@ -750,4 +750,9 @@ return true;</string>
</mediaSets>
<buildIds buildAll="true" />
<buildOptions verbose="false" faster="false" disableSigning="false" disableJreBundling="false" debug="false" />
<jvmArguments>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if these settings go missing, it may caused by a plain save of the config using the Install4J GUI. I'm trying to add it as "VM options file" at 53ffd02, which seems to work.

<arg>-XX:+UseG1GC</arg>
<arg>-XX:+UseStringDeduplication</arg>
<arg>-XX:StringTableSize=1000003</arg>
</jvmArguments>
</install4j>
55 changes: 42 additions & 13 deletions src/main/java/net/sf/jabref/model/entry/BibEntry.java
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import net.sf.jabref.model.database.BibDatabaseMode;
import net.sf.jabref.model.entry.event.EntryEventSource;
import net.sf.jabref.model.entry.event.FieldChangedEvent;
import net.sf.jabref.model.strings.LatexToUnicode;
import net.sf.jabref.model.strings.StringUtil;

import com.google.common.base.Strings;
Expand All @@ -52,11 +53,22 @@ public class BibEntry implements Cloneable {

private String type;
private Map<String, String> fields = new ConcurrentHashMap<>();
/*

/**
* Map to store the words in every field
*/
private final Map<String, Set<String>> fieldsAsWords = new HashMap<>();

/**
* Cache that stores latex free versions of fields.
*/
private final Map<String, String> latexFreeFields = new ConcurrentHashMap<>();

/**
* Used to cleanse field values for internal LaTeX-free storage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is wrong with cleanse? My dictionary says this is a proper term :-)

*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/**instead of /*

private LatexToUnicode unicodeConverter = new LatexToUnicode();

// Search and grouping status is stored in boolean fields for quick reference:
private boolean searchHit;
private boolean groupHit;
Expand All @@ -65,7 +77,7 @@ public class BibEntry implements Cloneable {

private String commentsBeforeEntry = "";

/*
/**
* Marks whether the complete serialization, which was read from file, should be used.
*
* Is set to false, if parts of the entry change. This causes the entry to be serialized based on the internal state (and not based on the old serialization)
Expand Down Expand Up @@ -95,7 +107,7 @@ public BibEntry(String id) {
/**
* Constructs a new BibEntry with the given ID and given type
*
* @param id The ID to be used
* @param id The ID to be used
* @param type The type to set. May be null or empty. In that case, DEFAULT_TYPE is used.
*/
public BibEntry(String id, String type) {
Expand All @@ -107,7 +119,7 @@ public BibEntry(String id, String type) {
}

public Optional<FieldChange> replaceKeywords(KeywordList keywordsToReplace, Optional<Keyword> newValue,
Character keywordDelimiter) {
Character keywordDelimiter) {
KeywordList keywordList = getKeywords(keywordDelimiter);
keywordList.replaceKeywords(keywordsToReplace, newValue);

Expand Down Expand Up @@ -180,7 +192,6 @@ public String getId() {

/**
* Sets the cite key AKA citation key AKA BibTeX key.
*
* Note: This is <emph>not</emph> the internal Id of this entry. The internal Id is always present, whereas the BibTeX key might not be present.
*
* @param newCiteKey The cite key to set. Must not be null; use {@link #clearCiteKey()} to remove the cite key.
Expand All @@ -191,7 +202,6 @@ public void setCiteKey(String newCiteKey) {

/**
* Returns the cite key AKA citation key AKA BibTeX key, or null if it is not set.
*
* Note: this is <emph>not</emph> the internal Id of this entry. The internal Id is always present, whereas the BibTeX key might not be present.
*/
@Deprecated
Expand Down Expand Up @@ -396,8 +406,9 @@ public void setField(Map<String, String> fields) {

/**
* Set a field, and notify listeners about the change.
* @param name The field to set
* @param value The value to set
*
* @param name The field to set
* @param value The value to set
* @param eventSource Source the event is sent from
*/
public Optional<FieldChange> setField(String name, String value, EntryEventSource eventSource) {
Expand All @@ -421,8 +432,8 @@ public Optional<FieldChange> setField(String name, String value, EntryEventSourc

changed = true;

fields.put(fieldName, value);
fieldsAsWords.remove(fieldName);
fields.put(fieldName, value.intern());
invalidateFieldCache(fieldName);

FieldChange change = new FieldChange(this, fieldName, oldValue, value);
eventBus.post(new FieldChangedEvent(change, eventSource));
Expand Down Expand Up @@ -460,7 +471,7 @@ public Optional<FieldChange> clearField(String name) {
* Remove the mapping for the field name, and notify listeners about
* the change including the {@link EntryEventSource}.
*
* @param name The field to clear.
* @param name The field to clear.
* @param eventSource the source a new {@link FieldChangedEvent} should be posten from.
*/
public Optional<FieldChange> clearField(String name, EntryEventSource eventSource) {
Expand All @@ -478,7 +489,8 @@ public Optional<FieldChange> clearField(String name, EntryEventSource eventSourc
changed = true;

fields.remove(fieldName);
fieldsAsWords.remove(fieldName);
invalidateFieldCache(fieldName);

FieldChange change = new FieldChange(this, fieldName, oldValue.get(), null);
eventBus.post(new FieldChangedEvent(change, eventSource));
return Optional.of(change);
Expand Down Expand Up @@ -570,7 +582,7 @@ public void setGroupHit(boolean groupHit) {
* Author1, Author2: Title (Year)
*/
public String getAuthorTitleYear(int maxCharacters) {
String[] s = new String[] {getField(FieldName.AUTHOR).orElse("N/A"), getField(FieldName.TITLE).orElse("N/A"),
String[] s = new String[]{getField(FieldName.AUTHOR).orElse("N/A"), getField(FieldName.TITLE).orElse("N/A"),
getField(FieldName.YEAR).orElse("N/A")};

String text = s[0] + ": \"" + s[1] + "\" (" + s[2] + ')';
Expand Down Expand Up @@ -765,4 +777,21 @@ public Set<String> getFieldAsWords(String field) {
public Optional<FieldChange> clearCiteKey() {
return clearField(KEY_FIELD);
}

private void invalidateFieldCache(String fieldName) {
latexFreeFields.remove(fieldName);
fieldsAsWords.remove(fieldName);
}

public Optional<String> getLatexFreeField(String name) {
if (!hasField(name)) {
return Optional.empty();
} else if (latexFreeFields.containsKey(name)) {
return Optional.ofNullable(latexFreeFields.get(toLowerCase(name)));
} else {
String latexFreeField = unicodeConverter.format(getField(name).get()).intern();
latexFreeFields.put(name, latexFreeField);
return Optional.of(latexFreeField);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ public boolean applyRule(String query, BibEntry bibEntry) {

List<String> unmatchedWords = new SentenceAnalyzer(searchString).getWords();

for (String fieldContent : bibEntry.getFieldValues()) {
String formattedFieldContent = LATEX_TO_UNICODE_FORMATTER.format(fieldContent);
for (String fieldKey : bibEntry.getFieldNames()) {
String formattedFieldContent = bibEntry.getLatexFreeField(fieldKey).get();
if (!caseSensitive) {
formattedFieldContent = formattedFieldContent.toLowerCase();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ public boolean compare(BibEntry entry) {
}

for (String field : fieldsKeys) {
Optional<String> fieldValue = entry.getField(field);
Optional<String> fieldValue = entry.getLatexFreeField(field);
if (fieldValue.isPresent()) {
if (matchFieldValue(fieldValue.get())) {
return true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,7 @@ public boolean applyRule(String query, BibEntry bibEntry) {
for (String field : bibEntry.getFieldNames()) {
Optional<String> fieldOptional = bibEntry.getField(field);
if (fieldOptional.isPresent()) {
String fieldContent = RegexBasedSearchRule.LATEX_TO_UNICODE_FORMATTER.format(fieldOptional.get());
String fieldContentNoBrackets = RegexBasedSearchRule.LATEX_TO_UNICODE_FORMATTER.format(fieldContent);
String fieldContentNoBrackets = bibEntry.getLatexFreeField(field).get();
Matcher m = pattern.matcher(fieldContentNoBrackets);
if (m.find()) {
return true;
Expand Down
23 changes: 19 additions & 4 deletions src/main/java/net/sf/jabref/model/strings/LatexToUnicode.java
Original file line number Diff line number Diff line change
@@ -1,21 +1,33 @@
package net.sf.jabref.model.strings;

import java.util.Map;
import java.util.regex.Pattern;

public class LatexToUnicode {

private static final Map<String, String> CHARS = HTMLUnicodeConversionMaps.LATEX_UNICODE_CONVERSION_MAP;
private static final Map<String, String> ACCENTS = HTMLUnicodeConversionMaps.UNICODE_ESCAPED_ACCENTS;

private static final Pattern AMP_LATEX = Pattern.compile("&|\\\\&");
private static final Pattern P_LATEX = Pattern.compile("[\\n]{1,}");
private static final Pattern DOLLAR_LATEX = Pattern.compile("\\\\\\$");
private static final Pattern DOLLARS_LATEX = Pattern.compile("\\$([^\\$]*)\\$");

private static final Pattern AMP = Pattern.compile("\\&amp;");
private static final Pattern P = Pattern.compile("<p>");
private static final Pattern DOLLAR = Pattern.compile("\\&dollar;");
private static final Pattern TILDE = Pattern.compile("~");

public String format(String inField) {
if (inField.isEmpty()) {
return "";
}
int i;
// TODO: document what does this do
String field = inField.replaceAll("&|\\\\&", "&amp;").replaceAll("[\\n]{1,}", "<p>").replace("\\$", "&dollar;") // Replace \$ with &dollar;
.replaceAll("\\$([^\\$]*)\\$", "\\{$1\\}");
String field = AMP_LATEX.matcher(inField).replaceAll("&amp;");
field = P_LATEX.matcher(field).replaceAll("<p>");
field = DOLLAR_LATEX.matcher(field).replaceAll("&dollar;");
field = DOLLARS_LATEX.matcher(field).replaceAll("\\{$1\\}");

StringBuilder sb = new StringBuilder();
StringBuilder currentCommand = null;
Expand Down Expand Up @@ -187,8 +199,11 @@ public String format(String inField) {
}
}

return sb.toString().replace("&amp;", "&").replace("<p>", "\n").replace("&dollar;", "$").replace("~",
"\u00A0");
String result = AMP.matcher(sb.toString()).replaceAll("&");
result = P.matcher(result).replaceAll("\n");
result = DOLLAR.matcher(result).replaceAll("\\$");
result = TILDE.matcher(result).replaceAll("\u00A0");
return result;

}
}