Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #3472: Files starting with BibTeX key of a similar entry are no longer found by mistake #3509

Merged
merged 10 commits into from
Dec 23, 2017
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ We refer to [GitHub issues](https://github.com/JabRef/jabref/issues) by using `#
- We fixed an issue where the same Java Look and Feel would be listed more than once in the Preferences. [#3391](https://github.com/JabRef/jabref/issues/3391)
- We fixed an issue where errors in citation styles triggered an exception when opening the preferences dialog [#3389](https://github.com/JabRef/jabref/issues/3389)
- We fixed an issue where entries were displayed twice after insertion into a shared database. [#3164](https://github.com/JabRef/jabref/issues/3164)
- We improved the auto link algorithm so that files starting with a similar key are no longer found (e.g, `Einstein1902` vs `Einstein1902a`). [#3472](https://github.com/JabRef/jabref/issues/3472)
- We fixed an issue where special fields (such as `printed`) could not be cleared when syncing special fields via the keywords [#3432](https://github.com/JabRef/jabref/issues/3432)
- We fixed an issue where the tooltip of the global search bar showed html tags instead of formatting the text [#3381](https://github.com/JabRef/jabref/issues/3381)
- We fixed an issue where timestamps were not updated for changed entries [#2810](https://github.com/JabRef/jabref/issues/2810)
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/jabref/gui/BasePanel.java
Original file line number Diff line number Diff line change
Expand Up @@ -2016,9 +2016,9 @@ public void searchAndOpen() {

// Run the search operation:
FileFinder fileFinder = FileFinders.constructFromConfiguration(Globals.prefs.getAutoLinkPreferences());
List<Path> files = fileFinder.findAssociatedFiles(entry, dirs, extensions);
Collection<Path> files = fileFinder.findAssociatedFiles(entry, dirs, extensions);
if (!files.isEmpty()) {
Path file = files.get(0);
Path file = files.iterator().next();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no advantage in using Collection here

Optional<ExternalFileType> type = ExternalFileTypes.getInstance().getExternalFileTypeByFile(file);
if (type.isPresent()) {
try {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import java.util.Optional;
import java.util.stream.Collectors;
Expand All @@ -26,22 +27,33 @@
public class AutoSetFileLinksUtil {

private static final Log LOGGER = LogFactory.getLog(AutoSetLinks.class);
private List<Path> directories;
private AutoLinkPreferences autoLinkPreferences;
private ExternalFileTypes externalFileTypes;

public List<LinkedFile> findassociatedNotLinkedFiles(BibEntry entry, BibDatabaseContext databaseContext, FileDirectoryPreferences fileDirPrefs, AutoLinkPreferences autoLinkPrefs, ExternalFileTypes externalFileTypes) {
public AutoSetFileLinksUtil(BibDatabaseContext databaseContext, FileDirectoryPreferences fileDirectoryPreferences, AutoLinkPreferences autoLinkPreferences, ExternalFileTypes externalFileTypes) {
this(databaseContext.getFileDirectoriesAsPaths(fileDirectoryPreferences), autoLinkPreferences, externalFileTypes);
}

public AutoSetFileLinksUtil(List<Path> directories, AutoLinkPreferences autoLinkPreferences, ExternalFileTypes externalFileTypes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am against creating a second constructor. It was my initial idea in #3368 and guess who supposed the current alternative? 😆
See your comment #3368 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean. The linked comment is about the list of entries and not about how the AutoSetFileLinksUtil gets the things needed (like directories and preferences). I think, the current setup makes sense: on initialization, specify everything the auto set functionality needs to know. Why don't you like these constructors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I could not see all code in the line on my mobile, so I was a b9it confused. Forget about my remark,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem!

this.directories = directories;
this.autoLinkPreferences = autoLinkPreferences;
this.externalFileTypes = externalFileTypes;
}

public List<LinkedFile> findAssociatedNotLinkedFiles(BibEntry entry) {
List<LinkedFile> linkedFiles = new ArrayList<>();

List<Path> dirs = databaseContext.getFileDirectoriesAsPaths(fileDirPrefs);
List<String> extensions = externalFileTypes.getExternalFileTypeSelection().stream().map(ExternalFileType::getExtension).collect(Collectors.toList());

// Run the search operation:
FileFinder fileFinder = FileFinders.constructFromConfiguration(autoLinkPrefs);
List<Path> result = fileFinder.findAssociatedFiles(entry, dirs, extensions);

// Iterate over the entries:
// Run the search operation
FileFinder fileFinder = FileFinders.constructFromConfiguration(autoLinkPreferences);
Collection<Path> result = fileFinder.findAssociatedFiles(entry, directories, extensions);

// Collect the found files that are not yet linked
for (Path foundFile : result) {
boolean existingSameFile = entry.getFiles().stream()
.map(file -> file.findIn(dirs))
boolean fileAlreadyLinked = entry.getFiles().stream()
.map(file -> file.findIn(directories))
.anyMatch(file -> {
try {
return file.isPresent() && Files.isSameFile(file.get(), foundFile);
Expand All @@ -50,8 +62,7 @@ public List<LinkedFile> findassociatedNotLinkedFiles(BibEntry entry, BibDatabase
}
return false;
});
if (!existingSameFile) {

if (!fileAlreadyLinked) {
Optional<ExternalFileType> type = FileHelper.getFileExtension(foundFile)
.map(externalFileTypes::getExternalFileTypeByExt)
.orElse(Optional.of(new UnknownExternalFileType("")));
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/jabref/gui/externalfiles/AutoSetLinks.java
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,11 @@ public static Runnable autoSetLinks(final List<BibEntry> entries, final NamedCom

Runnable r = () -> {
boolean foundAny = false;
AutoSetFileLinksUtil util = new AutoSetFileLinksUtil();
AutoSetFileLinksUtil util = new AutoSetFileLinksUtil(databaseContext, Globals.prefs.getFileDirectoryPreferences(), Globals.prefs.getAutoLinkPreferences(), ExternalFileTypes.getInstance());

for (BibEntry entry : entries) {

List<LinkedFile> linkedFiles = util.findassociatedNotLinkedFiles(entry, databaseContext, Globals.prefs.getFileDirectoryPreferences(), Globals.prefs.getAutoLinkPreferences(), ExternalFileTypes.getInstance());
List<LinkedFile> linkedFiles = util.findAssociatedNotLinkedFiles(entry);

if (ce != null) {
for (LinkedFile linkedFile : linkedFiles) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,9 +158,8 @@ public void bindToEntry(BibEntry entry) {
private List<LinkedFileViewModel> findAssociatedNotLinkedFiles(BibEntry entry) {
List<LinkedFileViewModel> result = new ArrayList<>();

AutoSetFileLinksUtil util = new AutoSetFileLinksUtil();
List<LinkedFile> linkedFiles = util.findassociatedNotLinkedFiles(entry, databaseContext, Globals.prefs.getFileDirectoryPreferences(), Globals.prefs.getAutoLinkPreferences(), ExternalFileTypes.getInstance());

AutoSetFileLinksUtil util = new AutoSetFileLinksUtil(databaseContext, Globals.prefs.getFileDirectoryPreferences(), Globals.prefs.getAutoLinkPreferences(), ExternalFileTypes.getInstance());
List<LinkedFile> linkedFiles = util.findAssociatedNotLinkedFiles(entry);
for (LinkedFile linkedFile : linkedFiles) {
LinkedFileViewModel newLinkedFile = new LinkedFileViewModel(linkedFile, entry, databaseContext);
newLinkedFile.markAsAutomaticallyFound();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ public class BibtexKeyPatternUtil extends BracketedPattern {
private static final Log LOGGER = LogFactory.getLog(BibtexKeyPatternUtil.class);

// All single characters that we can use for extending a key to make it unique:
private static final String CHARS = "abcdefghijklmnopqrstuvwxyz";
public static final String CHARS = "abcdefghijklmnopqrstuvwxyz";

private BibtexKeyPatternUtil() {
}
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/org/jabref/logic/net/URLDownload.java
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ public Path toTemporaryFile() throws IOException {

// Take everything after the last '/' as name + extension
String fileNameWithExtension = sourcePath.substring(sourcePath.lastIndexOf('/') + 1);
String fileName = FileUtil.getFileName(fileNameWithExtension);
String fileName = FileUtil.getBaseName(fileNameWithExtension);
String extension = "." + FileHelper.getFileExtension(fileNameWithExtension).orElse("tmp");

// Create temporary file and download to it
Expand Down
42 changes: 24 additions & 18 deletions src/main/java/org/jabref/logic/util/io/CiteKeyBasedFileFinder.java
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,22 @@
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.Optional;
import java.util.Set;
import java.util.function.BiPredicate;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import org.jabref.logic.bibtexkeypattern.BibtexKeyPatternUtil;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.strings.StringUtil;
import org.jabref.model.util.FileHelper;

import com.google.common.collect.ArrayListMultimap;
import com.google.common.collect.Multimap;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

Expand All @@ -32,40 +33,34 @@ class CiteKeyBasedFileFinder implements FileFinder {
}

@Override
public Map<BibEntry, List<Path>> findAssociatedFiles(List<BibEntry> entries, List<Path> directories, List<String> extensions) {
public Multimap<BibEntry, Path> findAssociatedFiles(List<BibEntry> entries, List<Path> directories, List<String> extensions) {
Objects.requireNonNull(directories);
Objects.requireNonNull(entries);

Map<BibEntry, List<Path>> result = new HashMap<>();
Multimap<BibEntry, Path> result = ArrayListMultimap.create();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you structure the code similar to the one in AutoSetLinks, you won't need a multimap:

 for (BibEntry entry : entries) {

                List<LinkedFile> linkedFiles = util.findassociatedNotLinkedFiles(entry, databaseContext, Globals.prefs.getFileDirectoryPreferences(), Globals.prefs.getAutoLinkPreferences(), ExternalFileTypes.getInstance());

                if (ce != null) {
                    for (LinkedFile linkedFile : linkedFiles) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is really good feedback. I just realized that the finder never gets a list of entries as argument but always a single entry. Thus the design can be simplified drastically without mulitmaps or maps to collections.


// First scan directories
Set<Path> filesWithExtension = findFilesByExtension(directories, extensions);

// Initialize Result-Set
for (BibEntry entry : entries) {
result.put(entry, new ArrayList<>());
}

// Now look for keys
nextFile:
for (Path file : filesWithExtension) {
String name = file.getFileName().toString();
int dot = name.lastIndexOf('.');
// First, look for exact matches:
String nameWithoutExtension = FileUtil.getBaseName(name);

// First, look for exact matches
for (BibEntry entry : entries) {
Optional<String> citeKey = entry.getCiteKeyOptional();
if ((citeKey.isPresent()) && !citeKey.get().isEmpty() && (dot > 0)
&& name.substring(0, dot).equals(citeKey.get())) {
if (StringUtil.isNotBlank(citeKey) && nameWithoutExtension.equals(citeKey.get())) {
result.get(entry).add(file);
continue nextFile;
}
}
// If we get here, we did not find any exact matches. If non-exact
// matches are allowed, try to find one:
// If we get here, we did not find any exact matches. If non-exact matches are allowed, try to find one
if (!exactKeyOnly) {
for (BibEntry entry : entries) {
Optional<String> citeKey = entry.getCiteKeyOptional();
if ((citeKey.isPresent()) && !citeKey.get().isEmpty() && name.startsWith(citeKey.get())) {
if (StringUtil.isNotBlank(citeKey) && matches(name, citeKey.get())) {
result.get(entry).add(file);
continue nextFile;
}
Expand All @@ -76,10 +71,21 @@ public Map<BibEntry, List<Path>> findAssociatedFiles(List<BibEntry> entries, Lis
return result;
}

private boolean matches(String filename, String citeKey) {
boolean startsWithKey = filename.startsWith(citeKey);
if (startsWithKey) {
// The file name starts with the key, that's already a good start
// However, we do not want to match "JabRefa" for "JabRef" since this is probably a file belonging to another entry published in the same time / same name
char charAfterKey = filename.charAt(citeKey.length());
return !StringUtil.contains(BibtexKeyPatternUtil.CHARS, charAfterKey);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better direclty use the java method: String.contains(Character.toString(c))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also take a look at the codacy

}
return false;
}

/**
* Returns a list of all files in the given directories which have one of the given extension.
*/
public Set<Path> findFilesByExtension(List<Path> directories, List<String> extensions) {
private Set<Path> findFilesByExtension(List<Path> directories, List<String> extensions) {
Objects.requireNonNull(extensions, "Extensions must not be null!");

BiPredicate<Path, BasicFileAttributes> isFileWithCorrectExtension = (path, attributes) ->
Expand Down
11 changes: 6 additions & 5 deletions src/main/java/org/jabref/logic/util/io/FileFinder.java
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
package org.jabref.logic.util.io;

import java.nio.file.Path;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.Map;

import org.jabref.model.entry.BibEntry;

import com.google.common.collect.Multimap;

public interface FileFinder {

/**
Expand All @@ -17,10 +19,9 @@ public interface FileFinder {
* @param directories The root directories to search.
* @param extensions The extensions that are acceptable.
*/
Map<BibEntry, List<Path>> findAssociatedFiles(List<BibEntry> entries, List<Path> directories, List<String> extensions);
Multimap<BibEntry, Path> findAssociatedFiles(List<BibEntry> entries, List<Path> directories, List<String> extensions);

default List<Path> findAssociatedFiles(BibEntry entry, List<Path> directories, List<String> extensions) {
Map<BibEntry, List<Path>> associatedFiles = findAssociatedFiles(Collections.singletonList(entry), directories, extensions);
return associatedFiles.getOrDefault(entry, Collections.emptyList());
default Collection<Path> findAssociatedFiles(BibEntry entry, List<Path> directories, List<String> extensions) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use the collection interface here? List interface defines a somehow ordered collection
I would really prefer list here
https://stackoverflow.com/a/3317408/3450689

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multimap only returns a collection because it actually is not implemented as a list (the sorting is random). But now we don't need multimaps anymore so we can go back to lists!

return findAssociatedFiles(Collections.singletonList(entry), directories, extensions).get(entry);
}
}
12 changes: 4 additions & 8 deletions src/main/java/org/jabref/logic/util/io/FileUtil.java
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.jabref.model.entry.BibEntry;
import org.jabref.model.util.OptionalUtil;

import org.apache.commons.io.FilenameUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

Expand Down Expand Up @@ -65,13 +66,8 @@ public static Optional<String> getFileExtension(File file) {
/**
* Returns the name part of a file name (i.e., everything in front of last ".").
*/
public static String getFileName(String fileNameWithExtension) {
int dotPosition = fileNameWithExtension.lastIndexOf('.');
if (dotPosition >= 0) {
return fileNameWithExtension.substring(0, dotPosition);
} else {
return fileNameWithExtension;
}
public static String getBaseName(String fileNameWithExtension) {
return FilenameUtils.getBaseName(fileNameWithExtension);
}

/**
Expand All @@ -80,7 +76,7 @@ public static String getFileName(String fileNameWithExtension) {
* Currently, only the length is restricted to 255 chars, see MAXIMUM_FILE_NAME_LENGTH.
*/
public static String getValidFileName(String fileName) {
String nameWithoutExtension = getFileName(fileName);
String nameWithoutExtension = getBaseName(fileName);

if (nameWithoutExtension.length() > MAXIMUM_FILE_NAME_LENGTH) {
Optional<String> extension = getFileExtension(fileName);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,7 @@
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
Expand All @@ -18,6 +16,8 @@
import org.jabref.model.entry.BibEntry;
import org.jabref.model.strings.StringUtil;

import com.google.common.collect.ArrayListMultimap;
import com.google.common.collect.Multimap;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

Expand Down Expand Up @@ -65,10 +65,10 @@ public static String expandBrackets(String bracketString, BibEntry entry, BibDat
}

@Override
public Map<BibEntry, List<Path>> findAssociatedFiles(List<BibEntry> entries, List<Path> directories, List<String> extensions) {
Map<BibEntry, List<Path>> res = new HashMap<>();
public Multimap<BibEntry, Path> findAssociatedFiles(List<BibEntry> entries, List<Path> directories, List<String> extensions) {
Multimap<BibEntry, Path> res = ArrayListMultimap.create();
for (BibEntry entry : entries) {
res.put(entry, findFiles(entry, extensions, directories));
res.putAll(entry, findFiles(entry, extensions, directories));
}
return res;
}
Expand Down
4 changes: 4 additions & 0 deletions src/main/java/org/jabref/model/strings/StringUtil.java
Original file line number Diff line number Diff line change
Expand Up @@ -719,4 +719,8 @@ public static List<String> getStringAsWords(String text) {
public static boolean containsIgnoreCase(String text, String searchString) {
return StringUtils.containsIgnoreCase(text, searchString);
}

public static boolean contains(String string, char character) {
return string.indexOf(character) != -1;
}
}
2 changes: 1 addition & 1 deletion src/test/java/org/jabref/BibtexTestData.java
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ public static BibEntry getBibtexEntry(ImportFormatPreferences importFormatPrefer
return database.getEntryByKey("HipKro03").get();
}

public static BibDatabase getBibtexDatabase(ImportFormatPreferences importFormatPreferences) throws IOException {
private static BibDatabase getBibtexDatabase(ImportFormatPreferences importFormatPreferences) throws IOException {
String article = "@ARTICLE{HipKro03,\n" + " author = {Eric von Hippel and Georg von Krogh},\n"
+ " title = {Open Source Software and the \"Private-Collective\" Innovation Model: Issues for Organization Science},\n"
+ " journal = {Organization Science},\n" + " year = {2003},\n" + " volume = {14},\n"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,17 +38,16 @@ public void setUp() throws Exception {
entry.setCiteKey("CiteKey");
file = folder.newFile("CiteKey.pdf").toPath();
when(databaseContext.getFileDirectoriesAsPaths(any())).thenReturn(Collections.singletonList(folder.getRoot().toPath()));
when(externalFileTypes.getExternalFileTypeSelection()).thenReturn(new TreeSet<>(externalFileTypes.getDefaultExternalFileTypes()));

when(externalFileTypes.getExternalFileTypeSelection()).thenReturn(new TreeSet<>(ExternalFileTypes.getDefaultExternalFileTypes()));
}

@Test
public void test() {
//Due to mocking the externalFileType class, the file extension will not be found
List<LinkedFile> expected = Collections.singletonList(new LinkedFile("", file.toString(), ""));

AutoSetFileLinksUtil util = new AutoSetFileLinksUtil();
List<LinkedFile> actual = util.findassociatedNotLinkedFiles(entry, databaseContext, fileDirPrefs, autoLinkPrefs, externalFileTypes);
AutoSetFileLinksUtil util = new AutoSetFileLinksUtil(databaseContext, fileDirPrefs, autoLinkPrefs, externalFileTypes);
List<LinkedFile> actual = util.findAssociatedNotLinkedFiles(entry);
assertEquals(expected, actual);
}

Expand Down
Loading