Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve encoding while copy and pasting in maintable #5048

Merged
merged 1 commit into from
Jun 28, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 50 additions & 36 deletions src/main/java/org/jabref/gui/ClipBoardManager.java
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
package org.jabref.gui;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Collections;
import java.util.List;
import java.util.Optional;
Expand Down Expand Up @@ -28,13 +30,11 @@
import org.slf4j.LoggerFactory;

public class ClipBoardManager {

public static final DataFormat XML = new DataFormat("application/xml");

private static final Logger LOGGER = LoggerFactory.getLogger(ClipBoardManager.class);

private final Clipboard clipboard;

private final ImportFormatReader importFormatReader;

public ClipBoardManager() {
Expand Down Expand Up @@ -63,9 +63,8 @@ public String getContents() {
String result = clipboard.getString();
if (result == null) {
return "";
} else {
return result;
}
return result;
}

public void setHtmlContent(String html) {
Expand All @@ -89,40 +88,55 @@ public void setContent(List<BibEntry> entries) throws IOException {
clipboard.setContent(content);
}

public List<BibEntry> extractEntries() {
public List<BibEntry> extractData() {
Object entries = clipboard.getContent(DragAndDropDataFormats.ENTRIES);

if (entries == null) {
return handleStringData(clipboard.getString());
}
return handleBibTeXData((String) entries);
}

private List<BibEntry> handleBibTeXData(String entries) {
BibtexParser parser = new BibtexParser(Globals.prefs.getImportFormatPreferences(), Globals.getFileUpdateMonitor());
if (entries != null) {
// We have determined that the clipboard data is a set of entries (serialized as a string).
try {
return parser.parseEntries((String) entries);
} catch (ParseException ex) {
LOGGER.error("Could not paste", ex);
}
} else {
String data = clipboard.getString();
if (data != null) {
try {
// fetch from doi
Optional<DOI> doi = DOI.parse(data);
if (doi.isPresent()) {
LOGGER.info("Found DOI in clipboard");
Optional<BibEntry> entry = new DoiFetcher(Globals.prefs.getImportFormatPreferences()).performSearchById(doi.get().getDOI());
return OptionalUtil.toList(entry);
} else {
try {
UnknownFormatImport unknownFormatImport = importFormatReader.importUnknownFormat(data);
return unknownFormatImport.parserResult.getDatabase().getEntries();
} catch (ImportException e) {
// import failed and result will be empty
}
}
} catch (FetcherException ex) {
LOGGER.error("Error while fetching", ex);
}
}
try {
return parser.parseEntries(new ByteArrayInputStream(entries.getBytes(StandardCharsets.UTF_8)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Stefan,
Nice to see some commits from you ;-)
I'm not sure whether this is not breaking then the parsing from Strings in other encodings... Have you Hi checked it with some ISO encoded special characters?

Copy link
Member Author

@stefan-kolb stefan-kolb Jun 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISO special characters should be mappable in UTF-8, right?! I guess currently the default encoding for copying to the clipboard is UTF-8, so pasting with UTF-8 made sense to me. If we want to preserve the real encoding, we probably need to pass the encoding with the DataFlavor somehow? Like https://stackoverflow.com/questions/29651301/text-copied-from-jtextarea-have-broken-encoding-after-paste-in-foxpro-applicatio

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always use UTF-8 internally, so this makes indeed sense. Only use case: What if the users has selected a different encoding for the database in the prefs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to preserve the real encoding, we probably need to pass the encoding with the DataFlavor somehow? Like https://stackoverflow.com/questions/29651301/text-copied-from-jtextarea-have-broken-encoding-after-paste-in-foxpro-applicatio

I guess we need to do something like this then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is old swing stuff, don't do this.
It should be sufficient to use the encoding in this method and apply the same encoding when pasting.

public void setContent(List<BibEntry> entries) throws IOException {
final ClipboardContent content = new ClipboardContent();
BibEntryWriter writer = new BibEntryWriter(new LatexFieldFormatter(Globals.prefs.getLatexFieldFormatterPreferences()), false);
String serializedEntries = writer.serializeAll(entries, BibDatabaseMode.BIBTEX);
content.put(DragAndDropDataFormats.ENTRIES, serializedEntries);
content.putString(serializedEntries);
clipboard.setContent(content);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot spot any place during serialization where we use an encoding different to the default encoding which seems to be UTF-8 then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't we need to pay attention to the database encoding since this option only controls the encoding when the database is written to the file (and does not affect the entries in itself).

It may be worth a try to use parser.parseEntries(new StringReader(entries)) which would circumvent the need to specify an encoding. But whether this works depends on if the JavaFX clipboard is able to correctly encode/decode the string (I tried to dig into the javafx code but it seems that they let the os handle the encoding). Otherwise your solution looks good to me.

} catch (ParseException ex) {
LOGGER.error("Could not paste", ex);
return Collections.emptyList();
}
}

private List<BibEntry> handleStringData(String data) {
if (data == null || data.isEmpty()) {
return Collections.emptyList();
}

Optional<DOI> doi = DOI.parse(data);
if (doi.isPresent()) {
return fetchByDOI(doi.get());
}

return tryImportFormats(data);
}

private List<BibEntry> tryImportFormats(String data) {
try {
UnknownFormatImport unknownFormatImport = importFormatReader.importUnknownFormat(data);
return unknownFormatImport.parserResult.getDatabase().getEntries();
} catch (ImportException ignored) {
return Collections.emptyList();
}
}

private List<BibEntry> fetchByDOI(DOI doi) {
LOGGER.info("Found DOI in clipboard");
try {
Optional<BibEntry> entry = new DoiFetcher(Globals.prefs.getImportFormatPreferences()).performSearchById(doi.getDOI());
return OptionalUtil.toList(entry);
} catch (FetcherException ex) {
LOGGER.error("Error while fetching", ex);
return Collections.emptyList();
}
return Collections.emptyList();
}
}
2 changes: 1 addition & 1 deletion src/main/java/org/jabref/gui/maintable/MainTable.java
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ private void clearAndSelectLast() {

public void paste() {
// Find entries in clipboard
List<BibEntry> entriesToAdd = Globals.clipboardManager.extractEntries();
List<BibEntry> entriesToAdd = Globals.clipboardManager.extractData();

if (!entriesToAdd.isEmpty()) {
// Add new entries
Expand Down