Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EN DASH disappears from PDF bookmarks in Japanese documents #4187

Closed
jfbu opened this issue Oct 24, 2017 · 2 comments
Closed

EN DASH disappears from PDF bookmarks in Japanese documents #4187

jfbu opened this issue Oct 24, 2017 · 2 comments

Comments

@jfbu
Copy link
Contributor

jfbu commented Oct 24, 2017

Consider following index.rst

Welcome to FOO's documentation!
===============================

もしもし–––さようなら
---------------------

何もない

and language = 'ja'. Then make latexpdf includes following console output:

dvipdfmx:warning: No character mapping available.
 CMap name: EUC-UCS2
 input str: <85>

and after enquiry it comes from the PDF bookmarks. Indeed the EN DASH are ok in PDF, but absent from bookmark. Picture:

capture d ecran 2017-10-24 a 17 40 52

Sphinx escapes the U+2013 (EN DASH) to \textendash{} in LaTeX, but this is not the cause of the problem. It happens also when using the approach at https://github.com/jfbu/sphinx/tree/fixplatexendash and inserting literal in the tex file.

Environment info

  • OS: Mac
  • Python version: 3.5.4
  • Sphinx version: 1.6.5
  • <Extra tools e.g.: Browser, tex or something else>
@jfbu jfbu added this to the 1.7 milestone Oct 24, 2017
@jfbu
Copy link
Contributor Author

jfbu commented Oct 24, 2017

The issue may have to do with this

  \RequirePackage{atbegshi}
  \ifx\ucs\undefined
    \ifnum 42146=\euc"A4A2
      \AtBeginShipoutFirst{\special{pdf:tounicode EUC-UCS2}}
    \else
      \AtBeginShipoutFirst{\special{pdf:tounicode 90ms-RKSJ-UCS2}}
    \fi
  \else
    \AtBeginShipoutFirst{\special{pdf:tounicode UTF8-UCS2}}
  \fi

Is this a problem Japanese LaTeX users know how to solve?

Using \PassOptionsToPackage{pdfencoding=unicode}{hyperref} fixes the EN DASH problem in bookmarks but destroys all Japanese characters there...

@jfbu jfbu added help wanted and removed type:bug labels Oct 24, 2017
@jfbu jfbu modified the milestones: 1.7, 2.0 Oct 24, 2017
@tk0miya
Copy link
Member

tk0miya commented Mar 10, 2019

To resolve this, we need to migrate to uplatex. With platex, we use EUC-JP as internal encoding of PDF bookmarks. But the encoding does not contain EN-DASH. As a workaround, we can replace them by other characters (ex. hyphen in ASCII) before outputting .tex code. But to keep using pLaTeX is not important. So it would be better to migrate.

@tk0miya tk0miya modified the milestones: 2.0.0, 2.1.0 Mar 18, 2019
@tk0miya tk0miya modified the milestones: 2.1.0, 2.2.0 May 28, 2019
@tk0miya tk0miya modified the milestones: 2.2.0, 3.0.0 Aug 18, 2019
tk0miya added a commit to tk0miya/sphinx that referenced this issue Mar 14, 2020
@tk0miya tk0miya modified the milestones: 3.0.0, 3.1.0 Mar 14, 2020
tk0miya added a commit to tk0miya/sphinx that referenced this issue May 23, 2020
tk0miya added a commit that referenced this issue May 23, 2020
Fix #4187: latex: EN DASH disappears from PDF bookmarks in Japanese documents
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants