-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XeTeX 下使用思源黑体 / 思源宋体后复制出错 #286
Comments
应该是 XeTeX 的 Bug。裸 XeTeX 就能重现。 % XeTeX
\font\1="Source Han Sans SC"
\font\2="Source Han Serif SC"
\font\3="Microsoft YaHei"
{\1 孤立子 ABC} \par
{\2 孤立子 ABC} \par
{\3 孤立子 ABC} \par
\bye 无论如何,ctex-kit 项目无法解决这类问题。 另外,似乎不是 % upTeX + dvipdfmx
\font\rm=upzhserif-h
\font\it=upzhserifit-h
\font\bf=upzhserifb-h
\special{pdf:mapline upserif-h unicode SourceHanSansSC-Regular.otf}
\special{pdf:mapline upserifit-h unicode SourceHanSerifSC-Regular.otf}
\special{pdf:mapline upserifb-h unicode msyh.ttc}
{\rm 孤立子 ABC} \par
{\it 孤立子 ABC} \par
{\bf 孤立子 ABC} \par
\bye 或者如下基于 % latex + dvipdfmx
\documentclass{article}
\usepackage{zhmCJK}
\setCJKmainfont{SourceHanSansSC-Regular.otf}
\setCJKsansfont{SourceHanSerifSC-Regular.otf}
\setCJKmonofont{msyh.ttc}
\begin{document}
{\rmfamily 孤立子 ABC} \par
{\sffamily 孤立子 ABC} \par
{\ttfamily 孤立子 ABC} \par
\end{document} |
经马起园确认,是 XeTeX 在 CMap 方面的 bug。 |
As a workaround, does I should add that although the ToUnicode map is incorrect, Preview from macOS seems to handle the copying perfectly fine. A quick fix would be simply blacklist the KANGXI RADICALs (U+2F00 to U+2FD5) region as that’s more rare. A smarter fix would be trying to leverage the actualtext unicode string to better generate the ToUnicode map. |
By the way, I have noticed that you have modified TeX Live. When this change can take effect? |
I modified my branch of texlive as an attempt, I’m not convinced that this is the best approach yet. In general, TeXLive has an annual release schedule so any changes done now will be released next year. |
这个问题的解决似乎从 XeTeX 那边着手更合理一些,毕竟 (up)latex+dvipdfmx 也不会出错,是 XeTeX 编译时丢失了一部分信息。在 dvipdfm-x 打补丁有点怪。 |
For the record, I’m not able to reproduce the issue even with Adobe Reader on macOS. So at this point I cannot confirm ToUnicode map is the problem here.
I sent an email to http://tug.org/pipermail/xetex/2017-June/027142.html in case anyone is willing to help. |
http://tug.org/pipermail/xetex/2017-June/027143.html claimed that |
I have tried some PDF viewers. With My OS is Windows 10 1607 (Build 14393.1198). |
Thanks for testing again. I will see if Akira-san can build a new w32tex binary for |
Akira-san kindly confirms my patch indeed helped the Feel free to try. Since it's not a ctex-kit bug I will close this and move further discussion elsewhere. If you have more to comment please reply to the email thread. |
Windows 下测试了一下, |
I have tested on TeX Live 2018 (revision 47303 2018-04-05 19:52:22 +0200) and XeTeX (3.14159265-2.6-0.99999). Test file: % Compiled with XeTeX
% \XeTeXgenerateactualtext=0 or 1
\font\1="Source Han Sans SC" % 1.004
\font\2="Source Han Serif SC" % 1.001
\font\3="Microsoft YaHei"
\def\KANGXI{%
⽴% U+2F74
⼦% U+2F26
}
\def\HAN{%
立% U+7ACB
子% U+5B50
}
{\1 \KANGXI ~ \HAN} \par
{\2 \KANGXI ~ \HAN} \par
{\3 \KANGXI ~ \HAN} \par
\bye Copy the string in PDF files to https://r12a.github.io/app-conversion/, the results are the following:
% \XeTeXgenerateactualtext=1
U+2F74 U+2F26 U+7ACB U+5B50
U+2F74 U+2F26 U+7ACB U+5B50
U+2F74 U+2F26 U+7ACB U+5B50
% \XeTeXgenerateactualtext=0
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50 U+7ACB U+5B50
U+107419 U+107316 U+7ACB U+5B50
% \XeTeXgenerateactualtext=1
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50 U+7ACB U+5B50
?? U+7ACB U+5B50
% \XeTeXgenerateactualtext=0
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50 U+7ACB U+5B50
?? U+7ACB U+5B50
% \XeTeXgenerateactualtext=1
U+2F74 U+2F26 U+7ACB U+5B50
U+2F74 U+2F26 U+7ACB U+5B50
U+2F74 U+2F26 U+7ACB U+5B50
% \XeTeXgenerateactualtext=0
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50 U+7ACB U+5B50
U+7419 U+7316 U+7ACB U+5B50
% \XeTeXgenerateactualtext=1
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50
% \XeTeXgenerateactualtext=0
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50 U+7ACB U+5B50
U+7ACB U+5B50 Notice that for normal Han characters, all programs give the correct results; for Kangxi Radicals, however, it seems that the mapping is still in a mess. OS: Windows 10 1709 (build 16299.334) PS: @jjgod I think ctex-kit is not so appropriate for discussing this issue, could you show me a better place (the email list seems to be outdated)? |
PS:从对 SumatraPDF 源码的简单搜索看,它仍不支持 关于 |
代码如下:
使用 XeLaTeX 编译后,前两行复制出来三个汉字是
U+5B64
、U+2F74
和U+2F26
,后两行则是U+5B64
、U+7ACB
和U+5B50
。在思源黑体 / 宋体之下“立子”二字变成了康熙部首(U+2F74
和U+2F26
)。在只用fontspec
的情况下,也同样有该问题。使用 LuaLaTeX 编译,则正常。
PDF 阅读器方面,Adobe Reader 和 SumatraPDF 都会出现错误。
平台:Windows 10,TeX Live 2017,XeTeX 3.14159265-2.6-0.99998,LuaTeX 1.0.4。
字体:Source Han Sans 1.004,Source Han Serif 1.000。
The text was updated successfully, but these errors were encountered: