Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YaLafi to handle LaTeX syntax (recognize begin/end of environments) #239

Closed
be4 opened this issue Jan 25, 2023 · 10 comments
Closed

YaLafi to handle LaTeX syntax (recognize begin/end of environments) #239

be4 opened this issue Jan 25, 2023 · 10 comments
Labels
type: support request A question on how to use YaLafi

Comments

@be4
Copy link

be4 commented Jan 25, 2023

I'd like to check a larger document. The according LaTeX sources (several tex files are included in the main document) contain many comments and the main document has a switch at the beginning so that from the same sources both a German and an English version of the PDF can be created. See the attached minimal example, where \begin{en} ... \end{en} is such an environment, which is only handled by LaTeX if the English language is selected. A second way to control the languages apart is \iftoggle{en} (also see the minimal example).

However, I still get way too many false positives because it also reads the comments and the switched off language, and sends them to LanguageTool.

For the minimal example, here is the call to YaLafi and the output (on a Ubuntu bash terminal):

$ python3 -m yalafi.shell --lt-directory ~/Downloads/LanguageTool-6.0/ --output html --include  --language en-US  main_two_languages.tex  > t1.html
=== checking for file inclusions ... main_two_languages.tex
=== main_two_languages.tex
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.comment'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.ifthen'
*** yalafi.shell: warning:
*** could not load module 'yalafi.packages.etoolbox'
Expected text language: English (US)
Working on STDIN...

How can I make the packages "comment", "ifthen" and "etoolbox" be known to YaLafi?

In addition to the minimal example "main_two_languages.tex", I attached the output (the file "t1.html"): 15 lines of text are output there, with only one line needed: the one with the misspelled word "misspellingx" in line 56.

If I just missed an option YaLafi already has or if you need more information, please send me a message. Thanks a lot.

PS: Concerning the minimal example: As GitHub said, "tex" and "html" are not supported as file types for an attachment, the two sample files are in the attached zip file.
samplefiles.zip

@torik42
Copy link
Owner

torik42 commented Jan 26, 2023

Thank you for opening this issue. To add these packages to YaLafi, you would basically need to implement a python version of them. How to do that is sketched here (see also here). However, I don’t think that is needed in your case. If a command (call it \unknown) is unknown to YaLafi it will be replaced by its arguments (i.e. function like \newcommand{\unknown}[1]{#1}).

In your case I would suggest to use \LTadd (read the explanation here). For LaTeX you define it as \newcommand{\LTadd}[1]{}, but YaLafi will ignore this definition. Then you redefine all your environments within \LTadd such that both are shown to YaLafi, but not LaTeX, (just use \newenvironment which will be parsed by YaLafi), in the following way:

  1. Load babel before the environments (and set the language to ngerman).
  2. Make the de environment without further settings (maybe use \selectlanguage to set the language to ngerman).
  3. Make the en environment and use the babel otherlanguage environment or \selectlanguage command within it to set the language to english.

Now YaLafi should check both languages simultaneously.

If you really want to check only one language at a time, one would at least need to implement an LTskip environment such that you could replace the de environment with an LTskip environment. You would still need to toggle that by hand. Support for the comment package and basic functions from the ifthen package could in general be added to YaLafi, I might add these as new issues. Or you define two custom packages, one removing the de, the other the en environment and call YaLafi with option --packages *,.your_en_package, where your_en_package.py is in the same directory and defines the de environment to vanish (see circuitikz.py for how to do that).

Feel free to ask further questions if something is still unclear. I have also not explicitly tested the above.

PS: You can use `...` for code in GitHub Markdown and the following for code blocks with syntax highlighting:

```latex
\documentclass{scrartcl}
...
```

@be4
Copy link
Author

be4 commented Jan 26, 2023

Hello Torik42,
thanks for the quick answer. I am not able to implement a skip function for an included package, but I am willing to work as a tester for LaTeX files when a new version of YaLafi handles this.

If for the given file (minimal example in "samplefiles.zip" in my last comment), all comments and all German parts would be stripped out, this would be ok for me. The attachment "samplefiles-filtered.zip" in this comment contains two samples of what I would expect as a result from the filters and what then can be handed over to LanguageTool.

Is this something what YaLafi already can handle, or do I have to wait till you or one of the other contributors implemented this new issue? I really appreciate this work and if you need a beta tester, please let me know.
samplefiles-filtered.zip

@torik42
Copy link
Owner

torik42 commented Jan 31, 2023

My initial suggestion was to check both languages simultaneously, by just setting the language in each of your two environment. Indeed, YaLafi can handle babels \selectlanguage command. Unfortunately – and I wasn’t aware of that – \newenvironment is not defined, so we cannot do that in LaTeX code. Instead, one need to write a python module.

You can use the following python code saved as en.py in the directory from where you call YaLafi:

from yalafi.defs import Environ, InitModule

require_packages = []

def init_module(parser, options, position):
    parms = parser.parms

    macros_latex = ''

    macros_python = []

    environments = [

        Environ(parms, 'de', remove=True, add_pars=False),

    ]

    return InitModule(macros_latex=macros_latex, macros_python=macros_python,
                      environments=environments)

It will remove the de environments if you call yalafi.shell with option --packages .en. If you also pass option --multi-language, YaLafi will correctly set the language based on the \usepackage[…]{babel} in the en environment. With the same file saved as de.py, where you change 'de' to 'en' in Environ(…) you can check the German text.

Since YaLafi does not know the ifthen package, the toggle does not work. But you could just define \newcommand{\deen}[2]{#1} and \newcommand{\deen}[2]{#2} in the de and en environment, respectively. Then use \deen{deutscher Text}{english text} instead of \iftoggle{de}{deutscher Text}{english text}. YaLafi will properly parse the definition of deen.

Alternatively, you can check both languages simultaneously. For that hide the preamble from YaLafi using:

%%% LT-SKIP-BEGIN%%% LT-SKIP-END

Then, use the following saved as auto.py

from yalafi.defs import Environ, InitModule

require_packages = []

def init_module(parser, options, position):
    parms = parser.parms

    macros_latex = ''

    macros_python = []

    environments = [

        Environ(parms, 'en', remove=False, repl=r'\selectlanguage{english}', add_pars=False),
        Environ(parms, 'de', remove=False, repl=r'\selectlanguage{ngerman}', add_pars=False),

    ]

    return InitModule(macros_latex=macros_latex, macros_python=macros_python,
                        environments=environments)

and call YaLafi with --packages .auto and --multi-language. You also need to add

\newcommand{\LTadd}[1]{}
\LTadd{\usepackage[english,ngerman]{babel}}
\LTadd{\newcommand{\deen}[2]{{\selectlanguage{ngerman}#1} {\selectlanguage{english}#2}}}

to your preamble so that YaLafi loads babel with the correct settings (again assuming you use \deen).

Edit: Fix missing bracket.

@torik42
Copy link
Owner

torik42 commented Feb 1, 2023

You could probably also skip the last part and put the commands within \LTadd into

    macros_latex = '''
        \usepackage[english,ngerman]{babel}}
        \newcommand{\deen}[2]{{\selectlanguage{ngerman}#1} {\selectlanguage{english}#2}}
   '''
within `auto.py`.

@be4
Copy link
Author

be4 commented Feb 3, 2023

Hello Torik42,

thanks a lot for your last answer.

(1) By using en.py as additional package, the following YaLafi command worked very well. This now ignored everything in the "de" environments:

python3 -m yalafi.shell  --lt-directory ~/Downloads/LanguageTool-6.0/  --output html  --language en-US  --packages .en  m-orig.tex  > m-orig_03en.html

And the same worked independently when wished to ignore the "en" environments.

(2) I modified your suggestion to substitute \iftoggle in all tex files with a newcommand \deen (since YaLafi does not know the ifthen package).

It was easier to do something similar only once in the document header:

  • Applying \let or the newer \NewCommandCopy to \iftoogle
  • Then renew \iftoggle and set the parameter according to the selected language.
\NewCommandCopy{\oldiftoggle}{\iftoggle}
\renewcommand{\iftoggle}[3]{\textbf{#3}}

See the attached minimal sample "m-language-switch.tex" in the zip file.

Thanks again for your great help. These changes reduced the number of "problems" found by LanguageTool
in my case in a "real" LaTeX document from 10901 to 1665. So I got rid of almost all false positives. You already added dealing with the comment package as another issue which is the main reason for remaining false positives in my case.

We recently had a discussion about language checks of LaTeX documents at DANTE. If you would like to give an introduction to YaLafi please let me know. You could write to my email address directly.

m-language-switch.zip

@torik42
Copy link
Owner

torik42 commented Feb 3, 2023

I am glad that it worked.

(2) I modified your suggestion to substitute \iftoggle in all tex files with a newcommand \deen (since YaLafi does not know the ifthen package).

If you want to keep the \iftoggle in the LaTeX code, you could also use \LTskip{\renewcommand{\iftoggle}[3]{#3}} \LTadd{\renewcommand{\iftoggle}[3]{#3}} (see how to define \LTskip \LTadd above. This would only be read by YaLafi, not LaTeX.

Thanks again for your great help. These changes reduced the number of "problems" found by LanguageTool in my case in a "real" LaTeX document from 10901 to 1665. So I got rid of almost all false positives. You already added dealing with the comment package as another issue which is the main reason for remaining false positives in my case.

If you want to remove more comment environments, you can – for now – just add more

     Environ(parms, 'environment_to_be_ignored', remove=True, add_pars=False),

statements to en.py. Or copy en.py to something like mycomment.py, add all environments you want to ignore, and call YaLafi with option --packages .en,.mycomment.

@be4
Copy link
Author

be4 commented Feb 3, 2023

Hello Torik42,

  1. Removing the comment environments via the Environ command you suggested worked well. Thanks.

  2. However, I didn't manage to get \LTskip work on my side.
    Could you please add a minimal working example including the enhanced file en.py.
    Please note: I don't want to make changes in my many tex files included by the main.tex file (they have > 1000 appearances of \iftoggle). All changes should be in the header of the main file or within additional files like en.py.

Thanks again for your help.

@torik42
Copy link
Owner

torik42 commented Feb 3, 2023

That was my fault. As initially written, it should be \LTadd. You define it in LaTeX as \newcommand{\LTadd}[1]{} so that arguments are ignored by LaTeX. Then you put \LTadd{\renewcommand{\iftoggle}[3]{#3}} and YaLafi will parse \renewcommand{\iftoggle}[3]{#3}. This you only put in the en environment within the preamble.

\LTskip has the exact opposite meaning, the argument is ignored by YaLafi, but parsed by LaTeX if defined as \newcommand{\LTskip}[1]{#1}. See here. I hope it also works without minimal example.

@torik42 torik42 added the type: support request A question on how to use YaLafi label Feb 3, 2023
@be4
Copy link
Author

be4 commented Feb 3, 2023

I hope it also works without minimal example.
Yes, \LTadd worked well. Thanks a lot.

PS: Could you please answer my DANTE question above. I'd like to remove my email address again.

@torik42
Copy link
Owner

torik42 commented Feb 8, 2023

Separated the remaining issues: #241 (parse \newenvironment), #242 (add LTskip environments), #243 (add comment package).

@torik42 torik42 closed this as completed Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: support request A question on how to use YaLafi
Projects
None yet
Development

No branches or pull requests

2 participants