Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent unintended expansion of \input and similar commands in \luabridge_tl_set:Nn in LuaTeX #29

Closed
Witiko opened this issue Nov 21, 2024 · 1 comment · Fixed by #31
Assignees
Labels
bug Something isn't working
Milestone

Comments

@Witiko
Copy link
Owner

Witiko commented Nov 21, 2024

Description

Using \luabridge_tl_set:Nn causes issues in LuaTeX when \input is produced by the Lua code.

Minimal reproducing example

File example.tex

\input lt3luabridge
\ExplSyntaxOn
\luabridge_tl_set:Nn
  \l_tmpa_tl
  { print(require("example")) }
\tl_show:N
  \l_tmpa_tl
\ExplSyntaxOff
\bye

File example.lua

return [[\input{example_input}\relax]] 

File example_input.tex

Hello, world!

Expected behavior

Executing luatex example should produce the following text on the terminal, like pdftex --shell-escape example does:

> \l_tmpa_tl=\input {example_input}\relax .

Actual behavior

Executing luatex example produces the following text on the terminal>

 (./example_input.tex)
Runaway argument?
{Hello,world!
! File ended while scanning use of \tl_set:Nn.
<inserted text> 
\par

Discussion

Cause of the issue

This behavior seems to be due to the way we define the function \luabridge_tl_set:Nn for LuaTeX:

\cs_new:Nn
\luabridge_tl_set:Nn
{
\tl_set:Nn
\l_tmpa_tl
{ #2 }
\tl_set:Nx
\l_tmpa_tl
{
_ENV = setmetatable({}, {__index = _ENV})
local~function~print(input)
input = tostring(input)
local~output = {}
for~line~in~input:gmatch("[^
\iow_char:N \\ r
\iow_char:N \\ n
]+") do~
table.insert(output, line)
end~
tex.print(output)
end~
\exp_not:V \l_tmpa_tl
}
\tl_set:Nf
#1
{
\lua_now:V
\l_tmpa_tl
}
}

Specifically, in the Lua code, we redefine print(text) to call tex.print() on the individual lines in text:

local~function~print(input)
input = tostring(input)
local~output = {}
for~line~in~input:gmatch("[^
\iow_char:N \\ r
\iow_char:N \\ n
]+") do~
table.insert(output, line)
end~
tex.print(output)
end~

Then we collect the lines in a tl variable:

\tl_set:Nf
#1
{
\lua_now:V
\l_tmpa_tl
}

The issue seems to be with the f-expansion, which seems to expand the command \input as well.

Potential solutions

Replace \input with \exp_not:N \input

The simplest solution would be to change our definition of function print() to replace any occurence of \input with \exp_not:N \input. This solves this particular issue but is easily fooled by e.g. \csname input\endcsname and does not solve the bigger issue, just one of the symptoms.

Controlled expansion

One solution would be to better control the expansion when storing the output of Lua in a tl variable. Namely, we can expand \l_tmpa_tl in a single expansion and \lua_now:n in two expansions, according to interface3.pdf. Therefore, changing our code as follows and adding the correct number of \exp_after:wN to the correct places should do the trick:

\tl_set:Nn
  #1
  {
    \lua_now:n
      { \l_tmpa_tl }
  }

However, this solution seems brittle because it relies on the fact that a specific number of expansions is sufficient, which does not seem guaranteed by expl3.

Using token.set_macro()

Instead of collecting the result of \lua_now:n, we might redefine the function print(text) to accumulate the individual lines of text in an auxiliary expl3 variable such as \l_tmpa_tl using e.g. token.set_macro("l_tmpa_tl"):

token.set_macro("l_tmpa_tl", "")
local function print(input)
  input = tostring(input)
  local value = token.get_macro("l_tmpa_tl")
  for line in input:gmatch("[^\r\n]+") do
    value = value .. line .. "\n"
  end
  token.set_macro("l_tmpa_l", value)
end

Then, we can simply execute the code and assign the content of \l_tmpa_tl to the output variable:

\lua_now:V
  \l_tmpa_tl
\tl_set:NV
  #1
  \l_tmpa_tl

This is an intriguing solution but it likely would have trouble with e.g. print([[\catcode`\%=12]]). Currently, this print() statement would affect how percent signs in the following print() statements are tokenized, whereas with the above solution, the percent signs in the following print() statements would still be tokenized as before.

@Witiko Witiko added the bug Something isn't working label Nov 21, 2024
@Witiko Witiko added this to the 2.2.1 milestone Nov 21, 2024
@Witiko
Copy link
Owner Author

Witiko commented Nov 21, 2024

Would you like to have a look, @TeXhackse? Seems like something a TeXnician might be able to fix over a coffee break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant