-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move away from the string
for handling executed commands
#213
Comments
type
ShellResult* = object
cmd*: ShellCmd
cwd*: AbsoluteDir ## Absolute path of initial command execution directory
retcode*: int ## Exit code
stderr*: string ## Stderr for command
stdout*: string ## Stdout for command
ShellCmdPartKind* = enum
cpkArgument ## String argument to command
cpkOption ## Key-value pair
cpkFlag ## Boolean flag (on/off)
cpkRaw ## Raw parameter
cpkTemplated ## Interpolated parameter
ShellCmdPart* = object
prefix*: string
key*: string
case kind*: ShellCmdPartKind
of cpkOption:
val*: string
sep*: string
else:
discard
ShellCmd* = object
bin: string
opts: seq[ShellCmdPart]
func flag*(cmd: var ShellCmd, fl: string, prefix: string = "-") =
## Add flag for command
cmd.opts.add ShellCmdPart(kind: cpkFlag, key: fl)
func opt*(cmd: var ShellCmd, key, sep, val: string, prefix: string = "--") =
cmd.opts.add ShellCmdPart(
kind: cpkOption, key: key, val: val, sep: sep, prefix: prefix)
func opt*(cmd: var ShellCmd, opts: openarray[tuple[key, val: string]], prefix: string = "--") =
## Add sequence of key-value pairs
for (key, val) in opts:
cmd.opt(key, "=", val, prefix)
func raw*(cmd: var ShellCmd, str: string) =
## Add raw string for command (for things like `+2` that are not
## covered by default options)
cmd.opts.add ShellCmdpart(kind: cpkRaw, key: str)
func arg*(cmd: var ShellCmd, arg: string) =
## Add argument for command
cmd.opts.add ShellCmdPart(kind: cpkArgument, key: arg)
func arg*(cmd: var ShellCmd, arg: int) = cmd.arg($arg)
func add*(cmd: var ShellCmd, part: ShellCmdPart) =
cmd.opts.add part
func shell*(bin: string): ShellCmd = ShellCmd(bin: bin)
func toStr*(part: ShellCmdPart): string =
result = part.prefix & part.key
case part.kind:
of cpkOption:
result.add part.sep
result.add part.val
else:
discard
func interpolate(parts: ShellCmdPart, interpolations: Table[string, seq[ShellCmdPart]]): seq[ShellCmdPart] =
for part in parts:
if part.kind == cpkTemplated:
result.add interpolations[part.key]
else:
result.add part
func toStr*(cmd: ShellCmd, interpolations: Table[string, seq[ShellCmdPart]]): seq[string] =
result = @[cmd.bin]
for part in cmd.opts.interpolate(interpolations):
result.add part.toStr()
This is an example API - it is pretty simple and provides only minimally necessary abstractions over the |
func opt*(cmd: var ShellCmd, opts: openarray[tuple[key, val: string]], prefix: string = "--") =
## Add sequence of key-value pairs
for (key, val) in opts:
cmd.opt(key, "=", val, prefix) I don't think this works. Even the C compiler, heavily used by nim has options like these:
The first two would work, but |
In the end I think I might be happier to not use abstractions. |
I think the abstraction would be per shell escaped or potentially quoted string. |
@saem I don't think the shell should be involved at all here. You can start processes, pass arguments, set environment variables all without using the shell. The shell is just syntax to do it on the command line. https://man7.org/linux/man-pages/man3/exec.3.html The abstraction provided by the operating system to start a process is a list of arguments, and a list of environment variables. Then there is Abstractions like flags, options, named parameters, nested arguments ( |
Maybe the better solution would be to dumb down abstraction even further, making it a
cmd.flag("I", "/usr/local/include")
cmd.flag("std", "=", "c++11")
cmd.flag("o")
cmd.arg("outfile")
cmd.flag("W", "l,--defsym,__stack_limit=0x7ffe0000") In the end, Also, I forgot about |
int execl(const char *pathname, const char *arg, ...
/*, (char *) NULL */);
int execlp(const char *file, const char *arg, ...
/*, (char *) NULL */);
int execle(const char *pathname, const char *arg, ...
/*, (char *) NULL, char *const envp[] */);
int execv(const char *pathname, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execvpe(const char *file, char *const argv[], char *const envp[]); In that case if linkTmpl.len == 0:
linkTmpl = CC[conf.cCompiler].linkTmpl
result.addf(linkTmpl, ["builddll", builddll,
"mapfile", mapfile,
"buildgui", buildgui, "options", linkOptions,
"objfiles", objfiles, "exefile", exefile,
"nim", quoteShell(getPrefixDir(conf)),
"lib", quoteShell(conf.libpath),
"vccplatform", vccplatform(conf)]) basically something better than abominations like these linkTmpl: "$buildgui $builddll -Wl,-Map,$mapfile -o $exefile $objfiles $options",
compileTmpl: "-w -MMD -MP -MF $dfile -c $options $include -o $objfile $file",
linkTmpl: "$builddll$vccplatform /Fe$exefile $objfiles $buildgui /nologo $options", |
I have no general problem with a specified command templating syntax, like it is done with
(Please no "smart" solution) |
If text is a single parameter, it stays as a single parameter and passed as a single parameter to the os process execution. |
I think there are two major parts to this:
The abstraction for the former needs to allow for composition with the abstraction for the latter. |
All of the comments above are pointing in the same direction it seems. So at this point is just implementation and iterating in it including docs and tests. |
This can be added and used in new stuff, with old implementation getting replaced in small pieces as we progress through. |
@haxscramper The thing I would like to point out is, as much as I approve the idea that you want to clean the things up a bit, I do not like the direction this thing is heading. The point is the following:
Here is an example. The manual of gcc tells me to pass an argument of the form "-I/usr/local/include". So my goal is to pass that string to gcc. I don't care about a This here is cryptic to read cmd.flag("I", "/usr/local/include")
cmd.flag("std", "=", "c++11")
cmd.flag("o")
cmd.arg("outfile")
cmd.flag("W", "l,--defsym,__stack_limit=0x7ffe0000")
So whatever abstraction you provide, I would like to pass my command parameters as a list of string and nothing else. cmd.args(["-I/usr/local/include", "-std=c++11", "-o", "outfile", "-Wl,--defsym,__stack_limit=0x7ffe0000"]) or maybe even like this cmd.args(splitWhitespace("-I/usr/local/include -std=c++11 -o outfile -Wl,--defsym,__stack_limit=0x7ffe0000")) When I know gcc, I know instantly what it does. When I don't know gcc, but I do know about starting processes, I can still look up with "-I" and "-o" does in I don't want to be the fun killer, but it feels like I have to be it here. I do not want abstractions. Especially I do not want abstractions for the sake of implementing abstractions. If abstractions are implemented, they need to provide real benefits, either syntactically, or structurally to prevent mistakes in the future. I do not see this here as given. |
Given we don't even have this (and instead just have a string Ok, we can just make it a binary+seq[arg] where arg is a string. Just like execve arguments
No, if you have a list of strings why would you need to joint it in the literal and then split on the whitespace?
Yeah sure, why even refactor anything, just leave the garbage alone
Not the worst thing in the world, we can live with this |
The example is still accepting a list of strings. I've just put it in there as an example of how to enable a command line feel without being cryptic about what is going on in the backend. I shows how a
yay PS: I just want to point out that I do agree with you that the code examples you picked in the biginning are very bad and need to be addressed nimskull/compiler/backend/extccomp.nim Lines 357 to 359 in afbd5fe
does not escape potential spaces and other characters in nimskull/compiler/backend/extccomp.nim Lines 491 to 492 in afbd5fe
Should probably return a list of strings. nimskull/compiler/backend/extccomp.nim Line 539 in afbd5fe
string literals like |
import std/[strutils, tables]
type
ShellResult* = object
cmd*: ShellCmd
cwd*: AbsoluteDir ## Absolute path of initial command execution directory
retcode*: int ## Exit code
stderr*: string ## Stderr for command
stdout*: string ## Stdout for command
ShellArgKind* = enum
cpkArgument ## String argument to command
cpkTemplated ## Interpolated parameter
ShellArg* = object
cmd*: string
kind*: ShellArgKind
ShellCmd* = object
bin*: string
opts*: seq[ShellArg]
func shArg*(arg: string): ShellArg = ShellArg(kind: cpkArgument, cmd: arg)
func shSub*(arg: string): ShellArg = ShellArg(kind: cpkTemplated, cmd: arg)
func shSub*(cmd: var ShellCmd, subs: openarray[string]) =
for sub in subs:
cmd.opts.add shSub(sub)
func args*(cmd: var ShellCmd, args: openarray[string]) =
## Add argument for command
for arg in args:
cmd.opts.add shArg(arg)
func arg*(cmd: var ShellCmd, format: string, args: varargs[string]) =
cmd.args([format(format, args)])
func shell*(bin: string, args: openarray[string] = @[]): ShellCmd =
result = ShellCmd(bin: bin)
result.args args
func add*(cmd: var ShellCmd, arg: ShellArg) = cmd.opts.add arg
func add*(cmd: var ShellCmd, args: openarray[ShellArg]) = cmd.opts.add args
func toStr*(part: ShellArg): string =
part.cmd
type
ShInterpolate* = Table[string, seq[ShellArg]]
func interpolate(
part: ShellArg,
interpolations: ShInterpolate = default(ShInterpolate)
): seq[ShellArg] =
if part.kind == cpkTemplated:
if part.cmd in interpolations:
result.add interpolations[part.cmd]
else:
result.add part
func argsToStr*(
cmd: ShellCmd,
interpolations: ShInterpolate = default(ShInterpolate)
): seq[string] =
for part in cmd.opts:
for interpol in part.interpolate(interpolations):
result.add interpol.toStr()
func toStr*(
cmd: ShellCmd,
interpolations: ShInterpolate = default(ShInterpolate)
): seq[string] =
result = @[cmd.bin] & cmd.argsToStr(interpolations)
block:
var gcc = shell("gcc")
for path in ["/usr/local/include", "/usr/include"]:
gcc.arg("-I$#", path)
gcc.args([
"-std=c++11",
"-o",
"outfile",
"-Wl,--defsym,__stack_limit=0x7ffe0000"
])
echo gcc.toStr()
type
ConfigRef = object
linkOptions: seq[ShellArg]
linkOptionsCmd: seq[ShellArg]
proc getLinkOptions(conf: ConfigRef): seq[ShellArg] =
result = conf.linkOptions & conf.linkOptionsCmd
var conf: ConfigRef
conf.linkOptions.add shArg("--random")
gcc.add conf.getLinkOptions()
echo gcc.toStr()
|
And something like this for running command import std/[osproc, streams, os]
proc exec*(
cmd: ShellCmd,
workdir: string = "",
stdin: string = ""
): ShellResult =
let workdir = if len(workdir) == 0: getCurrentDir() else: workdir
result.cwd = AbsoluteDir(workdir)
var p = startProcess(
cmd.bin,
args = cmd.argsToStr(),
options = {poUsePath}
)
result.cmd = cmd
var outp = outputStream(p)
var outerr = errorStream(p)
if stdin.len > 0:
inputStream(p).write(stdin)
close inputStream(p)
result.retcode = -1
var line = newStringOfCap(120)
while true:
if outp.readLine(line):
result.stdout.add(line)
result.stdout.add("\n")
elif outerr.readLine(line):
result.stderr.add(line)
result.stderr.add("\n")
else:
result.retcode = peekExitCode(p)
if result.retcode != -1:
break
close(p) block:
var gcc = shell("gcc", ["--version 123"])
echo gcc.exec() (cmd: (bin: "gcc", opts: @[(cmd: "--version 123", kind: cpkArgument)]), cwd: "/tmp", retcode: 1, stderr: "gcc: error: unrecognized command-line option ‘--version 123’; did you mean ‘--version’?\ngcc: fatal error: no input files\ncompilation terminated.\n", stdout: "") echo shell("nim", ["--eval=echo 12 + 2"]).exec()
|
For parallel execution (hacked on top of osproc, just to provide an example). import std/posix
iterator exec*[T](
cmds: openarray[tuple[cmd: ShellCmd, data: T]],
fullParams: set[ProcessOption] = {poUsePath},
workdir: string = "",
maxPool: int = 8,
beforeStart: proc(cmd: ShellCmd, data: T) = nil
): tuple[res: ShellResult, data: T] =
assert 0 < maxPool
var processList: seq[Process] = newSeq[Process](maxPool)
var commandMap: seq[int] = newSeq[int](maxPool)
let maxPool = min(maxPool, cmds.len)
var leftCount = len(cmds)
var cmdIdx = 0
let workdir = if len(workdir) == 0: getCurrentDir() else: workdir
while cmdIdx < maxPool:
if not isNil(beforeStart):
beforeStart(cmds[cmdIdx].cmd, cmds[cmdIdx].data)
processList[cmdIdx] = start(
cmds[cmdIdx].cmd,
workdir = workdir,
options = fullParams
)
commandMap[cmdIdx] = cmdIdx
inc(cmdIdx)
while leftCount > 0:
var exitedIdx = -1
var status: cint = 1
let res: Pid = waitpid(-1, status, 0)
if res > 0:
for idx, process in mpairs(processList):
if not isNil(process) and process.processID() == res:
if WIFEXITED(status) or WIFSIGNALED(status):
# process.exitFlag = true
# process.exitStatus = status
exitedIdx = idx
break
else:
let err = osLastError()
if err == OSErrorCode(ECHILD):
# some child exits, we need to check our childs exit codes
for idx, process in mpairs(processList):
if (not isNil(process)) and (not running(process)):
# process.exitFlag = true
# process.exitStatus = status
exitedIdx = idx
break
elif err == OSErrorCode(EINTR):
# signal interrupted our syscall, lets repeat it
continue
else:
# all other errors are exceptions
raiseOSError(err)
if exitedIdx >= 0:
var res: ShellResult
let p = processList[exitedIdx]
res.cmd = cmds[commandMap[exitedIdx]].cmd
res.stdout = p.outputStream.readAll()
res.stderr = p.errorStream.readAll()
res.retcode = p.peekExitCode()
yield (res, cmds[commandMap[exitedIdx]].data)
close(processList[exitedIdx])
if cmdIdx < len(cmds):
if not isNil(beforeStart):
beforeStart(cmds[cmdIdx].cmd, cmds[cmdIdx].data)
processList[exitedIdx] = start(
cmds[cmdIdx].cmd,
workdir = workdir,
options = fullParams)
commandMap[exitedIdx] = cmdIdx
inc(cmdIdx)
else:
processList[exitedIdx] = nil
dec(leftCount)
for (res, data) in exec({
shell("echo", ["AA"]): "long",
shell("echo", ["BB"]): "all"
}):
echo ">>> ", data
echo res
|
@krux02 I need to clarify something about philosophy here. Bluntly, these arguments are entirely invalid. We'll not use these to make decisions. The other parts of the comment with respect to type modelling issues, error prevention, and such, that is totally fine. |
For testament and a few other cases, I'm not sure aobut pulling out the std error/output is a good idea as This is porbably totally fine for extcomp and main. Testament might be ok until needs grow. |
Can you elabotae on this. What do you want to clarify about philosophy here? How are they entirely invalid? No problem with being blunt here, as I was blunt as well, so sorry for that. |
* Changes - Move missing 'input' configuration parameters to the ~CurrentConf~ - Added documentation for most of the configuration variables like ~git.url~, ~doc.item~, ~<module>.always~, module search logic - Add tests for the CLI parser. - Add ~[FileIndex]~ implementation for the ~ConfigRef~ to avoid constantly writing ~conf.m.fileInfos[index.int32]~ as if we don't have templates/procedures to take care of repetitive code. - Closes nim-works#210 - it is hardly possible to gain anything by refactoring this part of the compiler. I updated documentation on some of the relevant pieces I found, as well as some minor tests, but otherwise there is nothing to gain here. Current implementation is usable for the purposes of nim-works#211 - it is possible to call into ~parseCommand()~ and get self-contained chunk of data for all the flags, that we can also do roundtrips on (using nim-works#213). * Tried to do but failed Due to large number of dependencies between data storage, reporting etc. it was not possible to really separate CLI parser logic from everything else. I tried separating them out into standalone files, but it was a lot more troubles than it worth: - Moving CLI parser would've require separating 'external' reports into different file, but this would still pull in ~ast_types.nim~, which is a pretty large dependency on its own. - CLI parser can do external reports - but it is possible to have multiple failures before processing terminates if there is a ~--errormax=N~ before. So logical idea of - "CLI flag is either valid or invalid" does not really cut here, since "invalid" can be postponed. - Because there is still a need to drag around whole suite of configuration state, reporting "deprecated" warnings can stay as it is now.
Yeah, I didn't like how I phrased that, was playing with the wording any how, thanks for taking it charitably. What I mean about the philosophy is that we have a genuine change to restart with nimskull, let's not simply get stuck with decisions that got stuck in other languages here. I hope that makes sense. So in order to do that, let's avoid conventions established in other languages where those conventions might well existing because of the poor abstraction capabilities in those languages. So more specifically, lots of languages have strings, and yeah there are lots of issues with trying to create a good set of types over strings that model the CLI invocations. But... let's not be defeatist, there are some pretty good type abstractions already available to us and let's not shoot ourselves in the foot because others have done the same before. I'm not going to say there zero benefits to following the herd, there are lots! But let's follow the herd for where things are an improvement and rework for where things are not. In this case the herd is regressing to flinging strings around which can also have security implications, but I digress, so the herd is towards a regression. In other cases, let's saying testing library design, like OMG I would kill for something a lot more like JUnit, PHPUnit, RSpec, Mocha, KTest, ... you name it and it's probably better. 😆 That's a lot of words, sorry. I hope that helped; thanks for reading. I'm happy to talk more we can discuss in chat, a discussion, or whatever. |
* Changes - Move missing 'input' configuration parameters to the ~CurrentConf~ - Added documentation for most of the configuration variables like ~git.url~, ~doc.item~, ~<module>.always~, module search logic - Add tests for the CLI parser. - Add ~[FileIndex]~ implementation for the ~ConfigRef~ to avoid constantly writing ~conf.m.fileInfos[index.int32]~ as if we don't have templates/procedures to take care of repetitive code. - Closes nim-works#210 - it is hardly possible to gain anything by refactoring this part of the compiler. I updated documentation on some of the relevant pieces I found, as well as some minor tests, but otherwise there is nothing to gain here. Current implementation is usable for the purposes of nim-works#211 - it is possible to call into ~parseCommand()~ and get self-contained chunk of data for all the flags, that we can also do roundtrips on (using nim-works#213). * Tried to do but failed Due to large number of dependencies between data storage, reporting etc. it was not possible to really separate CLI parser logic from everything else. I tried separating them out into standalone files, but it was a lot more troubles than it worth: - Moving CLI parser would've require separating 'external' reports into different file, but this would still pull in ~ast_types.nim~, which is a pretty large dependency on its own. - CLI parser can do external reports - but it is possible to have multiple failures before processing terminates if there is a ~--errormax=N~ before. So logical idea of - "CLI flag is either valid or invalid" does not really cut here, since "invalid" can be postponed. - Because there is still a need to drag around whole suite of configuration state, reporting "deprecated" warnings can stay as it is now.
Most of the commands executed in the compiler go through the
execProcesses*(cmds: openArray[string],
orexecCmdEx*(command: string,
and as a consequence, most of the code handles strings by immediately joining them, using string template interpolation all over the place. While by itself it is not too bad, it makes it harder to reason about the code, because semi-random string joins all over the place, likenimskull/compiler/backend/extccomp.nim
Lines 357 to 359 in afbd5fe
nimskull/compiler/backend/extccomp.nim
Lines 491 to 492 in afbd5fe
nimskull/compiler/backend/extccomp.nim
Line 539 in afbd5fe
As well as weird handling of the shell execution results, where (again), premature string formatting completely conceals the purpose and data flow in the code.
nimskull/testament/testament.nim
Line 474 in afbd5fe
nimskull/testament/specs.nim
Lines 120 to 124 in afbd5fe
nimskull/testament/testament.nim
Lines 179 to 187 in afbd5fe
Instead, several helper types should be introduced to abstract away the command handling logic -
ShellCmd
,ShellResult
,ShellCmdPart
.nimskull/compiler/front/main.nim
Lines 84 to 93 in 829ffa1
nimskull/compiler/backend/extccomp.nim
Lines 416 to 418 in 829ffa1
nimskull/compiler/backend/extccomp.nim
Lines 235 to 240 in 829ffa1
nimskull/compiler/backend/extccomp.nim
Line 112 in 829ffa1
The text was updated successfully, but these errors were encountered: