Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance significantly worse than dash #156

Open
ghost opened this issue Apr 13, 2020 · 19 comments
Open

Performance significantly worse than dash #156

ghost opened this issue Apr 13, 2020 · 19 comments

Comments

@ghost
Copy link

ghost commented Apr 13, 2020

I don't know if this is a real issue or just because the project isn't really mature yet, but because afaik there is no irc server where I could talk about this, I wanted to create this issue:
When comparing the run times of this test code in mrsh and in dash:

#!/bin/sh
echo "Test starting..."
if test -e "123"; then
	echo hi
else
	echo hello
fi
head -c 50 /dev/urandom
printf "\n"
cat << EOF
123
456
789
EOF

i=0
while [ $i -le 100 ]; do
	printf "%s hello world\n" $i
	i=$((i+1))
done

I get those time results:
dash: dash test.sh 0.00s user 0.01s system 95% cpu 0.007 total
mrsh: mrsh test.sh 0.15s user 0.06s system 102% cpu 0.205 total

That's a very significant amount of ~30 times the amount that dash takes to run the script, and when running it a few times, I got similar results. You can even humanly feel the script in mrsh to take some time while in dash, it basically completes instantly.

@emersion
Copy link
Owner

I bet this is because dash implements some commands like echo, test, printf as built-ins while mrsh forks and execs them.

@ghost
Copy link
Author

ghost commented Apr 14, 2020

You are right, when replacing them with absolute paths, the speed of dash is about the same as the mrsh speed. Are those as builtin planned or are they out-of-scope?

@emersion
Copy link
Owner

That's a good question. I'm not sure yet. It's a little bit annoying to ship implementations for standard POSIX utilities on mrsh, but fork/exec is really slow.

@ddevault
Copy link
Contributor

The performance is comparible in more realistic scenarios - this one is pretty contrived. But, ultimately, I think that if your shell's performance is the bottleneck in your system, then you're probably doing something wrong anyway.

@ghost
Copy link
Author

ghost commented Apr 14, 2020

I agree that this one is pretty contrived, but testing and printf/echoing is something that is done pretty often in shell scripts. It may not be noticeable for shell scripts running from cronjobs or as services, but for interactive scripts, it could very well be.

@ddevault
Copy link
Contributor

I'll believe it when I see it. Let's see if anyone complains about performance during typical use, rather than while running benchmarks.

@ghost
Copy link
Author

ghost commented Apr 14, 2020

It was not while running benchmarks when I noticed this, but rather when running scripts normally, the benchmark here is mostly for illustrative purposes.

@ddevault
Copy link
Contributor

Can you share the specific scripts which had noticable performance options, so we can optimize for their bottlenecks rather than for the benchmark?

@ghost
Copy link
Author

ghost commented Apr 14, 2020

One script I had this issue with I don't want to share here, but the other one is this one, but it wasn't as significant there, just about ~2 times the time that dash takes.

@ghost
Copy link
Author

ghost commented Apr 14, 2020

Also, even for the system, the shell is pretty surely not the bottleneck on the system, but one of many. If every application thinks like that, it sums up and you get a pretty perceptible difference compared to when every application was better optimized.

@ddevault
Copy link
Contributor

Also, even for the system, the shell is pretty surely not the bottleneck on the system, but one of many. If every application thinks like that, it sums up and you get a pretty perceptible difference compared to when every application was better optimized.

Not necessarily. Simplicity always beats performance for anything which isn't a bottleneck.

@ghost
Copy link
Author

ghost commented Apr 14, 2020

Well that makes sense. But I wouldn't call the shell "no bottleneck at all". I'd say it's not a big one, but as I said, multiple of those can make a pretty significant difference, while one doesn't look like it does.

@ddevault
Copy link
Contributor

@ghost
Copy link
Author

ghost commented Apr 14, 2020

Oh my bad, I forgot that I had edited that to use posix sh, that one is mine:

#!/bin/sh
issuers=$(echo \
        | openssl s_client -servername "$1" -connect "$1:443" -showcerts 2>/dev/null \
        | grep -E '^  *i:' \
        | sed 's/^ *i:/issuer= /')

echo "$issuers" | while read -r issuer; do
        cn=$(echo "$issuer" | sed 's/.*\(CN\|OU\) *= *//')
        echo "==> For issuer $cn"

        for f in /etc/ca-certificates/trust-source/blacklist/*.pem; do
                [ -f "$f" ] || continue;
                i=$(openssl x509 -noout -issuer < "$f" 2>/dev/null)
                fcn=$(echo "$i" | sed 's/.*\(CN\|OU\) *= *//')
                #echo "'$cn' vs '$fcn'"
                if [ "$cn" = "$fcn" ]; then
                        echo "   -> Blacklist match on $(basename "${f%.pem}")"
                        echo "  $issuer"
                        echo "  $i"
                        break
                fi
        done
done

@concatime
Copy link
Contributor

How about using vfork when possible?

@emersion
Copy link
Owner

POSIX.1-2008 removes the specification of vfork().

@concatime
Copy link
Contributor

Didn’t know! At least posix_spawn is POSIX.

CONFORMING TO
POSIX.1-2001, POSIX.1-2008.

@emersion
Copy link
Owner

I seriously doubt this will improve performance though. Interested in benchmarks in any case.

@ddevault
Copy link
Contributor

posix_spawn is also a really bad API and I would rather not keep it around

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants