Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace regular expressions for performance #32

Open
probonopd opened this issue Dec 8, 2018 · 1 comment
Open

Replace regular expressions for performance #32

probonopd opened this issue Dec 8, 2018 · 1 comment
Labels
performance Enhancements that increase performance

Comments

@probonopd
Copy link
Member

Based on the results of AppImageCommunity/appimaged#53, regular expressions seem to be really CPU intensive.

Hence, we should replace them with something more efficient. Possibly fnmatch(3) would be more efficient (to be verified),

After having replaced regular expressions with something else, please test that

  • --appimage-extract still works with * wildcards
  • appimaged consumes significantly less CPU
@TheAssassin
Copy link
Member

I've run the following two test programs in a VM (4GiB RAM, 4 CPU cores).

#include <stdio.h>
#include <fnmatch.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    if (argc != 2) {
        printf("Usage: %s <num rounds>\n", argv[0]);
	return 2;
    }

    const int rounds = atoi(argv[1]);

    char pattern[] = "abc*nop";
    char test[] = "abcdefghijklnmop";

    for (int i = 0; i < rounds; ++i) {
        fnmatch(pattern, test, FNM_PATHNAME);
    }

    return 0;
}
#include <stdio.h>
#include <regex.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    if (argc != 2) {
        printf("Usage: %s <num rounds>\n", argv[0]);
	return 2;
    }

    const int rounds = atoi(argv[1]);

    char pattern[] = "abc.*nop";
    char test[] = "abcdefghijklnmop";

    for (int i = 0; i < rounds; ++i) {
        regex_t regex;
        regcomp(&regex, pattern, /*REG_ICASE |*/ REG_EXTENDED);
        regmatch_t matches[1];
        regexec(&regex, test, 1, matches, 0);
	regfree(&regex);
    }

    return 0;
}

The fnmatch based implementation is a lot shorter than the regex based one.

In terms of performance, these are the results for 10.000.000 iterations (both applications compiled using gcc -O0 -o <app> <app>.c):

$ time ./regex 10000000
real 1m54.342s
...

$ time ./fnmatch 10000000
real 0.780s

With -O3, the results change to 1m44.257s and 0.738s.

@probonopd probonopd added the performance Enhancements that increase performance label Dec 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Enhancements that increase performance
Projects
None yet
Development

No branches or pull requests

2 participants