Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue on relatively small file #537

Open
Porges opened this issue Apr 23, 2024 · 4 comments
Open

Performance issue on relatively small file #537

Porges opened this issue Apr 23, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@Porges
Copy link

Porges commented Apr 23, 2024

Describe the bug

Parsing the file located here with yaml.parse(str, {merge:true}) takes a very long time. Could there be something non-linear in the length of the file?

Currently it's under 500k:

$ wc bibliography.yaml
 18610  48360 489877 bibliography.yaml

To Reproduce
Steps to reproduce the behaviour.

const yaml = require('yaml');
const fs = require('fs');

fs.readFile("bibliography.yaml", 'utf8', (_, data) => {
    console.time('yaml-parse');
    const loaded = yaml.parse(data, {merge: true});
    console.timeEnd('yaml-parse');
});

Output:

yaml-parse: 10.501s

Expected behaviour
A clear and concise description of what you expected to happen.

By comparison, doing the same thing with js-yaml takes only:

const yaml = require('js-yaml');
const fs = require('fs');

fs.readFile("bibliography.yaml", 'utf8', (_, data) => {
    console.time('yaml-load');
    const loaded = yaml.load(data);
    console.timeEnd('yaml-load');
});

Output:

yaml-load: 125.814ms

Versions (please complete the following information):

  • Environment: Node v18.17.1
  • yaml: 2.4.1

Additional context
Add any other context about the problem here.

@Porges Porges added the bug Something isn't working label Apr 23, 2024
@eemeli
Copy link
Owner

eemeli commented Apr 23, 2024

Interesting. Thank you for sharing the file that's demonstrating this! Looks like the slowdown is during the document -> JS conversion. Probably due to the alias resolution?

I'll see when I can find time to dig into this properly; might take a while.

@rejetto
Copy link

rejetto commented Feb 2, 2025

loading a 3.7MB file of a simple single-level dictionary was taking 77 seconds on my machine fast mac (m1 max).
i unexpectedly solved it by reading the file line by line, and parsing each line. Now it's 3 seconds, 25x difference. 👀
of course this is not a solution unless you are in my same situation, but i wanted to share my experience, as it can shed some light on the problem.

@eemeli
Copy link
Owner

eemeli commented Feb 2, 2025

@rejetto Do you mean parsing each line as a separate YAML document?

@rejetto
Copy link

rejetto commented Feb 2, 2025

yes, invoking parse on the single line.
It's weird, and i don't have time to dig it more, but since it's hard to expect i thought good to share this information.
my dictionary is made of a file path for the key and a number for the value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants