Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic import (lazy load) parsers #2231

Merged
merged 1 commit into from
Sep 5, 2024
Merged

Conversation

Borewit
Copy link
Owner

@Borewit Borewit commented Aug 29, 2024

Use dynamic imports enabling to load parsers in lazy fashion

@Borewit Borewit self-assigned this Aug 29, 2024
@coveralls
Copy link

coveralls commented Aug 29, 2024

Coverage Status

coverage: 97.018% (+0.05%) from 96.969%
when pulling 7cee540 on dynamic-import-parsers
into f04f3ed on master.

@minht11
Copy link

minht11 commented Sep 2, 2024

Oooh this is pretty cool and looks clean. I would still want user configurable sync parsers because while lazy loading is good, it is not always optimal. Lazy loading requires more chunks=more separate network requests, more latency before parsing can be started.
If for example you know you will never need to include mp4 parser, even if it is not loaded it could be somewhere in your bundle which is cached into service worker (very specific situation I am in 😆) .
Or you could want preload one parser and load others lazily.

Your current implementation could support that pretty well I think, in a future it could look something similar to

// AiffLoader.ts
export const aiffLazyParserLoader: IParserLoader = {
  parserType: 'aiff',
  extensions: ['.aif', 'aiff', 'aifc'],
  async load(): Promise<ITokenParser> {
    return new (await import('./AiffParser.js')).AIFFParser;
  }
};
// ApexLoaderSync.ts
import { APEv2Parser } from './APEv2Parser.js'
export const apeParserLoader: IParserLoader = {
  parserType: 'apev2',
  extensions: ['.ape'],
  load(): ITokenParser{
    return new APEv2Parser;
  }
};

Putting it all together into pseudo code:

import { parserBlob } from 'music-metadata';
import { apeParserLoader } from 'music-metadata/loaders-sync';
import { aiffLazyParserLoader } from 'music-metadata/loaders';

parserBlob(blob, { parsers: [apeParserLoader, aiffLazyParserLoader] })

This would make aiff load when needed and ape sync.

@minht11
Copy link

minht11 commented Sep 3, 2024

I started thinking more, about my previous comment and came with following points:

  1. Having two separate sync/async entrypoints would be bad API design. Sorry for that.
  2. Some parsers are likely more commonly used than others.
  3. I am not convinced lazy loading by default (at least for all parsers) is a good thing. User who wants to parser mp3 file in a browser, will now first have to load library, get mp3 file, now hit the network again for the mp3 parser to load. Not ideal.

So my new proposal would be, to allow passing a function which loads parser when discovered:

import { parserBlob } from 'music-metadata/without-parsers'; // to not break old users
import { apeParser } from 'music-metadata/parser/ape';

parserBlob(blob, {
 parser: (parserType) => {
    if (parserType === 'aiff') {
      return import('music-metadata/parser/aiff'')
    }

    if (parserType === 'ape') {
      return apeParser
    }
 }
})

This allows best of both worlds, now user is in full control when to load parser sync and when async, they can even preload it if they need it. And the API design is simpler.

For users who do not care they can import all of the parsers.

import { parserBlob } from 'music-metadata/without-parsers'; // to not break old users
import { defaultParsers } from 'music-metadata/parser';

parserBlob(blob, {
 parser: defaultParsers
})

@Borewit
Copy link
Owner Author

Borewit commented Sep 3, 2024

Oooh this is pretty cool and looks clean.

Glad you like it, still work to do. Some cleanup to to, and I think parser init method can now move to the constructor. This will cleanup some nasty TypeScripts hacks.

I started thinking more, about my previous comment and came with following points:

Having two separate sync/async entrypoints would be bad API design. Sorry for that.
Some parsers are likely more commonly used than others.
I am not convinced lazy loading by default (at least for all parsers) is a good thing. User who wants to parser mp3 file in a > browser, will now first have to load library, get mp3 file, now hit the network again for the mp3 parser to load. Not ideal.

I see you concerns with lazy loading, and need to put that in a bot of perspective.

"Premature optimization is the root of all evil (or at least most of it) in programming"1

Nothing changed since 1974, remember that one.

Because the the parsers are lazy loaded, the entire API which is using music-metadata, is able to spin up faster. Maybe music-metadata is also lazy loaded, only when a certain operation is triggered which require parsing.

The other difference is, let's we a user parses a single file, music-metadata is loaded, and only that specific parser. And no more music-metadata plus all the other parser.

Especially if these modules are small, and consistently lazy (dynamically) loaded, the user will not even notice it. The user experience is that everything is instantly available. Yes you can try to over-complicate things, and try to eliminate the 10ms load time. But think that is the job of these fancy module bundlers, to whom we clearly defined what is dependent on what, and when what is to be loaded. Let those things optimize, they are designed for that reason.

Footnotes

  1. Quote from: Computer Programming as an Art (1974) - Donald_Knuth

@Borewit
Copy link
Owner Author

Borewit commented Sep 3, 2024

The primary use case for music-metadata is extracting metadata from a single file, or more commonly, from a collection of files. Users want to extract metadata in the simplest way possible, without hassle. Most expect the process to work consistently across all files, without the frustration of it working for some files and not others, or encountering inconsistencies in the output simply because the input comes from different file formats.

@Borewit Borewit force-pushed the dynamic-import-parsers branch 2 times, most recently from 8c8ac12 to 111eb54 Compare September 3, 2024 19:21
@Borewit Borewit changed the title Dynamic import parsers Dynamic import (lazy load) parsers Sep 5, 2024
@Borewit Borewit merged commit 4456def into master Sep 5, 2024
25 checks passed
@Borewit Borewit deleted the dynamic-import-parsers branch September 5, 2024 18:25
@Borewit
Copy link
Owner Author

Borewit commented Sep 5, 2024

Part of v10.4.0

@minht11
Copy link

minht11 commented Sep 5, 2024

@Borewit The latest release reduced chunk using musicmetada from ~180kb to ~104kb in my application. Thank you!

@Borewit
Copy link
Owner Author

Borewit commented Sep 6, 2024

@Borewit The latest release reduced chunk using musicmetada from ~180kb to ~104kb in my application. Thank you!

Awesome!, and interesting as the strtok3 dependencies in music-metadata and file-type are not alligned yet, there is a major version number difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants