Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data URLs truncated #53775

Closed
ghost opened this issue Jul 9, 2024 · 4 comments · Fixed by #54748
Closed

Data URLs truncated #53775

ghost opened this issue Jul 9, 2024 · 4 comments · Fixed by #54748
Labels
loaders Issues and PRs related to ES module loaders

Comments

@ghost
Copy link

ghost commented Jul 9, 2024

Version

v22.4.0

Platform

Linux server 6.8.0-36-generic #36-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 10 10:49:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

loaders

What steps will reproduce the bug?

This snippet reproduces.

node --eval '' --import 'data:text/javascript,console.log("Whither wanders thy distempered mind?")'
SyntaxError: Invalid or unexpected token

This is the embedded data URL.

data:text/javascript,console.log("Whither wanders thy distempered mind?")

How often does it reproduce? Is there a required condition?

Always.

What is the expected behavior? Why is that the expected behavior?

By the spec, data URLs do not have a query string or fragment.

Characters ? # should be allowed in the URL body.


This file loads the same data URL successfully in a browser.

<!DOCTYPE html>
<html>
  <head>
    <script>
import('data:text/javascript,console.log("Whither wanders thy distempered mind?")')
    </script>
  </head>
</html>
Whither wanders thy distempered mind?

What do you see instead?

SyntaxError: Invalid or unexpected token

Additional information

There's a correct regex that extracts the data URL body.

const DATA_URL_PATTERN = /^[^/]+\/[^,;]+(?:[^,]*?)(;base64)?,([\s\S]*)$/;

But it's run on URL#pathname, which chops off query string and fragment.

const match = RegExpPrototypeExec(DATA_URL_PATTERN, url.pathname);

new URL('data:,How far? Until ID #12345')
URL {
  href: 'data:,How far?%20Until%20ID%20#12345',
  origin: 'null',
  protocol: 'data:',
  username: '',
  password: '',
  host: '',
  hostname: '',
  port: '',
  pathname: ',How far',
  search: '?%20Until%20ID%20',
  searchParams: URLSearchParams { ' Until ID ' => '' },
  hash: '#12345'
}

This might be a quick fix:

const match = RegExpPrototypeExec(DATA_URL_PATTERN, url.pathname + url.search + url.hash);
@targos
Copy link
Member

targos commented Jul 9, 2024

@nodejs/loaders

@aduh95
Copy link
Contributor

aduh95 commented Jul 9, 2024

I'm probably saying things you already know about, but here is it anyway: a workaround is to encode the data (e.g. data:text/javascript,console.log(%22Whither wanders thy distempered mind%3F%22)) so there's no ambiguity on how to parse the URL.

Should URL be fixed to not report "fake" query string for data: URLs? I would expect it to parse URL strings as spec'd.

@ghost
Copy link
Author

ghost commented Jul 9, 2024

I'm probably saying things you already know about, but here is it anyway: a workaround is to encode the data (e.g. data:text/javascript,console.log(%22Whither wanders thy distempered mind%3F%22)) so there's no ambiguity on how to parse the URL.

thumbsup

Also could base64 it.

Should URL be fixed to not report "fake" query string for data: URLs? I would expect it to parse URL strings as spec'd.

I know it, it does this simplified parsing. I've never really been happy with it.

@targos
Copy link
Member

targos commented Jul 9, 2024

I opened #53778 with a fix.

@RedYetiDev RedYetiDev added the loaders Issues and PRs related to ES module loaders label Jul 9, 2024
KhafraDev added a commit to KhafraDev/node that referenced this issue Sep 3, 2024
@aduh95 aduh95 closed this as completed in 6c85d40 Sep 7, 2024
aduh95 pushed a commit that referenced this issue Sep 12, 2024
Fixes: #53775
PR-URL: #54748
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this issue Sep 30, 2024
Fixes: #53775
PR-URL: #54748
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
louwers pushed a commit to louwers/node that referenced this issue Nov 2, 2024
Fixes: nodejs#53775
PR-URL: nodejs#54748
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
loaders Issues and PRs related to ES module loaders
Projects
None yet
3 participants