Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature proposal: buffer.convert(value[, fromEnc], toEnc) #9797

Closed
ChALkeR opened this issue Nov 25, 2016 · 21 comments
Closed

Feature proposal: buffer.convert(value[, fromEnc], toEnc) #9797

ChALkeR opened this issue Nov 25, 2016 · 21 comments
Labels
buffer Issues and PRs related to the buffer subsystem. feature request Issues that request new features to be added to Node.js.

Comments

@ChALkeR
Copy link
Member

ChALkeR commented Nov 25, 2016

While investigating the Buffer usage, I noticed one thing — in many (really, many) cases, Buffers are constructed just for the sake of base64-encoding a string.

base64-encodinng is used in many places, like auth headers, data uris, messages, etc.

With the new API, that would look like Buffer.from(from).toString(encoding).
With the old API, that would look like new Buffer(from).toString(encoding) (note that this lacks checks).

So I propose to add a small utility function, e.g.

Buffer.convert = function(value, targetEncoding, sourceEncoding) {
  if (!Buffer.isEncoding(targetEncoding)) {
    throw new TypeError('"targetEncoding" must be a valid string encoding');
  }
  return Buffer.from(value, sourceEncoding).toString(targetEncoding);
}

Note that the impl does not default to utf-8 and forces an encoding to be specified, this way it would be more clear what the userspace code does.

Sure, that could be done on userside, but this doesn't deserve a separate package, and re-implementing that in all the packages that do conversion also doesn't look very good to me.

Note that if we have had such method earlier, the usage of Buffer without new (and other Buffer-related issues) would have been measurably lower.

If everyone thinks that this is a good idea, I am willing to file a PR for it (impl/docs/tests).

@ChALkeR ChALkeR added buffer Issues and PRs related to the buffer subsystem. feature request Issues that request new features to be added to Node.js. labels Nov 25, 2016
@ChALkeR
Copy link
Member Author

ChALkeR commented Nov 25, 2016

Or perhaps Buffer.convert(value[, sourceEncoding], targetEncoding) would be better — I am not sure of that yet.

@addaleax
Copy link
Member

addaleax commented Nov 25, 2016

I’d be okay with it, although I would probably prefer it as buffer.convert directly on the buffer module, like we have buffer.transcode (to which convert would be the dual).

And you should probably specify both encodings explicitly. For this conversion to make sense, one of these has to be a binary-to-text encoding like base64, and I don’t think there’s any way of knowing whether it’s supposed to be encoding or decoding?

/cc @nodejs/buffer

@ChALkeR
Copy link
Member Author

ChALkeR commented Nov 25, 2016

@addaleax Ah, I entirely missed that one (from #9038).

Then we could use the same header, like buffer.convert(value, fromEnc, toEnc) and perpahs also just buffer.convert(value, toEnc) (defaulting fromEnc to 'utf-8').

@ChALkeR ChALkeR changed the title Feature proposal: Buffer.convert(value, targetEncoding[, sourceEncoding]) Feature proposal: buffer.convert(value[, fromEnc], toEnc) Nov 25, 2016
@addaleax
Copy link
Member

@ChALkeR I am not sure you can compare the situation wrt default encodings here. For transcode, it makes sense to default to utf-8 for the arguments, because transcode only makes sense to be called for character encodings, not binary-to-text ones.

If you prefer it another way, I guess that’s fine, but I really think explicit encodings would be better here.

@seishun
Copy link
Contributor

seishun commented Nov 28, 2016

Not sure this is a good idea.

  • buffer.convert(fromStr, target, source) isn't much simpler than Buffer.from(fromStr, source).toString(target).
  • Order of arguments is confusing (I'd expect source encoding to come before target encoding, so it seems different people have different expectations).
  • Can be confused with buffer.transcode.
  • It might not be clear why it's a part of Buffer API even though both the input and the output are strings.

@ChALkeR
Copy link
Member Author

ChALkeR commented Nov 28, 2016

@seishun Thanks.

  • It's a bit shorter, more clear, does the single thing, also it could be optimized later if we decide that it should be.
  • I have already changed the order in the title to be buffer.convert(value[, fromEnc], toEnc), I did not update the impl, though. Does that look better?
  • We could limit buffer.convert to strings only and remove Buffer support on input. This way, it couldn't be confused with .transcode, which accepts only Buffer objects.

@seishun
Copy link
Contributor

seishun commented Nov 28, 2016

I have already changed the order in the title to be buffer.convert(value[, fromEnc], toEnc), I did not update the impl, though. Does that look better?

I didn't say that the order is better one or the other way. My concern is that people will likely get confused either way.

This way, it couldn't be confused with .transcode, which accepts only Buffer objects.

But they still both live in the buffer module. And it's not obvious from the names which does which.

@Trott
Copy link
Member

Trott commented Jul 15, 2017

@ChALkeR Leave this open? (I imagine so, but it's been inactive for long enough that I figure it's worth checking.)

@jasnell
Copy link
Member

jasnell commented Jul 17, 2017

This is essentially Buffer.transcode()... and can also be accomplished using the Encoding API implementation I've done.

@seishun
Copy link
Contributor

seishun commented Jul 17, 2017

This is essentially Buffer.transcode()

Not quite. Buffer.transcode converts a buffer to a buffer. This converts a string to a string.

@gireeshpunathil
Copy link
Member

so the proposal itself has a prototype code, not sure why it got stalled for 18 months. Calling for contributors!

@gireeshpunathil gireeshpunathil added the good first issue Issues that are suitable for first-time contributors. label May 20, 2018
@apapirovski
Copy link
Member

apapirovski commented May 20, 2018

I thought @seishun brought up some valid concerns. Also Buffer.from(fromStr, source).toString(target) is already very convenient. I don't really know how I feel about adding this... It's not like there's even that many people asking for this feature.

@gireeshpunathil
Copy link
Member

@apapirovski - fair point, while I also don't have strong opinion either way (not much demanded from users vs a convenient API abstraction) my motivation was to see how this can be moved towards a conclusion. If we have consensus on no action at this point, we can close this out.

/cc @ChALkeR @addaleax @seishun @jasnell

@gireeshpunathil gireeshpunathil removed the good first issue Issues that are suitable for first-time contributors. label May 20, 2018
@BridgeAR
Copy link
Member

I personally think Buffer.from(fromStr, source).toString(target) is convenient enough.

The only compelling reason I can see is that it might be possible to improve the performance of that operation. But that would be a much more complicated implementation than suggested and it is not clear how significant the difference would actually be.

@ChALkeR
Copy link
Member Author

ChALkeR commented May 28, 2018

@BridgeAR I disagree — people do that a lot, and some still use new Buffer constructor there.
Simplifying that usecase would be helpful, probably.

@ChALkeR
Copy link
Member Author

ChALkeR commented May 28, 2018

@seishun I am also not sure if this should be on the buffer module or somewhere else. Changing the order from to, from to from, to looks fine, I suppose.

@BridgeAR
Copy link
Member

@ChALkeR if they did not move away from new Buffer by now I doubt that they would change it to anything else, no matter if it is simpler or not.

@jasnell
Copy link
Member

jasnell commented Jun 19, 2020

There's been no activity on this in over 2 years and it can be accomplished using existing API. Closing.

@jasnell jasnell closed this as completed Jun 19, 2020
@jorangreef
Copy link
Contributor

I think that Node has no way to convert buffer encodings from one buffer to another without creating an intermediary string and doing associated GC?

@addaleax
Copy link
Member

@jorangreef buffer.transcode() does exactly what you describe.

This issue here seems to be about a string → Buffer → string conversion. As I pointed out above, this is not really useful unless one of the encodings is a binary-to-text encoding, i.e. hex or base64. Given that limited scope and the fact that there has been no activity here, I think closing this is the right call.

@jorangreef
Copy link
Contributor

Thanks @addaleax

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. feature request Issues that request new features to be added to Node.js.
Projects
None yet
Development

No branches or pull requests

9 participants