Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Base64/128 Variable Length Quantity (VLQ) in System.Convert #15237

Closed
am11 opened this issue Sep 19, 2015 · 5 comments
Closed

Proposal: Base64/128 Variable Length Quantity (VLQ) in System.Convert #15237

am11 opened this issue Sep 19, 2015 · 5 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@am11
Copy link
Member

am11 commented Sep 19, 2015

Concept:

System.Convert currently resides in CoreCLR repo: https://github.com/dotnet/coreclr/../System/Convert.cs. The class has number of methods ToBase64* and FromBase64* with various signatures. The base64 encoding has 8-bit fixed length.

The are many applications of variable length base64 encoding; Variable Length Quantity (VLQ). IMO, one of the most popular application of B64 VLQ is in V3 source maps; which is used by browsers, editors, trans-compilers etc. to obtain source-to-source mappings in trans-compilation scenarios. For instance; CoffeeScript, Less, Sass, Stylus, SweetJS and TypeScript are languages, that compile into CSS or JavaScript. Today, the developer tools in all major browsers are capable of mapping generated source to original by the virtue of source-maps. Additionally, Base128 VLQ has applications in media formats, such as MIDI, XMF etc. [1]

Proposed API:

namespace System {
  public static class Convert {

    [...]

    public static byte[] FromBaseNVLQString(
                                      EncodingRadix radix;
                                      string s,
                                      BaseNVLQSetting setting);

    public static byte[] FromBaseNVLQCharArray(
                                      EncodingRadix radix;
                                      char[] inArray,
                                      Int32 offset, Int32 length,
                                      BaseNVLQSetting setting);

    public static string ToBaseNVLQString(
                                      EncodingRadix radix;
                                      byte[] inArray,
                                      BaseNVLQSetting setting);

    public static string ToBaseNVLQString(
                                      EncodingRadix radix;
                                      byte[] inArray,
                                      int offset,
                                      int length,
                                      BaseNVLQSetting setting);

    public static unsafe string ToBaseNVLQString(
                                      EncodingRadix radix;
                                      byte[] inArray,
                                      int offset,
                                      int length,
                                      BaseNVLQSetting setting);

    public static int ToBaseNVLQCharArray(
                                      EncodingRadix radix;
                                      byte[] inArray,
                                      int offsetIn,
                                      int length,
                                      char[] outArray,
                                      int offsetOut,
                                      BaseNVLQSetting setting);

    public static unsafe int ToBaseNVLQCharArray(
                                      EncodingRadix radix;
                                      byte[] inArray,
                                      int offsetIn,
                                      int length,
                                      char[] outArray,
                                      int offsetOut,
                                      BaseNVLQSetting setting);

    public static unsafe int ToBaseNVLQCharArray(
                                      EncodingRadix radix;
                                      byte[] inArray,
                                      int offsetIn,
                                      int length,
                                      char[] outArray,
                                      int offsetOut,
                                      BaseNVLQSetting setting); 
  }

  public enum EncodingRadix {
    Base64 = 64;
    Baase128 = 128;
  }

  public struct BaseNVLQSetting {
    private int _base,
                _baseShift,
                _baseMask.
                _continuationBit;
    private char[] separators;

    public BaseNVLQSetting(
                int base,
                int baseShift,
                int baseMask,
                int continuationBit,
                char[] separators);

    public enum Template { V3SourceMap, MIDI, XMF }

    public static BaseNVLQSetting FromTemplate(BaseNVLQSetting.Template template) {
      switch(template) {
        case Template.V3SourceMap:
          return new BaseNVLQSetting(1 << 5, 5, 4, 1 << 5, new char[] {',', ';'});  break;
        [...]
        default: throw new Exception("Unknown template.");
      }
    }
  }
}

Working Example:

We implemented sourcemap encoding and decoding in WE2013:
https://github.com/madskristensen/WebEssentials2013/../Base64VLQ.cs, inspired by Mozilla's JavaScript implementation: https://github.com/mozilla/source-map.

@CodesInChaos
Copy link

I don't like stuffing more encodings into the Convert, since that's rather non-modular. Even Base64 should have been a separate class.

I don't see much benefit in unifying Base 64 VLQ and Base 128 VLQ implementations. Base 128 VLQ seems to be a binary 8 bit encoding whereas Base 64 VLQ is a text encoding. The names suggest that they're the same encoding with 1 bit difference, but the difference is actually 2 bits.

I'd prefer a separate assembly for source map handling which includes Base 64 VLQ encoding. You included separator handling in your code (, and ;) but I think those are part of the source map format and not of Base 64 VLQ.

@am11
Copy link
Member Author

am11 commented Sep 22, 2015

Thanks for the input @CodesInChaos.

Even Base64 should have been a separate class.

In which case VLQ variant can probably share the same class; as with certain flexible implementation, redundancies can be avoided in achieving both fixed and variable length quantities.

The bit which distinguish between source-map V

I'd prefer a separate assembly for source map handling which includes Base 64 VLQ encoding.

Note that the only part of V3 source-map template I referred here is "mappings": "<this value>", as this proposal chiefly concerns variable-length encoding. The rest of the constituents of .map file (or data in case of embedded + encoded maps) as well as the source-map specific symbols (, ;) in the mappings value should be considered extraneous for this proposal.
The difference between raw B64VLQ and source-map falvoured (with additional pointers , & ;) can be observed in this Ruby implementation: http://murzwin.com/base64vlq.html.

IMO, while "source-map" as a library implementation should be a separate assembly, there is room for FLQ & VLQ encodings to co-exists in .NET Core; and depending on the flexibility of implementation the source-map assembly implementer would be able to consume .NET Core's Base64VLQ implementation as is.

@AlexGhiondea
Copy link
Contributor

@am11 thanks for writing the proposal.

Since these methods are somewhat specific to a scenario, I think the right place for this would be in their separate assembly.

One place where you can start a prototype is CoreFxLab.

@am11
Copy link
Member Author

am11 commented Dec 22, 2016

@AlexGhiondea, I am busy with other projects at the moment. This API is on my TODO list. I will try to start working on it as I get some free time.

Thanks for mentioning CoreFXLab repo. That certainly seems like the best place to start for new APIs.

@AlexGhiondea
Copy link
Contributor

@am11 sounds great! Looking forward to seeing what you come up with!

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.0.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

No branches or pull requests

7 participants