Skip to content

sarilouis/GPT3Encoder-dotnet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT3Encoder

.NET BPE Encoder / Decoder for GPT-2 / GPT-3

About

GPT-2 and GPT-3 use byte pair encoding (BPE) to turn text into a series of integers to feed into the model. This is a C# implementation of OpenAI's original python encoder/decoder. Also inspired by Latitude's Javascript GPT-3-Encoder.

Usage

var encoder = new GPT3Encoder.Encoder();

// Encode
var encodedString = encoder.Encode("Arbitrarily sampled sentence to encode. Should result in 14 tokens.");
Console.WriteLine("Encoded string:");
Console.WriteLine($"[{string.Join(", ", encodedString)}]");
Console.WriteLine($"\nToken count: {encodedString.Count()}");

// Decode
var decodedString = encoder.Decode(encodedString);
Console.WriteLine("\nDecoded string:");
Console.WriteLine(decodedString);
Console.WriteLine($"\n|{"Token",-10}|String");
Console.WriteLine($"|{"-----",-10}|------");
encodedString.ToList().ForEach(token => Console.WriteLine($"|{token,-10}|{encoder.Decode(new int[] { token })}"));

About

.NET BPE Encoder / Decoder for GPT-2 / GPT-3

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages