Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base58 encoding a ~366KB byte array takes relatively forever #8

Closed
rsmets opened this issue Jul 22, 2022 · 4 comments
Closed

Base58 encoding a ~366KB byte array takes relatively forever #8

rsmets opened this issue Jul 22, 2022 · 4 comments

Comments

@rsmets
Copy link

rsmets commented Jul 22, 2022

I'm curious why encoding a byte array of only a few hundred kilobytes results in an operation that takes 10+ minutes? I actually can't find any materials that explains why base58 encoding would struggle with a moderate amount of data. Maybe someone here can explain?

Below is some example code

/**
 * nodejs script to test encrypting + encoding a base64 image data uri
 */

import fs from 'fs';
import * as url from 'url';
import path from 'path';
import { randomBytes, createCipheriv, createDecipheriv } from 'crypto';
import {encode, decode} from 'base58-universal';

import base58 from 'bs58';

const __dirname = url.fileURLToPath(new URL('.', import.meta.url));

const data = fs.readFileSync(path.join(__dirname, 'examples/docImageBase64String.txt'));

const key = randomBytes(32);
const iv = randomBytes(16);
const algorithm = 'aes-256-cbc';
const cipher = createCipheriv(algorithm, key, iv);

const startEncryption = new Date().getTime();

const encrypted1 = cipher.update(data);
const encrypted2 = cipher.final();
const encrypted = Buffer.concat([encrypted1, encrypted2]);

const endEncryption = new Date().getTime();

const timeToEncrypt = endEncryption - startEncryption;

console.log('time to encrypt', timeToEncrypt);

const beforeBase64Encoding = new Date().getTime();
const encodedBase64 = encrypted.toString('base64');
const afterBase64Encoding = new Date().getTime();

const timeToBase64Encode = afterBase64Encoding - beforeBase64Encoding;
console.log('time to encode base64 (using Buffer.toString())', timeToBase64Encode);

// write the encrypted data to a file so we can look at it without crashing the terminal lol
const outputDir = path.join(__dirname, 'output');

if (!fs.existsSync(outputDir)) {
  fs.mkdirSync(outputDir);
}
fs.writeFileSync(path.join(__dirname, 'output/encrypted.txt'), encodedBase64);

// base58 encoding
// this is V E R Y slow
const beforeEncoding = new Date().getTime();
const encoded = encode(encrypted);

const afterEncoding = new Date().getTime();
const timeToEncode = afterEncoding - beforeEncoding;
console.log('time to encode base58', timeToEncode);

// write base58 encoded data to a file
fs.writeFileSync('base58encoded.txt', encoded);

// let's decrypt it and write back to a file as a further sanity check
const decipher = createDecipheriv(algorithm, key, iv);

const decrypted1 = decipher.update(encrypted);
const decrypted2 = decipher.final();
const decrypted = Buffer.concat([decrypted1, decrypted2]);

fs.writeFileSync(path.join(__dirname, 'output/decrypted.txt'), decrypted.toString());

You'll just need to have a base64 encoded image in the /examples dir to run.

Running the above results in the output:

time to encrypt 2
time to encode base64 (using Buffer.toString()) 1
time to encode base58 1140551

However if one does run you'll see the base64 encoding taking a couple of ms while the base58 encoding takes tens of minutes. Why?

Thanks for your responses and for your efforts in open sourcing this project.

@davidlehn
Copy link
Member

There are some links explaining the issue and graphs I made over here:
#6

The graphs were from benchmark code to show the issue, though it needs to be rebased now:
#5

I'm not sure if there are easy solutions other than just using another encoding for large inputs. base58 works just fine for small inputs.

@rsmets
Copy link
Author

rsmets commented Jul 23, 2022

Thank you @davidlehn for the helpful links. Considering the need to encode at least 100s of KBs seems like base64 encoding is the way to go. Nonetheless, thank you for this helpful base58 project.

Pretty tangential but... I know many blockchains use base58 encoding... is that why image NFTs can not just be encoded on chain (why really just pointers to urls)?

@davidlehn
Copy link
Member

Yeah, a power of 2 base encoding would probably work better, depending on your needs. The implementations of those can often be optimized better than arbitrary base N code too.

I think most uses of base58 are for smaller data like addresses, signatures, hashes, and similar. The performance is fine when it's just 10s to 100s of bytes.

As far as bulk data in a blockchain, it's almost certainly always not advised due to performance reasons no matter what encoding (if any) you use. In many designs you have to sync all the data between nodes, and syncing pointers and hashes is less costly than huge amounts of data.

@rsmets
Copy link
Author

rsmets commented Aug 18, 2022

Thanks again @davidlehn. I'm going to close this issue.

@rsmets rsmets closed this as not planned Won't fix, can't repro, duplicate, stale Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants