You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hmmm what do you mean by that ? here's a fully-functional hex-encoder for gawk (sorry for the poor formatting - i dug it up from my pile)
even in gawk unicode-byte, i got it to hex encode 2 different binary mp3 files with ease, and without any error messages popping up (try not to use it in gawk -P posix mode - all kinds of weird behavior may bubble up. I think the octal encoder also works, but haven't tested it lately. lemme know if this works or not ?
if that offset 8^8 doesn't work, use 0xDC00 instead. if that also fails, then try the last resort of -4^4.
this encoder may not be 100% to URL-encoding spec per se - it was simply i quickly slabbed together another time before. it's currently instructed to only skip encoding the alphanumeric ones, but will encode the other punctuation symbols that aren't part of the spec. feel free to modify it.
The text was updated successfully, but these errors were encountered:
rethab
changed the title
awk cannot read binary ??
suggestion how to read binary input directly with gawk
Feb 7, 2022
hmmm what do you mean by that ? here's a fully-functional hex-encoder for gawk (sorry for the poor formatting - i dug it up from my pile)
even in gawk unicode-byte, i got it to hex encode 2 different binary mp3 files with ease, and without any error messages popping up (try not to use it in gawk -P posix mode - all kinds of weird behavior may bubble up. I think the octal encoder also works, but haven't tested it lately. lemme know if this works or not ?
if that offset 8^8 doesn't work, use 0xDC00 instead. if that also fails, then try the last resort of -4^4.
gawk -e 'function hexencode(str,chr) { for(chr in b2hex) { if (chr!~/[[:alnum:]%\\]/) { gsub(chr,b2hex[chr],str) } }; return str } function octencode(str,chr) { gsub(/\\/,b2oct["\\"],str); gsub(/[0-7]/,"\06&",str); for(chr in b2oct) { if(chr!~/[0-7\\]/) { gsub(chr,b2oct[chr],str) str } }; return str } BEGIN { offset=8^8;for(x=0;x<256;x++) { byte=sprintf("%c",x+offset);b2hex[byte]=sprintf("\\x%.2X",x);b2oct[byte]=sprintf("\\%03o",x) }; spc1="/\\^[]";spc2="~!@#%&_-{}:;\42\47\140 <>,$.|()*+=?"; for(x=length(spc1);x;x-=1) { byte=substr(spc1,x,1); b2hex[("\\"(byte))]=b2hex[byte]; b2oct[("\\"(byte))]=b2oct[byte]; delete b2hex[byte]; delete b2oct[byte] }; for(x=length(spc2);x;x--) { byte=substr(spc2,x,1); b2hex[("["(byte)"]")]=b2hex[byte]; b2oct[("["(byte)"]")]=b2oct[byte]; delete b2hex[byte]; delete b2oct[byte] } } BEGIN { RS=FS="^$"; OFS=""; ORS=""; } END { print hexencode($0) }'
this encoder may not be 100% to URL-encoding spec per se - it was simply i quickly slabbed together another time before. it's currently instructed to only skip encoding the alphanumeric ones, but will encode the other punctuation symbols that aren't part of the spec. feel free to modify it.
The text was updated successfully, but these errors were encountered: