-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
luajit: add decoder #709
luajit: add decoder #709
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Did some quick review and will do some more later. I think we should be able to add some custom decode function to replace some FieldAnyFn and maybe kgcType
could use some standard mapper?
Overall looks good i think and thanks for including tests!
BTW you can add a luajit.md
with author add embedd it, see wasm.go/wasm.md for example. Then it will be part of the documentation and also available in the CLI help.
format/luajit/luajit.go
Outdated
d.FieldValueBool("BE", flags&0x01 > 0) | ||
d.FieldValueBool("STRIP", flags&0x02 > 0) | ||
d.FieldValueBool("FFI", flags&0x04 > 0) | ||
d.FieldValueBool("FR2", flags&0x08 > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Field names are usually lowercase as it's the convention in jq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can also use 0b...
notation if you want to be explicit about bit positions
format/luajit/luajit.go
Outdated
ok := bytes.Equal([]byte(magic), []byte{0x1b, 0x4c, 0x4a}) | ||
if !ok { | ||
d.Errorf("invalid magic number") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think you can replace this with d.FieldStr("magic", 3, d.AssertStr(".."))
or d.FieldRawLen("magic", 3*8, d.AssertBitBuf([]byte{...}))
which will do the compare and also the magic will be shown as "valid"
format/luajit/luajit.go
Outdated
|
||
if !di.Strip { | ||
namelen := d.FieldU8("namelen") | ||
d.FieldStr("name", int(namelen), encoding.Nop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is encoding.Nop
important? can skip if it's ok to be decoded as UTF-8
func init() { | ||
for i := 0; i < len(opcodes); i++ { | ||
bcOpSyms[uint64(i)] = opcodes[i].Name | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if you can make the BcDef
implement scalar.UintMapper
, i think the formatEntries
type in msgpack.go
does something similar. If so you can skip init and bcOpSyms
I added more decode function, but is still need FieldAnyFn, because LuaJIT's format makes a weird use of ULEB128 encoding, and I want to map the bits of the two 32bit-max ULEB128 to the resulting 64bit Field. (see LuaJITDecodeI64(), for example) |
The problem is that we need a default case: 0 to 4 are types with know size, but 5 and more indicates a string and its size: I don't think we can use a UintMapSymStr here. |
format/luajit/testdata/simple.fqtest
Outdated
@@ -0,0 +1,274 @@ | |||
$ fq dv -d luajit simple.luac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment to the .fqtest files or a README.md or Makefile how to generate the luac files?
format.LuaJIT, | ||
&decode.Format{ | ||
Description: "LuaJIT 2.0 bytecode dump", | ||
DecodeFn: LuaJITDecode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The magic is quite unique so if you think it make senes you could add the format to the probe group
format/luajit/luajit.go
Outdated
interp.RegisterFormat( | ||
format.LuaJIT, | ||
&decode.Format{ | ||
Description: "LuaJIT 2.0 bytecode dump", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly skip the "dump" part or are .luac files known as dump files?
I see, yeap that might be tricky.
Aha that makes sense, i guess you could use a range mapper but maybe not worth it. I added some more comments but getting a bit late here so will continue tomorrow. Think we're close to something mergable. Thanks for the patience! |
format/luajit/luajit.go
Outdated
Description: "LuaJIT 2.0 bytecode", | ||
Groups: []*decode.Group{format.Probe}, | ||
DecodeFn: LuaJITDecode, | ||
Dependencies: []decode.Dependency{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dependencies
you can skip, it's if the format decoder itself want to decode something using some group
format/luajit/luajit.go
Outdated
}) | ||
|
||
if !di.Strip { | ||
d.FieldArray("debug", func(d *decode.D) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When i played around with the decoder i noticed that data in debug has some strings, is there any known structure to it or should be threaded as opaque blob? if blob should the array be replaced with d.FieldRawLen("debug", debuglen*8)
?
format/luajit/luajit.go
Outdated
} | ||
} | ||
|
||
type kgcType struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One variant is to do something like:
type fallbackUintMapSymStr struct {
fallback string
scalar.UintMapSymStr
}
func (m fallbackUintMapSymStr) MapUint(s scalar.Uint) (scalar.Uint, error) {
s.Sym = m.fallback
return m.UintMapSymStr.MapUint(s)
}
and then
kgctype := d.FieldULEB128("type", fallbackUintMapSymStr{
fallback: "str",
UintMapSymStr: scalar.UintMapSymStr{
0: "child",
1: "tab",
2: "i64",
3: "u64",
4: "complex",
},
})
and use it for ktabType
also
Maybe the scalar package should have something like this if its common hmm
format/luajit/luajit.go
Outdated
} | ||
|
||
func LuaJITDecodeI64(d *decode.D) { | ||
d.FieldAnyFn("i64", func(d *decode.D) any { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some other format decoders that do something similar i've decode the value as a "value" field, see msgpack or bson etc. Maybe that would make sense here also? for complex maybe the value field could an extra d.FieldValueAny("value", ...)
?
interp.RegisterFS(LuaJITFS) | ||
} | ||
|
||
// reinterpret an int as a float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to understand how this works. So all numbers are floats but encoded i two parts where the hi part will be zero for < 32 bit integers? and hi might be > 2^32 so hi << 32 + lo will be > 2^64 and needs to be float? but looks like we only care about 8 bytes when reinterpreting? hope that made sense :)
// make it a signed integer | ||
lo_data_i32 := int32(lo_data_u32) | ||
|
||
// return a larger type to make fq happy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕺💃 maybe there should be smaller number types hmm, should probably do another round of profiling memory soon
format/luajit/luajit.go
Outdated
}) | ||
|
||
// TODO: find out more about how to decode these strings | ||
d.FieldArray("anotations", func(d *decode.D) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: annotations
👍 with more debug structure
Lint not happy about unused Think i'm happy to merge if your happy too? |
it's good for me |
Thanks! |
a decoder for LuaJIT bytecode/precompiled files