-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Silent failure when a field references a subsequent field #469
Comments
ksc currently provides no guarantees like that, i.e. data in seq:
- id: a
size: 16
process: xor(b)
- id: b
type: u1
seq:
- id: a
size: 16
process: xor(c)
- id: b
type: u1
instances:
c:
value: b + 1
|
That said, I personally never found it to be a big problem, as typically a target language compiler is good in issuing warning about such behavior in generated code (i.e. usage of uninitialized variables). |
I think the fact that KSC wraps all accesses to the fields inside nice accessor functions masks the fact that the accesses might be to uninitialized memory. (There's no way, inside the accessor function, to tell that the field is uninitialized, and there's probably no way in the calling function ( So, I'm willing to buy into the fact that my The code generated by KSC for me look similar to this:
At least conceptually, it seems like it would be straightforward for the compiler to emit the calls to In the absence of that compiler feature, how can I specify to KSC that I want the field processed by a value that I haven't read yet? Can I create a type and apply it lazily? (Or, is there some way to do the processing in a value instance?) |
Ok, so there are 2 distinct questions:
Second one is easier to answer to in some respects. Existing solutions are: All manualseq:
- id: placeholder
size: 16 + 32
- id: xor_key
type: u2
instances:
serial_number:
pos: 0
size: 16
process: xor(xor_key & 0xff)
hash:
pos: 16
size: 32
process: xor(xor_key & 0xff) Hacky but automaticseq:
- id: serial_number_placeholder
size: 16
if: serial_number_pos >= 0 # always true, but stores pos
- id: hash_placeholder
size: 32
if: hash_pos >= 0 # always true, but stores pos
- id: xor_key
type: u2
instances:
serial_number_pos:
value: _io.pos
serial_number:
pos: serial_number_pos
size: 16
process: xor(xor_key & 0xff)
hash_pos:
value: _io.pos
hash:
pos: hash_pos
size: 32
process: xor(xor_key & 0xff) And one not-yet-implemented yet lazy seq solution: seq:
- id: serial_number
process: xor(xor_key & 0xff)
size: 16
lazy: true
- id: hash
process: xor(xor_key & 0xff)
size: 32
lazy: true
- id: xor_key
type: u2 This is obviously not a panacea, but typically wrapping everything possible in |
Commenting some of the questions / proposals:
First of all, KSC does not do it everywhere. For example, in JavaScript and Python, one just accesses raw object's variables using Second, I don't think that detection of uninitialized memory access is particularly hard in runtime even when it's not done automatically by the language, i.e. we can in theory prepare an extra boolean field for every attribute that will start as if self._i_foo:
return self.foo
else:
raise Exception("foo is accessed, but it is yet undefined") May be it's a good idea to have some sort of flag to generate that kind of "safety" machinery. It will be obviously slower (and less readable), but definitely safer and won't allow that silent corruption you've mentioned.
No, it's not: seq:
- id: a
type: foo
size: 16
process: xor(0xaa)
- id: b
size: a.some_attribute In this case, |
I don't think it's completely infeasible to support this in "sane" cases where the graph of "field cross-references" forms a DAG: for each field, you presumably know which other fields its (That said, it's definitely a new feature, I guess) |
@FireyFly It is definitely possible, but I just wanted to point you that it's trickier than it looks like. seq:
- id: a
size: b
- id: b
type: u4 And this one looks very similar it, but it is actually totally ok, as sum of size(a)+size(b) is const. seq:
- id: a
size: 20 + c
- id: b
size: 20 - c
- id: c # actually always starts at constant offset = 20 + c + 20 - c = 40
type: u4 |
@FireyFly, I think constructing and traversing the DAG would allow the compiler to flag cases which aren't sane (which would have been a boon to me...). However, as @GreyCat has demonstrated, the constraint of having to read the fields in order can conflict with a requirement to process them out of order. @GreyCat, thanks as always for your prompt attention and cogent responses. I plan to pursue the "Hacky but automatic" solution -- since the fields occur in the middle of my record, have to calculate and maintain the offsets manually is unappealing. And, as for referencing the fields out of order, that is the sort of thing up with which I shall not put. ;-) |
(I tried to search for similar issue reports, but failed to find any -- sorry if this is a duplicate.)
I have a
.ksy
file containing the following:The generated code reads the fields in order (which is not surprising), but it processes
serial_number
when it is read, whenxor_key
is still uninitialized.I would have expected either that KSC would have issued a warning/error at compile time or that it would have deferred the processing of
serial_number
. (Replacing the direct reference to thexor_key
expression with an value instance yields the same result.) Having it "silently fail" (i.e., produce bad results at runtime) is...disappointing.Any suggestions on how I should implement this? (I would expect that I should be able to define an value instance for the serial number, but I don't know how to specify a
process
attribute for an instance, and I don't know how to apply an XOR operator to each byte of a byte array....)Thanks!
The text was updated successfully, but these errors were encountered: