-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Text Format] Allow the data in data segments to be written in a vector of u8 and s8 numbers. #1348
Comments
I agree that the current format is pretty limited. That said, I think if we change this we'll want to have something more comprehensive. For example, see this simple raytracer I made. These values are We'll probably want syntax for this (maybe |
Interesting demo :) If we want something more advanced than an array of bytes, maybe we need to introduce special keywords for them? For example: data ::= '(' 'data' memidx '(' 'offset' expr ')' datacontent* ')'
datacontent ::= string
| '(' 'data_u8' u8 ')'
| '(' 'data_u16' u16 ')'
| '(' 'data_u32' u32 ')'
...
| '(' 'data_f32' f32 ')'
| '(' 'data_f64' f64 ')' Here's the usage example: (data (i32.const 0)
"hello"
(data_f32 3.14159265359)
(data_u32 0xABCD)
)
Because they are converted to data in the segment data during the wat2wasm compilation, we don't need to add any additional And the additional grammar above should be backward compatible, as it still accepts strings. |
Good point. In that case it could use the same format anyway (which is also used in other places too, e.g. assertions), as if we had an This is a pretty simple change, but still requires going through the proposal process for standardization. Would you want to champion it? |
Sure, I would be glad to do that. What are the steps that I should take? |
Next steps would be to bring this to a future wasm community group meeting to see whether the group is interested in pursuing it. Are you a community group member? If not, you can join here: https://www.w3.org/community/webassembly/ |
Suggestion:
So for example if we wanted to have a vector of 3 floats, instead of having to say Could take that idea even farther and allow for a mixable data stream. For example, variable width vectors that are prefixed by their length: |
@jgravelle-google's suggestion seems to be useful. For example, this copied code from @binji's ray tracer: (data (i32.const 0)
"\00\00\00\00" ;; ground.x
"\00\00\c8\42" ;; ground.y
"\00\00\00\00" ;; ground.z
"\00\00\c2\42" ;; ground.r
"\3b\df\6f\3f" ;; ground.R
"\04\56\8e\3e" ;; ground.G
"\91\ed\bc\3e" ;; ground.B
) can be re-written as (data (offset (i32.const 0))
(data_f32 0 100.0 0 97.0) ;; ground.{x,y,z,r}
(data_f32
0.936999976635 ;; ground.R
0.277999997139 ;; ground.G
0.368999987841 ;; ground.B
)
) Regarding @binji's previous comment:
Yeah, maybe it's better to use those formats instead of introducing the new ones. So I'm thinking the possible updated grammar would be this: data ::= '(' 'data' memidx '(' 'offset' e:expr ')' datacontent* ')'
datacontent ::= string
| '(' 'u8.const' u8+ ')'
| '(' 'u16.const' u16+ ')'
| '(' 'u32.const' u32+ ')'
| '(' 'u64.const' u64+ ')'
| '(' 's8.const' s8+ ')'
| '(' 's16.const' s16+ ')'
| '(' 's32.const' s32+ ')'
| '(' 's64.const' s64+ ')'
| '(' 'f32.const' f32+ ')'
| '(' 'f64.const' f64+ ')' |
Nice, I like this! Although we can simplify it further, since we already require that datacontent ::= string
| '(' 'i8.const' i8+ ')'
| '(' 'i16.const' i16+ ')'
| '(' 'i32.const' i32+ ')'
| '(' 'i64.const' i64+ ')'
| '(' 'f32.const' f32+ ')'
| '(' 'f64.const' f64+ ')' |
@binji Great, I hope it can make its way to the text format spec. I've just joined the wasm community group with id: echamudi. |
@echamudi Thanks! If you can attend one of the next CG meetings (the next is June 9th) to present, that would be best. You can open a PR to add this as an agenda item: https://github.com/WebAssembly/meetings/tree/master/2020 |
Can we simplify that further to
This mirrors what we did for SIMD values, and it avoids mistaking these value encodings for instructions. Also, I would tend to replace |
Looks fine with me. Also, I found out that the previous syntax (especially (data (i32.const 10) (i32.const 20)) ;; same keyword, but 1st one is offset, 2nd one is data. So, using @rossberg's suggested format might remove the problem. (data (i32.const 10) (i32 20)) |
Yeah, in fact, t.const might even create an ambiguity for passive segments. |
Ah, good points. This is looking very nice now! |
Hey, I decided to fork and play with the wat2wasm code last weekend 😃. So, here's the sneak peek demo of this proposal: https://wasmprop-numerical-data.netlify.app/wat2wasm/ . And here are the code changes: code diff |
Hello,
In the current specification, when we initialize data segments, we are only allowed to use a vector of strings.
For example:
or
To save some arbitrary numbers, we can use backslashes and character hex numbers.
But this approach feels a bit hacky. Wouldn't it better if we are also allowed to put a vector of numbers that can be written in numbers directly?
For example:
And I think this addition still conforms with the core syntax definition, which defines the data part as a vector of bytes:
The text was updated successfully, but these errors were encountered: